Main Website
Scraping
Scraping
Updated on
February 15, 2024

Cookies in Puppeteer: How to Accept, Save, Load or Clear Them

In this guide, we’ll cover the following key aspects of cookie management in Puppeteer:

  • Accepting Cookies: We will learn how to effectively interact with websites that prompt users to accept or reject cookies.
  • Saving Cookies: We will discover how to save cookies from Puppeteer sessions to reuse them later. Saving cookies can be valuable when you want to maintain user sessions across multiple interactions with a website.
  • Loading Cookies: We will find out how to load previously saved cookies in Puppeteer sessions. This functionality is essential when you want to emulate returning users or continue sessions from a previous state.
  • Clearing Cookies: We will learn how to clear cookies selectively or entirely from Puppeteer sessions. Clearing cookies is useful for scenarios where you need to simulate a fresh session or reset specific user data.

Accepting Cookies

In this section, we will delve into the process of accepting cookies, specifically from a popup window that many websites display to request user consent for cookie storage.

When you visit a website for the first time or after clearing your browser cookies, you often encounter a popup or banner requesting your consent to store cookies. It is important to address these popups programmatically to continue our tasks seamlessly. To accept cookies from a popup in Puppeteer, you can use the following function:

Now, let's break down how this function works in detail:

1) Function Signature: The acceptCookiesFromPopup function accepts three parameters:

  • page: This is a Puppeteer Page object representing the browser page on which we want to accept cookies.
  • popupSelector: A CSS selector for the cookie consent popup element.
  • acceptButtonSelector: A CSS selector for the "Accept" button within the popup.

2) Waiting for the Popup: Inside the function, we use await page.waitForSelector(popupSelector) to instruct Puppeteer to wait until the specified popupSelector becomes available on the page. This step ensures that Puppeteer doesn't proceed until the cookie consent popup has appeared.

3) Clicking the "Accept" Button: Once the popup is visible, we simulate a click on the "Accept" button using await page.click(acceptButtonSelector). This action mimics the user clicking the button to accept the cookies.

4) Handling Errors: We wrap the entire process in a try-catch block to handle potential errors gracefully. If any error occurs during the process of accepting cookies, it will be caught, and an error message will be logged to the console. The function returns false to indicate that cookies were not successfully accepted in case of an error.

5) Success Indicator: If the function successfully completes without encountering any errors, it returns true to indicate that cookies have been accepted.

Saving Cookies

Saving cookies during a Puppeteer session serves several essential purposes:

  • Session Persistence: By saving cookies, you can maintain a user’s session across multiple interactions with a website. This is especially important for scenarios where you need to perform a series of actions that depend on the user’s session state, such as online shopping, where you want to keep items in a cart between visits.
  • Authentication: Cookies often contain authentication tokens or session identifiers. Saving and reusing these cookies allows you to stay logged in as a user, eliminating the need to re-enter login credentials for each interaction.
  • Data Retention: Some websites store user-specific data in cookies, such as preferences or settings. Saving these cookies ensures that the website recognizes the user’s preferences during subsequent visits.

Here is the function for saving cookies:

Now, let’s break down how this function works:

1) Function Signature: The saveCookiesToFile function accepts two parameters:

  • page: It is the Puppeteer page object representing the browser page from which we want to save cookies.
  • filePath: It is the path of the file where cookies will be saved. This file will store the cookies in JSON format.

2) Getting Cookies: Inside the function, we use await page.cookies() to retrieve all cookies from the current page. Puppeteer’s page.cookies() method returns an array of cookie objects.

3) Writing Cookies to a File: We use the Node.js fs (file system) module to write the retrieved cookies to the specified file. The fs.writeFileSync() function is used to write the cookies in a human-readable JSON format, making it easy to load and reuse them later.

4) Handling Errors: We wrap the entire process in a try-catch block to handle errors.

5) Success Indicator: If the function completes successfully without encountering any errors, it returns true to indicate that cookies have been saved to the specified file. 

Loading Cookies

In the world of web automation with Puppeteer, the ability to load cookies is an invaluable tool for manipulating session continuity, managing user authentication and restoring user-specific data. When you are simulating returning users or preserving session states, loading cookies allows you to seamlessly pick up where you left off in your automation journey.

Here is the function for loading cookies:

Let’s break down how this function works:

1) Function Signature: The loadCookiesFromFile function accepts two parameters:

  • page: It is the Puppeteer page object representing the browser page into which we want to load cookies.
  • filePath: It is the path of the file from which cookies will be loaded. This file should contain the cookies in JSON format.

2) Reading Cookies from File: Inside the function, we use the Node.js fs (file system) module to read the cookies from the specified file. We assume that the cookies are stored in JSON format in the file.

3) Setting Cookies in the Page: We use await page.setCookie(...cookies) to set the cookies in the current page. Puppeteer’s page.setCookie method accepts an array of cookie objects and sets them in the page’s browser context.

Clearing Cookies

Clearing cookies during a Puppeteer session is essential for various reasons:

  • Simulating a Fresh Session: Clearing cookies allows you to start your automation with a clean slate, simulating a fresh user session. This is particularly useful when you want to ensure that no previous data or settings influence your interactions with a website.
  • Resetting User Data: In certain testing scenarios, you may need to reset specific user data, such as login credentials or session-related information. Clearing cookies selectively enables you to achieve this while retaining other necessary session information.
  • Privacy and Compliance: In compliance with privacy regulations or company policies, you may need to clear cookies regularly to ensure user data is not retained longer than necessary, thus enhancing privacy and compliance.

Here’s the function for clearing cookies:

Let’s break down the working of this function:

1) FunctionSignature: The clearCookies function accepts two parameters:

  • page: It is a Puppeteer object representing the browser page from which we want to clear cookies.
  • cookieNames: It is an optional array of cookie names to be cleared. If provided, only the specified cookies will be cleared. If not provided or an empty array is passed, all cookies on the page will be cleared.

2) Clearing Cookies: Inside the function, we check if cookieNames is empty. If it is, we clear all cookies on the page by running the code using page.evaluate(). This code iterates through all cookies and removes them by setting their expiration date to the past. This effectively deletes all cookies.

3) Selective Cookie Clearance: If specific cookie names are provided in the cookieNames array, we use Puppeteer’s page.deleteCookie(...cookieNames) method to delete only the specified cookies. This allows you to target and clear particular cookies while retaining others.

Similar to the previous sections, we handle errors by wrapping the entire process in a try-catch block.

Conclusion

Mastering cookie management in Puppeteer is crucial for web automation. In this article, we discussed how to accept, save, load and clear cookies programmatically. Accepting cookies allows you to seamlessly interact with websites, while saving and loading cookies ensure session persistence and data retention. Additionally, clearing cookies offers the flexibility to reset sessions or maintain privacy and compliance.

Related Articles

Using Puppeteer on AWS Lambda for Scraping

Scroll in Puppeteer: Scroll to Bottom, Top, or Into View

5 Puppeteer Alternatives For Scraping & Application Testing