Main Website
Web Scraping
Web Scraping
Updated on
May 21, 2024

How to Intercept Requests in Puppeteer?

Puppeteer is a Node library that provides a high-level API to have fine-grained control in Chrome or Chromium over the DevTools Protocol. It is primarily used for automating tasks across a web page, from rendering, scraping, and testing to taking screenshots and PDF generation. One of Puppeteer's most powerful features is the Puppeteer intercept request which refers to its ability to intercept, observe, block, or modify network requests during page interactions. This capability is essential for developers who need to test how web applications handle various network conditions or modify requests for testing APIs.

In this tutorial, you will learn how to intercept requests in Puppeteer with step-by-step examples. We will also discuss a real-world example to illustrate these concepts further and provide troubleshooting tips for common errors that might arise.

Observing network requests

One of the main reasons for request interception is to observe them. There are several reasons to do it as listed below.

  1. Observing network requests is important for performance analysis. This allows  developers to identify and address bottlenecks or inefficiencies in data transfer that can impact user experience.
  2. It plays a key role in security audits by enabling developers to scrutinize the data sent to and from the server, helping detect potential security vulnerabilities.
  3. Analyze a redirect chain to see how many redirects are happening, and what the URLs are, and even modify them before your browser follows them.
  4. When integrating external APIs, observing network requests ensures that the application sends correct data and receives expected responses.
  5. This technique assists in debugging and error tracing by logging request and response data, thus pinpointing where issues such as authentication failures or unexpected server responses occur.

Let's see the steps to intercept network requests to observe them.

Initialize Puppeteer and open a page

Start by setting up Puppeteer and opening a new page. This involves launching the browser, creating a new page instance, and going to the target URL.


const puppeteer = require('puppeteer');

async function startBrowser() { 
//the await request pauses the rest of the code until the current line is done.
    const browser = await puppeteer.launch();
    const page = await browser.newPage();
    await page.goto('https://puppeteer-example.com');
    return { browser, page };
}

Set up request interception to observe requests

To observe the requests, you need to listen to the request event on the page object. This event is emitted for every request made by the page, and by attaching a listener, you can log details about each request.


const { browser, page } = await startBrowser();

page.on('request', interceptedRequest => {
    console.log(`URL: ${interceptedRequest.url()}`);
    console.log(`Type: ${interceptedRequest.resourceType()}`);
    console.log(`Method: ${interceptedRequest.method()}`);
});

// await browser. close  closes the opened browser
await browser.close();

This setup logs the URL, resource type, and method of each network request initiated by the page. This information is invaluable for debugging and understanding external interactions.

Blocking network requests

Blocking specific network requests is useful for testing how your application behaves under conditions where certain types of resources are unavailable. For example, blocking images or stylesheets can simulate a scenario where these resources fail to load due to network issues.

Let's see how to implement blocking network requests in Puppeteer.

Step 1 - Before you can block requests, you must enable request interception on the page object. This allows you to manage each request manually.


await page.setRequestInterception(true);

Block-Specific Types of Requests - Decide which requests you want to block. For instance, you might want to block all image requests to speed up loading times during testing or to simulate a scenario where images are not available.


page.on('request', interceptedRequest => {
    if (interceptedRequest.resourceType() === 'image') {
        interceptedRequest.abort();
    } else {
        interceptedRequest.continue();
    }
});

await page.goto('https://puppeteer-example.com');
// await browser. close  closes the opened browser
await browser.close();

This code snippet effectively stops and all image requests while allowing others to continue, which can be particularly useful in performance testing or when ensuring that critical text content is accessible even when images fail to load.

Modify network requests

Modifying requests allows developers to change request headers, POST data, or the URL of outgoing requests. This is useful for testing how servers respond to altered inputs or for adding custom headers required by an API without changing the actual application code.

Enable Request Interception- As with blocking, modifying requests requires enabling request interception to manually manage each request.


await page.setRequestInterception(true);

Modify and Continue Requests - You can modify the request by changing its headers, query parameters, or POST data, before calling continue. In the following example, we add a custom header to every request.


page.on('request', interceptedRequest => {
    const headers = Object.assign({}, interceptedRequest.headers(), {
        'My-Custom-Header': 'HeaderValue'
    });
    interceptedRequest.continue({ headers });
});

//goto https tells the which URL the page should open
await page.goto('https://example.com');
await browser.close();

This approach is beneficial when you need to test how your application or an external API responds to altered header information, which can mimic conditions like authentication tokens being passed or simulate requests from different user agents.

Real-world examples

There are many real-world use cases for request interception in test automation and scraping.

Testing user authentication

Suppose you want to test how your application handles expired tokens. You can modify the authentication headers to include an expired token and observe how the application behaves, ensuring that it correctly prompts the user to re-authenticate.

Content availability under poor network conditions

Blocking image and stylesheet requests can simulate low-bandwidth conditions, allowing you to test how your application prioritizes critical content and remains functional even when non-essential resources are unavailable.

API Dependency handling

By modifying API requests to return predetermined responses, you can ensure that your application gracefully handles API failures or unexpected responses, such as higher-than-normal latency or incorrect data formats.

Some websites employ advanced anti-scraping measures. In such cases, consider using the 'puppeteer extra plugin stealth', an extension built on top of Puppeteer Extra, alongside interception to mask your automation and increase scraping success rates.

It is also worth noting that before modifying a request, it's essential to check if it's already been dealt with by other handlers. This avoids conflicts and ensures your handler operates on the request as expected.

Troubleshooting common errors

  1. Requests Not Being Intercepted: This usually occurs if request interceptions are not enabled before navigation starts. Ensure you enable request interceptions right after launching the browser and before any navigation or request occurs.
  2. Performance Issues: Enabling request interception can slow down your tests because Puppeteer tells your code to await requests. Optimize your request interception code to handle requests quickly, and disable interception when it's not needed.
  3. Unhandled Promise Rejections: Always include error handling in your request interception logic to catch and handle exceptions. This prevents unhandled promise rejections that can crash your application.

Conclusion

Intercepting requests with Puppeteer offers powerful capabilities for testing and ensuring the robustness of web applications. Whether observing, blocking, or modifying requests, developers can gain deeper insights into network interactions and simulate various network conditions, leading to better-tested, more reliable applications. By mastering these techniques, you can enhance your testing strategies and effectively handle any network-related challenges in your applications.

End-to-End Testing with Puppeteer: Getting Started

Proxy in Puppeteer: 3 Effective Setup Methods Explained

How to scrape websites using Puppeteer?