Main Website
Web Scraping
Web Scraping
Updated on
May 21, 2024

How To Execute JavaScript on Page in Puppeteer: Examples

In the realm of web scraping and automated browser testing, Puppeteer stands out as a powerful tool for controlling headless Chrome. One of its key features is the ability to execute JavaScript code directly within the context of a web page, enabling developers to interact dynamically with elements, manipulate data and extract valuable insights.

Step-by-Step guide to executing JavaScript in Puppeteer

Let’s discuss the step-by-step process of setting up Puppeteer, navigating to a web page and integrating JavaScript execution to enhance your automation tasks.

Setting up Puppeteer

Before diving into Puppeteer’s capabilities, you need to set it up in your development environment. Fortunately, Puppeteer offers easy installation via npm (Node Package Manager), making it accessible to developers across various platforms. Simply run the following command in your terminal to install Puppeteer:


npm install puppeteer

Navigating to a web page

You need to navigate to the desired URL in order to execute JavaScript on a webpage with Puppeteer. This can be achieved using Puppeteer’s page.goto() method, which loads a given URL in the browser’s tab. Here’s a basic example of how to navigate to a web page using Puppeteer:


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  // Further code goes here
  await browser.close();
})();

Executing JavaScript code on the page context

You can execute JavaScript code using Puppeteer’s page.evaluate() method that evaluates the provided function within the context of the page. Here’s how you can execute JavaScript code on a page with Puppeteer:


const result = await page.evaluate(() => {
  // JavaScript code to be executed on the page
  return document.title;
});

console.log(result);

In this snippet, Puppeteer evaluates the provided function on the page and returns the title of the web page.

Example - Executing JavaScript on page with Puppeteer

In this section, we’ll demonstrate how to leverage Puppeteer’s capabilities to execute JavaScript for scraping data on a book-selling website.

Launching Puppeteer and navigating to the website

First, we initiate Puppeteer, launch a new browser instance, create a new page and navigate to the website https://books.toscrape.com/.


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://books.toscrape.com/');

Executing JavaScript to extract book data

Now, we execute custom JavaScript code within the context of the webpage to extract book titles, prices and availability.


// Extracting Book Titles
const titles = await page.evaluate(() => {
    const titleNodes = document.querySelectorAll('.product_pod h3 a');
    const titlesArray = Array.from(titleNodes).map(node => node.textContent.trim());
    return titlesArray;
  });

// Extracting Book Prices
const prices = await page.evaluate(() => {
    const priceNodes = document.querySelectorAll('.product_pod p.price_color');
    const pricesArray = Array.from(priceNodes).map(node => node.textContent.trim());
    return pricesArray;
  });

// Extracting Book Availability
const availability = await page.evaluate(() => {
    const availabilityNodes = document.querySelectorAll('.product_pod p.availability');
    const availabilityArray = Array.from(availabilityNodes).map(node => node.textContent.trim());
    return availabilityArray;
  });

Through the use of document.querySelectorAll(), we target specific elements on the page, such as book titles nested within <h3> tags with the class .product_pod, prices marked with the class .price_color, and availability information indicated by the class .availability.

Combining extracted data into an array of objects

In this step, we merge the extracted book titles, prices and availability into an array of objects. Using the map() method, each book’s information is paired together into a single object within the booksData array.


const booksData = titles.map((title, index) => ({
    title,
    price: prices[index],
    availability: availability[index]
  }));

Writing data to a JSON file

Lastly, we write the extracted data to a JSON file and close the Puppeteer browser instance.


// Write data to a JSON file
const fs = require('fs');
fs.writeFileSync('booksData.json', JSON.stringify(booksData, null, 2));

await browser.close();
})();

Here’s how the output looks like:

Conclusion

In this article, we explored the capabilities of Puppeteer in executing JavaScript on web pages, focusing on the example of extracting data from https://books.toscrape.com/. We demonstrated how Puppeteer enables navigation through web pages, execution of JavaScript to extract targeted information such as book titles, prices and availability, and finally, storing the scraped data in JSON format.

End-to-End Testing with Puppeteer: Getting Started

How to scrape websites using Puppeteer?

Get Element in Puppeteer: Mastering Class, ID and Text Methods