Main Website
Scraping
Web Scraping
Updated on
March 28, 2024

Get Element in Puppeteer: Mastering Class, ID and Text Methods

Efficient element selection is a key to success in the world of web automation. In this article, we will discuss Puppeteer’s methods for precisely targeting and manipulating elements on web pages. We’ll explore three essential techniques:

These methods are fundamental to automating web tasks, extracting data, and enhancing your web automation skills. Let’s explore these strategies and equip you with the knowledge and examples needed to master Puppeteer’s element selection capabilities. 

Get Element by Class

In Puppeteer, getting an element by its class is a common and useful operation when automating web interactions. A class is a CSS attribute applied to one or more HTML elements, allowing you to style and group them. To retrieve an element by class, Puppeteer provides the page.$(selector) method, where selector is a CSS class selector (prefixed with a dot ‘.’). This method returns the first element matching the specified class selector on the page.

In Puppeteer, getting an element by its class is a common and useful operation when automating web interactions. A class is a CSS attribute applied to one or more HTML elements, allowing you to style and group them. To retrieve an element by class, Puppeteer provides the page.$(selector) method, where selector is a CSS class selector (prefixed with a dot ‘.’). This method returns the first element matching the specified class selector on the page.

Examples

Here are some code examples to illustrate how to use the page.$(selector) method:


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.goto('https://example.com');
  
  // Getting the element with class 'header'
  const headerElement = await page.$('.header');
  
  // Interacting with the element
  await headerElement.click();

  await browser.close();
})();

In this example, we launch a headless browser, navigate to a web page, and then use page.$('.header') to select the element with the class “header”. We can subsequently interact with this element, such as clicking it.

Here’s another example:


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Waiting for an element with class 'login-button' to appear and then click it
  await page.waitForSelector('.login-button');
  const loginButton = await page.$('.login-button');
  await loginButton.click();

  await browser.close();
})();

In this example, we wait for an element with the class login-button to appear using page.waitForSelector('.login-button') before selecting and clicking it. This ensures that we interact with the element only after it is loaded.

Tips and best practices to Get Element by Class

Here are some useful tips to get an element by its class:

Be Specific: Ensure that the class selector is specific enough to uniquely identify the element you want. Avoid overly generic class names that may match multiple elements on the page.


// Incorrect: Generic class selector
const element = await page.$('.button');

// Correct: Specific class selector
const element = await page$('.login-button');

Error Handling: Always handle the possibility that the element may not exist. Puppeteer will return null if the element is not found, so check for this and handle it in your code.


const element = await page.$('.non-existent-element');
if (element === null) {
  console.log('Element not found.');
} else {
  // Performing actions on the element
  await element.click();
}

Wait for Element: If the element you are looking for is dynamically loaded or takes time to appear on the page, consider using Puppeteer’s page.waitForSelector(selector) method before attempting to select the element.


// Waiting for an element with class 'dynamic-element' to appear
await page.waitForSelector('.dynamic-element');

// Now it's safe to select the element
const element = await page.$('.dynamic-element');
await element.click();

Performance: If you need to interact with multiple elements of the same class, consider using page.$$(selector) to get an array of matching elements.


// Getting an array of all elements with class 'item'
const elements = await page.$$('.item');

// Looping through and interacting with each element
for (const element of elements) {
  await element.click();
}

Get Element by ID

In Puppeteer, obtaining an element by its unique ID is a precise and efficient way to interact with a specific element on a web page. An ID is an HTML attribute assigned to a single element, making it distinct from others on the same page. To select an element by its ID, Puppeteer provides the page.$(selector) method, where selector should be a CSS ID selector (prefixed with a hash ‘#’). This method retrieves the first element matching the specified ID selector on the page.

Examples

Below are the examples that show how to use the page.$(selector) method to get an element by ID:


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  
  await page.goto('https://example.com');
  
  // Getting the element with ID 'unique-element'
  const uniqueElement = await page.$('#unique-element');
  
  // Performing actions on the element
  await uniqueElement.type('Puppeteer is awesome!');
  
  await browser.close();
})();

In this example, we use page.$('#unique-element') to select the element with the unique ID “unique-element”. We then interact with this element by typing into an input field.

Here’s another example:


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Directly selecting and clicking the element by ID
  const element = await page.$('#unique-element-id');
  if (element) {
    await element.click();
  } else {
    console.log('Element with ID "unique-element-id" not found.');
  }

  await browser.close();
})();

In this example, we directly select and click the element with the ID unique-element-id without additional waiting because IDs are unique by definition. We also check for the element’s existence before interacting with it.

Tips and best practices to Get Element by ID

Here are some useful tips to get an element by its ID:

Unique IDs: Ensure that the ID you’re using is unique on the page. IDs should only be assigned to a single element. Avoid using duplicate IDs as this can lead to unexpected behavior and may not conform to HTML standards.


// Correct: Using a unique ID
const element = await page.$('#unique-id');

Direct Selection: When selecting elements by ID, you can directly reference the element using the page.$('#element-id') without the need for additional waits or checks, as IDs are meant to be unique.


// Directly selecting the element by ID
const element = await page.$('#unique-element-id');
await element.click();

Consistency in Naming: Maintain a consistent naming convention for the element IDs throughout your project. This makes it easier to locate and manage elements in your Puppeteer scripts, especially when dealing with complex web applications.


// Consistent naming convention for IDs
const usernameInput = await page.$('#username');
const passwordInput = await page.$('#password');

Get Element by Text

In Puppeteer, locating elements based on their textual content is a valuable capability when specific elements on a web page are identifiable by the text they contain. To achieve this, Puppeteer offers the page.$$eval(selector, text => ...) method. This method enables you to find elements that match a specific text content within a given selector. 

Examples

Below are the examples that demonstrate the use of this method to get an element by text.


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Finding and clicking a button with the text 'Learn More'
  await page.$$eval('button', buttons => {
    for (const button of buttons) {
      if (button.textContent === 'Learn More') {
        button.click();
        break; // Clicking the first matching button and exiting the loop
      }
    }
  });

  await browser.close();
})();

In this example, we navigate to a webpage and use page.$$eval to find and click the button with the text “Learn More”. We iterate through all matching elements to find the one with the desired text content.

Here’s another example:


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Finding and highlighting elements with the text 'Special Offer'
  const elementsWithText = await page.$$eval('*:not(script)', (elements, searchText) => {
    const matchedElements = [];
    elements.forEach(element => {
      if (element.textContent.includes(searchText)) {
        element.style.border = '2px solid red'; // Highlighting matching elements
        matchedElements.push(element);
      }
    });
    return matchedElements;
  }, 'Special Offer');

  await browser.close();
})();

In this example, we locate and highlight all elements on the page containing the text “Special Offer” using page.$$eval. It demonstrates how to work with elements that share the same text content.

Tips and best practices to Get Element by Text

Here are some useful tips to get an element by text:

Text Exact Match: Be cautious when using text-based selection. If the text content must match exactly, use ‘===’ for precise matching.


// Exact text match
if (element.textContent === 'Learn More') {
  // Perform actions on the element
}

Text Contains Match: If the text content contains additional characters or spaces, consider using includes() for a partial match.


// Partial text match
if (element.textContent.includes('Special Offer')) {
  // Perform actions on the element
}

Element Specificity: Ensure that the selector used in page.$$eval(selector, text => ...) is specific enough to narrow down the search scope. Using broad selectors may result in multiple matches.


// Correct: A specific selector targeting a single element
await page.$$eval('.product-description', descriptions => {
  // ...
});

Advanced Get Element configuration techniques

In Puppeteer, mastering advanced configurations for element interaction is essential when dealing with complex web pages and dynamic content. In this section, we will cover four aspects of advanced element handling.

Getting multiple elements

In Puppeteer, you often need to select and interact with multiple elements that share the same characteristics, such as the same class. To achieve this, you can use the page.$$ method followed by a selector to get an array of all matching elements.


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Getting all buttons on the page
  const buttons = await page.$$('button');

  // Iterating through the buttons and performing actions
  for (const button of buttons) {
    console.log(await button.textContent());
  }
  await browser.close();
})();

In this example, we use page.$$('button') to get all buttons on the page and then iterate through them to print their text content. This is useful when you want to work with a set of similar elements.

Getting a list of all elements on a web page

To fetch all elements on a web page, you can use the page.$$('*') method that selects all elements on the page and returns them as an array.


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Getting all elements on the page
  const allElements = await page.$$('*');

  console.log(`Total elements on the page: ${allElements.length}`);

  await browser.close();
})();

Here, we use page.$$('*') to obtain all elements on the page, and we print the total number of elements. This can be useful for comprehensive page analysis.

Getting href of an element

In web scraping and automation tasks, it is common to extract URLs or links from web pages. Puppeteer provides a straightforward method for fetching the href attribute of an element, such as a hyperlink, making it easy to access and utilize the linked resource.


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Getting the href of a link with the class 'external-link'
  const link = await page.$('.external-link');
  const href = await link.getProperty('href');
  const hrefValue = await href.jsonValue();

  console.log(`The link's href: ${hrefValue}`);

  await browser.close();
})();

In this example, we select an element with the class external-link and retrieve its href attribute value using getProperty('href').

Getting style of an element

In Puppeteer, it’s often necessary to inspect and manipulate the visual properties of web elements, such as their colors, dimensions and positioning. To access the computed styles of an element, you can utilize the page.evaluate method to run code within the context of the web page, allowing you to extract style information.


const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();

  await page.goto('https://example.com');

  // Getting the computed style of an element
  const element = await page.$('.highlighted');
  const styles = await page.evaluate(el => {
    const computedStyle = window.getComputedStyle(el);
    return {
      backgroundColor: computedStyle.backgroundColor,
      color: computedStyle.color,
    };
  }, element);

  console.log('Computed styles:', styles);

  await browser.close();
})();

In this example, we select an element with the class highlighted and use page.evaluate to execute Javascript code that fetches the computed backgroundColor and color styles of the element.

Conclusion

In this article, we explored the powerful capabilities of Puppeteer for element selection and interaction in web automation and scraping tasks. We discussed how to target elements by class, ID and text, along with the best practices for precise and efficient selection. Further, we delved into advanced configurations, such as retrieving multiple elements, obtaining a list of all elements on a page, fetching href attributes, and accessing computed styles. Puppeteer’s versatility and simplicity empower developers to navigate and manipulate web pages with ease. Whether automating repetitive tasks or extracting valuable data, Puppeteer’s element selection methods help unlock the potential of web automation.

Related Articles

Fill & Submit Form in Puppeteer: Guide With Examples

Click in Puppeteer: Guide to Master Puppeteer's Clicking Methods

waitForSelector in Puppeteer: Basic and Advanced Configuration