Buy fast & affordable proxy servers. Get 10 proxies today for free.
Download our Proxy Server Extension
© Webshare Proxy
payment methods
CAPTCHA (Completely Automated Public Turing test to tell Computers and Humans Apart) is a common security mechanism designed to block automated bots from accessing websites. In Playwright, a popular end-to-end testing library, encountering CAPTCHA can be a roadblock for automation. Websites use CAPTCHA to differentiate real users from bots by presenting challenges like image recognition or text entry. However, automating CAPTCHA handling in Playwright can save significant time and effort when testing or scraping. In this article, we'll guide you through the steps to bypass CAPTCHA automatically with Playwright, ensuring a smooth automation workflow.
How to automatically bypass CAPTCHA with Playwright? →
To bypass CAPTCHA with Playwright effectively, ensure you have the following setup:
python -m venv playwright-captcha-env
source playwright-captcha-env/bin/activate # Use `.\playwright-captcha-env\Scripts\activate` on Windows
pip install playwright
python -m playwright install
pip install 2captcha-python
Follow the steps below to bypass CAPTCHA automatically using Playwright with the 2captcha service:
Import the required libraries, including Playwright for browser automation and 2Captcha to solve CAPTCHA challenges:
from playwright.sync_api import sync_playwright
from twocaptcha import TwoCaptcha
Initialize Playwright to launch a browser and create a new page. Also, instantiate the 2Captcha solver with your API key:
url = "https://patrickhlauke.github.io/recaptcha/" # Target URL with reCAPTCHA
with sync_playwright() as p:
browser = p.chromium.launch(headless=True) # Launch browser in headless mode
page = browser.new_page() # Create a new page
solver = TwoCaptcha("<YOUR_API_KEY>") # Initialize 2Captcha solver with your API key
Navigate to the URL containing the CAPTCHA and locate the iFrame element that holds the CAPTCHA box. Switch to the iFrame and extract the CAPTCHA site key:
page.goto(url) # Open the target URL
# Obtain the iFrame containing the CAPTCHA box
captcha_frame = page.wait_for_selector("iframe[src*='recaptcha']")
# Switch to the content of the CAPTCHA iframe
captcha_frame_content = captcha_frame.content_frame()
# Extract the site key from the CAPTCHA iframe
site_key = captcha_frame.get_attribute("src").split("k=")[-1].split("&")[0]
# Get the CAPTCHA checkbox element
captcha_checkbox = captcha_frame_content.wait_for_selector("#recaptcha-anchor")
# Click the CAPTCHA checkbox to start the challenge
captcha_checkbox.click()
Use the 2Captcha service to solve the CAPTCHA by passing the extracted site key. Retrieve the solution and input it into the hidden CAPTCHA response field:
# Solve the CAPTCHA using 2Captcha
captcha_response = solver.recaptcha(sitekey=site_key, url=url)
# Extract the CAPTCHA response token from the result
captcha_token = captcha_response["code"]
if captcha_response:
# Fill the solved CAPTCHA token into the response field
input = page.evaluate(
f'document.querySelector("#g-recaptcha-response").value="{captcha_token}"'
)
# Print the input value to confirm the token
print(input)
# Take a screenshot of the page for verification
page.screenshot(path="screengrab.png")
After filling in the CAPTCHA token, proceed with further actions (like form submission) and close the browser:
page.wait_for_timeout(5000) # Wait to observe the result
# Close the browser session
browser.close()
Apart from this method, the Playwright Stealth plugin also helps bypass CAPTCHAs by making automated interactions appear more like human behavior. This open-source plugin enhances Playwright with various evasion techniques that help to avoid detection by CAPTCHA systems. For instance, it can modify the User Agent to mimic a real browser, spoof runtime environments, disable WebRTC to prevent IP address identification, and alter the WebDriver navigator field to avoid typical scraping patterns.
When automating CAPTCHA solving with Playwright and third-party services, you may encounter some common issues. Here’s how to resolve them:
Issue: Playwright fails to locate the CAPTCHA element due to an incorrect selector.
Solution: Double-check the CSS selector for the CAPTCHA. Use more flexible selectors like targeting the src attribute of CAPTCHA images:
captcha_element = page.locator("img[src*='captcha']")
captcha_element.screenshot(path="captcha.png")
You can also use Playwright's built-in debugging tools to inspect the page and find the correct selector. Use page.pause() to pause the script and inspect the elements in the browser.
Issue: Delays in receiving the CAPTCHA solution from the service.
Solution: Implement a retry mechanism with time.sleep() to wait before retrying the request for a solution:
import time
time.sleep(5) # Wait for 5 seconds before retrying
Issue: Using an incorrect or expired API key will cause request failures.
Solution: Ensure the API key is valid and correctly included in your requests. Check the service's dashboard for the correct key:
api_key = "your_valid_api_key"
response = requests.post(
"https://2captcha.com/in.php",
data={"key": api_key, "method": "post"},
files={"file": captcha_image},
)
Bypassing CAPTCHA with Playwright combines automation with third-party CAPTCHA-solving services for efficient handling of challenges. To enhance reliability and handle IP-based restrictions, you can integrate a proxy service like Webshare. This allows you to rotate IPs seamlessly and avoid detection during automation.