Buy fast & affordable proxy servers. Get 10 proxies today for free.
Download our Proxy Server Extension
© Webshare Proxy
payment methods
In Selenium, CAPTCHA is a common hurdle when automating web testing workflows. Designed to distinguish between human users and bots, CAPTCHAs present tasks like identifying objects in images, solving puzzles, or entering distorted text, which are straightforward for humans but difficult for automated systems. While CAPTCHAs are effective for preventing abuse, they can disrupt Selenium scripts, making it challenging to fully automate processes on CAPTCHA-protected websites.
How to automatically bypass CAPTCHA with Selenium? →
In this article, we’ll walk you through the steps to automatically bypass CAPTCHA in Selenium, ensuring smooth execution of your test or scraping scripts.
Before setting up the process to bypass CAPTCHA in Selenium, ensure the following prerequisites are in place:
pip install selenium
Additionally, upgrade to Selenium 4 or later for built-in WebDriver support:
pip install --upgrade selenium
pip install 2captcha-python
CAPTCHAs are often a barrier when automating web interactions. This method demonstrates how to bypass CAPTCHA using Selenium and the 2Captcha service with a simple example.
Here's how you can integrate it step by step.
Start by importing all necessary libraries for Selenium, twocaptcha, and time management.
from selenium.webdriver.common.by import By
from twocaptcha import TwoCaptcha
from selenium import webdriver
import time
Use Selenium to open a browser instance and navigate to the page with the CAPTCHA you want to bypass.
driver = webdriver.Chrome() # Ensure you have the ChromeDriver installed
url = "https://2captcha.com/demo/normal" # Demo page for testing
driver.get(url)
Locate the CAPTCHA image, extract its URL, and pass it to the solver.normal() method of 2Captcha. Replace Your_2Captcha_API_key with your actual API key.
imgResults = driver.find_elements(By.XPATH, "//img[contains(@class,'_2hXzbgz7SSP0DXCyvKWcha')]")
solver = TwoCaptcha('Your_2Captcha_API_key') # Initialize the solver
result = solver.normal(imgResults[0].get_attribute("src")) # Solve the CAPTCHA
print("Solved CAPTCHA: " + str(result))
Find the CAPTCHA input field and submit button, then use the solution returned by 2Captcha to complete the process.
captchafield = driver.find_element(By.XPATH, "//input[contains(@class,'_26Pq0m_qFk19UXx1w0U5Kv')]")
captchafield.send_keys(result["code"]) # Enter the CAPTCHA solution
button = driver.find_element(By.XPATH, "//button[contains(@class, 'l2z7-tVRGe-3sq5kU4uu5 _2xjDiWmBxfqem8nGQMmGci _2HIb5VBFp6Oi5_JoLdEcl6 _2vbG_IBm-DpI5KeEAHJkRy')]")
button.click() # Submit the form
time.sleep(10) # Allow time for the page to load
Finally, confirm that the CAPTCHA has been bypassed by checking the success message on the page:
messagefield = driver.find_element(By.XPATH, "//p[contains(@class,'_2WOJoV7Dg493S8DW_GobSK')]")
print("Result: " + messagefield.text) # Output: Captcha is passed successfully!
Apart from using 2Captcha, you can also leverage the selenium-stealth Python package to avoid detection when scraping with Selenium. This package helps mimic human-like behavior, making your automated traffic appear more manual and reducing the likelihood of encountering CAPTCHAs or being blocked.
When automating CAPTCHA bypass with Selenium, there are several common issues you may encounter.
Problem: Some websites can detect when Selenium is running in headless mode and may block access or treat the request as a bot. This can happen even when you set a custom user-agent.
Solution: Use a headless browser that mimics human behavior more effectively. Tools like undetected-chromedriver help bypass detection. You can also set a custom user-agent along with other arguments like window size and enabling certain features to avoid detection.
from undetected_chromedriver.v2 import Chrome, ChromeOptions
options = ChromeOptions()
options.add_argument('--headless')
options.add_argument('user-agent=your_custom_user_agent')
driver = Chrome(options=options)
Problem: When you set a custom user-agent in Selenium, it may only apply to the initial page load. Subsequent page requests might reset the user-agent to its default value.
Solution: Ensure that the user-agent is set consistently for each request or use the set_preference method to ensure the setting persists across all page loads.
options = webdriver.ChromeOptions()
options.add_argument('user-agent=your_custom_user_agent')
driver = webdriver.Chrome(options=options)
driver.get("https://example.com")
# Continue with your automation
Problem: When using Selenium with Chrome in headless mode, some users report issues like the browser failing to launch correctly even after setting a custom user-agent.
Solution: Ensure that the --headless flag is used correctly. Sometimes, specifying the window size or enabling the GPU feature may resolve the issue.
options = webdriver.ChromeOptions()
options.add_argument('user-agent=your_custom_user_agent')
options.add_argument('--headless')
options.add_argument('--window-size=1920x1080') # Specify window size
driver = webdriver.Chrome(options=options)
Bypassing CAPTCHA in Selenium can significantly enhance your automation tasks, but it requires careful handling of web driver settings, custom user-agent configurations, and integration with CAPTCHA-solving services. By following the steps outlined and addressing common issues such as headless mode detection, user-agent persistence, and JavaScript execution problems, you can streamline your automation process.