Skip to main content
Resolve Captcha
Scarlett avatar
Written by Scarlett
Updated yesterday

CAPTCHA is a very common anti-scraping technique applied by many websites in different forms.

To help improve the efficiency of scraping, Octoprase can currently handle the four kinds of Captcha automatically: ImageCaptcha, hCaptcha, ReCaptcha V2, and ReCaptcha V3.

hCaptcha and ReCaptcha V2 & V3 can be resolved similarly, while it is more complicated to set up a resolution to deal with ImageCaptcha.

Follow through this tutorial and you will be able to have a basic understanding of each Captcha and handle them with Octoparse.


1. What are hCaptcha, ReCaptcha V2 & V3?

  • hCaptcha usually combines:

- an "I am human" button with the logo of hCaptcha

- and simple questions (in pictures) that are easy for humans and difficult for machines:

  • ReCaptcha V2

Most ReCaptcha V2 usually has an "I'm not a robot" button; however, sometimes, it may contain simple questions similar to hCaptcha.

  • ReCaptcha V3 looks similar to ReCaptcha V2, but it does not have a checkbox.


2. How to solve hCaptcha, Recaptcha V2 & V3

  • Click the Add Step button in the workflow

  • Select Solve CAPTCHA

  • Click on the Solve CAPTCHA Type

  • Select the CAPTCHA type based on the Captcha you encounter

Note: If the ReCaptcha or hCaptcha you encounter includes a submit button(See the screenshot below), select hCaptcha V2 Checkbox or hCaptcha Checkbox. Otherwise, choose ReCaptcha or hCaptcha.

  • Click Apply to save the settings

11.png

Note:

  • For ReCaptcha or hCaptcha with a submit button, you will need to set up one more action.

a. Click on a submit button which can direct you to the target page

(It could be a submit button, sign in button, or confirm button)

b. Choose Click element/Click button

  • hCaptcha and ReCaptcha won't be resolved automatically until an actual data run. Thus, you need to turn on Browse Mode and resolve it manually to proceed when creating the task.


3. What is Image Captcha?

ImageCaptcha is the original way in which humans were verified. It can use known words or phrases or random combinations of digits and letters. Some ImageCaptcha also include variations in capitalization.

imageeg.png

4. How to solve Image Captcha

To follow through with the tutorial and resolve ImageCaptcha, you may use this URL: https://democaptcha.com/demo-form-eng/image.html

A. Select the Input Box and Image Box for the Captcha

  • Click on the Input Box for the Captcha

  • Select Solve CAPTCHA on the Tips panel

  • Click the Image Box

  • Click the Login/Submit/Confirm button to continue (sometimes it can be other buttons such as 'Send' in this specific case)

  • Click Confirm on the Tips Panel

B. Set up a Captcha Solving Failure

Now, we need to train Octoparse to resolve the Captcha by setting up a solving failure.

  • Click on the error message (in this case - Some errors were detected in your form: Invalid verification code)

  • Click Confirm Error on the Tips panel

C. Set up a Captcha Solving Success

  • Click Set Up CAPTCHA Solving Success to go through the final steps

  • Input the text shown in the Image Box

  • Click Submit CAPTCHA answer and complete setup

The Image Captcha has now been resolved. The Solve CAPTCHA step will be added to the workflow, and you can also modify the settings under the workflow.

17.png

Note:

  • hCaptcha/ReCaptcha V2/ReCaptcha V3 can be detected automatically, so there is no need to set XPath to locate them. Image CAPTCHA cannot be detected without XPath. You need to pay attention to the XPath in the settings.

  • The cost is $1/1K CAPTCHA. One attempt to resolve a CAPTCHA is counted as one CAPTCHA credit. So resolving one CAPTCHA successfully might cost several CAPTCHA credits. You can click on Add Credits to top up. CAPTCHA credits cannot be refunded. We have sent some credits for Standard/Professional plan users for testing. You can test it before paying for the credits.

  • Once the credits are used up, the task will fail to resolve the captchas. Thus, before running the task, make sure there are enough credits in your account.

Did this answer your question?