All Collections
FAQ
Why does the task wait for a long before scraping the second page?
Why does the task wait for a long before scraping the second page?
Updated over a week ago

You are browsing a tutorial guide for Octoparse's latest version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier, and more robust! Download and upgrade here if you haven't already done so!

If it takes a long time before Octoparse is able to move on to the next action in the workflow or if you ever get stuck upon clicking a "Next Page" button, this is likely due to the AJAX technique (short for Asynchronous JavaScript and XML) used for the "Next Page" button. In this tutorial, I will explain how to work around the issue so you can fetch data efficiently and faster.

Why "AJAX Load" slows down the process

Before Octoparse goes on to execute actions such as Click Item and Click to Paginate, it needs to confirm that the page's fully loaded. To do this, Octoparse takes page-reloading as the signal for when the web page is ready for the next action in the workflow. For a web page that loads with AJAX, though, the new content is usually updated without reloading; in this case, Octoparse would not get the signal to proceed. As a result, you may get zero or much fewer data extracted than expected.

To work around this issue, we can set up an AJAX Load timeout for the Click Item action. When the timeout is reached, Octopares will proceed to the next action regardless of whether page reloading is detected.

Where to set up the AJAX Load

  • Click on Click Item or Click to Paginate action

2021-09-23_17-40-40.png
  • Tick Load with AJAX in the Options tab at the bottom of the workflow

2.png
  • Select the AJAX timeout according to how fast your page loads and click Apply to save the setting

3.png

Tip: Make sure to set up a timeout long enough for the page or the target information to load. In most cases, Octoparse detects AJAX and sets up the timeout automatically, but you may still need to extend the AJAX timeout for pages that take longer to load. Check out more about AJAX at Handling AJAX

Did this answer your question?