You are browsing a tutorial guide for Octoparse version 8.4. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
After you set up a task and take a test run on your local device, you may sometimes encounter such a problem: The number of data output doesn't match with the number of results on the target website.
If you have encountered the same problem, please check the possible causes and solutions below to see if any of them is helpful to your case.
1. Pagination Error
When scraping multiple pages, Octoparse can not go to the next page correctly all the time. The auto-generated pagination XPath may not always work well.
Click on Pagination, and then click the step Click to Paginate. Repeat the actions above to see if the page goes to the next page correctly all the time.
If the pagination is okay, which means Octoparse goes to pages one by one in the correct order, you can skip this part and check the next possible causes. If you find Octoparse skip pages, you will need to correct the XPath of Pagination.
Solution: Modify the XPath of Pagination to make sure it locates the next page button precisely.
Click on Pagination
Enter the new XPath and click Apply to save
2. Page Load Error
When you test run the task on your local device, you should keep an eye on the upper part of the progress window, which shows how the target web page is being navigated to the next page or to open a new page.
If you find that before the web page is completely loaded, the browser has skipped to another page, then you can try the following solutions to help the page load:
Solution A: A longer wait time for some steps (e.g., "Extract Data")
Solution B: Increase timeout for some steps (e.g. "Go to Web Page") or AJAX timeout for "Click Item"
Solution C: Set up Page Scroll for steps (e.g. Go to Web Page or Click Item)
3. Loop Mode Error
Normally after checking the pagination, you should then check the Loop Item which loops through each item from the page. As for the Loop Item, please pay attention to the Loop Mode especially to check if it is a Fixed List.
A fixed list is using elements' fixed positions to locate them. But if the page's structure changes a bit, for example, some pages have more or fewer items, or the location is different, then you may receive this kind of error message: "Cannot find any element matching this XPath expression"