You are browsing a tutorial guide for Octoparse latest version. If you are running an older version of Octoparse, we strongly recommend you upgrade because it is faster, easier and more robust! Download and upgrade here if you haven't already done so!
After you set up a task and take a test run on your local device, you may sometimes encounter such a problem: The number of data output doesn't match the number of results on the target website.
If you have encountered the same problem, please check the possible causes and solutions below to see if any of them are helpful to your case.
1. Pagination Error
When scraping multiple pages, Octoparse cannot go to the next page correctly all the time. The auto-generated pagination XPath may not always work well.
Click on Pagination, and then click the step Click to Paginate. Repeat the actions above to see if the page goes to the next page correctly all the time.
If the pagination is okay, which means Octoparse goes to pages one by one in the correct order, you can skip this part and check the next possible causes. If you find Octoparse skip pages, you will need to correct the XPath of Pagination.
Solution: Modify the XPath of Pagination to make sure it locates the next page button precisely.
Click on Pagination
Enter the new XPath and click Apply to save
FAQ:
2. Page Load Error
When you test run the task on your local device, you should keep an eye on the upper part of the progress window, which shows how the target web page is being navigated to the next page or to open a new page.
If you notice that the browser skips to another page before the web page fully loads, or if the page doesn’t scroll at all, try the following solutions to ensure the page loads correctly:
Solution A: A longer wait time for some steps (e.g., "Extract Data")
Solution B: Increase timeout for some steps (e.g. "Go to Web Page") or AJAX timeout for "Click Item"
Solution C: Set up Page Scroll for steps (e.g. Go to Web Page or Click Item)
Solution D: Set up Partial Scroll for steps
3. Loop Mode Error
Normally after checking the pagination, you should then check the Loop Item which loops through each item from the page. As for the Loop Item, please pay attention to the Loop Mode especially to check if it is a Fixed List.
A fixed list uses elements' fixed positions to locate them. But if the page's structure changes a bit, for example, some pages have more or fewer items or the location is different, then you may receive this kind of error message: "Cannot find any element matching this XPath expression"
Solution: Switch "Fixed List" to "Variable List" first and then write a new XPath.