15.3. Fixing automatic crawling problems
If the plugin has not been saving any posts for a long amount of time, although it should have been, then do the following:
Make sure scheduling is enabled under Scheduling Tab of General Settings Page. Otherwise, the plugin does not automatically save posts.
Make sure the site is configured for automatic crawling by following Saving posts automatically guide.
Make sure there are no empty URL inputs in Category URLs setting. If so, delete the inputs with no URL in it.
The plugin automatically creates posts in your site if there are post URLs waiting in the queue. You can see how many post URLs are in the queue by checking out Active Sites Section in Dashboard Page.
If there are no URLs in the queue, then either there are no new posts in the target category pages or the plugin cannot find post URLs in the target category pages. If there are URLs waiting in the queue but the plugin does not save them automatically while it should, please continue with Checking if CRON jobs work and fixing CRON problems instead of continuing with the items below.
You can find out if the plugin can find post URLs in the target category pages by testing your category settings as it is explained in Testing site settings guide. When testing, please select a category URL from the select input, which is explained in Tester Page. The reason for this is to ensure that you do not test a category URL that is not defined in Category URLs setting, since the plugin will only use the category URLs defined in Category URLs setting.
If the test results do not show any post URLs, first, make sure that the category URLs you tested are valid by opening them in your browser. If they show a
not found
page, you should fix the category URLs defined in Category URLs setting. If you are sure that the category URLs are valid, then you should refine the CSS selectors defined in Category Post URL Selectors setting.If you are sure that Category URLs and Category Post URL Selectors settings are correctly configured and the test results still do not show any post URLs, then the target site might have restricted the access from your server.
Tip
You can also test if the target site has restricted your server’s access by opening Visual Inspector and trying to load a URL that exists in the target site. If the page cannot be loaded, then your server cannot reach the target site.
Tip
You can also test if the target site has restricted your server’s access by using button of Category Post URL Selectors setting after you make sure a valid URL is entered in Test Category URL and the CSS selector you test is definitely valid. For example, you can temporarily enter
body
as a selector andhtml
as attribute in Category Post URL Selectors since every HTML page must have abody
element, and test that selector. If there is no result, then your server cannot reach the target site.If the target site has restricted the access from your server, then you can try to use a proxy (See: Using proxies) or to change HTTP User Agent as a workaround.
If there are post URLs in the test results, make sure the URLs actually exist in the target site by clicking to a few of the URLs. If you see
not found
pages, then you should make sure that the plugin finds valid post URLs by refining your site settings.By now, you have made sure that your category settings were properly configured such that category URLs are correctly defined and the plugin can find valid post URLs.
The other reason was that the target category pages did not contain new post URLs. Check if all of the posts existing in the target category pages are saved in your site. If all of the posts are already saved, then there is nothing wrong and everything works as it should. You do not have to proceed with the next items.
If you reached this item, then it means that there are new post URLs in the target category pages and the plugin cannot automatically save them. In this case, the problem is most likely related to CRON jobs. Please continue with Checking if CRON jobs work and fixing CRON problems.