5. Site Listing Page

WP Content Crawler defines a custom post type named as wcc_sites in your WordPress installation. Posts created in this post type are called Sites. A Site is a WordPress post that stores the settings that are used to crawl a web site defined by you. Therefore, for each web site you want to crawl, you create a different site. These sites are listed in this page.

This page shows you the sites you have created as a table. You can create a new site by clicking Add New button available next to the page’s title or in the sidebar of your admin panel, under Content Crawler section.

The table’s columns are as follows:

Title
Name of the site defined by you in the site settings
Author
The user who created the site
Active for scheduling
When you check this checkbox, the site is activated for scheduling. Scheduling is an umbrella term that covers URL collection and post crawling CRON events. Just checking/unchecking the checkbox saves its status as well.
Active for recrawling
When you check this checkbox, the site is activated for recrawling. In other words, the posts saved using this site’s settings will be recrawled, i.e. updated, using the recrawling settings defined either in general settings or in the site settings by checking use custom general settings checkbox. Just checking/unchecking the checkbox saves its status as well.
Active for deleting
When you check this checkbox, the posts saved by using this site’s settings will be deleted using the settings for post deleting, which are available in general settings of the plugin.
Counts

This column shows how many posts have been saved, deleted, and updated, as well as how many of them are in queue, waiting to be saved. The queue is a list of post URLs collected from the target web site using the settings under the category tab. The plugin, first, collects post URLs from the target site and stores them in the database, which is the queue. After that, it saves these URLs using scheduled events which are triggered by WP-Cron.

URL count shown as other indicates the URLs that are not categorized as saved, deleted, updated or in queue. These URLs are probably problematic URLs that cannot be saved due to, for example, being a duplicate post. Another possibility is that their status could not be updated due to a server error. This number is found using the following formula:

\[Total - (Queue + Saved + Deleted) = Other\]

Note that updated posts are also saved posts.

Last X
The columns named as last URL collection, last post crawl, last post recrawl, and last post delete show the dates at which the last URL collection, the last post crawling, the last post update, and the last post deleting events ran for the site, respectively.
Date
This column shows when the site settings was published.