5. Site Listing Page
WP Content Crawler defines a custom post type named as wcc_sites
in your WordPress installation.
Posts created in this post type are called Sites
. A Site
is a WordPress
post that stores the settings that are used to crawl a web site defined by you. Therefore, for
each web site you want to crawl, you create a different site
. These sites are listed
in this page.
This page shows you the sites you have created as a table. You can create a new site by clicking
Add New
button available next to the page’s title or in the sidebar of your admin
panel, under Content Crawler
section.
The table’s columns are as follows:
- Title
- Name of the site defined by you in the site settings
- Author
- The user who created the site
- Active for scheduling
- When you check this checkbox, the site is activated for scheduling. Scheduling is an umbrella
term that covers
URL collection
andpost crawling
CRON events. Just checking/unchecking the checkbox saves its status as well. - Active for recrawling
- When you check this checkbox, the site is activated for recrawling. In other words, the posts
saved by using this site’s settings will be recrawled, i.e. updated, by using the recrawling
settings defined either in
general settings
or in the site settings by checkinguse custom general settings
checkbox. Just checking/unchecking the checkbox saves its status as well. - Active for deleting
- When you check this checkbox, the posts saved by using this site’s settings will be deleted by using the settings for post deleting, which are available in general settings of the plugin.
- Counts
This column shows how many posts have been saved, deleted, and updated, as well as how many of them are in queue, waiting to be saved. The queue is a list of post URLs collected from the target web site by using the settings under the
category
tab. The plugin, first, collects post URLs from the target site and stores them in the database, which is the queue. After that, it saves these URLs by using scheduled events which are triggered by WP-Cron.URL count shown as
other
indicates the URLs that are not categorized as saved, deleted, updated or in queue. These URLs are probably problematic URLs that cannot be saved due to, for example, being a duplicate post. Another possibility is that their status could not be updated due to a server error. This number is found by using the following formula:\[Total - (Queue + Saved + Deleted) = Other\]Note that
updated
posts are alsosaved
posts.- Last X
- The columns named as last URL collection, last post crawl, last post recrawl, and last post delete show the dates at which the last URL collection, the last post crawling, the last post update, and the last post deleting events ran for the site, respectively.
- Date
- This column shows when the site settings was published.