5. Site Listing Page¶
WP Content Crawler defines a custom post type named as
wcc_sites in your WordPress installation.
Posts created in this post type are called
Site is a WordPress
post that stores the settings that are used to crawl a web site defined by you. Therefore, for
each web site you want to crawl, you create a different
site. These sites are listed
in this page.
This page shows you the sites you have created as a table. You can create a new site by clicking
Add New button available next to the page’s title or in the sidebar of your admin
Content Crawler section.
The table’s columns are as follows:
- Name of the site defined by you in the site settings
- The user who created the site
- Active for scheduling
- When you check this checkbox, the site is activated for scheduling. Scheduling is an umbrella
term that covers
post crawlingCRON events. Just checking/unchecking the checkbox saves its status as well.
- Active for recrawling
- When you check this checkbox, the site is activated for recrawling. In other words, the posts
saved using this site’s settings will be recrawled, i.e. updated, using the recrawling
settings defined either in
general settingsor in the site settings by checking
use custom general settingscheckbox. Just checking/unchecking the checkbox saves its status as well.
- Active for deleting
- When you check this checkbox, the posts saved by using this site’s settings will be deleted using the settings for post deleting, which are available in general settings of the plugin.
This column shows how many posts have been saved, deleted, and updated, as well as how many of them are in queue, waiting to be saved. The queue is a list of post URLs collected from the target web site using the settings under the
categorytab. The plugin, first, collects post URLs from the target site and stores them in the database, which is the queue. After that, it saves these URLs using scheduled events which are triggered by WP-Cron.
URL count shown as
otherindicates the URLs that are not categorized as saved, deleted, updated or in queue. These URLs are probably problematic URLs that cannot be saved due to, for example, being a duplicate post. Another possibility is that their status could not be updated due to a server error. This number is found using the following formula:\[Total - (Queue + Saved + Deleted) = Other\]
updatedposts are also
- Last X
- The columns named as last URL collection, last post crawl, last post recrawl, and last post delete show the dates at which the last URL collection, the last post crawling, the last post update, and the last post deleting events ran for the site, respectively.
- This column shows when the site settings was published.