5.3.2.1. Category

5.3.2.1.1. Category URLs

Define URLs of the categories from which you want to save posts. Each item of this setting has two fields:

Post Category
This is a select element that displays the categories available in your site. You should select a category so that the plugin saves the posts into that category.
Category URL

You should enter the URL of the category page existing in the target web site. The plugin will collect post URLs from this category URL. The category URLs can be entered either as absolute URLs or relative URLs.

Absolute URL
A URL that starts with http. For example, http://site.com/category/art.
Relative URL

A URL that is defined relative to the base URL of the site, which is defined in Site URL setting. For example, if the base URL of the site is http://site.com/ and you want to crawl http://site.com/category/art, then you can enter category/art or /category/art or /category/art/ or category/art/ as a relative URL. The plugin will make this relative URL an absolute URL using the value defined in Site URL setting.

Note

If you are not comfortable with using relative URLs, you can directly enter an absolute URL. If a relative URL is not correctly entered, then the plugin will not be able to collect URLs from the category URL.

../../../_images/example-category-urls.png

Fig. 5.9 An example configuration of Category URLs setting.

An example configuration of the setting can be seen in Fig. 5.9. Here, there are two category URLs defined. The first one tells the plugin that it should go to https://techcrunch.com/, collect post URLs from this page, and then save those posts into Technology News category of WordPress. This URL is defined as an absolute URL. The plugin will not change anything in this URL. The second one is a relative URL, /apps/. This tells the plugin that it should use Site URL to make this relative URL an absolute URL. Here, Site URL setting’s value was defined as https://techcrunch.com/. So, the URL will become https://techcrunch.com/apps/. So, for the second entry, the plugin will collect post URLs from https://techcrunch.com/apps/ category and save those posts into Apps category in WordPress.

Tip

You can enter / as a category URL to tell the plugin that it should collect post URLs from Site URL setting’s value. In other words, / is a relative URL that will eventually become the absolute URL defined in Site URL setting.

Important

  • Changing this setting’s value will clear all of the URLs waiting in the queue.
  • Duplicate category URLs are not allowed. A category URL must not be added more than once.

5.3.2.1.2. Add category URLs automatically?

Category URLs you want to enter into Category URLs setting can be automatically retrieved from the target site using CSS selectors. If you want to enter category URLs by retrieving them from the target web site, check this. When you check this, two settings will be visible which are URL of a page containing category URLs and Category URL Selectors. You should fill those settings to automatically fill the category URLs.

5.3.2.1.3. URL of a page containing category URLs

Enter the URL (an absolute URL) of a page that contains category URLs that will be automatically entered to Category URLs setting. The plugin will find the URLs in this page’s source code.

Important

This setting is visible only if Add category URLs automatically? is checked.

5.3.2.1.4. Category URL Selectors

Define a CSS selector that will find the elements that store category URLs in the source code of the page whose URL is set in URL of a page containing category URLs. This setting also has an attribute input whose default value is href. It means that the plugin will try to retrieve the value of href attribute of the found elements if no attribute is specified. This is a type of Selector and Attribute Setting.

After you define the values, hit button. The plugin will find the URLs and add them to Category URLs setting.

Note

Do not forget to set post categories after the category URLs are inserted automatically, since the plugin will not set them automatically.

Important

This setting is visible only if Add category URLs automatically? is checked.

An example of inserting the category URLs automatically can be seen in the following video.

5.3.2.1.5. Test Category URL

An absolute category URL to be used to perform the tests for category page CSS selectors. In other words, when you click a that requires a URL under this tab, the URL you enter here will be used. If this is empty, the tests that require a URL cannot be performed.

You can simply enter a category URL here.

5.3.2.1.6. Category Post URL Selectors

CSS selectors for the post URLs in category pages. This is a type of Selector and Attribute Setting. The plugin will use the CSS selectors defined in this setting to find the URLs of the posts existing in the categories whose URLs are defined in Category URLs. If you do not enter a CSS selector that finds post URLs, then the plugin will not be able to find post URLs and, hence, not be able to automatically crawl posts.

Note

If you give more than one selector, each selector will be used to get URLs and the results will be combined.

Important

If you want the plugin to automatically crawl posts, you must enter at least one CSS selector that finds post URLs in a category page whose URL is defined in Category URLs.

For an example about how to use this setting, you can watch the following video.

5.3.2.1.7. Collect URLs in reverse order?

When you check this, the URLs found by URL selectors will be ordered in reverse before they are saved into the database. Therefore, the posts will be saved in reverse order for each category page.

Note

This does not make the plugin crawl the category pages starting from the last category page. This only reverses the order of the post URLs found in a category page. The plugin will still crawl the category pages starting from their first page.