13.2. Can I get content from X site?

  • Can I scrape X site?
  • Is plugin able to grab content from X site?

The plugin can retrieve content that is not manipulated by JavaScript. Therefore, you can retrieve the content that is not manipulated by JavaScript. To be able to test this, you can disable JavaScript in your browser, just for a short amount of time, for the site from which you want to get content and check if the content that is shown to you in your browser is what you need.

For example, let’s say you want to get content from CodeCanyon. First, open CodeCanyon in your browser. Next, you need to disable JavaScript for this site for a short amount of time, just to inspect the static content, that is the content not manipulated by JavaScript.

To disable JavaScript in Chrome, first, click the icon that comes before the page URL in the address bar of the browser, as shown in Fig. 13.1, and click Site settings. Next, in the site settings page, find JavaScript and select block, as shown in Fig. 13.2. After that, go to the site and refresh the page. The content shown now is the content you can get from that site using WP Content Crawler. You can observe any page of the site to see if all the content you need is there. If some images are not shown, they might be retrieved by the plugin by making a small replacement using Manipulate HTML > Exchange element attributes (See: Exchange element attributes) option available in site settings of the plugin. After you are done inspecting the pages, you can go to the site settings in your browser again and enable JavaScript by selecting Allow instead of Block.

../_images/chrome-site-settings.png

Fig. 13.1 Opening site settings in Chrome

../_images/chrome-block-javascript.png

Fig. 13.2 Blocking JavaScript in Chrome

You can find out how to disable JavaScript for your browser if you use a browser other than Chrome. You can just Google “how to disable javascript in X”, where X should be replaced with your browser name, such as “how to disable javascript in Firefox”.

After inspecting the pages and making sure the content you need is there, you can go to Demo and just create site settings for testing purposes to see if the plugin can successfully get the content, since there might be something else that might prevent the plugin from getting the content. Other problems related to HTML can be fixed using Find and replace in raw HTML setting.