How to Use Crawl Cleanup to Increase Search Engine Crawl Quota

When Google (and other search engines) begin to index your site, it’s common for them to pick up many unwanted URLs that come from your RSS feeds or from users who add query args in the address bar of their browser.

Google sees these URLs as unique and tries to index them separately. While this doesn’t harm your rankings (Google is smart enough to figure these things out), it does take up the crawl quota allotted to your site and in turn, could cause delays in indexing.

To combat this, we’ve introduced a new Crawl Cleanup feature in AIOSEO version 4.2.1 inside our Advanced Search Appearance settings to allow our users to fine-tune what Google can pick up.

To enable the Crawl Cleanup settings, click on Search Appearance in the All in One SEO menu and then click on the Advanced tab.

Scroll down to the bottom of the page and enable the toggle for Crawl Cleanup.

Crawl Cleanup section on the Search Appearance Advanced tab

Once enabled, you’ll be presented with quite a few new options for managing your RSS feeds as well as query args added to your site’s URLs.

Remove Query Args

This option removes all unrecognized query arguments from visitor URLs and redirects them to the same page they were visiting without them.

Remove Query Args setting under Crawl Cleanup

However, there are a few exceptions to this. By default, we keep the main WordPress query args to prevent your site from breaking. An example of this is if a user is performing a search on your blog, that uses the s query arg: https://yoursite.com/?s=My+search+query.

We’ve also built integrations with many popular plugins that use query args to determine if they are enabled and restrict their query args from being removed.

Allowed Query Args

You can also manually allow any query arg in our Allowed Query Args list. Just add one per line. You can also use regular expressions here.

By default, we include support for UTM parameters for Google Analytics.

Allowed Query Args field under Crawl Cleanup

Only the key is required here. For example, if you want to allow a query arg that looks like this: https://yoursite.com?key=value. Then you only need to add the key part in the allowed list, i.e. key.

RSS Feeds

WordPress includes many RSS feeds on your site, including feeds that are not necessary at all. If you enable Crawl Cleanup, we automatically disable most feeds, keeping your main site feed and some additional feeds that are important.

Global RSS Feed

The global RSS feed is how users subscribe to any new content that has been created on your site. This is enabled by default with Crawl Cleanup and we do NOT recommend disabling it.

Global RSS Feed setting under Crawl Cleanup

Global Comments RSS Feed

The global comments feed allows users to subscribe to any new comments added to your site. This is disabled by default with Crawl Cleanup.

Global Comments RSS Feed setting under Crawl Cleanup

Static Posts Page Feed

If you are using a static page for your posts (i.e. https://yoursite.com/blog/) Then this option will appear. This is enabled by default with Crawl Cleanup and we do NOT recommend disabling it.

Static Posts Page Feed setting under Crawl Cleanup

Author Feeds

The authors feed allows your users to subscribe to any new content written by a specific author. This is enabled by default with Crawl Cleanup.

Author Feeds setting under Crawl Cleanup

Post Comment Feeds

The post comments feed allows your users to subscribe to any new comments on a specific page or post. This is disabled by default with Crawl Cleanup.

Post Comment Feeds setting under Crawl Cleanup

Search Feed

The search feed allows visitors to subscribe to your content based on a specific search term. This is disabled by default with Crawl Cleanup.

Search Feed setting under Crawl Cleanup

Attachments Feed

The attachments feed allows users to subscribe to any changes to your site made to media file categories. This is disabled by default with Crawl Cleanup.

Attachments Feed setting under Crawl Cleanup

Paginated RSS Feeds

The paginated RSS feeds are for any posts or pages that are paginated. This is disabled by default with Crawl Cleanup.

Paginated RSS Feeds setting under Crawl Cleanup

Post Type Archive Feeds

This controls which post type archive feeds are enabled. No post type archive feeds are enabled by default with Crawl Cleanup.

Post Type Archive Feeds setting under Crawl Cleanup

Taxonomy Feeds

This controls which taxonomy feeds are enabled. Only the Categories feed is enabled by default with Crawl Cleanup.

Taxonomy Feeds setting under Crawl Cleanup

Atom Feed

This is a global feed of your site which is output in the Atom format. This is disabled by default with Crawl Cleanup.

Atom Feed setting under Crawl Cleanup

RDF/RSS 1.0 Feed

This is a global feed of your site which is output in the RDF/RSS 1.0 format. This is disabled by default with Crawl Cleanup.

RDF/RSS 1.0 Feed setting under Crawl Cleanup

This feature is new in All in One SEO version 4.2.1. No legacy documentation is available.