Today we’d like to inform you that we’ve changed the default sitewide checks settings for single-page crawls via the Task POST endpoint of On-Page API.
Earlier, when you set max_crawl_pages
to 1
and specified a start_url
to crawl only one page, our system still had to make numerous additional requests to obtain data on sitewide checks, and thus the crawl was taking longer to complete.
To minimize the crawl time and our system load, we have made the following changes:
1 If you set max_crawl_pages
to 1 and do not specify start_url
or set a homepage in it, the following sitewide checks will be disabled:
test_canonicalization
,enable_www_redirect_check
,test_hidden_server_signature
,test_page_not_found
,test_directory_browsing
,test_https_redirect
2 If you set max_crawl_pages
to 1
and specify start_url
other than a homepage, all sitewide checks will be disabled.
However, you can still enable sitewide checks for one-page crawls. For this, we’ve added a new parameter force_sitewide_checks
that you can simply set to true
when necessary.
Check our docs to review the changes
Learn more about disabling sitewide checks for multi-page crawls
Don’t hesitate to contact our support team if you have any quesitons.