120 On-Page API metrics

120 OnPage API Metrics Explained

On-page SEO refers to the practices to improve the technical aspects of a webpage that search engines use to determine its search rankings. Unlike off-page SEO which acts indirectly through the backlinks and anchor text, on-page SEO works directly by creating unique and relevant content and optimizing metadata. It also covers keywords, headlines, and visuals, and involves a high level of expertise and trust.

Optimization begins with a website audit – a tedious process requiring immersion into technicalities. Search engines never stop setting new ranking criteria. In DataForSEO, however, we are aware that providing comprehensive analytics of the website’s health is not enough to hit the market. We comply with the niche trends by offering crawling automation, executing and analyzing JavaScript, and a user-first approach. If you develop a new solution for digital marketers or implement new features to the existing SEO tool, integrating OnPage API is the way to go.

DataForSEO OnPage API is a cloud-based web crawler encompassing multiple endpoints. It checks the websites for  technical benchmarks so that you can quickly fix flaws and make optimizing tweaks.

In this article, we will look over 120 flexible on-page metrics and issues with their meaning and impact over your website.

Contents:
Setting an OnPage API Task
Retrieving API Task Results
OnPage Score
OnPage API metrics by categories
Takeaway

Setting an OnPage API Task

A website is sent for crawling with a request to the OnPage Task Post endpoint. Alongside the mandatory input fields – URL or domain name and limit of pages, you can add customizable parameters, such as:

  • thresholds for various performance indicators related to errors, size, speed, and content;
  • JavaScript rules to run a custom JS code while crawling;
  • execute embedded JavaScript rules on a crawled website;
  • obtain and store raw HTML of the crawled website;
  • load resources (images, stylesheets, scripts, broken items);
  • enable browser rendering to measure Core Web Vitals;
  • measure keyword density to avoid keyword stuffing.

You can also prioritize certain pages, disable checking the other ones, set or disable a sitewide check, or a custom sitemap for crawling. Some specific parameters relate only to canonical pages. Link equity from all duplicate pages is consolidated to such a page, it is the one that is indexed and shown in search results, and the best representative among a group of duplicate pages, according to Google. A canonical tag indicates that a certain URL represents a master copy of a page.

Retrieving API Task Results

After the website is fetched for crawling, you can start retrieving results using your task identifier: id. You can request the following OnPage endpoints depending on your objectives:

  1. Summary – overall information and exact on-page issues of a scanned website, giving a clue about what functions to use to drill down to the found issues.
  2. Instant Pages (Live) – page-specific data for up to 20 specific URLs in one request from different domains. When addressing this endpoint, you can similarly specify a custom user agent and JavaScript rules, load page scripts, enable spell-checking, and specify threshold values.
  3. Pages – a list of the crawled pages with performance metrics; precise information on how well your pages are configured for search.
  4. Pages by Resource – a list of pages where a specified resource is located, as well as metrics of the pages accommodating it.
  5. Resources – a list of resources, such as images, scripts, stylesheets, and broken items, with their detailed overview.
  6. Duplicate Tags – a list of pages with duplicate title or description tags, with their performance metrics.
  7. Duplicate Content – a list of pages with content similar to the page specified, their performance metrics, and the similarity scores;
  8. Links – a list of inbound and outbound links on a target website.
  9. Redirect Chains – a list of URLs forming at least two redirects between the initial URL and the destination URL or redirect loops.
  10. Non-indexable – a list of pages blocked from being indexed by Google and other search engines by robots.txt, HTTP headers, or meta tags settings.
  11. Waterfall – the page speed insights and parameters for creating the speed tests.
  12. Keyword Density – keyword density and frequency data for terms appearing on the specific website or page, available for filtering and sorting.
  13. Microdata – structured JSON-LD data and Microdata, and detailed results of its validation; supports all microdata types available on schema.org.
  14. Raw HTML of a page you indicate in the request.
  15. Page Screenshot – returns the screenshots of the pages so that you can spot design and layout issues visually.
  16. Content parsing & Content parsing (Live) – returns the structured content, URLs, anchors, headings, and textual content of the target page.
  17. Lighthouse – leverages the features of Google’s open-source project intended to provide webmasters with the metrics of performance of pages and apps on a mid-level mobile device connected via 4G.

You can fetch results gradually as our crawler processes the pages without waiting for all of them to be crawled. The crawling progress and status are displayed in the Summary endpoint. Or you can request a complete result once it is ready.

Payment is charged only for the actually scanned pages: the difference for unscanned ones will automatically return to your account. The price is based on the number of pages and additional parameters you want to use.

OnPage Score

We chose to start with the OnPage API SEO Score since this DataForSEO’s proprietary metric is probably the most universal and insightful. It shows how the user-centric and technical aspects contribute to search engine optimization, and, ultimately, to higher rankings and organic traffic.

It evaluates the website quality on a 100-point scale, where 100 is the highest score indicating the absence of critical issues, and 0 is the lowest score, showing that the page accumulates all kinds of errors and warnings and is not optimized at all.

The metric considers critical errors and warnings found on the page. It can be found in the following endpoints:

There are 34 OnPage API parameters affecting OnPage Score.

Parameters responsible for critical issues include:
high_loading_time, redirect_loop, canonical_to_broken, recursive_canonical, is_http, no_title, duplicate_title_tag, broken_links, links_relation_conflict, canonical_to_redirect, canonical_chain, duplicate_content

Parameters responsible for important issues include:
duplicate_title, broken_resources, large_page_size, duplicate_description, no_description, no_image_alt, seo_friendly_url, flash, duplicate_meta_tags, no_h1_tags, irrelevant_description, title_too_long, is_orphan_page, irrelevant_title, title_too_short, no_doctype, high_content_rate, low_character_count, high_character_count, low_readability_rate, deprecated_html_tags, lorem_ipsum

In DataForSEO team, we have deeply analyzed Google’s statements, competition solutions, and common SEO practices to assign the weight to each of those parameters. As a result, our figures turned out almost similar as in other tools, although may slightly differ. They depend on their optimization significance. For example, slow-loading pages score 10 points, since it can frustrate users, leading to higher bounce rates. While canonical to redirect and canonical chain issues score only 3 points each, because they can potentially confuse search engines, but do not hurt the user experience.

The OnPage API response returns the score for every single crawled page. Yet, the Summary endpoint provides the averaged value for the whole domain, therefore, here it depends on the number of crawled pages. The principle remains the same, but we add additional variables: the number of pages where a specific error or warning appears, and the total number of the crawled pages. Broken pages always have a zero OnPage Score.

OnPage API metrics by categories

DataForSEO provides extensive OnPage API documentation with a detailed description of every endpoint, of what, and how it returns. We have grouped them for convenient searching based on common digital marketing practices. Thus, you can quickly spot a metric or parameter of concern, read about its purpose, and, what counts most, find related and associated parameters. In particular, the on-page metrics returned in response fall within the following categories:

  1. On-page errors
  2. Non-indexable pages
  3. Website performance speed
  4. Domain security
  5. Usability
  6. Content metrics.

1. On-page errors

Those are the ones to start an audit with. They have moderate to high importance, but all prevent the websites from proper indexation and ranking. Such issues are costly and lead to unhappy readers. At the same time, they are the easiest to detect, and are unchallenging to be fixed.

  • Broken-type issues. Broken links and resources do not impact SEO but increase bounce rates. When something goes wrong with the page request, resulting in a 4xx or 5xx HTTP status code, the user experience is at risk. Yet, Google considers a site with a high number of 404 errors not reliable, which damages ranking.
  • is_broken – broken page, resource, or link. Indicates whether a page returns a response code less than 200 or greater than 400. Indicates whether a page with the given resource returns 4xx, or 5xx response codes or has broken elements in the resource. The number of such pages.
  • is_4xx_code – a page with 4xx status code. 4xx response for client errors comes from the browser if the parameters required to visit the requested URL are missing. The number of such pages.
  • is_5xx_code – a page with 5xx status code. 5xx response for server errors indicates that the request could not be handled. The number of such pages.
  • broken_resources – indicates whether a page or a website contains images and other resources with broken links, and their number. An important issue for the OnPage Score.
  • broken_links – indicates whether a page or a website contains broken links. The number of broken links across all crawled pages on a target website. A critical issue for the OnPage Score.
  • Duplicate issues. Of course, duplicate content is unavoidable for e-commerce websites. It doesn’t bring technical penalties from the search engines, but they decide which page is canonical, and disregard the rest. It can also dilute page authority. Other duplicate issues create confusion for engines and visitors, e.g., Google picks duplicate meta-tags only once, and pages with similar titles and descriptions compete.
  • duplicate_title – indicates whether a page has duplicate title tags. The number of such pages. An important issue for the OnPage Score.
  • duplicate_title_tag – indicates whether a canonical page has more than one title tag; the number of such pages; A critical issue for the OnPage Score;
  • duplicate_description– indicates whether a page has a duplicate description tag. The number of such pages. An important issue for the OnPage Score.
  • duplicate_content – indicates whether a page has duplicate content. The number of such pages. A critical issue for the OnPage Score.
  • duplicate_meta_tags – indicates whether a canonical page has duplicate meta tags for structuring the page data. The number of pages with more than one meta tag of the same type. The array of duplicate meta tags on the page. An important issue for the OnPage Score.
  • Canonical issues. Canonical errors mean that your website has multiple duplicate URLs. They occur when 301 redirects are not placed properly. A canonical tag prevents problems from the presence of equivalent or “equal” content on different URLs.
  • canonical_to_broken – indicates whether a page has a canonical link element pointing to a page that responds with a 4xx or 5xx response code. The number of such pages. A critical issue for the OnPage Score.
  • canonical_to_redirect – indicates whether a page has a canonical link element pointing to a page that responds with a 3XX redirect. The number of such pages. A critical issue for the OnPage Score.
  • canonical_chain – indicates whether a page has a canonical link element pointing to a page that has a canonical pointing elsewhere, e.g. page a is canonicalized to page b, which is, in turn, canonicalized to page c; the number of such pages. A critical issue for the OnPage Score.
  • recursive_canonical – error indicating whether a page has a canonical tag to another page, which, in turn, refers back to the initial page. The number of such pages. A critical issue for the OnPage Score.
  • Other on-page errors.
  • redirect_loop – number of redirect chains where the destination URL redirects back to the original URL. A critical issue for the OnPage Score.
  • no_content_encoding – indicates whether a page or resource has no compression algorithm of the content. The number of such pages.
  • no_encoding_meta_tag – indicates whether a page has no meta tag encoding – the content type parameter. Informative only if the encoding is not explicit in the header. The number of such pages.
  • no_doctype – indicates whether a page has no Doctype declaration (used to indicate the page’s type and its markup language. The number of such pages. An important issue for the OnPage Score.
  • no_title – indicates whether a page has an empty or absent title tag. The number of such pages. A critical issue for the OnPage Score.
  • no_description– indicates whether a canonical page has an empty or absent description meta tag. The number of such pages. An important issue for the OnPage Score.
  • no_image_alt– images without alt tags (an attribute setting the alternative text for an image that describes what is shown. Used by search engines to rank when searching for images), and their number. An important issue for the OnPage Score;
  • no_h1_tags– indicates whether a page has an empty or missing level 1 heading. The number of such pages. An important issue for the OnPage Score.
  • is_link_relation_conflict– indicates whether a link has a conflict with another link. Indicates whether a page receives a mix of both followed and no-followed inbound internal links. The number of such pages on the website. For example, if is_link_relation_conflict is 1, the target website has one page receiving at least one internal no-follow link and at least one do-follow link. A critical issue for the OnPage Score.
  • is_orphan_page– indicates whether a page has no reference from other pages of the domain. The number of such pages. An important issue for the OnPage Score.
  • lorem_ipsum– indicates whether a page contains lorem ipsum, i.e. placeholder content. The number of such pages. An important issue for the OnPage Score.
  • resource_errors– indicates whether an object has resource errors and warnings. Returns the lines and columns containing errors, error status codes, and text messages, as well as the full list of potential HTML, JS, CSS, and image errors.

2. Non-indexable pages

Every page has to meet certain technical conditions to be shown in Google search results. Together they are referred to as indexability. Indexable pages include:

  • canonical pages;
  • not blocked by the ‘noindex robots’ meta tag;
  • not blocked by the disallow directive in the robots.txt;
  • responding with a 200 status code.

It is common for websites to contain plenty of non-indexable pages, like faceted categories or unavailable products. They are useless for website visitors, and they shouldn’t be able to find them, therefore shouldn’t be placed in sitemaps.

The fields responsible for indexation are:

  • dofollow in the Links endpoint – indicates whether a link is do-follow. If the value is true, the link doesn’t have a nofollow attribute;
  • follow in the Pages, Instant Pages, and Page by Resource endpoints – indicates whether a page’s robots allow crawlers to follow the links on the page. If false, the page’s meta_robots tag contains a nofollow parameter instructing crawlers not to follow the links.

DataForSEO allows you to keep track of pages with noindex tags to make sure they appear only on the pages that search engines actually should not consider. We recommend adding this step to your publishing checklist.

For non-indexable pages, we have provided a dedicated Non-Indexable endpoint. It returns a list of pages blocked from being indexed by Google and other search engines by robots.txt, HTTP headers, or meta tags settings. It returns the following information:

  • reason – the reason why the page is non-indexable;
  • URL – the link to the non-indexable page.

3. Website performance speed

Those metrics in our API are collected in the Waterfall endpoint, as well as in the page_timing array of the other endpoints. Speed forms a good first impression, impacts conversion and bounce rates, and repeat deals. Thus, a one-second improvement can grow mobile conversions by a whopping 27%.  Search engines prioritize user experience, so a slow-loading page can negatively impact rankings.

  • General timing issues. Nobody wants to wait for information, and websites have mere seconds to impress. Our API supplements your SEO software with the benchmarks providing speed insights.
  • high_loading_time– indicates whether a page has too long loading time; specified as the time in milliseconds it takes a page to fully load; the default value is 3000 milliseconds. The number of such pages. If the loading time is more than or equals the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response. A critical issue for the OnPage Score.
  • waiting_time – Time to First Byte (TTFB) in milliseconds – a measurement of responsiveness. Time a client’s browser needs to receive the first byte of the response from the server.
  • high_waiting_time – indicates whether a page has too long waiting time. Specified as the TTFB in milliseconds. The default value is 1500 milliseconds. The number of such pages. If the waiting time is more than or equals the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response.
  • time_to_interactive– the TTI Performance time. Time in milliseconds it takes until the user can fully interact with a page.
  • dom_complete – time to download resources. Time in milliseconds it takes until the page and all of its subresources are downloaded.
  • connection_time – time in milliseconds it takes until the connection with a server is established.
  • request_sent_time – time in milliseconds it takes until the request to a server is sent.
    download_time – time in milliseconds it takes a browser to receive a response.
    duration_time – total time in milliseconds it takes until a browser receives a complete response from a server. Time in milliseconds it takes to fetch a resource.
  • fetch_start – time in milliseconds a browser needs to start downloading the HTML source, or other resources of a page.
  • fetch_end – time in milliseconds a browser needs to complete downloading the HTML source, or other resources of a page.
  • time_to_secure_connection – time in milliseconds it takes until the secure connection with a server is established.
  • ttl – time to live (TTL). Time in milliseconds it takes for the browser to cache a resource.
    lighthouse – an endpoint running a series of individual tests to produce a numeric score according to the project’s official documentation. Performs audits for performance, accessibility, progressive and mobile web apps, SEO, and best practices compliance.
  • Render-blocking issues. Resources, such as CSS, JS files, and images, upon certain conditions, can block web page rendering. A page having them takes longer to load, which affects user experience and SEO performance.
  • has_render_blocking_resources – indicates whether a page has render-blocking resources, typically, the scripts and stylesheets that prevent a page from loading quickly. The number of such pages.
  • render_blocking_scripts_count – the number of scripts on the page that block page rendering.
  • render_blocking_stylesheets_count – the number of CSS styles on the page that block page rendering.

4. Domain security

Insecure login pages, phishing scams, and malware lead to lost or stolen sensitive data, such as personal, and financial details, and credentials. Сonsequences for users and businesses are identity theft, financial loss, and reputation damage. OnPage API highlights the following security issues:

  • ssl – indicates whether a target website has an SSL certificate (providing https:// connection).
    http2 – indicates whether a target website is using the HTTP2 protocol.
  • test_hidden_server_signature – hidden server signature (public web server identification containing sensitive information). Indicates whether a server signature is hidden from crawlers. If the value is false, our crawler is able to access the website’s server signature.
  • is_www – indicates whether a page is on a www subdomain. The number of such pages. Indicates whether a page with a given resource is on a www subdomain.
  • is_https – indicates whether a page is encrypted with the HTTPS protocol. The number of such pages.
  • is_http – indicates whether a page has the non-secure HTTP protocol. The number of such pages. A critical issue for the OnPage Score.
  • https_to_http_links – indicates whether a secure HTTPS page has a link to the non-secure HTTP pages. The number of such pages.
  • has_html_doctype – indicates whether a page has the HTML doctype declaration. The number of such pages.

5. Usability

Weak or under-researched usability aspects undermine a website’s performance, leading to missed opportunities, frustrated users, and reduced engagement. Such problems hinder people from efficiently interacting with a website. Below are the usability insights you gain by crawling your website with our API:
for indexation are:

  • Size issues are directly related to both speed and usability. The heavier a page is, the slower it takes to be ready to be consumed. Search engines will consider a heavy page unhelpful and drop it in the SERP. If the website is not optimized for the monitor size, it automatically causes the user annoyance.
  • size – indicates the size of a given page or resource measured in bytes. In the Filters endpoint, you can sort out the pages by a certain size.
  • is_minified – indicates whether the content of a stylesheet or script is minified. Minification is the process of minimizing code and markup, one of the main methods to reduce load times and bandwidth usage on websites.
  • small_page_size – indicates whether a page is too small. Specified as the weight of the page in bytes. The value will be true if the page size is smaller than 1024 bytes. The number of such pages. If the page weight is less than or equal to the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response.
  • large_page_size – indicates whether a page is too heavy. Specified as the weight of the page in bytes. The value will be true if a page size exceeds 1 megabyte. The number of such pages. If the page weight is more than or equal to the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response. An important issue for the OnPage Score.
  • size_greater_than_3mb – indicates whether a page size exceeds 3 MB. The number of such pages.
  • encoded_size – indicates a page or resource size after encoding measured in bytes. In the Filters endpoint, you can sort out the pages by a certain encoded size.
  • total_transfer_size – indicates a page or resource compressed size measured in bytes. In the Filters endpoint, you can sort out the pages by a certain transfer size.
  • total_dom_size – total Document Object Model (DOM) size of a page. The browser creates a DOM of the page when it is loaded. All the tags in the HTML document are called nodes and DOM size is the sum of them. In the Filters endpoint, you can sort out the pages by a certain DOM size.
  • Redirect issues can trap web crawlers, preventing them from accessing and indexing content. This can severely impact a site’s visibility in search results. Redirects also have a high cost, as an extra HTTP request occurs, and for each extra page, the user loses interest.
  • is_redirect – indicates whether a page has redirects. The number of pages containing 3xx redirects to other pages.
  • has_redirect – indicates whether a resource has an inbound redirect. If the resource type is an image, this field will indicate whether other pages and/or resources have redirects pointing at this image. If the resource type is a script, this field will indicate whether the script contains a redirect.
  • has_links_to_redirects – indicates whether a page is pointing to a page that redirects elsewhere. The number of pages pointing to a page that responds with a 3xx redirect.
    redirect_chain – indicates whether a page has multiple redirects. The number of pages with at least two redirects between the original page and the destination page.
  • has_meta_refresh_redirect – indicates whether a page has a meta refresh client-side redirect (with <meta http-equiv=”refresh”> tag) which instructs the web browser to load another web page after a certain time span. The number of such pages.
  • canonical – indicates whether a page is canonical. The number of canonical pages.
  • Friendly URLs have a higher likelihood of being clicked. They have a safe and memorable appearance and make search results more meaningful.
  • seo_friendly_url – indicates whether a canonical page has an SEO-friendly URL. The SEO-friendliness of a page URL is checked by four parameters:
    – the length of the relative path is less than 120 symbols;
    – no special characters;
    – no dynamic parameters;
    – relevance of the URL to the page.
    If at least one of them has failed, then such a URL is not considered SEO-friendly. The number of pages with SEO-friendly URLs. An important issue for the OnPage Score.
  • seo_friendly_url_dynamic_check – URL dynamic check-up. The value will be true if a page has no dynamic parameters in the URL. The number of such pages.
  • seo_friendly_url_characters_check – URL characters check-up. Indicates whether a page URL contains only uppercase and lowercase Latin characters, digits, and dashes. The number of such pages.
  • seo_friendly_url_relative_length_check – URL length check-up. The value will be true if a page URL is not longer than 120 symbols. The number of such pages.
  • seo_friendly_url_keywords_check – URL keyword check-up. Indicates whether a page URL is consistent with the title meta tag. The number of pages that have consistent URLs.
  • Core Web Vitals. Google counts the first impression from a website based on its loading speed (Largest Contentful Paint, LCP), responsiveness (First Input Delay, FID), and visual stability of content (Cumulative Layout Shift, CLS) together known as the Core Web Vitals.
  • largest_contentful_paint – Core Web Vitals metric measuring how fast the largest above-the-fold content element is displayed. Time in milliseconds it takes to render the largest content element visible in the viewport from the moment the user requests the URL.
  • cumulative_layout_shift – Core Web Vitals metric measuring the layout stability of a page. Measures the total of all individual layout shift scores for every unexpected layout shift that occurs during the entire lifespan of the page.
  • first_input_delay – Core Web Vitals metric indicating the responsiveness of a page. Time in milliseconds from the moment a user first interacts with your page to the moment when the browser responds to that interaction.
  • cachable – indicates whether a page or resource is cacheable, i.e. reusable responses from the server are stored to make subsequent requests faster.
    no_favicon – indicates whether a page has no favicon (icon on the page tab). Its absence hinders user navigation. The number of such pages.
  • has_subrequests – indicates whether a resource has subrequests. Indicates whether the content of a stylesheet or a script contains additional requests.
  • Other usability issues
  • deprecated_html_tags – indicates whether a page contains deprecated HTML tags that are allowed, but not recommended. They were replaced by the improved alternatives and may affect accessibility. The number of such pages. An important issue for the OnPage Score;
  • flash – indicates whether a page contains heavy and outdated flash elements that slow down the page loading, and cannot be opened without a flash player; the number of such pages. An important issue for the OnPage Score.
  • frame – indicates whether a page contains frame tags responsible for dividing the browser window into parts that display the content. The number of such pages.
  • click_depth – number of clicks it takes to get to the page. Indicates the number of clicks from the homepage needed before landing at the target page.

6. Content metrics

All strategies have content at their core and all marketers work on it. Content is capable of generating leads, boosting conversions, increasing engagement, and Google ranking. Depending on your business model and offers, you may relatively frequently update content, thus even with expertly and carefully crafted original copy, subsequent additions or edits can damage the SEO effectiveness.

  • Quantity issues. The more content you publish, the more traffic you’ll get. Longer content tends to solve the problems of users better, so Google wants it at the top of the SERPs. However, the visitors may simply not finish reading if it is not engaging enough.
  • plain_text_size – total size of the text on the page measured in bytes. In the Filters endpoint, you can sort out the pages by a certain plain text size.
  • plain_text_rate – plaintext rate value. The ratio of plain_text_size to size values. In the Filters endpoint, you can sort out the pages by a certain plain text rate.
  • plain_text_word_count – number of words on a page. In the Filters endpoint, you can sort out the pages by a certain plain text word count.
  • low_content_rate – indicates that a page content rate is too low. Specified as the plain text size to page size ratio. The number of pages that have this ratio of less than 0.1. If the number of characters on the page is less than or equal to the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response.
  • high_content_rate – indicates that a canonical page content rate is too high. Specified as the plain text size to page size ratio. The number of pages which have this ratio of more than 0.9. If this ratio exceeds or equals the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response. An important issue for the OnPage Score.
  • low_character_count – indicates a page containing less than 1024 characters. The number of such pages. Specified as the number of characters on the page. If the number of characters on the page is less than or equal to the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response. An important issue for the OnPage Score.
  • high_character_count – indicates a page that has more than 256,000 characters; the number of such pages. Specified as the number of characters on the page. If the number of characters on the page is more than or equal to the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response. An important issue for the OnPage Score.
  • Quantity metrics. DataForSEO API also provides content-related metrics not indicative of page quality as such. However, configured in the filters and thresholds, they will demonstrate performance against certain criteria and reference values.
  • internal_links_count – number of internal links on the page.
  • external_links_count – number of external links on the page.
  • images_count – number of images on the page.
  • images_size – total size of images on the page measured in bytes.
  • scripts_count – number of scripts on the page.
  • scripts_size – total size of scripts on the page measured in bytes.
  • stylesheets_count – number of stylesheets on the page.
  • stylesheets_size – total size of stylesheets on the page measured in bytes.
  • title_length – length of the title tag in characters.
  • description_length – length of the description tag in characters.
  • Quality issues. To truly reign supreme, the content must be consistent and fresh, and deliver value to the audience. It will see it as a trusted source for advice and recommendations and will return to read more.
  • irrelevant_title – indicates that a canonical page’s title is not relevant to its content. To calculate relevancy, we take all the unique words from the page title, and all the unique words from the page content. if the number of such words in title that occur in the content is less than 30%, we consider it irrelevant. The number of such pages. Specified as the match rate of the page’s title to its content; the relevance threshold is 0.3. If the score is less than or equals the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response. An important issue for the OnPage Score.
  • irrelevant_description – indicates that a canonical page description tag is irrelevant to the content of a page. To calculate relevancy, we take all the unique words from the page description, and all the unique words from the page content. if the number of such words in description that occur in the content is less than 20%, we consider it irrelevant. The number of such pages. Specified as the match rate of a page’s description to its content; the relevance threshold is 0.2. If the score is less than or equals the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response. An important issue for the OnPage Score.
  • description_to_content_consistency – consistency of the meta_description tag with the page content measured from 0 to 1.
  • title_to_content_consistency – consistency of the meta_title tag with the page content measured from 0 to 1.
  • charset_consistency – indicates whether a page has a meta charset tag that sets character encoding for this page. The number of such pages.
  • has_misspelling – indicates whether a page has spelling mistakes. Informative if the check_spell field was set to true in the POST request. The number of such pages.
  • spell – Hunspell spellcheck errors and suggestions. Hunspell is the spell checker of Google Chrome, OpenOffice.org, LibreOffice, Mozilla Firefox, and Thunderbird, and is also used by proprietary software packages, such as MacOS, MemoQ, SDL Trados, Opera, and InDesign. This object available in the Pages and Summary endpoints returns spellcheck language code, the misspelled words, and the social media tags found on the page and their content. Supported tags include but are not limited to Open Graph and Twitter Card.
  • low_readability_rate – indicates whether a page has a low readability rate, that is, scored less than 15 points on the Flesch-Kincaid readability test. The number of such pages. An important issue for the OnPage Score.
  • title_too_long – indicates whether a page has a long title. Indicates whether the content of the title tag exceeds 65 characters. The number of such pages. An important issue for the OnPage Score.
  • title_too_short – indicates whether a page has a short title. Indicates whether the content of the title tag is shorter than 30 characters. The number of such pages. An important issue for the OnPage Score.
  • Keyword issues and metrics. Proper keywords help you understand what users are searching for, and communicate to the search engines what a website is about. Search queries have much variance and are the cornerstone of Google’s algorithms.
  • keyword – a keyword found on the website or web page with the number of words specified when setting a task to the Keyword Density endpoint.
  • frequency – keyword frequency. The number of times a keyword appears on the website or page you specified when setting a task to the Keyword Density endpoint.
  • density – keyword density. A ratio of frequency to the total count of keyword keywords with the set keyword length on the website or page.
  • keywords_to_content_consistency – consistency of meta-keywords tag with the page content measured from 0 to 1.
  • irrelevant_meta_keywords – indicates that a canonical page has meta-keyword tags irrelevant to its content. Specified as the match rate of the page’s meta keywords to its content. The relevance threshold is 0.6. If the score is less than or equals the value specified in the checks_threshold array of the POST request, the pages matching the set criteria will be flagged in the API response.
  • Readability indices show how difficult a text is to read after measuring its complexity. Measurable properties of texts such as word and sentence lengths, syllable counts, etc. enable measuring text complexity. It is subsequently compared to how well the reader understands it. This data is entered into a formula that predicts difficulty in different aspects. DataForSEO provides a bunch of such indices:

Takeaway

Data-driven SEO is a process far from set-and-forget. User expectations change with time impacting the search engine requirements. Search engines seek to deliver faster, more interactive, and highly personalized content.

With DataForSEO OnPage API, website owners can precisely indicate reasons for a drop in conversions, growing bounce rates, abandoned carts, and other signs something no longer delivers. Our friendly customizable crawling engine enables checking websites against 120+ metrics. With reasonable prices and no need to build a complex scraper in-house, you save resources for optimization.

DataForSEO OnPage API has everything SEO agencies and software providers need to build custom tools. Run thorough website audits in-house or sell them. We’re inviting you to evaluate your websites against dozens of on-page parameters – try out DataForSEO’s APIs at no cost. And our responsive 24/7 support team is ready to assist you with this.

Nick Chernets
No Comments

Sorry, the comment form is closed at this time.

Embed DataForSeo widget on your website


Embed code:
Preview: