Site icon DataForSEO

How to find broken images with On-Page API?

A broken image is a hyperlink to an image resource that returns a “File Not Found” error. As a rule, it’s visually represented as an icon of a ripped photo or piece of paper.

Broken images damage both user experience and SEO. Website visitors encountering broken graphics get disappointed and hardly ever return to the website again. At the same time, broken images on a site give search engine bots a signal that the website neglects user experience or could be an abandoned web resource. Ultimately, the website’s conversions and rankings decrease.

In order to prevent such undesirable effects, website owners should regularly check their website health and fix broken images, links, and other resources.

To find broken images with DataForSEO On-Page API, you should first register and get your api key that’s necessary for authentication.

Learn more about Authentication in our docs >>

With the API key at hand, you can get a list of broken images in a few simple steps.

1 Set a task to On-Page API and specify the necessary website as a target. Remember to set the “load_resources” parameter to true, as it will instruct our crawler to load the discovered resources on a website, including images. If you do not enable this parameter, our crawler will not initiate resource loading and you will not be able to review broken resources.

Note that additional charges will apply. To learn more about the cost of tasks with the “load_resources” parameter enabled, please refer to this help article.

Example request.

POST: https://api.dataforseo.com/v3/on_page/task_post

[
  {
    "target": "dataforseo.com",
    "max_crawl_pages": 10,
    "load_resources": true,
    "tag": "some_string_123",
    "pingback_url": "https://your-server.com/pingscript?id=$id&tag=$tag"
  }
]

2 After the task is set and the crawl is finished, use the task’s ID from the API response to call the Summary endpoint.

GET: https://api.dataforseo.com/v3/on_page/summary/$id

In the “page_metrics” array, find the “broken_resources” counter. If this field shows a value greater than 0, our crawler found the indicated number of broken resources on the target website.

3 To extract the list of broken images, call the Resources endpoint. In the body of your request, specify the task’s ID and use a filter as in the example below.

POST: https://api.dataforseo.com/v3/on_page/resources

[
  {
    "id": "07281559-0695-0216-0000-c269be8b7592",
    "filters": [
      [
        "resource_type",
        "=",
        "image"
      ],
      "and",
      [
        [
          "checks.is_broken",
          "=",
          true
        ],
        "or",
        [
          "checks.is_4xx",
          "=",
          true
        ],
        "or",
        [
          "checks.is_5xx",
          "=",
          true
        ]
      ]
    ]
  }
]

The results of the task will contain a list of defunct images found on the target site. Besides the URLs to the images, you will get their metadata, including alt text and dimensions.


Alternatively, you can first get a list of pages that contain a broken resource, and then review broken images and/or other broken resources for specific URLs.


To do so, after step 2, set a task to the Pages endpoint with a filter as in the example.

POST: https://api.dataforseo.com/v3/on_page/pages

[
  {
    "id": "07281559-0695-0216-0000-c269be8b7592",
    "filters": [
      [
        "resource_type",
        "=",
        "html"
      ],
      "and",
      [
        "broken_resources",
        "=",
        true
      ]
    ]
  }
]

Once you receive the results of the task, you’ll get the list of pages containing broken resources. You can then use the URLs of the returned pages when setting tasks to the Resources endpoint. For example, the results of a task like the one below will provide you with broken images, stylesheets, scripts, fonts, and other elements discovered on the page you specified.

POST: https://api.dataforseo.com/v3/on_page/resources

[
  {
    "id": "07281559-0695-0216-0000-c269be8b7592",
    "filters": [
      [
        "resource_type",
        "=",
        "broken"
      ],
      "or",
      [
        [
          "checks.is_broken",
          "=",
          true
        ],
        "or",
        [
          "checks.is_4xx",
          "=",
          true
        ],
        "or",
        [
          "checks.is_5xx",
          "=",
          true
        ]
      ]
    ]
  }
]

Recommended read:

How to find broken links with On-Page API?
How to find broken (4xx/5xx) pages?
What are broken backlinks/broken pages?

Exit mobile version