How to get the HTML of a page?
in
HTML data can help you build advanced website audit solutions.
To receive the raw HTML code of a page with DataForSEO OnPage API, add the store_raw_html field to the On-Page API Task POST body and set it to true.
Example:
[
{
"target": "dataforseo.com",
"max_crawl_pages": 10,
"store_raw_html": "true"
}
]
Once the task is created, copy its ID and call the Raw HTML endpoint.
Here, you should specify the ID of your task and the url of a page you want to receive HTML data for.
Example:
[
{
"id": "09161530-2692-0216-0000-e429a13680de",
"url": "https://dataforseo.com/apis"
}
]
Once the crawl is finished, the Raw HTML endpoint will provide you with HTML data of the page you specified in the url field.
{
"version": "0.1.20210917",
"status_code": 20000,
"status_message": "Ok.",
"time": "0.1204 sec.",
"cost": 0,
"tasks_count": 1,
"tasks_error": 0,
"tasks": [
{
"id": "10051429-2806-0216-0000-74f7a98536e8",
"status_code": 20000,
"status_message": "Ok.",
"time": "0.0581 sec.",
"cost": 0,
"result_count": 1,
"path": [
"v3",
"on_page",
"raw_html"
],
"data": {
"api": "on_page",
"function": "raw_html",
"url": "https://dataforseo.com/apis/serp-api",
"target": "dataforseo.com",
"max_crawl_pages": 10,
"store_raw_html": "true"
},
"result": [
{
"crawl_progress": "finished",
"crawl_status": {
"max_crawl_pages": 10,
"pages_in_queue": 0,
"pages_crawled": 10
},
"items_count": 1,
"items": {
"html": "<!DOCTYPE html></html>" }