Best practices for handling OnPage API requests

admin

2 years ago

Handling low-volume OnPage API payload

From the user perspective, a low-volume payload means a relatively periodic usage of an API with a few thousand requests a day or one-time data collection.

Setting API tasks

We advise setting a few tasks at once, we offer the possibility to set up to 100 Task Post tasks in a single request.
You don’t have to use callbacks, such as pingbacks and postbacks.
After setting a task, make sure that your task is, in fact, set. The response of the successfully set task will contain the following values:

	"status_code": 20100,
        "status_message": "Task Created.",

With such a task, you can proceed further.
If the task returns an error, it is not processed further. In this case, it is necessary to provide for an algorithm of re-setting the task after some time. Alternatively, you can label such tasks in the database for further processing. If the error concerns an API server, we recommend that you try re-setting a task with an error in a few minutes, for example:

	"status_code": 50000,
	"status_message": "Internal Error.",

If the error refers to the validation of input parameters, check the payload submitted in the POST request:

	"status_code": 40501,
	"status_message": "Invalid Field",

Also, refer to the Errors endpoint to get a full list of possible errors.

We have provided the tag string in the POST parameter for your convenience. Use it to put down additional information in the task and, consequently, make the retrieval of task results more convenient.

Retrieving API task results

A single-threaded worker service should be launched on your server. It can make requests to the Tasks Ready endpoint, obtain the id of the completed tasks, and collect and process the results. The frequency of launching the worker depends on the number of tasks, as well as on the number of pages to be scanned. For example, you set 1000 tasks with different domains, 100 pages for scanning in each task – in case of such a volume, it will be sufficient to launch the worker once every 15-30 minutes. If the number of pages to be scanned will be larger, the frequency of calling the Tasks Ready endpoint should be increased, for example, to 30-60 minutes.

If you need to collect results from several endpoints:

similarly as described above, a worker is launched to receive the ids of completed tasks, but it only marks the completion of tasks in your database
another worker takes the id of the completed task from the database and addresses those endpoints that are needed to collect information

Handling high-volume OnPage API payload

High-volume payload means frequent usage of an API with many thousands of requests a day, or periodic usage with big volumes of collected data, more than several thousand requests per day.

Setting API tasks

Your server should be set for communication with external services, API services in particular.
We strongly advise setting 50-100 tasks in every single Task Post request.
You should be using pingback-type callbacks.
After setting a task, make sure that your task is, in fact, set. The response of the successfully set task will contain the following values:

	"status_code": 20100,
	"status_message": "Task Created.",

With such a task, you can proceed further. We recommend storing the id parameters of tasks in the database. This way, you will be able to see an id of the task and when it was submitted for processing.

If the task returns an error, it is not processed further. In this case, it is necessary to provide for an algorithm of re-setting the task after some time. Alternatively, you can label such tasks in the database for further processing. If the error concerns an API service, we recommend that you try re-setting a task with an error in a few minutes, for example:

	"status_code": 50000,
        "status_message": "Internal Error.",

Also, refer to the Errors endpoint to get a full list of possible errors.

We have provided the tag string in the POST parameter for your convenience. Use it to put down additional information to use when retrieving a task, and, consequently, make the retrieval of task results more convenient. Values provided in the tag field are also substituted into the $tag variable of the &tag=$tag parameter if specified in the URL of the callback. The tag parameter can therefore be used for transferring certain labels and implementing the logic of your software solution.
in the URL, specify the ?id=$id, and our system will substitute the id of the completed task into the $id variable.

Retrieving API task results

A worker service that communicates with a database has to be launched on your server.

- 1. Your web server obtained a GET request where the id parameter is the id of the completed task.
  2. Completed tasks should be appropriately labeled.
  3. The worker takes the id of the completed task from the database and addresses those endpoints that are needed to collect information.
  4. After all the previous steps are completed, you can process the retrieved results.

Additional recommendations for using webhooks

Watch a video with a detailed explanation of how to use pingbacks and postbacks.
If your web server responded with an error, you can get a list of such tasks using the Tasks Ready endpoint – but only if you didn’t collect this task result separately from the Task Get endpoint. After the operation of your server is resumed, you can re-send the tasks using the Webhook Resend endpoint