OnPage API Resources
This endpoint will provide you with a list of resources, including images, scripts, stylesheets, and broken elements.
Documentation Index
Fetch the complete documentation index at: https://aisa.one/docs/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
ID of the task required field you can get this ID in the response of the Task POST endpoint example: “07131248-1535-0216-1000-17384017ad04”
page URL optional field specify this field if you want to get the resources for a specific page note that to obtain resource’s meta from a particular URL, you should specify the URL in this field; if you do not indicate a url when setting a task, resource’s meta in the results will be returned based on the data from the page where our crawler first saw the resource
the maximum number of returned resources optional field default value: 100 maximum value: 1000
offset in the results array of returned resources optional field default value: 0 if you specify the 10 value, the first ten resources in the results array will be omitted and the data will be provided for the successive resources
array of results filtering parameters optional field you can add several filters at once (8 filters maximum) you should set a logical operator and, or between the conditions the following operators are supported: regex, not_regex, , , >, >=, =, , in, not_in, like, not_like you can use the % operator with like and not_like to match any string of zero or more characters example: ["resource_type","=","stylesheet"] [["resource_type","=","image"], "and",["checks.is_https","=",false]] [["fetch_timing.duration_time",">",1],"and",[["total_transfer_size",">",100],"or",["checks.high_loading_time","=",true]]] The full list of possible filters is available by this link.
filter the resources by relevant pages optional field you can use this field to obtain resources from pages matching to the defined parameters you can apply the same filters here as available for the pages endpoint you can add several filters at once (8 filters maximum) you should set a logical operator and, or between the conditions the following operators are supported: regex, not_regex, , , >, >=, =, , in, not_in, like, not_like you can use the % operator with like and not_like to match any string of zero or more characters example: ["checks.no_image_title","=",true]
results sorting rules optional field you can use the same values as in the filters array to sort the results possible sorting types: asc – results will be sorted in the ascending order desc – results will be sorted in the descending order you should use a comma to set up a sorting type example: ["size,desc"] note that you can set no more than three sorting rules in a single request you should use a comma to separate several sorting rules example: ["size,desc","fetch_timing.fetch_end,desc"]
token for subsequent requests optional field provided in the identical filed of the response to each request; use this parameter to avoid timeouts while trying to obtain over 20,000 results in a single request; by specifying the unique search_after_token value from the response array, you will get the subsequent results of the initial task; search_after_token values are unique for each subsequent task ; Note: if the search_after_token is specified in the request, all other parameters should be identical to the previous request
user-defined task identifier optional field the character limit is 255 you can use this parameter to identify the task and match it with the result you will find the specified tag value in the data object of the response
Response
Successful response
the current version of the API
general status code you can find the full list of the response codes here Note: we strongly recommend designing a necessary system for handling related exceptional or error conditions
general informational message you can find the full list of general informational messages here
execution time, seconds
total tasks cost, USD
the number of tasks in the tasks array
the number of tasks in the tasks array returned with an error
array of tasks
task identifier unique task identifier in our system in the UUID format
status code of the task generated by DataForSEO; can be within the following range: 10000-60000 you can find the full list of the response codes here
informational message of the task you can find the full list of general informational messages here
execution time, seconds
cost of the task, USD
number of elements in the result array
URL path
contains the same parameters that you specified in the POST request
array of results
status of the crawling session possible values: in_progress, finished
details of the crawling session
maximum number of pages to crawl indicates the max_crawl_pages limit you specified when setting a task
number of pages that are currently in the crawling queue
number of crawled pages
total number of relevant items crawled
number of items in the results array
items array
type of the returned resource possible types: script, image, stylesheet, broken
resource properties the value depends on the resource_type note that if you do not indicate a url when setting a task, resource’s meta is returned based on the data from the page where our crawler first saw the resource; to obtain resource’s meta from a particular url, specify that URL when setting a task
content of the image alt attribute the value depends on the resource_type
title
original image width in px
original image height in px
image width in px
image height in px
status code of the page where a given resource is located
location header indicates the URL to redirect a page to
resource URL
resource size indicates the size of a given resource measured in bytes
resource size after encoding indicates the size of the encoded resource measured in bytes
compressed resource size indicates the compressed size of a given resource in bytes
date and time when a resource was fetched in the UTC format: “yyyy-mm-dd hh-mm-ss +00:00” example: 2021-02-17 13:54:15 +00:00
resource fething time range
indicates how many milliseconds it took to fetch a resource
time to start downloading the resource the amount of time a browser needs to start downloading a resource
time to complete downloading the resource the amount of time a browser needs to complete downloading a resource
instructions for caching
indicates whether the resource is cacheable
time to live the amount of time it takes for the browser to cache a resource; measured in milliseconds
resource check-ups contents of the array depend on the resource_type
resource with no content encoding indicates whether a page has no compression algorithm of the content; available for items with the following resource_type: script, image, stylesheet, broken
resource with high loading time indicates whether a resource loading time exceeds 3 seconds; available for items with the following resource_type: script, image, stylesheet, broken
resource with redirects indicates whether a page with this resource has 3XX redirects to other pages; available for items with the following resource_type: script, image, stylesheet, broken
resource with with 4xx status code indicates whether a page with this resource has 4XX response code
resource with 5xx status code indicates whethera page with this resource has 5XX response code
broken resource indicates whether a page with this resource returns 4xx, 5xx response codes or has broken elements inside the resource; available for items with the following resource_type: script, image, stylesheet, broken
page with www indicates whether a page with this resource is on a www subdomain; available for items with the following resource_type: script, image, stylesheet, broken
page with the https protocol available for items with the following resource_type: script, image, stylesheet, broken
page with the http protocol available for items with the following resource_type: script, image, stylesheet, broken
image desplayes in its original size indicates whether the image is displayed in its original size; available for items with the following resource_type: image
resource is minified indicates whether the content of a stylesheet or script is minified; available for items with the following resource_type: stylesheet, script
resource has a redirect available for items with the following resource_type: script, image; if the resource_type is image, this field will indicate whether other pages and/or resources have redirects pointing at the image; if the resource_type is script, this field will indicate whether the script contains a redirect
resource contains subrequests indicates whether the content of a stylesheet or script contain additional requests; available for items with the following resource_type: stylesheet, script
resource was found on website’s sitemap if true, the resource was found on the sitemap of the website
resource errors and warnings
resource errors
line where the error was found
column where the error was found
text message of the error the full list of possible HTML errors can be found here
status code of the error possible values: 0 — Unidentified Error; 501 — Html Parse Error; 1501 — JS Parse Error; 2501 — CSS Parse Error; 3501 — Image Parse Error; 3502 — Image Scale Is Zero; 3503 — Image Size Is Zero; 3504 — Image Format Invalid
resource warnings
line the warning relates to note that if "line": 0, the warning relates to the whole page
column the warning relates to note that if "column": 0, the warning relates to the whole page
text message of the warning possible messages: "Has node with more than 60 childs." – HTML page has at least 1 tag nesting over 60 tags of the same level "Has more that 1500 nodes." – DOM tree contains over 1,500 elements "HTML depth more than 32 tags." – DOM depth exceeds 32 nodes
status code of the warning possible values: 0 — Unidentified Warning; 1 — Has node with more than 60 childs; 2 — Has more that 1500 nodes; 3 — HTML depth more than 32 tags
type of encoding
types of media used to display a resource
indicates the expected type of resource for example, if "resource_type": "broken", accept_type will indicate the type of the broken resource possible values: any, none, image, sitemap, robots, script, stylesheet, redirect, html, text, other, font
server version
contains data on changes related to the resource if there is no data, the value will be null
date and time when the header was last modified in the UTC format: “yyyy-mm-dd hh-mm-ss +00:00” example: 2019-11-15 12:57:46 +00:00 if there is no data, the value will be null
date and time when the sitemap was last modified in the UTC format: “yyyy-mm-dd hh-mm-ss +00:00” example: 2019-11-15 12:57:46 +00:00 if there is no data, the value will be null
date and time when the meta tag was last modified in the UTC format: “yyyy-mm-dd hh-mm-ss +00:00” example: 2019-11-15 12:57:46 +00:00 if there is no data, the value will be null