OnPage API Content Parsing
This endpoint allows parsing the content on any page you specify and will return the structured content of the target page, including link URLs, anchors, hea…
Documentation Index
Fetch the complete documentation index at: https://aisa.one/docs/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
URL of the content to parse required field URL of the page to parse example: https://dataforseo.com/blog/a-versatile-alternative-to-google-trends-exploring-the-power-of-dataforseo-trends-api
ID of the task required field you can get this ID in the response of the Task POST endpoint note: the enable_content_parsing parameter in the POST request must be set to true example: "07131248-1535-0216-1000-17384017ad04"
return page content as markdown optional field if set to true, the markdown-formatted content of the page will be returned in the page_as_markdown field of the response; default value: false
Response
Successful response
the current version of the API
general status code you can find the full list of the response codes here Note: we strongly recommend designing a necessary system for handling related exceptional or error conditions
general informational message you can find the full list of general informational messages here
execution time, seconds
total tasks cost, USD
the number of tasks in the tasks array
the number of tasks in the tasks array returned with an error
array of tasks
task identifier unique task identifier in our system in the UUID format
status code of the task generated by DataForSEO; can be within the following range: 10000-60000 you can find the full list of the response codes here
informational message of the task you can find the full list of general informational messages here
execution time, seconds
cost of the task, USD
number of elements in the result array
URL path
contains the same parameters that you specified in the POST request
array of results
status of the crawling session possible values: in_progress, finished
details of the crawling session
number of items in the results array
items array
type of the returned item = ‘сontent_parsing_element’
date and time when the content was fethced example: "2022-11-01 10:02:52 +00:00"
status code of the page
parsed content of the page
parsed content of the header
primary content on the page you can find more information about content priority calculation in this help center article
content text
page URL displayed in case the text is a link anchor
contains other URLs and anchors found in the content element
other URL found in the content element
text of the URL’s anchor
secondary content on the page you can find more information about content priority calculation in this help center article
content text
page URL displayed in case the text is a link anchor
contains other URLs and anchors found in the content element
other URL found in the content element
text of the URL’s anchor
content of the table on the page
content of the header of the table
content of the row cells of the header
text in the row cell
contains other URLs and anchors found in the cell
URL found in the cell
text of the URL’s anchor
indicates if the text belongs to the header
content of the body of the table
content of the row cells of the header
text in the row cell
contains other URLs and anchors found in the cell
URL found in the cell
text of the URL’s anchor
indicates if the text belongs to the header
content of the footer of the table
content of the row cells of the header
text in the row cell
contains other URLs and anchors found in the cell
URL found in the cell
text of the URL’s anchor
indicates if the text belongs to the header
parsed content of the footer
main topic on the page you can find more information about topic priority calculation in this help center article
meta title
main title of the block
content author name
content language
HTML level
primary content on the page you can find more information about content priority calculation in this help center article
content text
page URL displayed in case the text is a link anchor
contains other URLs and anchors found in the content element
other URL found in the content element
text of the URL’s anchor
secondary content on the page you can find more information about content priority calculation in this help center article
secondary topic on the page you can find more information about topic priority calculation in this help center article
meta title
main title of the block
content author name
content language
HTML level
primary content on the page you can find more information about content priority calculation in this help center article
content text
page URL displayed in case the text is a link anchor
contains other URLs and anchors found in the content element
other URL found in the content element
text of the URL’s anchor
secondary content on the page you can find more information about content priority calculation in this help center article
contains objects with rating information for the products displayed on the page
rating name Note: this field is not used in this particular object, and its value is always set to null
the value of the rating
maximum value for the rating
the amount of feedback
relative rating can take values from 0 to 1
array of products displayed on the page contains objects with information on products displayed on the page
name of the product
price of the product
price currency
displays the date and time until which the price is valid in the UTC format: “yyyy-mm-dd hh-mm-ss +00:00” example: "2022-11-01 10:02:52 +00:00"
array of comments displayed on the page contains objects with information on comments related to displayed products
product’s rating contains information about the rating a customer has given to the product
rating name Note: this field is not used in this particular object, and its value is always null
the value of the rating
maximum value for the rating
the amount of feedback Note: this field is not used in this particular object, and its value is always null
relative rating can take values from 0 to 1
title of the customer’s comment
date when the comment was published
author of the comment
primary content on the page you can find more information about content priority calculation in this help center article
text of the comment
displayed in case the text is a link anchor
contains other URLs and anchors found in the content element
contact information contains contact information displayed on the page
array of telephone numbers
array of emails
page content in the markdown format page content in the text-to-HTML markdown format specify markdown_view as true in the request to return the value