JS Scenario
The ScrapeOps Proxy API Aggregator enables you to send commands to a headless browser when using our headless browser functionality.
For common JS commands we've created dedicated query parameters that you can enable on your requests:
- Wait Time: allows you to wait a specific amount of time before returning the response.
- Wait For Selector: allows you to wait for a specific element on the page before returning the response.
- Scroll: Enables you to scroll the page in the y-axis before returning the response so you can scrape pages with infinite scrolls.
However, if you want to execute other actions (like filling in forms or scrolling in the x-axis) or you want to chain mulitple actions together then you can create custom JS scenarios and send to the proxy API.
Using JS Scenarios
Here is an simple Javascript Scenario to show you how this functionality works.
For example, if you would like to wait for 3000 miliseconds before returning the response (useful if a page doesn't display data straight away) then you can create a JS scenarion like this:
{
"instructions": [
{"wait": 3000}
]
}
You then need to stringify this JSON object and escape any special characters. In Python we would do this like this:
import json
import urllib.parse
js_scenario = {
"instructions": [
{"wait": 3000}
]
}
js_scenario_string = json.dumps(js_scenario)
encoded_js_scenario = urllib.parse.quote(js_scenario_string)
## Output
## %7B%22instructions%22%3A%20%5B%7B%22click%22%3A%20%22%23buttonId%22%7D%5D%7D
Then we just need to send this encoded string with our request using the js_scenario
query parameter
curl -k "https://proxy.scrapeops.io/v1/?api_key=YOUR_API_KEY&url=http://example.com/&js_scenario=%7B%22instructions%22%3A%20%5B%7B%22click%22%3A%20%22%23buttonId%22%7D%5D%7D"
This Javascript Scenario will then tell the headless browser to wait 3,000 miliseconds before returning a response, giving the website enough time to load the target data onto the page.
Adding a JS Scenario to the request will automatically enable a headless browser (equivalent to adding render_js=true
to the request). Using the Headless Browser functionality will consume 10 API Credits.
Chaining Javascript Actions
You can also chain numerous Javascript actions together in the same JS scenario to interact with the page in more complex ways.
import json
import urllib.parse
js_scenario = {
"instructions": [
{"wait": 2000}, # Wait 2 seconds
{"scroll_y": 1000}, # Scroll the screen down 1,000px in vertical direction
{"click": "#button"}, # Click on button
{"wait": 2000} # Wait 2 seconds
{"fill": ["#input_field", "value_to_input"]}, # Fill some input field with value
{"evaluate": "console.log('Hello')"}, # Run custom JavaScript code
]
}
js_scenario_string = json.dumps(js_scenario)
encoded_js_scenario = urllib.parse.quote(js_scenario_string)
Here is a full list of the JS scenario actions you can use:
Action | Input | Description |
---|---|---|
wait | Number of miliseconds to wait. | Wait the defined number of miliseconds before returning response. |
wait_for | String with CSS selector of element to wait for. | Wait until element appears on the page. |
wait_for_and_click | String with CSS selector of element to wait for and click. | Wait until element appears on the page, and click it. |
click | String with CSS selector of element you want to click. | Clicks element on page. |
scroll_x | Interger with number of pixels you want to scroll | Scroll the screen in the horizontal axis. |
scroll_y | Interger with number of pixels you want to scroll | Scroll the screen in the vertical axis. |
fill | Array with element and value you want to fill. | Enter values into input fields. |
evaluate | String with Javascript code you want to execute. | Run custom Javascript code in the headless browser. |