Documentation
Parameters
Parameter | Description | Detailed Example |
---|---|---|
apiKey
[string]
Required
|
Your Api Key is :
YOUR_API_KEY
|
Learn More |
url
[string]
Required
|
Encoded Url you want to scrape | Learn More |
premiumProxies
[boolean]
|
This will route the requests through our residential proxies. | Learn More |
countryCode
[string]
|
Sets which country the preimum proxies will be run through | Learn More |
renderJs
[boolean]
|
This determines if you want the browser to execute javascript code | Learn More |
cookies
[string]
|
String representation of cookies you want to send to the requested url. | Learn More |
headers
[string]
|
String representation of headers you want to send to the requested url. | Learn More |
width
[integer]
|
The value for the width of the browsers viewport. | Learn More |
height
[integer]
|
The value for the height of the browsers viewport. | Learn More |
device
[string]
|
The type of device to render the web broswer as. The default is desktop. | Learn More |
screenshot
[boolean]
|
This parameter will tell Scrapely to screenshot the page's viewport after it's loaded. | Learn More |
screenshotFullPage
[boolean]
|
Works very similarly to screenshot, except it will capture the entire page, not just what's in the viewport. | Learn More |
timeout
[integer]
|
The timespan in ms that the browser will wait before timing out on a request. | Learn More |
waitForCssSelector
[string]
|
The CSS selector that the browser will wait for before returning results. | Learn More |
userAgent
[string]
|
Sets the User-Agent header that the browser will send in it's request(s). | Learn More |
executeJs
[string]
|
Scrapely offers the ability to run your own javascript through our web browser | Learn More |
hasAdvancedDetection
[boolean]
|
Some websites come with more advanced detections methods, this parameter beefs up Scrapely for those sites. | Learn More |
Getting Started
Scrapely is an easy to use API for web scraping.
Using Scrapely only requires two things:
-
Your API Key:
YOUR_API_KEY
- The encoded web url you want to scrape
curl "https://api.scrapely.net/scrape?apiKey=YOUR_API_KEY&url=https%3A%2F%2Fexample.com%2F"
import requests
ploads = {'apiKey':'YOUR_API_KEY','url':'https%3A%2F%2Fexample.com%2F'}
r = requests.get('https://api.scrapely.net/scrape',params=ploads)
const axios = require('axios');
axios.get('https://api.scrapely.net/scrape', {
params: {
'apiKey': 'YOUR_API_KEY',
'url': 'https%3A%2F%2Fexample.com%2F',
}
}).then(function (response) {
console.log(response);
})
Api Key
There is no default value for this parameter, and it it required on every request.
Your Api Key is used to Authenticate you to the service
Ecoded Url
There is no default value for this parameter, and it it required on every request.
This parameter is used to navigate to the page you want to scrape. This needs to be the full url include the http/https section of the url.
The encoding is so that the characters in the url don't interfere with the request you're trying to make to Scrapely.
Premium Proxies
The default value for this parameter is false which means by default requests will NOT be routed through our residental proxies, instead they will route through our data center proxies.
Sometimes that simply isn't enough for your target site's anti bot measures.
So we have the option to use real residential proxies to route the requests through. This means it will look like your web request is coming from a real ip out in the real world, and is much less likely to be blocked.
Requests put through the premium proxies will cost you 10 credits with renderJs set to false and 25 credits with renderJs set to true.
Country Code
If you are using our premium proxies, we allow you to target speicific countries for your request to be routed through.
Note: this is only available when using premium proxies, our regular proxies are used as is.
If you enable premium proxies and do not pass in a country code, it will default to using a proxy in the United States
Here is a shortened list of the more popular countries:
countryCode | Country Name |
---|---|
us | United States |
uk | United Kingdom |
ru | Russia |
it | Italy |
fr | France |
You can view the full list of countries our proxies are able to routed through here.
Javascript Rendering
The defaults value for this parameter is true
This parameter controls if the browser waits for the execution of Javascript code before returning the HTML.
This is useful for Single Page Applications (SPA) or websites that use javascript to build their front-end. Websites written in React, Vue, or Angular
This parameter must be set to true if you intend to execute your own javascript.
Execute Js
Sometimes you'll need to perform some actions before what you want to scrape becomes available to you, whether it be clicking a button, typing in some text, or something completely custom.
Scrapely offers you the ability to execute 7 javascript commands that give you unlimited flexibility.
- click: allows you to click on an html element
- type: allows you to specify the css selector and a value to enter into that field. This action is different from the others in that it takes an array for it's value, the first item in the array will be the css selector for the element, and the second will be the value to be entered into the specified element.
- wait: will force the api to wait the specified time (in ms) before continuing on to the next action or returning the content
- waitFor: waits for the specified css selector to become visible
- waitForAndClick: combines the waitFor and click actions
- scrollX: scrolls the specified amount in pixels along the X axis (horizontal)
- scrollY: scrolls the specified amount in pixels along the Y axis (vertical)
- evaluate: runs custom javascript to cover anything we haven't with the actions above
You'll pass in these commands to the api in JSON. To successfully format the JSON you'll need to wrap all of these commands in an array named js. Below is an example with all of the commands you can execute:
{
"js":[
{
"click":"#nav-search-submit-button"
},
{
"type":[
"#twotabsearchtextbox",
"gtx 3080ti"
]
},
{
"wait":1000
},
{
"waitFor":"#twotabsearchtextbox"
},
{
"waitForAndClick":"#nav-search-submit-button"
},
{
"scrollX":1000
},
{
"scrollY":1000
},
{
"evaluate":"console.log('hello world!');"
}
]
}
Headers
We allow you to send headers through to your destination url.
To send headers through Scrapely, you'll use the headers parameter, the format for this string is name=value. Separate multiple headers with ; (encoded as %3B).
You are also able to send headers through the headers in the request to Scrapely. You'll just need to prefix the header you want to pass along to the destination url with scrapely-, which will be removed before sending.
Do not try to send the Cookies header through this parameter, for that please use Cookies.
Width
The default is 1920 pixels. If you are using the screenshot functionality you can set this value to adjust the width of the outputted screenshot.
Height
The default is 1080 pixels. If you are using the screenshot functionality you can set this value to adjust the height of the outputted screenshot.
Device
This parameter will tell Scrapely which platform you would like to emulate when accessing the requested Url. There are only two options, desktop and mobile. If no value, or an invalid value, is sent then the default for the platform will be desktop, and the browser will render as such.
If you are using this parameter and the value is set to mobile, then setting the height and width parameters is unnecessary.
Screenshot
Screenshot Full Page
This parameter works similarly to the screenshot parameter. The only difference is that it ignores the height and width parameters and will screenshot the entire page's content.
Timeout
Some websites can take longer than others due to network constraints or the website itself being slow. This parameter can be used to set the timeout (in ms) that browser will use to determine if the page is taking too long to load and it will timeout.
The minimum for this parameter is 10000 ms (10 seconds) and can be set as high as 35000 ms (35 seconds).
Wait For Css Selector
This parameter will set the CSS selector that the web browser will use to figure out that the web page is done loading.
If no value is provided, then the page will only wait until the there are no active network connections for 500 ms.
User Agent
When a browser sends a request, one of the headers that it will send with the request is the User-Agent header. This is one way webservers determine a few things like if the request originated from a real browser or a program trying to mimmic one, the type of browser (desktop or mobile) and a few other things.
If no value is provided, then Scrapely will automatically get pull one from a list that matches the device.
Has Advanced Detection
Some websites employ advanced anti bot measures. Scrapely is prepared to take that on. This parameter is unnecessary for the majority of websites, and by default is set to false. When this parameter is set to true Scrapely will setup the browser is to employ all of the anti-detection methods we currently have, so the request for scraping may take a bit longer. You should only use this parameter if you are finding that your requests are being blocked without it.