We are a startup looking to obtain relatively clean data from more or less clean sources online.
Your job would be to scrape different websites, that include more structured data (tables) as well as more unstructured data (text). Some of the data can be obtained with simple URL requests (wget, requests, urllib), while other websites you will need to do searches including selecting filters and clicking buttons that require javascript (for example using selenium).
We would like the code to collect the data several times a day using a cron job, ideally set up on AWS EC2. Your code should be written in Python.
We would start with a one-off project for a few of the sites we are interested in and if we are happy with the person, potentially extend to an ongoing contractor arrangement.
Skills needed:
python, web scraping, requests, urllib, selenium, mongoDB, SQL, data extraction, data acquisition, data cleaning, databases, automation, scripting, cron jobs
We are flexible with payments. We can work hourly or on a fixed price basis, depending on the experience and time and cost estimate of the freelancer. We are hoping to spend less than 1000 dollars for the first 2-3 websites, with the scraper, cron job and database setup included.
About the recuiterMember since Mar 14, 2020 Techsunware Inc
from Surhondar, Uzbekistan