Remote Web Development Job In IT And Programming

Need help designing a web scraping solution to surface external calendar information

Find more Web Development remote jobs posted recently Worldwide

Hi, I need help with solving an architectural/algorithmic problem around scraping calendars on the web to have near real time updates to the times we show on our app. The current design involves having a recurring job scrape a site for all of the free times available for a month for a service that is 30 minutes long (our service duration interval) and storing the free times in our database. Then, when a user comes to our site and chooses their services, we pull the free times from our cache, resolve what times can accommodate the aggregate service duration, and show those to our users. The issue is that one of the external scheduling providers has a bug where they dont show all of the times they have available to book. So the optimization of only scraping the times for a 30 min duration and using those free times to calculate which ones will work for a larger duration at runtime gets thrown out of the window. The only other option we can think of is scraping for each individual time interval but that makes the scrape/caching job take way too long to be feasible. The scraping script takes -2min per pass and we need to do it for at least 2 months (the current and next) so for 100 stylists using our current implementation, the job takes 2min * 100 stylists * 2months * 1 (30 min service duration) = 400 min if we parallelize that on 8 machines it would run in less than an hour. However, trying to run the job for every possible aggregate service duration would be 2min * 100 stylists * 2 months * 16 (8 hours by 30min intervals) =6400 min = 106+ hours and even if we parallelize it on 8 machines, it would still take 13+ hours to run and thats too long. Were looking for a fresh pair of eyes that can see another solution we arent seeing that allows for our times to sync with the external scheduling provider on a regular, relatively small interval.

***Some people have asked why the script takes so long. It has to automate choosing a service and walking through the booking flow of the other website to see the times available for each day.

The site is (removed by Toogit admin). If you click on one service and then add some more and click save it will show you the calendar where you have to click on each individual day to see its hours ***
About the recuiter
Member since Mar 14, 2020
Akshay Kumar
from Austurland, Iceland

Skills & Expertise Required

Automation Data Extraction Scripting Selenium Web Scraping 

Candidate shortlisted and hiredHiring open till - Oct 6, 2021

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$12.52

Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Use Public Data Sources to Create a Prospect List from AirBnB data.

I need help putting together a sales prospecting list using AirBnB data. This is a prospecting list that will be used to reach out to current AirBnB hosts and offer property management services through Direct Mail, email, and social media.

I...read more

Quick Python web scraping help needed. Template will be provided!

Looking for an experienced python developer who has prior exposure to web scraping. Project will take 1-3 days of work and code structure and database schema will be provided. You will work directly with the CTO to build a web scraper using python an...read more

Fetch data from Sharepoint Online API (XML) and send to Azure SQL Database

Fetch data from Sharepoint Online API (XML) and send to Azure SQL Database
Source is Sharepoint Online API (data would be in XML format)
This API data needs to be loaded in Azure SQL Server Database or Blog storage as CSV File
I tri...read more

Google Sheets Expert

Require an expert in spreadsheets, and specifically Google Sheets. Person must be an expert with:

- Pivot Tables
- Writing macros
- Formulas
- Conditional Formatting

Scrap Amazon Product Details for Millions of products

Looking for developer to can developer a scrap that can scrap millions of product details from Amazon ..

Looking for someone who has experience scraping large amount of data with proxy rotation, capacha solving and other methods. Scrap will...read more