Remote Data Mining And Management Job In Data Science And Analytics

Python web Crawler (bot) that build a bilingual corpus

Find more Data Mining And Management remote jobs posted recently Worldwide

We need to build a web crawler (bot) that will traverse the high level domain like .co.uk or com.
The search for bilingual web sites.
Determine the languages of the site.
Scrap and align the text from the site.

There are many python libraries and research papers that talk about that. I think bitextor for example (which extract and align 2 html pages) will take care of the alignment.

We will be waiting for a detailed proposal how the project will be performed and the time frame.
About the recuiter
Member since Jul 6, 2017
Smith S.
from Scotland, United Kingdom

Skills & Expertise Required

Data Scraping Web Crawling Python Data Extraction 

Open for hiringApply before - May 9, 2024

Work from Anywhere

40 hrs / week

Fixed Type

Remote Job

$479.04

Cost

Offer to work on this project closes in 1 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Google Sheet and Web Scraping Expert Needed

Hello, I need someone who is an expert with Google Sheets and Web Scraping to help me create a Google Sheet that will scrape the property information (house address, asking price, property description) and contact infomation (name, phone number and e...read more

Data extraction from Google Maps using API

We need assistance from someone proficient in using Google Maps API to extract information on schools in African Countries.

The programmer will have to:
(1) Guide us in the request of api keys
(2) Write the code in Java Script to...read more

Data Engineer

We are looking for a freelancer who has proven experience in Data Engineering projects.

The requirements we are looking for:
- Experience with Python
- Experience with Big Data tools (eg: Hadoop, Cassandra, Kafka)
- Experience wi...read more

Instacart third party app bot - for faster batch orders

Instacart has a known - third party app bot - that can take the higher dollar batch orders and give to those who have this app downloaded.
It use to be Ninja Hours - then name changed to SuShopper. Now I cant find it.

Please help me...read more

Machine Learning Model Needed

Looking for an experienced data scientist/ machine learning expert for a very rudimentary problem.

Preferred experience with TensorFlow or other ML libraries. Need experience with different types of ML model and how to apply to different da...read more