Find more Web Development Remote Jobs posted recently Worldwide

Required Web Scraping,Node.js,Scrapy,Beauty,Python freelancer for webscraping english conversations job

Posted at - Aug 24, 2024

Toogit Instant Connect Enabled


I need someone to web-scrape some English conversations

I'm building a chatbot for learning English and want someone to scrape a bunch of English conversations from various websites to use as training material.
I'm looking for short and simple conversations

There are two parts to the task
- some googling to find basic relevant conversations
- scraping code for different sites


Most of these sites are pretty simple plain text services. Here are some example sites, but there are hundreds of resources.
I would want the scraped results in a TSV or CSV format:

convoId | line | url | topic | who | text

convoId - an ID for each conversation so we can sort things later
line - simple increment count for each line in that conversation
url - place it was from for attribution later
topic - please try to get a topic from the page. if this is a LOT more work maybe not needed
who - usually the conversations have role playing 'A: xxx, B: replies'
text - scraped line of text

You can use NodeJS or Python.

Let me know what experience you have in scraping, although this should not be a challenging scraping task - most of these are amateur sites with no Logins or other blockers.

If you're trying to improve your English, this also might be an interesting project!

If you're into machine learning, I've also looked at the various online corpus for dialog training, but haven't found anything great yet.
These datasets don't work for basic language learning conversations.

I'd like to start with a small sample task, but then manage this as an on-going project with some regular work each month as we refine the idea. There will be on-going cleaning up of the dataset for training etc.

Respond to me with some info on what kind of scraping tasks you've done before and how many sites you think you can cover for the initial budget I've proposed.

About the recuiterMember since May 20, 2018 Naresh Yadav
from New York, United States

Skills & Expertise Required

Web Scraping Node.js Scrapy Beauty Python 

Open for hiringApply before - Nov 22, 2024

Work from Anywhere
40 hrs / week
Hourly Type
Remote Job
$13.34
Cost

Offer to work on this project closes in 49 days!
Are you interested in this Opportunity?

Apply Now

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions


Apply on more work from home jobs posted in Web Development category.


Related Jobs


Latest In Web Scraping Jobs


Latest In Node.js Jobs


Latest In Scrapy Jobs


Latest In Beauty Jobs


Latest In Python Jobs