Remote Data Mining And Management Job In Data Science And Analytics

Large-Scale Topic Model / Sparse Matrix Construction

Find more Data Mining And Management remote jobs posted recently Worldwide

I have a larger corpus (-38K documents) that I am attempting to run topic models over. The problem right now is that creating a corpus via packaged solutions, like Textacy, puts too much strain on the memory for the doc-term-matrix. I believe there are more efficient solutions, perhaps collapsing each document into a Counter object and incrementally feeding it into a sparse matrix, but I am looking for someone with more experience working with larger, data-intensive memory sets and scientific computing to create code that can easily be transported over. I can provide the full text files as a corpus, and am available to discuss the project on an ongoing basis.
About the recuiter
Member since Sep 6, 2017
Pallavi Ghosh
from Delhi, India

Open for hiringApply before - May 26, 2024

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$26.86

Cost

Offer to work on this project closes in 27 days!
Are you interested in this Opportunity?

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

need some one with deepfake GAN knowledge

need to use AI camera enabled and to develop a model to train and learn and create deepfake with GAN, accuracy needs to be 95%. I will provide remaining details with shortlisted candidates.
Thanks

Help moving existing Scikit-Learn Model to AWS SageMaker

We have a basic neural network model that is already running using Scikit-Learn and Python. We have been working to get off of EC2 machines and migrate to AWS SageMaker. We have very little experience with SageMaker and looking for someone to help an...read more

Convert email SAS code to python

I have a small section of SAS code which is to be converted to Python
You dont need to run it, just provide the substitute code accordingly

scrape 10 websites

looking to scrape 10 websites. the scrapers must work on a windows client and must in near-real time upload/paste data into google sheets.

E2E Speech Recognition Paper Implementation in PyTorch

Looking for engineer with strong background experience in speech recognition pipelines, and with really good working knowledge of PyTorch.

Im working on several projects, and need an engineer wholl deliver implementation of paper STATE-OF...read more