Required Natural Language Toolkit (NLTK),Python Scikit-Learn,Natural Language Processing,Python,Machine Learning freelancer for Large-Scale Topic Model / Sparse Matrix Construction job
Posted at - Apr 14, 2020
Toogit Instant Connect Enabled
I have a larger corpus (-38K documents) that I am attempting to run topic models over. The problem right now is that creating a corpus via packaged solutions, like Textacy, puts too much strain on the memory for the doc-term-matrix. I believe there are more efficient solutions, perhaps collapsing each document into a Counter object and incrementally feeding it into a sparse matrix, but I am looking for someone with more experience working with larger, data-intensive memory sets and scientific computing to create code that can easily be transported over. I can provide the full text files as a corpus, and am available to discuss the project on an ongoing basis.
About the recuiterMember since Mar 14, 2020 Ollaniran Monsu from Udine, Italy