Remote Data Mining And Management Job In Data Science And Analytics

Python developer needed for a resume parser project

Find more Data Mining And Management remote jobs posted recently Worldwide

We are looking for a skilled Python developer to assist in the development of an intelligent resume parser for our ATS.

Key requirements for the resume parser include:

- Accurate and efficient parsing of a predetermined set of structured fields based on a resumes main sections and sub-sections

- Extraction of clean, dis-aggregated, and normalized data for each field, e.g. Bachelor of Arts in Economics --> { degree: Bachelor of Arts, major: Economics}

- Ability to automatically handle documents of multiple formats, including doc, docx and PDF. This includes both text- and image-based documents (using OCR), as well as multi-columned documents

- Output of parsed resume data in a standardized JSON format

- Ability to programatically test the accuracy of the parser with an existing sample resume dataset for continuous improvement

- Ability to programatically train the parser with a growing sample resume data set to continually increase its level of accuracy

Desirable skills and qualifications for the task include:

- Excellent command of the Python programming language

- Solid understanding of and practical experience with document parsing / data extraction

- Experience with natural language processing (NLP) and relevant Python libraries such as NLTK and / or Spacy

Work on the parser has already begun with the current version of the parser able to:

- Load and read a variety of document types with Pythons Textract library

- Break the resume into sections based on a data dictionary of common section headings

- Extract basic fields such as name, email, phone number, and skills

Current extraction of entities such as skills are determined by a keyword search method based on a local database. However this method is both time- and resource-intensive which is why a greater emphasis on machine learning with an NLP will be necessary going forward.

Keywords: Python, resume parser, Textract, PDFMiner, OCR, natural language processing, NLP, Spacy, NLTK, named entity recognition, NER, data extraction
About the recuiter
Member since Sep 5, 2017
Cooper
from California, United States

Candidate shortlisted and hiredHiring open till - Feb 11, 2021

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$19.48

Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

GPS TRACKING SYSTEM

Looking for GPS TRACKING SYSTEM Developer.
developer need to develop server and api for gps tracking system tk007 with the help of protocol we provide.

Back-end Developer (Django)

We are looking for an experienced Back-end developer to join our IT team for a project duration. You will be responsible for the server side of our web application. If you have excellent programming skills and a passion for developing applications, w...read more

Azure Python Batch task with input file and parallel compute

I have a python script that takes to long to run from a single machine. I want to have the script run across a batch and point to the storage account for the input file instead of the local root folder of the script as it is now. This should be som...read more

Data Mining and Excel Creation-Airlines BD work

Need a data mining expert who can find country-wise the airlines operating and the tonnage they are doing around the globe.

Please apply only if interested.

MeetPlace AI recognition document

Extract the structure of invoice.
You can use TensorFlow to detect this area.
The recognized areas should be automatically split into several files.
You can use language Python or Java.