Remote Data Mining And Management Job In Data Science And Analytics

Write Program To Merge List Of Leads By Finding Similar Company And Contact Names

Find more Data Mining And Management remote jobs posted recently Worldwide

We have scraped data on buildings in New York City.

Each building can have up to 3 owners and 1 Management company (or 4 owners and no management company)

(NYC buildings are expensive and often times owners partner together to buy buildings)

Each owner can own an indefinite number of buildings (depending on how wealthy they are).

Given that each building is a partnership of numerous owners, new business entities (companies) are created when each building is purchased.

That means that each building has owners (people) as well as a company that owns the building.

That also means that each owner can be associated with numerous different companies.

There are a few Many to Many relationships here (however there is always only one building)

In addition to that, sometimes an owner can use the same company to buy 2 buildings but since were dealing with scraped data thats as good as the person who entered that data on the citys platform, very often there are slight differences in spelling between the two company names or even between the two owner names of a building, making a straight comparison impossible.

(For example, there could be one building owned by The Carlton Group (company name), which is owned by John Marks and Greg Smith, and another building owned by Carlton Group, which is owned by Jonathan Marks and Gregory Smith.)

so far weve been manually comparing the data to look for duplicates.

The goal is to write a program that will merge and then divide all the data into 3 master lists of:

companies
contacts
buildings

so that all similar companies are merged into one company.

all similar contacts are merged into one contact.

we want the program to include an audit log that shows what the old data was and what the new data is. that will make the manual part easier so were just manually looking over what the program changed.

The program will allow us to enter in different leads at a later time and run it through the same process.
About the recuiter
Member since May 20, 2018
Soumendra Saha
from Zaghwan, Tunisia

Skills & Expertise Required

Data Analytics 

Candidate shortlisted and hiredHiring open till - Jun 12, 2021

Work from Anywhere

40 hrs / week

Fixed Type

Remote Job

$347.35

Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

I need a spreadsheet to determine survey results using ranked choice voting.

I have a Google form survey that asks two questions with 4 and 8 options respectively.

I need someone to create a formula to determine the winners of the survey.

Quantitative Behavioral Research

We need someone local (Salt Lake City, Utah) that has the software program SPSS and that can work on it for quantitative behavioral research.

Filling position- ASAP.

Machine Learning Engineer Needed

The project can be an on-going gig and is divided as below
1) I want to run a pricing analysis on some data from e-commerce store, It should basically test and identify the best selling price of the item and suggest changes based on live traction...read more

Data Scientist/Predictive Analytics

Ability to work with multiple data sets/sources and build predictive models to drive marketing and strategic direction. Develop meaningful dashboards using data visualization tools for executives and managers throughout the organization.

Teach CLV analysis using Tableau Superstore data

As a Marketer, I need to understand my customer as best I can and I need to understand how to create CLV calculations within Tableau so I can inform my business on current trends. I am not able to upload live data therefore, need to use Superstore sa...read more