Remote Data Mining And Management Job In Data Science And Analytics

Data Cleaning with Trifacta and R

Find more Data Mining And Management remote jobs posted recently Worldwide

I have a continuing flow of data that is extracted from school websites that needs to be checked and validated before it is made available for an R analytics platform.

I am looking at using a combination of R (we already have quite a lot of code) and Trifacta. The data sets are small but they need to be joined together very accurately. Often the data contains errors and incomplete data for linking across sources. We either access the required data from previous data that has been ingested or ask schools for the additional data.

The first task in the process is to identify all issues of validity and completeness in each data set, followed by implementing a strategy for to fix any issues.

I am seeking a consultant who is familiar with Trifacta and/or R to build a strategy that targets each data source with a series of analyses that locate the issues in the data that is drawn from that source. In total there could be up to 100 sources for which we need to develop recipes in this cleaning and validation stage.

We want to automate the process as much as possible, by adding additional rules/procedures to each recipe until it contains all the steps required for the data that comes from each specific source.
About the recuiter
Member since May 20, 2018
Beypeople
from Uttar Pradesh, India

Skills & Expertise Required

R 

Candidate shortlisted and hiredHiring open till - May 13, 2021

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$19.47

Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Correlation Analysis

I am looking for a capable individual that can put survey data into correlation with public company stock performance. The survey data has predictive implications, hence I am trying to link the forecast to the actual stock performance or another meas...read more

Developing R package for Mendelian randomization and GWAS analysis on cluster

Deliverable are the R based packages based on processing of genetic data.

I am looking for someone who may have experience in processing of genetic data as well as R packaging.

The person will be a joint author on resulting publicat...read more

Rmarkdown - making a pdf report beautiful

I want to make a pdf report generated by a Rmarkdown template to look better.
I have a static pdf version of what I want the report to look like, but I now need to make it dynamic in the rmarkdown template.
1. I will share the current pdf fi...read more

Query YouTube analytics API with R

I need this problem solved- query YouTube analytics API but there is an authentication loop due to user vs brand level permissions

Data analyst for statistical, multivariate regression

I am looking for a quantitative social scientist with extensive experience in data analytics, statistics, R program language or similar and multivariate regression.

The ideal person will have a strong quantitative background in data analysis...read more