Remote Data Mining And Management Job In Data Science And Analytics

Converting JSON or Avro files to Parquet

Find more Data Mining And Management remote jobs posted recently Worldwide

I need to convert JSON, Avro or other row-based format files in S3 into Parquet columnar store formats using an AWS service like EMR or Glue.

I already have code that converts JSON to parquet using Python but the process is very manual, accounting for NULL values in the JSON elements by looking at each and every field/column and putting in default values if theres a NULL.

I am looking for an easier, less manual way of doing this using something like Spark or other similar methods.

Since I am working exclusively on AWS, I am only looking for solutions using AWS services such as EMR, Glue or similar AWS service.

I am thus looking for someone with experience using AWS EMR, Glue, Python, Pyspark etc.

Please note: Since this is going to be a learning experience for me, this is going to be a live session on Skype, Zoom, Google Hangouts etc where you code and I watch and you answer any questions I have in the process.

Thus, I will pay in one-hour increments. The initial contract is going to be for one hour and if we need more time we can have another one hour contract and so on and so forth.

Please only apply if youre ok with all these conditions and have the required experience.
About the recuiter
Member since Jul 6, 2017
Kieran
from Bergamo, Italy

Skills & Expertise Required

Amazon S3 Amazon Web Services Apache Spark Pyspark Python 

Candidate shortlisted and hiredHiring open till - Oct 19, 2022

Work from Anywhere

40 hrs / week

Hourly Type

Remote Job

$25.04

Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions

Similar Projects

Amazon, Ebay- Need research best selling item

Hi

Happy Holidays

I am looking to set up an amazon / Ebay / Carousel / Shopify store ASAP and need help to set up and source best selling item. For January Im looking to source:

- Fitness, excercise, new year resolu...read more

Smooth Streaming and optimization of Wowza

These following task are completed & need to join them together now.

1. Remote storage and SMIL file linkage from the remote store
2. UDP consumption at origin
3. Transcode of UDP converted to RTP stream
4. VOD as a Live Channel<...read more

AWS specialist needed for API Gateway/Lambda custom deployment

We have a Nuxt site currently running on Netlify that utilizes [lambda] Functions and also static/CDN hosting for assets. We need this infrastructure reimplemented on AWS so that we are a bit closer to the metal/have direct control over certain para...read more

Full-stack Java developer required for code review and some features

We are looking for a developer that has strong analytical skills and can quickly review some code for us.

We need to improve latency in the DB Layer, expert knowledge required, and 5+ years of Hibernate.

Experience required:
...read more

Need an ecommerce fashion store designed.

The budget listed for this project is completely negotiable.

I am looking to organize one or several experienced website developers who specialize in developing an eCommerce website on Shopify that will be geared to sell clothing.

read more