Find more Data Mining And Management Remote Jobs posted recently Worldwide

Required Data Science & Analytics,Data Mining & Management freelancer for Data on Twitter Accounts for R&D ML Project for the DoD / Navy Research Center job

Posted at - Jun 24, 2020

Toogit Instant Connect Enabled


I have 75k twitter accounts.

I am looking for the following data on each of them.

Category Feature
1 - Profile Commonality between screen name and user names
1 - Profile Creation Date
1 - Profile Description / Bio
1 - Profile Display Name
1 - Profile Is Profile Picture Egg (Yes/No)
1 - Profile Is Profile Picture Human? (Yes/No)
1 - Profile Is Profile Picture Stock Image (yes/no)
1 - Profile Number of Sources (mobile, computer, null)
1 - Profile Primary Language
1 - Profile Handle (@name)
1 - Profile Twitter User ID
2 - Bio / Description Does Description have a URL? (Yes/No)
2 - Bio / Description If so, does the description URL have a clone elsewhere?
2 - Bio / Description Average Word Length
2 - Bio / Description Contains URL
2 - Bio / Description Correlation with a NLP Program
2 - Bio / Description Length
2 - Bio / Description Number of Words
2 - Bio / Description Score - ARI (Automated Readability Index)
2 - Bio / Description Score - Coleman Liau index
2 - Bio / Description Score - Dale-Chall Score
2 - Bio / Description Score - Flesch Kincaid Grade level
2 - Bio / Description Score - Flesch Reading Ease
2 - Bio / Description Score - Linsear Write Formula
2 - Bio / Description Score - SMOG
3 - Activity URL Is Shortened? (yes/No)
3 - Activity # of Posts
3 - Activity # of Retweets
3 - Activity # of Tweeting @'s
3 - Activity % of Tweets Geo-enables
3 - Activity Ave. # of Hashtags in Tweets
3 - Activity Ave. # of Links in Tweets
3 - Activity Ave. # of Special Characters in Tweets
3 - Activity Ave. # of User Mentions in Tweets
3 - Activity Average Duration between being a tweet being posted and this user re-tweeting it for all retweets (in minutes)
3 - Activity Average Duration between being a tweet being posted and this user re-tweeting it for top 10 fastest re-tweets (in minutes)
3 - Activity Average Duration between being a tweet being posted and this user re-tweeting it for top 3 fastest re-tweets (in minutes)
3 - Activity Average Tweets / Day Since Creation Date
3 - Activity Distribution of Tweets Per Hour
3 - Activity Longest No-Tweet Duration (In Days)
3 - Activity Most Compact Number of Tweets per Hour
3 - Activity Number of Languages
3 - Activity Percentage of tweets ending with punctuation, hashtag, or link
3 - Activity Number of Events / Hour Distribution - Standard Deviation
3 - Activity Number of Events / Hour Distribution - Skew
3 - Activity Number of Events / Hour Distribution - Kurt
3 - Activity Sentiment Score
3 - Activity Time from Last Tweet (In Days)
3 - Activity # of Followers
3 - Activity # of Following
3 - Activity # of Likes
3 - Activity Category of website Linked to
4 - Similarity Number of known bots followed by a user - a user following several known bots is more likely to be a bot.
4 - Similarity Number/Percentage of bots in the cluster that a user belonged to -if a clustering algorithm places the user in a cluster with many bots, he is more likely to be a bot.
4 - Similarity Pagerank and between-ness centrality of users in both retweet and mention networks
4 - Similarity Similarity of Profile to Known Bots
4 - Similarity Variables related to star and clique networks associated with users
5 - Outcome Is a Bot? (Yes / No)
5 - Outcome Bot Type (Spambots, Paybots, Influence Bots)

This is important, but not time sensitive. The proper data miner / analyst will be given a few weeks to work on the job,


The final output for this is 2-fold.

1) Looking for a google spreadsheet output of this info for all -75k accounts.
2) a web-based tool that I can upload a CSV file or paste in a list of Google ID's OR Usernames to get this data for the identified accounts.

I have a strong opinion about what tools you use to build this solution.

About the recuiterMember since Sep 14, 2017 Lance Hirahon
from Jalisco, Mexico

Skills & Expertise Required

Data Science & Analytics Data Mining & Management 

Candidate shortlisted and hiredHiring open till - Apr 20, 2021

Work from Anywhere
40 hrs / week
Hourly Type
Remote Job
$25.05
Cost

Looking for help? Checkout our video tutorial
How to search and apply for jobs

How to apply? Do you have more questions about the Job?
See frequently asked questions


Apply on more work from home jobs posted in Data Mining And Management category.