We have a database of -85k tweets collected in a period of 3 days using the Twitter API.
We want to generate an interactive visualization of the retweets happening during this period.
The visualization should be simple and effective and feature the following:
- A set of nodes representing the users participating in the conversation
- The nodes should be sized according to some measure of user importance (e.g.: centrality, activity, etc.)
- For each retweet, there should be an edge going from the original user to the retweeting user.
- There should be a timeline slider, and the graph would show only edges of retweets until the point in time selected in this slider.
- Next to the graph, there should be a list of the 10 most retweeted tweets
- Clicking on any of those would filter the nodes and edges in the graph to display only users that tweeted/retweeted the chosen tweet.
There are no restrictions in the format or tools to be used for this task.
It can be a Gephi visualization, a Jupyter notebook, or any other format that you find convenient.
Data will be provided in two CSV files:
- tweets.csv: contains one row per original tweet in the dataset, with following columns:
user_id, tweet_id, full_text, datetime_of_creation
- retweets.csv: contains one row per retweet, with following columns:
origin_user_id, retweeting_user_id, tweet_id, datetime_of_retweet
About the recuiterMember since Mar 14, 2020 Ketan Patel
from Jiangxi, China