-
Notifications
You must be signed in to change notification settings - Fork 0
Home
Gabrielle Agrocostea edited this page Mar 14, 2016
·
3 revisions
Welcome to the twitter_models wiki!
The goal of this project is to build a linear model to predict twitter impressions based on followers and retweets.
1st attempt:
- subset data to only 28 properties
- Jan 2016 Columns available: time, impressions, retweets, page_id, twitter_id, fans
Properties subset:
Twitter_id | Twitter_name |
---|---|
'30309979' | '106andpark' |
'18342955' | 'abc11_wtvd' |
'16374678' | 'abc7' |
'119606058' | 'aquiyahorashow' |
'16560657' | 'bet' |
'226299107' | 'betnews' |
'9695312' | 'billboard' |
'32448740' | 'brueggers' |
'1426645165' | 'bustle' |
'21308602' | 'cartoonnetwork' |
'759251' | 'cnn' |
'73200694' | 'coach' |
'634784951' | 'dasaniwater' |
'14946736' | 'directv' |
'27677483' | 'essencemag' |
'25053299' | 'fortunemagazine' |
'436171805' | 'fusionpop' |
'25453312' | 'hallmarkchannel' |
'14934818' | 'instyle' |
'192981351' | 'landroverusa' |
'2367911' | 'mtv' |
'19426551' | 'nfl' |
'25589776' | 'peoplemag' |
'223525053' | 'ringlingbros' |
'5988062' | 'theeconomist' |
'14293310' | 'time' |
'40924038' | 'univision' |
'15513910' | 'valvoline' |
Data Cleaning: Removed outliers in quantiles: .9 and .1
Modeling Training for all models - 90% of data Testing - 10% of data
Model 1 information
- Trained on all 27 properties
- alpha used: 0.1
- coefficients: array([ 8.50609133e-03, 2.16934297e+02])
Model 2 information
- Trained on smaller subset of properties
- alpha tuned using cross validation:
- coefficients: