GitHub - blehman/fake_data_4D3: Visualizing: Feature Set + Label

#Fake Data This is simply available for visualization practice b/c it represents some interesting challenges. The clusterID in the last column of the line_counts_clusterID.csv aligns with the results; however, the results do not contain all of the same keys.

Huge thanks to @zanstrong for the drawings and brainstorming! Also, thanks to @s1nelson, @micahstubbs, @shirleyxywu and @enjalot for all of the thinking with me!

###Fake Data Structure

line_counts_clusterID.csv - the last column is the cluster label. The preceeding 14 columns are boolean values that represent the presences of some features.
cluster#_results - additional information for each cluster that provides details on the following:

Gender
Language
Interests
TV Genre
TV Show
Location: Country
Location: Region
Location: Metro
Device Category
Device Wireless Network

###Fake Data Visualization & Analysis What is possible?

Analysis:

Multinomial Logistic Regression (thanks @s1nelson)
t-SNE (thanks @micahstubbs)

Visualization:

% Presence of each feature per each label.

Heatmap (thanks @zanstrong)
- click merge on feature selection
- option to remove feature
- separation between heatmap rows to indication disctions

How do the users arrive at the label?

Node path ()
- Similar to a decision tree to showcase paths by which tweets receive a label.
Parallel Coordinates (thanks @shirleyxywu and @enjalot)
- Turn on/off labels
- Forground/Background colors instead of 14 different colors
- Opacity for selected labels
- randomly arround the feature ordering.
Crossfilter (thanks @enjalot).
The @zanstrong technique:

Difference from the mean count of each item per each cluster.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
data		data
README.md		README.md
__init__.py		__init__.py
exploring_data.ipynb		exploring_data.ipynb
imports.py		imports.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Contributors 2

Languages

blehman/fake_data_4D3

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages