Solid questions/ things we want to have in the end #17

jwzimmer-zz · 2020-10-30T17:06:11Z

Now that we've got all this info, what do we want to do with it? Let's lay them out, then run them by Prof Cheney for feedback before we put too much time into actually analyzing things.

Things we could do (no bad ideas!):

(1) We have tropes the community has manually listed as super, sub, and sister tropes. We could use this as a training and validation set, and see if a model can predict labels for the other trope nodes that haven't been explicitly labeled by the community as super/ sub/ sister. I am not sure with just edges and directions we have enough information for the model to learn anything meaningful from this - maaaybe if they also did some parsing of the names themselves?
- What would this get us? Well, if we also have the structure inferred by a structure detection algorithm as in (2), we could compare how the community consciously groups tropes vs. how they end up grouping tropes when they write about them. This might be interesting if it illuminates what features are used explicitly vs. implicitly. -> Who cares? Application to news?
- We could do this anyway, without the ML part from (1), if we just compared the networks made explicitly by the community (super tropes, sub tropes, sister tropes) vs. what structure is detected from the 27000 tropes (2) qualitatively?
(2) We have 27000 tropes and what they link to. We could use structure detection algorithms to see what clusters and hierarchy exist.
- What would that get us? Well, the way I described how meaning works in the project proposal would imply a lot of trope creation based on connections between other tropes. What would that mean in the graph - loops? High degree? (Since I think meaning comes from connection between things, there should be a lot more edges than nodes.)
(3) How close is the network from (2) to a small-world/ regular/ random graph? Does it fall into the pattern of most real-world networks (small world ish)?
- Who cares? A little loosy goosy, but anything that shows similarity between formations from the human brain and external systems supports the idea that human brains are expressions of the natural world just as much as anything else is - kind of goes against thinking of sentience as something spiritual.
(4) What's the average shortest path like?
- What's the point? Kind of reflects how closely related the tropes are to each other. Are there some distinct categories? Are they all very tightly related?
(5) What are the features, if any, which make connection or clustering more likely?
- For example, maybe a trope centrally involving gender is likely to be connected to an equivalent involving the "other" gender (going along with binary gender because most of these tropes are contemporaneous with that concept).
(6) Are things we think of as genres reflected in the structure?
(7) It could be the case that everything is connected to everything else, more or less. In that case, we won't be able to see interesting patterns easily. And it seems plausible - creativity lets us connect anything to anything else we want to. So maybe the more solid connections that reflect universally-agreed upon similarity are those which occur multiple times, which we can look at by incrementing the weight of an edge every time it occurs and looking at edges above some threshhold.
- These links will be the ones the community has repeatedly identified in different contexts... I think that's a reasonable indication of robustness.

Advice on visualization from Jane Adams:

Have you looked at it in dendrogram form? Clustering might be a good approach, then you can examine the clusters as one meta-network or a bunch of small networks
This approach always seemed cool to me, though I haven't used it myself yet https://youtu.be/7G3MxyOcHKQ
NodeTrix: A Hybrid Visualization of Social Networks
NetworkX is great for having a lot of built-in functions for clustering and centrality measures https://networkx.org/documentation/stable/reference/algorithms/clustering.html
Also the backbone method is super cool for pulling out the 'most significant' edges in a weighted network (though if you're working with the tvtropes dataset I'm guessing those are unweighted) https://www.pnas.org/content/106/16/6483 if this is a method you do end up using let me know bc I have some other related resources like a python implementation of the algorithm
This is a great resource for network visualization too - biofabric might be a good route, also listed there https://vdl.sci.utah.edu/mvnv/wizard/
vdl.sci.utah.eduvdl.sci.utah.edu

jwzimmer-zz · 2020-11-02T16:02:00Z

Other questions:

link prediction in our network = editing a trope page... but does it also say anything about new trope creation? how often are the pages edited? maybe they are already linked to approximately every existing trope the community will ever agree they "should" be linked to?
tropes are shorthand for more complicated stories and context
maybe compare to other wikis? e.g. https://dumps.wikimedia.org/
- http://konect.cc/networks/dbpedia-writer
- or other fictional networks from media: https://arxiv.org/abs/cond-mat/0202174

Other questions (not that fleshed out, I thought I'd write them up nicely but instead ended up copy-pasting messy things from the colab notebook for time!):

Which nodes/tropes get linked to the most? Is there a set of nodes that act as "master" nodes, i.e. the trope of tropes, which I think was one of our main questions in the proposal. How many are there? If exists, does the structure make sense?
are there loops in the network? can we follow a path from root to leaf? do the paths convey anything? it might be cool to randomly sample the network for the sake of visualization. jitter? all roads lead to rome? small world? is it self similar? does it have fractal dimension? what if we look at the connections like syntax and try to carry them all the way out to the individual word in an article - qualitatively, a few, not a lot b/c v exploratory - how much information do you gain from each level in the tree? when you know a leaf trope do you know the path to it?
Random subset samples, Random walks, Jitter, Pick random pairs, subgraphs, are they connected? What's the average shortest path between two random nodes?
Meeting notes 11/1:
generate names for tropes (justification??)
clustering
ask prof cheney about tools, summary statistics (size worry)
how society thinks about topics
stereotypes
psychology
offensive
what are taboos
gender roles
romance, relationships
family structure
race
compare categories of tropes like romance in tv vs. real life
sentiment analysis?
- even just are tropes happy/ sad/ what distribution
- different clusters
symmetry?
the 6 story arcs?
are happy tropes connected to happy tropes?
trope titles - happy or sad using hedonometer word list
classifier - take transcript and output what tropes it thinks are in it
plots for the tropes?
network for indices vs. clustering detection algorithm
which indices? just the ones that include individual tropes
can ask jane about hedonometer sentiment words lists
sentiment of the indices vs. the clusters
centrality - which nodes? sentiment?
how in sync with modern psych is tv? what expertise does tv reference and when is it from?

jwzimmer-zz · 2020-11-02T16:32:17Z

Top priority (per discussion with @nguyenhphilip on 11/1:

(1) describe and visualize the network of 27000 individual tropes: good overview of what we want to do here: https://bigdata.unl.edu/documents/ASA_Workshop_Materials/Tutorial%20Statistical%20Analysis%20of%20Network%20Data.pdf
- network statistics (degree distribution, does it follow power law, how close is it to random, etc.)
- weighted edge network (based on number of connections) for manageable visualization
- cluster/ structure detection algorithms

Subsequent priorities:

(2) describe and visualize the network of individual tropes that are also organized by the indices, for comparison with the network mentioned above
(3) some sentiment analysis of trope titles (based on the happy/ sad ratings of the words in the title)

Stretch goals:

(4) machine learning to generate trope names and clusters for comparison

jwzimmer-zz · 2020-11-02T16:44:22Z

Visualization ideas/ thoughts:

For visualizing the network of all tropes post-weighting and post-clustering, and for comparing that to the clusters from the indices network, a chart like this might be good: the https://vdl.sci.utah.edu/mvnv/techniques/sunburst/

jwzimmer-zz · 2020-11-02T18:18:44Z

Ideas from talking with Juniper Lovato:

stereotypes - gender
character network?
gender to being protagonist?
look at edit history over time for some smallllll subset of tropes
look at existing research on fandom? could be good for comparison overtime within some article
wokeometer (from Ale)
offensiveness on website - skew, past problematic categories, flame bait
is it an open wiki?
research on edit histories on wikipedia - do the controversial tropes on tvtropes line up with these? simon dedeo, brian keegan (spelling, sorry)
- usernames, suggested edits, reverts - how controversial

nguyenhphilip · 2020-11-02T19:30:00Z

Some random ideas i had while thinking about why studying tropes is relevant:

tropes are particular abstractions with generally agreed upon meanings used to convey and facilitate the communication of ideas within the structure of some larger/collective narrative
- Modeling society in this way can allow us to gain insight into our own present and future thought processes and behavior if we assume that individuals and groups of people act in ways that correlate with specific tropes
- Can we classify high profile individuals as fitting a specific trope? What higher-level, organizing meta tropes do these specific tropes fall under? How do they interact with other tropes? What implications might this have?
  - this feels similar to psychological profiling / a personality test, except that tropes are built on stories and may depict a more obvious relationship of how the tropified individual may influence the system they are in.
Can we predict the future state of a system based on the the actors (individual tropes) and the structures (meta tropes) that define it?
- do stories defined by similar sets of meta and individual tropes follow the same developmental arc i.e. have the same or similar outcomes?
- If not, why?
  - If everything points to being doomed, what might steer the trajectory in a different direction, despite evidence to the contrary?
  - luck? randomness? a lone hero or rebellious group of individuals? the presence of lesser-known, but highly influential actors?
  - things we've yet to tropify or are untropifable?
- What are the most important components of this story?
are more popular tropes more convincing? Are they better representations of the phenomena they depict?
- are stories that consist of these tropes more effective in persuading/shaping public opinion? (e.g. part of political campaigning seems to be about creating a story that people feel they can get behind)

jwzimmer-zz · 2020-11-02T20:46:12Z

Notes from talking with @janeadams and @nguyenhphilip:

https://dhs.stanford.edu/social-media-literacy/tvtropes-pt-1-the-weird-geometry-of-the-internet/
once we have clusters, look only for inter-cluster links (exclude intra-cluster) - without category linking
- then inside intra-cluster categories - within category linking
https://hedonometer.org/words/labMT-en-v1/
heatmap with matplotlib and/ or seaborn
adjacency matrix
scipy to create hierarchical dendrogram, then generate heatmap
maybe do something to shorten labels
do stats first
do the weighting thing
- sparsify?
onion decomp https://github.com/junipertcy/onion_decomposition

overall - to do:

do the index thing with in-group, out-group connections
weighted thing
summary stats

=== Links from @janeadams ===
you could probably create (or ask Melissa to create) a #p_tvtropes channel here, or ask peter to add Phil to the compstorylab slack — might be easier to pull other people (eg laurent) into the project on an as-needed basis / share charts with curious folks

here’s a link dump:
Nadieh Bremer’s “Why do cats and dogs?” https://whydocatsanddogs.com/cats
here’s the design process for that viz: https://www.visualcinnamon.com/2019/04/designing-google-cats-and-dogs
happiness scores from hedonometer: https://hedonometer.org/api.html
networkx clustering: https://networkx.org/documentation/stable//reference/algorithms/generated/networkx.algorithms.cluster.clustering.html
my acm iui paper using backbone method: https://www.overleaf.com/project/5f778826c6077c00013f5499
python implementation of backbone method: https://github.com/aekpalakorn/python-backbone-network
graphgen tool for sql (have not used): https://medium.com/district-data-labs/graph-analytics-over-relational-datasets-with-python-89fb14587f07
laurent’s onion decomposition: https://arxiv.org/pdf/1510.08542.pdf
someone else’s python implementation of onion decomp: https://github.com/junipertcy/onion_decomposition
networkx + plotly to create interactive network graph with nodes colored by [centrality, # of connections]: https://plotly.com/python/network-graphs/

and we talked about:
creating a network graph where nodes are user-generated indexes and links are weighted by the total number of connections between all tropes in each index cluster
creating adjacency matrices or heatmaps of within-index connections, trope-to-trope, with edges weighted by the number of times each trope-trope connection occurs
network sparsification to include only: tropes that co-occur often; tropes with a minimum centrality measure (like the stanford demo here https://dhs.stanford.edu/social-media-literacy/tvtropes-pt-1-the-weird-geometry-of-the-internet/), or some other sparsification method (e.g. backbone method https://arxiv.org/abs/0904.2389 or onion decomposition)

jwzimmer-zz · 2020-11-02T21:43:52Z

Want to make sure we're not retreading what has already been done in the dhs.stanford.edu article series...

they include other language namespaces, and I don't think we want to (https://dhs.stanford.edu/social-media-literacy/tvtropes-pt-2-trope-but-not-troper-communities/)
they include links to works, which... maybe we should, maybe not. we aren't now because we're looking at things the community itself made, whereas the lists of works are references to things with definitions outside tv tropes... but on the other hand, every work listed there was listed by the community, and tropes aren't supposed to be unique to the tv tropes website either...
let's see if we get the same clusters??? https://dhs.stanford.edu/social-media-literacy/tvtropes-pt-2-trope-but-not-troper-communities/
- they have lists from their clusters:
- Maybe we could use those lists ^ but exclude things in non-Main namespaces?
they look at works related by what tropes they have in common - https://dhs.stanford.edu/algorithmic-literacy/tv-tropes-pt-3-if-you-liked-dwarf-fortress-youll-love-twilight-breaking-dawn/... we weren't going to do anything like that, but maybe do care about the works rather than ignoring that section completely?

jwzimmer-zz · 2020-11-07T22:34:14Z

Seeing if we get the same categories as in https://github.com/jwzimmer/tv-tropes/tree/main/Stanford_Neighborhoods would be interesting, especially since:

we can see what effect excluding "Works" has (since they used it and we aren't)
we can try several community detection algorithms for comparison: What community detection methods do we care about? #21

jwzimmer-zz · 2020-11-14T18:43:34Z

(Closed #7 because the topic there has been subsumed by this issue.)

jwzimmer-zz mentioned this issue Nov 2, 2020

Index page dicts - links to masterlist tropes only #18

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solid questions/ things we want to have in the end #17

Solid questions/ things we want to have in the end #17

jwzimmer-zz commented Oct 30, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020

jwzimmer-zz commented Nov 2, 2020

nguyenhphilip commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 7, 2020

jwzimmer-zz commented Nov 14, 2020

Solid questions/ things we want to have in the end #17

Solid questions/ things we want to have in the end #17

Comments

jwzimmer-zz commented Oct 30, 2020 • edited Loading

jwzimmer-zz commented Nov 2, 2020 • edited Loading

jwzimmer-zz commented Nov 2, 2020 • edited Loading

jwzimmer-zz commented Nov 2, 2020

jwzimmer-zz commented Nov 2, 2020

nguyenhphilip commented Nov 2, 2020 • edited Loading

jwzimmer-zz commented Nov 2, 2020 • edited Loading

jwzimmer-zz commented Nov 2, 2020 • edited Loading

jwzimmer-zz commented Nov 7, 2020

jwzimmer-zz commented Nov 14, 2020

jwzimmer-zz commented Oct 30, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading

nguyenhphilip commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading

jwzimmer-zz commented Nov 2, 2020 •

edited

Loading