Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solid questions/ things we want to have in the end #17

Open
jwzimmer-zz opened this issue Oct 30, 2020 · 9 comments
Open

Solid questions/ things we want to have in the end #17

jwzimmer-zz opened this issue Oct 30, 2020 · 9 comments

Comments

@jwzimmer-zz
Copy link
Owner

jwzimmer-zz commented Oct 30, 2020

Now that we've got all this info, what do we want to do with it? Let's lay them out, then run them by Prof Cheney for feedback before we put too much time into actually analyzing things.

Things we could do (no bad ideas!):

  • (1) We have tropes the community has manually listed as super, sub, and sister tropes. We could use this as a training and validation set, and see if a model can predict labels for the other trope nodes that haven't been explicitly labeled by the community as super/ sub/ sister. I am not sure with just edges and directions we have enough information for the model to learn anything meaningful from this - maaaybe if they also did some parsing of the names themselves?
    • What would this get us? Well, if we also have the structure inferred by a structure detection algorithm as in (2), we could compare how the community consciously groups tropes vs. how they end up grouping tropes when they write about them. This might be interesting if it illuminates what features are used explicitly vs. implicitly. -> Who cares? Application to news?
    • We could do this anyway, without the ML part from (1), if we just compared the networks made explicitly by the community (super tropes, sub tropes, sister tropes) vs. what structure is detected from the 27000 tropes (2) qualitatively?
  • (2) We have 27000 tropes and what they link to. We could use structure detection algorithms to see what clusters and hierarchy exist.
    • What would that get us? Well, the way I described how meaning works in the project proposal would imply a lot of trope creation based on connections between other tropes. What would that mean in the graph - loops? High degree? (Since I think meaning comes from connection between things, there should be a lot more edges than nodes.)
  • (3) How close is the network from (2) to a small-world/ regular/ random graph? Does it fall into the pattern of most real-world networks (small world ish)?
    • Who cares? A little loosy goosy, but anything that shows similarity between formations from the human brain and external systems supports the idea that human brains are expressions of the natural world just as much as anything else is - kind of goes against thinking of sentience as something spiritual.
  • (4) What's the average shortest path like?
    • What's the point? Kind of reflects how closely related the tropes are to each other. Are there some distinct categories? Are they all very tightly related?
  • (5) What are the features, if any, which make connection or clustering more likely?
    • For example, maybe a trope centrally involving gender is likely to be connected to an equivalent involving the "other" gender (going along with binary gender because most of these tropes are contemporaneous with that concept).
  • (6) Are things we think of as genres reflected in the structure?
  • (7) It could be the case that everything is connected to everything else, more or less. In that case, we won't be able to see interesting patterns easily. And it seems plausible - creativity lets us connect anything to anything else we want to. So maybe the more solid connections that reflect universally-agreed upon similarity are those which occur multiple times, which we can look at by incrementing the weight of an edge every time it occurs and looking at edges above some threshhold.
    • These links will be the ones the community has repeatedly identified in different contexts... I think that's a reasonable indication of robustness.

Advice on visualization from Jane Adams:

  • Have you looked at it in dendrogram form? Clustering might be a good approach, then you can examine the clusters as one meta-network or a bunch of small networks
  • This approach always seemed cool to me, though I haven't used it myself yet https://youtu.be/7G3MxyOcHKQ
    NodeTrix: A Hybrid Visualization of Social Networks
  • NetworkX is great for having a lot of built-in functions for clustering and centrality measures https://networkx.org/documentation/stable/reference/algorithms/clustering.html
  • Also the backbone method is super cool for pulling out the 'most significant' edges in a weighted network (though if you're working with the tvtropes dataset I'm guessing those are unweighted) https://www.pnas.org/content/106/16/6483 if this is a method you do end up using let me know bc I have some other related resources like a python implementation of the algorithm
  • This is a great resource for network visualization too - biofabric might be a good route, also listed there https://vdl.sci.utah.edu/mvnv/wizard/
    vdl.sci.utah.eduvdl.sci.utah.edu
@jwzimmer-zz
Copy link
Owner Author

jwzimmer-zz commented Nov 2, 2020

Other questions:

Other questions (not that fleshed out, I thought I'd write them up nicely but instead ended up copy-pasting messy things from the colab notebook for time!):

  • Which nodes/tropes get linked to the most? Is there a set of nodes that act as "master" nodes, i.e. the trope of tropes, which I think was one of our main questions in the proposal. How many are there? If exists, does the structure make sense?
  • are there loops in the network? can we follow a path from root to leaf? do the paths convey anything? it might be cool to randomly sample the network for the sake of visualization. jitter? all roads lead to rome? small world? is it self similar? does it have fractal dimension? what if we look at the connections like syntax and try to carry them all the way out to the individual word in an article - qualitatively, a few, not a lot b/c v exploratory - how much information do you gain from each level in the tree? when you know a leaf trope do you know the path to it?
  • Random subset samples, Random walks, Jitter, Pick random pairs, subgraphs, are they connected? What's the average shortest path between two random nodes?
    Meeting notes 11/1:
  • generate names for tropes (justification??)
  • clustering
  • ask prof cheney about tools, summary statistics (size worry)
  • how society thinks about topics
  • stereotypes
  • psychology
  • offensive
  • what are taboos
  • gender roles
  • romance, relationships
  • family structure
  • race
  • compare categories of tropes like romance in tv vs. real life
  • sentiment analysis?
    • even just are tropes happy/ sad/ what distribution
    • different clusters
  • symmetry?
  • the 6 story arcs?
  • are happy tropes connected to happy tropes?
  • trope titles - happy or sad using hedonometer word list
  • classifier - take transcript and output what tropes it thinks are in it
    plots for the tropes?
  • network for indices vs. clustering detection algorithm
  • which indices? just the ones that include individual tropes
  • can ask jane about hedonometer sentiment words lists
  • sentiment of the indices vs. the clusters
  • centrality - which nodes? sentiment?
  • how in sync with modern psych is tv? what expertise does tv reference and when is it from?

@jwzimmer-zz
Copy link
Owner Author

jwzimmer-zz commented Nov 2, 2020

Top priority (per discussion with @nguyenhphilip on 11/1:

Subsequent priorities:

  • (2) describe and visualize the network of individual tropes that are also organized by the indices, for comparison with the network mentioned above
  • (3) some sentiment analysis of trope titles (based on the happy/ sad ratings of the words in the title)

Stretch goals:

  • (4) machine learning to generate trope names and clusters for comparison

@jwzimmer-zz
Copy link
Owner Author

Visualization ideas/ thoughts:

@jwzimmer-zz
Copy link
Owner Author

Ideas from talking with Juniper Lovato:

  • stereotypes - gender
  • character network?
  • gender to being protagonist?
  • look at edit history over time for some smallllll subset of tropes
  • look at existing research on fandom? could be good for comparison overtime within some article
  • wokeometer (from Ale)
  • offensiveness on website - skew, past problematic categories, flame bait
  • is it an open wiki?
  • research on edit histories on wikipedia - do the controversial tropes on tvtropes line up with these? simon dedeo, brian keegan (spelling, sorry)
    • usernames, suggested edits, reverts - how controversial

@nguyenhphilip
Copy link
Collaborator

nguyenhphilip commented Nov 2, 2020

Some random ideas i had while thinking about why studying tropes is relevant:

  • tropes are particular abstractions with generally agreed upon meanings used to convey and facilitate the communication of ideas within the structure of some larger/collective narrative

    • Modeling society in this way can allow us to gain insight into our own present and future thought processes and behavior if we assume that individuals and groups of people act in ways that correlate with specific tropes
    • Can we classify high profile individuals as fitting a specific trope? What higher-level, organizing meta tropes do these specific tropes fall under? How do they interact with other tropes? What implications might this have?
      • this feels similar to psychological profiling / a personality test, except that tropes are built on stories and may depict a more obvious relationship of how the tropified individual may influence the system they are in.
  • Can we predict the future state of a system based on the the actors (individual tropes) and the structures (meta tropes) that define it?

    • do stories defined by similar sets of meta and individual tropes follow the same developmental arc i.e. have the same or similar outcomes?

    • If not, why?
      - If everything points to being doomed, what might steer the trajectory in a different direction, despite evidence to the contrary?
      - luck? randomness? a lone hero or rebellious group of individuals? the presence of lesser-known, but highly influential actors?
      - things we've yet to tropify or are untropifable?

    • What are the most important components of this story?

  • are more popular tropes more convincing? Are they better representations of the phenomena they depict?

    • are stories that consist of these tropes more effective in persuading/shaping public opinion? (e.g. part of political campaigning seems to be about creating a story that people feel they can get behind)

@jwzimmer-zz
Copy link
Owner Author

jwzimmer-zz commented Nov 2, 2020

Notes from talking with @janeadams and @nguyenhphilip:

overall - to do:

  • do the index thing with in-group, out-group connections
  • weighted thing
  • summary stats

=== Links from @janeadams ===
you could probably create (or ask Melissa to create) a #p_tvtropes channel here, or ask peter to add Phil to the compstorylab slack — might be easier to pull other people (eg laurent) into the project on an as-needed basis / share charts with curious folks

here’s a link dump:
Nadieh Bremer’s “Why do cats and dogs?” https://whydocatsanddogs.com/cats
here’s the design process for that viz: https://www.visualcinnamon.com/2019/04/designing-google-cats-and-dogs
happiness scores from hedonometer: https://hedonometer.org/api.html
networkx clustering: https://networkx.org/documentation/stable//reference/algorithms/generated/networkx.algorithms.cluster.clustering.html
my acm iui paper using backbone method: https://www.overleaf.com/project/5f778826c6077c00013f5499
python implementation of backbone method: https://github.com/aekpalakorn/python-backbone-network
graphgen tool for sql (have not used): https://medium.com/district-data-labs/graph-analytics-over-relational-datasets-with-python-89fb14587f07
laurent’s onion decomposition: https://arxiv.org/pdf/1510.08542.pdf
someone else’s python implementation of onion decomp: https://github.com/junipertcy/onion_decomposition
networkx + plotly to create interactive network graph with nodes colored by [centrality, # of connections]: https://plotly.com/python/network-graphs/

and we talked about:
creating a network graph where nodes are user-generated indexes and links are weighted by the total number of connections between all tropes in each index cluster
creating adjacency matrices or heatmaps of within-index connections, trope-to-trope, with edges weighted by the number of times each trope-trope connection occurs
network sparsification to include only: tropes that co-occur often; tropes with a minimum centrality measure (like the stanford demo here https://dhs.stanford.edu/social-media-literacy/tvtropes-pt-1-the-weird-geometry-of-the-internet/), or some other sparsification method (e.g. backbone method https://arxiv.org/abs/0904.2389 or onion decomposition)

@jwzimmer-zz
Copy link
Owner Author

jwzimmer-zz commented Nov 2, 2020

Want to make sure we're not retreading what has already been done in the dhs.stanford.edu article series...

@jwzimmer-zz
Copy link
Owner Author

Seeing if we get the same categories as in https://github.com/jwzimmer/tv-tropes/tree/main/Stanford_Neighborhoods would be interesting, especially since:

@jwzimmer-zz
Copy link
Owner Author

(Closed #7 because the topic there has been subsumed by this issue.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants