Skip to content

A project as part of MSc course that analysed editorial activities (e.g., creating, editing, reverting articles) on Wikipedia to understand the dynamics of human interaction and behaviour on online platform.

License

Notifications You must be signed in to change notification settings

hirokiodd/wiki-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analysis of editorial activities on Wikipedia

About the repo

The repo contains code for a project in my MSc course, which explains what I have done as part of the programme (and which is hopefully helpful for showing my ability of intermediate coding skills).

Content

Topic

The code analyses editorial activities (e.g., creating, editing, reverting articles) on Wikipedia to understand the dynamics of human interaction and behaviour on online platforms. Here is a list of related research studies focusing on online editorial activities, networks, and communities on Wikipedia.

  • Tsvetkova, M., García-Gavilanes, R., Floridi, L., & Yasseri, T. (2017). Even good bots fight: The case of Wikipedia. PloS one, 12(2), e0171774.
  • Gildersleve, P., Lambiotte, R., & Yasseri, T. (2023). Between news and history: identifying networked topics of collective attention on Wikipedia. Journal of Computational Social Science, 6(2), 845-875.

Code structure

  • output.ipynb: A notebook contains the final output of the analysis
  • /module/create_network.py: A module to create a network data for the analysis
  • /module/find_revert_back.py: A module to find mutual reverts in Wikipedia articles
  • /module/calculate_similarity.py: A module to calculate the similarity of edit activities
  • /module/visualise.py: A module to visualise the output of the analysis

Coding environment

I used Poetry to manage the Python environment. The pyproject.toml file contains the dependencies and the Python version used in the project.

Note

  • The code is written to solve particular problems which cannot be shared. Also, the repo does not contain data provided by the school, as sharing the data is not permitted.
  • When solving the problem, we can only use simple modules such as pickle, random, datetime, and the packages numpy, matplotlib, and seaborn** to practice writing code. We can NOT use advanced data processing packages such as pandas, networkx, scikitlearn, etc.

About

A project as part of MSc course that analysed editorial activities (e.g., creating, editing, reverting articles) on Wikipedia to understand the dynamics of human interaction and behaviour on online platform.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published