add group advantage explorer to website

xehu · xehu · commit 577c981ddb3e · 2025-10-13T21:40:57.000-04:00
diff --git a/index.Rmd b/index.Rmd
@@ -51,13 +51,13 @@ We also provide documentation on how to annotate new tasks, including the annota
 
 - The code used to generate the Task Space from raw annotations can be found here: [`generate_task_map_from_raw.Rmd`](https://github.com/Watts-Lab/task-mapping/blob/master/analysis/analysis_task_space/generate_task_map_from_raw.Rmd).
 
-# Accessing our Full Reproduction package
+# Exploring Our Data
 
-Our full reproduction package is hosted [on GitHub](https://github.com/Watts-Lab/task-mapping).
+Please use the interactive tools below to explore the data associated with our paper!
 
-# Visualizing the Task Map
+## Task Cluster Explorer
 
-The following interactive visualizer allows you to explore the Task Map in two dimensions using PCA, and to conduct k-means clustering on the underlying 24-dimensional space. You can choose different values of *k*, and mouse over each dot to see the name of the task.
+The following interactive visualizer allows you to explore the Task Map in two dimensions using PCA, and to conduct k-means clustering on the underlying 24-dimensional space. You can choose different values of *k*, and mouse over each dot to see the name of the task. You can use this tool to get an intuitive sense of the relationships between tasks.
 
 ```{r setup, include=FALSE}
 knitr::opts_chunk$set(echo = TRUE)
@@ -73,13 +73,35 @@ library(tidyverse)
 knitr::include_app("https://xehu.shinyapps.io/interactive-task-map/")
 ```
 
+## Group Advantage Explorer (20 Experimental Tasks)
+
+In our paper, we demonstrate the use of the Task Space by conducting a large-scale integrative experiment measuring the phenomenon of *group advantage* --- in which an interacting group outperforms individuals working alone. In our experiment, we define two types of group advantage:
+
+- **Strong Group Advantage** is the ratio between an interacting group's performance to that of the *best* individual in an equivalently-sized "nominal" team (a statistical aggregation of participants who worked alone, used to account for the resource advantage of having more people).
+- **Weak Group Advantage** is the ratio between an interacting group's performance to that of a *randomly-selected* individual in an equivalently sized "nominal" team.
+
+Our experiment involved 20 tasks sampled from the task space, which were implemented at three levels of complexity (low, medium, and high) and completed by groups of two different sizes (3 and 6). In the following interactive panel, please explore our data to see how group advantage varies across these experimental conditions. You can toggle between Strong/Weak Advantage, filter by complexity and group size, and hover to see task names.
+
+A key takeaway is that **group advantage is incredibly heterogeneous**; there's no one answer for whether groups outperform individuals. But importantly, these differences are explainable by task features. However, it turns out that the Task Space features can explain 43% of the variance in this phenomenon, demonstrating that our framework can systematically account for variations in group outcomes like this one.
+
+If you are interested in learning more, our data, code, and materials are available in our [GitHub repository](https://github.com/Watts-Lab/task-mapping).
+
+```{r echo=F, out.width="100%"}
+knitr::include_app("https://xehu.shinyapps.io/interactive-task-map-group-advantage/")
+```
+
+# Accessing our Full Reproduction Package
+
+Our full reproduction package is hosted [on GitHub (https://github.com/Watts-Lab/task-mapping)](https://github.com/Watts-Lab/task-mapping).
+
 # Team
+The paper's authors are listed below. For feedback, questions, or suggestions for new tasks and dimensions, please reach out to the Corresponding Authors.
 - [Xinlan Emily Hu](https://xinlanemilyhu.com) (Corresponding Author)
 - [Mark Whiting](https://whiting.me/)
 - [Linnea Gandhi](https://www.linneagandhi.com/)
 - [Duncan J. Watts](https://duncanjwatts.com/)
 - [Abdullah Almaatouq](http://amaatouq.io/) (Corresponding Author)
 
-This work was also created with the support from many other people, including many research assistants at the University of Pennsylvania, as well as the with labor of Amazon Mechanical Turk workers. 
+We also acknowledge that this work was created with the support from many other people, including research assistants at the University of Pennsylvania and the labor of Amazon Mechanical Turk workers. 
 
 This project is part of the [group dynamics / integrative experiments research](https://css.seas.upenn.edu/project/integrative-experiments/) at the [Computational Social Science Lab at Penn](https://css.seas.upenn.edu). You can learn more about our lab [here](https://css.seas.upenn.edu/people/). 
diff --git a/index.html b/index.html
@@ -447,33 +447,76 @@ <h1>Annotating New Tasks</h1>
 be found here: <a href="https://github.com/Watts-Lab/task-mapping/blob/master/analysis/analysis_task_space/generate_task_map_from_raw.Rmd"><code>generate_task_map_from_raw.Rmd</code></a>.</p></li>
 </ul>
 </div>
-<div id="accessing-our-full-reproduction-package" class="section level1">
-<h1>Accessing our Full Reproduction package</h1>
-<p>Our full reproduction package is hosted <a href="https://github.com/Watts-Lab/task-mapping">on GitHub</a>.</p>
-</div>
-<div id="visualizing-the-task-map" class="section level1">
-<h1>Visualizing the Task Map</h1>
+<div id="exploring-our-data" class="section level1">
+<h1>Exploring Our Data</h1>
+<p>Please use the interactive tools below to explore the data associated
+with our paper!</p>
+<div id="task-cluster-explorer" class="section level2">
+<h2>Task Cluster Explorer</h2>
 <p>The following interactive visualizer allows you to explore the Task
 Map in two dimensions using PCA, and to conduct k-means clustering on
 the underlying 24-dimensional space. You can choose different values of
-<em>k</em>, and mouse over each dot to see the name of the task.</p>
+<em>k</em>, and mouse over each dot to see the name of the task. You can
+use this tool to get an intuitive sense of the relationships between
+tasks.</p>
 <iframe src="https://xehu.shinyapps.io/interactive-task-map/?showcase=0" width="100%" height="400px" data-external="1">
 </iframe>
 </div>
-<div id="team" class="section level1">
-<h1>Team</h1>
+<div id="group-advantage-explorer-20-experimental-tasks" class="section level2">
+<h2>Group Advantage Explorer (20 Experimental Tasks)</h2>
+<p>In our paper, we demonstrate the use of the Task Space by conducting
+a large-scale integrative experiment measuring the phenomenon of
+<em>group advantage</em> — in which an interacting group outperforms
+individuals working alone. In our experiment, we define two types of
+group advantage:</p>
 <ul>
-<li><a href="https://xinlanemilyhu.com">Xinlan Emily Hu</a>
-(Corresponding Author)</li>
-<li><a href="https://whiting.me/">Mark Whiting</a></li>
-<li><a href="https://www.linneagandhi.com/">Linnea Gandhi</a></li>
-<li><a href="https://duncanjwatts.com/">Duncan J. Watts</a></li>
-<li><a href="http://amaatouq.io/">Abdullah Almaatouq</a> (Corresponding
-Author)</li>
+<li><strong>Strong Group Advantage</strong> is the ratio between an
+interacting group’s performance to that of the <em>best</em> individual
+in an equivalently-sized “nominal” team (a statistical aggregation of
+participants who worked alone, used to account for the resource
+advantage of having more people).</li>
+<li><strong>Weak Group Advantage</strong> is the ratio between an
+interacting group’s performance to that of a <em>randomly-selected</em>
+individual in an equivalently sized “nominal” team.</li>
 </ul>
-<p>This work was also created with the support from many other people,
-including many research assistants at the University of Pennsylvania, as
-well as the with labor of Amazon Mechanical Turk workers.</p>
+<p>Our experiment involved 20 tasks sampled from the task space, which
+were implemented at three levels of complexity (low, medium, and high)
+and completed by groups of two different sizes (3 and 6). In the
+following interactive panel, please explore our data to see how group
+advantage varies across these experimental conditions. You can toggle
+between Strong/Weak Advantage, filter by complexity and group size, and
+hover to see task names.</p>
+<p>A key takeaway is that <strong>group advantage is incredibly
+heterogeneous</strong>; there’s no one answer for whether groups
+outperform individuals. But importantly, these differences are
+explainable by task features. However, it turns out that the Task Space
+features can explain 43% of the variance in this phenomenon,
+demonstrating that our framework can systematically account for
+variations in group outcomes like this one.</p>
+<p>If you are interested in learning more, our data, code, and materials
+are available in our <a href="https://github.com/Watts-Lab/task-mapping">GitHub
+repository</a>.</p>
+<iframe src="https://xehu.shinyapps.io/interactive-task-map-group-advantage/?showcase=0" width="100%" height="400px" data-external="1">
+</iframe>
+</div>
+</div>
+<div id="accessing-our-full-reproduction-package" class="section level1">
+<h1>Accessing our Full Reproduction Package</h1>
+<p>Our full reproduction package is hosted <a href="https://github.com/Watts-Lab/task-mapping">on GitHub
+(https://github.com/Watts-Lab/task-mapping)</a>.</p>
+</div>
+<div id="team" class="section level1">
+<h1>Team</h1>
+<p>The paper’s authors are listed below. For feedback, questions, or
+suggestions for new tasks and dimensions, please reach out to the
+Corresponding Authors. - <a href="https://xinlanemilyhu.com">Xinlan
+Emily Hu</a> (Corresponding Author) - <a href="https://whiting.me/">Mark
+Whiting</a> - <a href="https://www.linneagandhi.com/">Linnea Gandhi</a>
+- <a href="https://duncanjwatts.com/">Duncan J. Watts</a> - <a href="http://amaatouq.io/">Abdullah Almaatouq</a> (Corresponding
+Author)</p>
+<p>We also acknowledge that this work was created with the support from
+many other people, including research assistants at the University of
+Pennsylvania and the labor of Amazon Mechanical Turk workers.</p>
 <p>This project is part of the <a href="https://css.seas.upenn.edu/project/integrative-experiments/">group
 dynamics / integrative experiments research</a> at the <a href="https://css.seas.upenn.edu">Computational Social Science Lab at
 Penn</a>. You can learn more about our lab <a href="https://css.seas.upenn.edu/people/">here</a>.</p>
diff --git a/interactive-task-map-group-advantage.Rmd b/interactive-task-map-group-advantage.Rmd
@@ -0,0 +1,149 @@
+---
+output: html_document
+runtime: shiny
+---
+
+```{r setup, include=FALSE}
+knitr::opts_chunk$set(echo = TRUE)
+
+library(ggplot2)
+library(plotly)
+library(dplyr)
+library(tidyr)
+library(readr)
+library(shiny)
+library(rlang)
+```
+
+```{r load-data, echo=FALSE, message=FALSE, warning=FALSE}
+# Load PCA base
+task_map <- read_csv('./outputs/processed_data/task_map.csv')
+# Load group advantage at condition level (has task, complexity, playerCount, strong, weak)
+ga_cond <- read_csv('./outputs/processed_data/condition_level_group_advantage.csv')
+
+# PCA of task map
+pca <- prcomp(task_map[, -1], scale = TRUE)
+pca_df <- as.data.frame(pca$x[, 1:2]) %>% mutate(task_pca = task_map$task)
+
+# Normalize task names between sources for robust joining
+norm_task <- function(x) {
+  x <- tolower(x)
+  x <- gsub('[^a-z0-9]+', ' ', x)
+  trimws(x)
+}
+
+pca_df <- pca_df %>% mutate(task_norm = norm_task(.data$task_pca))
+
+# Map GA task names to task_map names when they differ
+# User-provided mapping (GA name -> output/task_map name)
+name_map <- c(
+  'Sudoku' = 'Sudoku',
+  'Moral Reasoning' = 'Moral Reasoning (Disciplinary Action Case)',
+  'Wolf Goat Cabbage' = 'Wolf, goat and cabbage transfer',
+  'Guess the Correlation' = 'Guessing the correlation',
+  'Writing Story' = 'Writing story',
+  'Room Assignment' = 'Room assignment task',
+  'Allocating Resources' = 'Allocating resources to programs',
+  'Divergent Association' = 'Divergent Association Task',
+  'Word Construction' = 'Word construction from a subset of letters',
+  'Whac a Mole' = 'Whac-A-Mole',
+  'Random Dot Motion' = 'Random dot motion',
+  'Recall Association' = 'Recall association',
+  'Recall Word Lists' = 'Recall word lists',
+  'Typing' = 'Typing game',
+  'Unscramble Words' = 'Unscramble words (anagrams)',
+  'WildCam' = 'Wildcam Gorongosa (Zooniverse)',
+  'Advertisement Writing' = 'Advertisement writing',
+  'Putting Food Into Categories' = 'Putting food into categories'
+)
+
+ga_cond <- ga_cond %>% mutate(
+  task_mapped = dplyr::recode(.data$task, !!!name_map, .default = .data$task),
+  task_norm = norm_task(.data$task_mapped)
+)
+
+# Ensure complexity is ordered factor
+ga_cond <- ga_cond %>% mutate(
+  complexity = factor(.data$complexity, levels = c('Low', 'Medium', 'High'), ordered = TRUE),
+  playerCount = as.factor(.data$playerCount)
+)
+
+# For default view, compute task-level means (across complexity and group sizes)
+ga_task_means <- ga_cond %>%
+  group_by(.data$task_norm) %>%
+  summarise(
+    task = dplyr::first(.data$task_mapped),
+    strong = mean(.data$strong, na.rm = TRUE),
+    weak = mean(.data$weak, na.rm = TRUE),
+    .groups = 'drop'
+  )
+
+# Join PCA with GA info; keep the mapped task name
+pca_ga <- pca_df %>%
+  inner_join(ga_task_means, by = 'task_norm') %>%
+  transmute(PC1 = .data$PC1, PC2 = .data$PC2, task_norm = .data$task_norm,
+            task = .data$task, strong = .data$strong, weak = .data$weak)
+
+# Keep a version with all condition rows for filtering
+pca_ga_cond <- pca_df %>% inner_join(ga_cond, by = 'task_norm')
+```
+
+```{r ui-server, echo=FALSE}
+ui <- fluidPage(
+  titlePanel('Interactive Task Map — Group Advantage (20 tasks)'),
+  sidebarLayout(
+    sidebarPanel(
+      radioButtons('dv', 'Color by:', choices = c('Strong' = 'strong', 'Weak' = 'weak'), selected = 'strong', inline = TRUE),
+  selectInput('complexity', 'Complexity:', choices = c('All', 'Low', 'Medium', 'High'), selected = 'All'),
+  checkboxGroupInput('groupSize', 'Group size:', choices = c('3', '6'), selected = c('3','6'), inline = TRUE),
+  checkboxInput('showLabels', 'Show labels', value = FALSE),
+  helpText('Note: Colors are centered at 1. Blue indicates advantage (>1), red indicates disadvantage (<1).')
+    ),
+    mainPanel(
+      plotlyOutput('map_plot')
+    )
+  )
+)
+
+server <- function(input, output) {
+  # reactive filtered data
+  filtered_points <- reactive({
+    dv_col <- input$dv
+
+    if (input$complexity == 'All') {
+      df <- pca_ga %>% mutate(value = .data[[dv_col]])
+    } else {
+      df <- pca_ga_cond %>%
+        filter(.data$complexity == input$complexity) %>%
+        filter(as.character(.data$playerCount) %in% input$groupSize) %>%
+        group_by(.data$task_norm, .data$task_mapped, .data$PC1, .data$PC2) %>%
+        summarise(value = mean(.data[[dv_col]], na.rm = TRUE), .groups = 'drop') %>%
+        rename(task = .data$task_mapped)
+    }
+
+    df
+  })
+
+  output$map_plot <- renderPlotly({
+    dv_label <- ifelse(input$dv == 'strong', 'Strong Advantage', 'Weak Advantage')
+
+    base <- ggplot() +
+      geom_point(data = pca_df, aes(x = .data$PC1, y = .data$PC2), color = 'grey70', alpha = 0.5, size = 2) +
+      theme_minimal() + labs(x = 'PC1', y = 'PC2')
+
+    pts <- filtered_points()
+
+    g <- base +
+      geom_point(data = pts, aes(x = .data$PC1, y = .data$PC2, color = .data$value, text = paste0(.data$task, '\n', dv_label, ': ', round(.data$value, 3))), size = 4) +
+      scale_color_gradient2(name = dv_label, low = '#b2182b', mid = '#f7f7f7', high = '#2166ac', midpoint = 1)
+
+    if (isTRUE(input$showLabels)) {
+      g <- g + geom_text(data = pts, aes(x = .data$PC1, y = .data$PC2, label = .data$task), vjust = -0.8, size = 3)
+    }
+
+    ggplotly(g, tooltip = c('text'))
+  })
+}
+
+shinyApp(ui, server)
+```
diff --git a/interactive-task-map.Rmd b/interactive-task-map.Rmd
@@ -11,6 +11,9 @@ library(e1071)
 library(plotly)
 library(dplyr)
 library(tidyverse)
+library(readr)
+library(shiny)
+library(rlang)
 ```
 
 ```{r load-data, echo=FALSE, message=FALSE, warning=FALSE}
@@ -73,7 +76,7 @@ server <- function(input, output) {
     plot_data <- clustering_data$data
 
     ggplotly(
-      ggplot(plot_data, aes(x = PC1, y = PC2, color = cluster, text = TaskName)) +
+      ggplot(plot_data, aes(x = .data$PC1, y = .data$PC2, color = .data$cluster, text = .data$TaskName)) +
         geom_point(size = 3) +
         theme_minimal()
     )