Skip to content

Latest commit

 

History

History
78 lines (64 loc) · 6.09 KB

README_OLD.md

File metadata and controls

78 lines (64 loc) · 6.09 KB

HitCount Star Fork Star

This is a work-in-progress course website for Introductory Statistics for Undergraduate Students, produced by Fan. Course covers a limited subset of topics from Statistics for Business and Economics (Anderson Sweeney Williams Camm Cochran 12e).

R is used. Packages from Tidyverse are used, including tibble for framing data, tidyr and dplyr for reshaping data and aggregating statistics, ggplot2 for graphing, and readr for file IO. Materials are written in R using Jupyter notebook and shown as HTML files. To obtain codes and raw files, see here for github set up. For HTML files, click on the links below.

Please contact FanWangEcon for issues or problems.

1. Dataset, Tables and Graphs

  1. An In-class Survey
  • create a tibble dataset
  • draw 10 random students from 50 and build a survey
  • first use: tibble, add_row, factor, ifelse, group_by, mutate, summarise, write_csv
  1. Opening up a Dataset
  • relative and absolute path
  • first use: read.csv
  1. One Variable Graphs and Tables
  • frequency table
  • bar chart and histogram
  • R function and lapply to generate graphs/tables for different variables
  • first use: function, loop, lapply, !!sym, geom_histogram, geom_bar
  1. Multiple Variables Graphs and Tables
  • two-way frequency table
  • stacked bar chart
  • scatter-plot
  • first use: spread, geom_point, geom_text, geom_smooth, geom_bar

2. Summarizing Data

  1. Mean, Standard Deviation
  • a dataset with city-month temperatures
  • mean and standard deviation
  • use: dplyr + ggplot, gather, filter, facet_wrap, show.unique.values, geom_line, geom_point, scale_x_continuous
  1. Rescaling--Coefficient of Variation and Correlation
  • a dataset with state-level wage and education data
  • scatter-plot
  • coefficient of variation rescales standard deviation
  • correlation rescales covariance

3. Basics of Probability

  1. Sample Space, Experimental Outcomes, Events, Probabilities
  • definitions of Sample Space, Experimental Outcomes, Events and Probability
  • union, intersection and complements
  • conditional probability
  1. Examples of Sample Space and Probabilities
  • throwing a Quarter, four candidates for election, six-sided unfair dice, two basketball games
  • use: tibble, sample
  1. Throw an Unfair Four Sided Dice
  • Throw an unfair dice many times, law of large number
  • use: reduce, full_join, mutate_all, dplyr::mutate; tibble+group_by+summarise+mutate+arrange+select; !!str.var.name!=, sprintf, str_extract; bind_cols, logspace; geom_line, scale_x_continuous(trans='log10'), labs()
  1. Multiple-Step Experiment: Playing the Lottery Three times
  • Path after 1, 2 and 3 plays

4. Discrete Probability Distribution

  1. Discrete Random Variable and Binomial Experiment
  • Discrete Random Variable
  • Expected Value and Variance
  • Binomial Properties
  • Examples: USA larceny clearance rate, WWII German soldier survival rate
  • use: dbinom, pbinom; geom_bar, geom_line, geom_point, geom_text; lapply, sprintf, scale_y_continuous(sec.axis), axis.text.y, round
  1. Poisson Probability Distribution
  • Poisson Properties
  • Examples: Ladislaus Bortkiewicz's analysis of Prussian army horse-kick deaths
  • use: dpois, ppois

Please contact for issues or problems.

RepoSize CodeSize Language Release License