Skip to content

A project I completed in Introduction to Data Science (SDS192) at Smith College. Data wrangling with dplyr of election contribution data. Comparing contributions spent supporting and opposing candidates in each state based on political party.

Notifications You must be signed in to change notification settings

niannucci/sds192-mp2

 
 

Repository files navigation

sds192-mp2

Mini-project 2:

See (https://beanumber.github.io/sds192/mod_data.html) for the project instructions

load("house_elections.rda")
load("candidates.rda")
load("committees.rda")
load("contributions.rda")

Verify that your data looks like this:

library(tidyverse)
glimpse(house_elections)
## Observations: 2,178
## Variables: 10
## $ fec_id         <chr> "B2CA08156", "H0AK00097", "H0AL01030", "H0AL020...
## $ state          <chr> "CA", "AK", "AL", "AL", "AL", "AL", "AL", "AR",...
## $ district       <chr> "08", "00", "01", "02", "05", "07", "07", "01",...
## $ incumbent      <chr> "FALSE", "FALSE", "FALSE", "TRUE", "TRUE", "TRU...
## $ candidate_name <chr> "Mitzelfelt, Brad", "Cox, John R.", "Gounares, ...
## $ party          <chr> "R", "R", "R", "R", "R", "D", "R", "R", "R", "R...
## $ primary_votes  <int> 8801, 11179, 3854, 0, 65163, 0, 11537, 0, 0, 0,...
## $ runoff_votes   <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,...
## $ general_votes  <int> 0, 0, 0, 180591, 189185, 232520, 73835, 138800,...
## $ ge_winner      <chr> "", "", "", "W", "W", "W", "N", "W", "W", "W", ...
glimpse(candidates)
## Observations: 5,628
## Variables: 15
## $ cand_id                <chr> "H0AK00089", "H0AK00097", "H0AL00016", ...
## $ cand_name              <chr> "CRAWFORD, HARRY T JR", "COX, JOHN ROBE...
## $ cand_party_affiliation <chr> "DEM", "REP", "UNK", "REP", "REP", "DEM...
## $ cand_election_yr       <int> 2010, 2012, 2010, 2012, 2012, 2008, 201...
## $ cand_office_state      <chr> "AK", "AK", "AL", "AL", "AL", "AL", "AL...
## $ cand_office            <chr> "H", "H", "H", "H", "H", "H", "H", "H",...
## $ cand_office_district   <int> 0, 0, 7, 1, 2, 5, 5, 5, 5, 5, 6, 7, 7, ...
## $ cand_ici               <chr> "C", "C", "O", "C", "I", "C", "C", "I",...
## $ cand_status            <chr> "P", "N", "C", "C", "C", "C", "C", "C",...
## $ cand_pcc               <chr> "C00466698", "C00525261", "C00464040", ...
## $ cand_st1               <chr> "4350 BUTTE CIR", "PO BOX 1092", "PO BO...
## $ cand_st2               <chr> "", "", "", "", "", "", "", "", "SUITE ...
## $ cand_city              <chr> "ANCHORAGE", "ANCHOR POINT", "BIRMINGHA...
## $ cand_state             <chr> "AK", "AK", "AL", "AL", "AL", "AL", "AL...
## $ cand_zip               <int> 99504, 8388607, 35201, 36561, 36106, 35...
glimpse(committees)
## Observations: 14,454
## Variables: 15
## $ cmte_id                <chr> "C00000042", "C00000059", "C00000422", ...
## $ cmte_name              <chr> "ILLINOIS TOOL WORKS INC. FOR BETTER GO...
## $ tres_name              <chr> "LYNCH, MICHAEL J. MR.", "GREG SWARENS"...
## $ cmte_st1               <chr> "3600 WEST LAKE AVENUE", "2501 MCGEE", ...
## $ cmte_st2               <chr> "", "MD#288", "SUITE 600", "", "", "", ...
## $ cmte_city              <chr> "GLENVIEW", "KANSAS CITY", "WASHINGTON"...
## $ cmte_state             <chr> "IL", "MO", "DC", "OK", "KS", "IN", "DC...
## $ cmte_zip               <int> 60026, 64108, 20001, 73107, 66612, 4620...
## $ cmte_dsgn              <chr> "B", "U", "B", "U", "U", "U", "B", "B",...
## $ cmte_type              <chr> "Q", "Q", "Q", "N", "Q", "Q", "Q", "Q",...
## $ cmte_party_affiliation <chr> "", "UNK", "", "", "UNK", "", "UNK", "U...
## $ cmte_filing_freq       <chr> "Q", "M", "M", "Q", "Q", "Q", "M", "M",...
## $ org_type               <chr> "C", "C", "M", "L", "T", "M", "M", "L",...
## $ connected_org_name     <chr> "ILLINOIS TOOL WORKS INC.", "", "AMERIC...
## $ cand_id                <chr> "", "", "", "", "", "", "", "", "", "",...
glimpse(contributions)
## Observations: 396,369
## Variables: 22
## $ cmte_id          <chr> "C00478404", "C00140855", "C00140855", "C0014...
## $ amndt_ind        <chr> "N", "N", "N", "N", "N", "N", "N", "N", "N", ...
## $ rpt_type         <chr> "M3", "M3", "M3", "M3", "M3", "M3", "M3", "M3...
## $ transaction_pgi  <chr> "P", "P", "P", "P", "P", "P", "G", "P", "P", ...
## $ image_num        <chr> "11930476751.0", "11930476826.0", "1193047682...
## $ transaction_type <chr> "24K", "24K", "24K", "24K", "24K", "24K", "24...
## $ entity_type      <chr> "COM", "CCM", "CCM", "CCM", "CCM", "CCM", "CC...
## $ name             <chr> "KLINE FOR CONGRESS", "TIM RYAN FOR U.S. CONG...
## $ city             <chr> "BURNSVILLE", "WASHINGTON", "WASHINGTON", "BO...
## $ state            <chr> "MN", "DC", "DC", "MD", "ND", "MI", "MN", "IA...
## $ zip_code         <chr> "55337", "20013", "20005", "20716", "58106", ...
## $ employer         <chr> "", "", "", "", "", "", "", "", "", "", "", "...
## $ occupation       <chr> "", "", "", "", "", "", "", "", "", "", "", "...
## $ transaction_dt   <chr> "02252011", "02012011", "02012011", "02222011...
## $ transaction_amt  <dbl> 2400, 1000, 1000, 2500, 1000, 5000, 1000, 100...
## $ other_id         <chr> "C00326629", "C00373464", "C00289983", "C0014...
## $ cand_id          <chr> "H8MN06047", "H2OH17109", "H4KY01040", "H2MD0...
## $ tran_id          <chr> "B37FBC79414E54DD7A1C", "38595006", "38595007...
## $ file_num         <int> 717033, 717042, 717042, 717042, 717043, 71704...
## $ memo_cd          <chr> "", "", "", "", "", "", "", "", "", "X", "", ...
## $ memo_text        <chr> "", "", "", "", "", "", "", "", "", "CHECK 23...
## $ sub_id           <dbl> 4.03182e+18, 4.03172e+18, 4.03172e+18, 4.0317...

Make sure that the row and column counts match!

About

A project I completed in Introduction to Data Science (SDS192) at Smith College. Data wrangling with dplyr of election contribution data. Comparing contributions spent supporting and opposing candidates in each state based on political party.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published