arrow-r

Background

In organizations with limited resources and siloed teams, data sharing is a clunky process. This is especially the case when one department produces a large file (say, 12GB), and another team must provide analysis within a tight deadline.

These instances inspired me to explore the possibilities of Apache Arrow. I utilized the following resources:

R for Data Science Chapter 22: Arrow by Hadley Wickham
Doing More With Data: An Introduction to Arrow for R Users by Danielle Navarro at Voltron Data
Using the {arrow} and {duckdb} packages to wrangle medical datasets that are Larger than RAM by Peter Higgins at R Consortium

Data

The Fire Department of New York City (FDNY) maintains data produced by their EMS dispatch system. The EMS Incident Dispatch Data file contains 27M records with information relating to incident location, perceived call severity, and Fire Department response time.

R Version

This project was produced with R version 4.3.1.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
arrow-r.Rproj		arrow-r.Rproj
ems_arrow.R		ems_arrow.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

arrow-r

Background

Data

R Version

About

Releases

Packages

Languages

sgrever/arrow-r

Folders and files

Latest commit

History

Repository files navigation

arrow-r

Background

Data

R Version

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages