Skip to content

Iota87/dealing_with_data

 
 

Repository files navigation

Prerequisite: Setting up Linux on Amazon EC2

The Basics: SSH, command-line, CURL

In class: Find Web API using Mashape, issue requests using CURL

Relational Databases

Entity-Relationship Model

  • Entities, Primary Keys, and Attributes
  • Relations
  • Cardinality: One-to-One, One-to-Many, Many-to-Many
In class: Artist-Gallery-Painting example

From ER Diagram to SQL Tables

  • Translating ER Diagrams to Tables
  • SQL Statements for Creating Tables

Querying a Database Using SQL

  • USE, DESCRIBE queries
  • Selection queries: *, column, column AS, DISTINCT, ORDER BY, LIMIT
  • Where clauses: Boolean conditions, IN, BETWEEN, LIKE
  • Aggregation queries: GROUP BY, SUM, AVG, MAX, MIN, ROLLUP
  • Join queries: INNER JOIN, OUTER JOIN
  • Subqueries and Views
In-class Exercise: Compare Tastes Across Demographic Segments

Additional Resources

Introduction to Python

Primitive Data Types

  • Strings
  • Integers, Floats, and Math operators
  • Booleans

Complex Data Structures

  • Lists
  • Sets
  • Tuples
  • Dictionaries
  • Nested data structures

Control Statements

  • Conditional statements (if-then-else)
  • Loops (for loops, list comprehensions)

Beyond the Basics

  • Functions
  • Libraries
  • Files
In-class Exercise: Find Similar Company Names

Additional Resources

Regular Expressions

  • Atoms
  • Anchoring expressions
  • Repetition and Grouping operators
In-class Exercise: Extract Email from Web Page

Web API's, Crawling & XPath

  • Python and Web APIs
  • Beyond the Basics: Parameters and Headers
  • (advanced) Using OAuth for authentication
  • XPath
  • Crawling Websites
In-class Exercise: Retrieve Buzzfeed articles

Python and Databases

  • Interacting with a database using Python
  • Inserting data in a database using Python
  • Retrieving data from a database using Python
In-class Exercise: Retrieve live weather or Citibike data and insert in database

Processing Data using Python Pandas

Data Plotting and Visualization

Text Mining and Natural Language Processing

About

Material for the "Dealing with Data" class

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 99.9%
  • Python 0.1%