Skip to content

Commit

Permalink
port over to sql for data science
Browse files Browse the repository at this point in the history
  • Loading branch information
Josiah Baker committed Feb 8, 2017
1 parent 4189842 commit afcfd2d
Show file tree
Hide file tree
Showing 75 changed files with 759,785 additions and 221 deletions.
69 changes: 11 additions & 58 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,68 +1,21 @@
# AdventureWorks for Postgres

This project provides the scripts necessary to set up the OLTP part of the go-to database used in
training classes and for sample apps on the Microsoft stack. The result is 68 tables containing HR,
sales, product, and purchasing data organized across 5 schemas. It represents a fictitious bicycle
parts wholesaler with a hierarchy of nearly 300 employees, 500 products, 20000 customers, and 31000
sales each having an average of 4 line items. So it's big enough to be interesting, but not
unwieldy. In addition to being a well-rounded OLTP sample, it is also a good choice to demonstrate
ETL into a data warehouse.
This is based off the work done by [lorint](https://github.com/lorint/AdventureWorks-for-Postgres). We've already done the work to convert the csv's to be compatible with postgres. If you would like the original files, head over to [Adventure Works 2014 OLTP](https://msftdbprodsamples.codeplex.com/downloads/get/880662) download page. The download includes a script for loading the data into MSSQL Server.

Provided is a ruby file to convert CSVs available on CodePlex into a format usable by Postgres, as
well as a Postgres script to create the tables, load the data, convert the hierarchyid columns, add
primary and foreign keys, and create some of the views used by Adventureworks.
## Getting started

## How to set up the database:
First, make sure you have postgres installed. You can do this by typing `psql` in terminal.
If nothing comes up, install postgres using `brew install postgres`. This will install and initialize a postgres database.

Download [Adventure Works 2014 OLTP Script](https://msftdbprodsamples.codeplex.com/downloads/get/880662).
(If this link becomes broken then here's the [original page](https://msftdbprodsamples.codeplex.com/releases/view/125550).)
### Windows

Extract the .zip and copy all of the CSV files into the same folder, also containing update_csvs.rb file and install.sql.
Head over to [https://www.postgresql.org/download/windows/](https://www.postgresql.org/download/windows/) and follow the instructions.

Modify the CSVs to work with Postgres by running:
```
ruby update_csvs.rb
```
Create the database and tables, import the data, and set up the views and keys with:
```
psql -c "CREATE DATABASE \"Adventureworks\";"
psql -d Adventureworks < install.sql
```
All 68 tables are properly set up, and 11 of the 20 views are established. The ones not built are those that rely on XML functions like value and ref. To see a list of tables, open psql, and then connect to the database and show all the tables with these two commands:
```
\c "Adventureworks"
\dt (humanresources|person|production|purchasing|sales).*
```
### Run the script

## Motivation
Once you have confirmed your postgres install, run the following two lines:

Five years ago I was pretty happy developing .NET apps for large organizations. The stack was
mature, and good practices surrounding software development were very respected. The same kind of
approach I appreciated from my days writing Java code was there, and the community was passionate.
psql -c "CREATE DATABASE \"Adventureworks\";"
psql -d Adventureworks < install.sql

Then along came Windows 8. The //build/ conference in September 2011 revealed its first beta, and
even with that early peek at the new direction things were headed, it was clear that everything about
the platform was a haphazard combination of the new Metro apps along with all the traditional control
panel and options and API for classic code. It left a very bad taste in my mouth. Perhaps it would
look pretty, but be very unusable. I couldn't see it ever being successful. Once the "red pill"
registry setting vanished from the builds in mid-2012, the Windows 7 interface was then no longer
available even in Server editions. I knew it was time for a change. For a year I stuck it out
watching to see if there was any hope for some kind of tablet miracle out of Redmond, but I was
consistently unimpressed.

In mid-2013 a friend looped me in on a new project involving Ruby on Rails, and I fervently dove in
and have very much enjoyed the elegance of that ecosystem. A big part of that has been ramping up my
knowledge of Postgres. What a great database engine! I figure that others departing the Microsoft
camp may appreciate the same data samples they're familiar with, so I created this along with the
Northwind sample. It's been useful in the classroom training folks about Rails. I expect with the
heavy-handed tactics Microsoft has now used around Windows 10 that even more organizations will
choose to transition away from that platform, so there will be lots of opportunity for samples like
this to help people learn a new environment.

As well, with the imminent release of SQL Server 2017 for Linux, this sample could be used to
evaluate performance differences between Postgres and SQL 2017. Never thought I'd see the day that
MS SQL got compiled for Linux, but alas, here we are.

Let's keep coding fun.

Enjoy!
You're all set!
1 change: 1 addition & 0 deletions data/AWBuildVersion.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
1 12.0.1800 2014-02-20 04:26:00 2014-07-08 00:00:00
Loading

0 comments on commit afcfd2d

Please sign in to comment.