Skip to content

This repository contains common SQL features used in data analysis tasks. The queries are written in PostgreSQL.

License

Notifications You must be signed in to change notification settings

mijanr/Necessary-SQL

Repository files navigation

Necessary-SQL

This repository contains common SQL features used in data analysis tasks. The queries are written in PostgreSQL.

Creating Schema

We will name the schema as Sales in the following examples.

CREATE SCHEMA IF NOT EXISTS Sales;

A schema is a way to organize database objects (tables, views, functions, etc.) into separate namespaces within a database.

Creating Table

There will be two tables called customers and orders in the Sales schema.

customers table:

CREATE TABLE IF NOT EXISTS sales.customers (
    customer_id SERIAL PRIMARY KEY,
    first_name VARCHAR(50),
    last_name VARCHAR(50),
    email VARCHAR(100)
);

orders table:

CREATE TABLE IF NOT EXISTS sales.orders (
    order_id SERIAL PRIMARY KEY,
    customer_id INT REFERENCES sales.customers(customer_id),
    order_date DATE,
    total_amount NUMERIC(10, 2)
);

Inserting Data

Inserting data into customers table:

INSERT INTO 
    Sales.customers (first_name, last_name, email)
VALUES 
    ('John', 'Smith', '[email protected]'),
    ('Jane', 'Doe', '[email protected]'),
    ...

Inserting data into orders table:

INSERT INTO 
    Sales.orders (customer_id, order_date, total_amount)
VALUES 
    (1, '2020-01-01', 100.00),
    (1, '2020-01-02', 200.00),
    ...

Queries Explored

1. Query Foundations

- The SELECT & FROM commands
- Selection using WHERE / IN / AND / BETWEEN
- ORDER BY
- TOP / LIMIT
- Aggregation with GROUP BY + SUM / COUNT
- HAVING

2. Merging & Joining

- Merging data using LEFT & INNER
- Stacking data using UNION and UNION ALL
- Executing multiple queries using TEMP TABLES and CTE

3. Manipulation

- WINDOW functions + OVER / PARTITION BY
- Adding LEAD / LAG
- Rounding numerical data
- Selecting random samples of data
- Manipulating dates
- Manipulating text

This list is taken from Andre Jones LinkedIn profile.

SQL JOINS

The following diagram shows the different SQL JOINs: SQL JOINS

Requirements

requirements.yml file contains the necessary packages to run the Jupyter Notebook. To install the packages, run the following command:

conda env create -f requirements.yml

requirements.yml is generated using the following command:

conda env export --no-builds | grep -v "prefix" > requirements.yml

About

This repository contains common SQL features used in data analysis tasks. The queries are written in PostgreSQL.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published