FGV_Intro_DS

Introduction to Data Science @ FGV

This will be the repository of code for the Introduction to Data Science discipline.

This class is about the Data Science process, in which we seek to gain useful predictions and insights from data. Through real-world examples and code snippets, we introduce methods for:

data munging, scraping, sampling andcleaning in order to get an informative, manageable data set;
data storage and management in order to be able to access data (even if big data);
exploratory data analysis (EDA) to generate hypotheses and intuition about the data;
prediction based on statistical learning tools;
communication of results through visualization, stories, and interpretable summaries

Detailed Syllabus:

Data science concepts and methodologies
- What is Data Science
- Data Science process
  - Business Intelligence
  - CRISP-DM
Data Science Tools
- Jupyter Notebook
- Pandas 1, 2
Exploratory data analysis 1, 2
Data Science, AI and machine learning
Linear regression and regularization
Model selection and evaluation
Classification: kNN, decision trees
Classification: SVM
Ensemble methods: random forests
Intro to probability:
Naive Bayes and logistic regression
Feature engineering and selection
Clustering: k-means, hierarchical clustering
Dimensionality reduction: PCA and SVD
Text mining and information retrieval
Network Analysis
Recommender systems
Relational databases, SQL
Big data storage and retrieval: noSQL, GraphDB
Big data distributed computing: map-reduce, spark, rdd
Neural Networks and Deep Learning

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FGV_Intro_DS

About

Releases

Packages

CarlaParreiras/FGV_Intro_DS

Folders and files

Latest commit

History

Repository files navigation

FGV_Intro_DS

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages