Skip to content

CarlaParreiras/FGV_Intro_DS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

FGV_Intro_DS

Introduction to Data Science @ FGV

Instructor: Renato Rocha Souza

This will be the repository of code for the Introduction to Data Science discipline.

This class is about the Data Science process, in which we seek to gain useful predictions and insights from data. Through real-world examples and code snippets, we introduce methods for:

  • data munging, scraping, sampling andcleaning in order to get an informative, manageable data set;
  • data storage and management in order to be able to access data (even if big data);
  • exploratory data analysis (EDA) to generate hypotheses and intuition about the data;
  • prediction based on statistical learning tools;
  • communication of results through visualization, stories, and interpretable summaries

Detailed Syllabus:

  • Data science concepts and methodologies

  • Data Science Tools

  • Exploratory data analysis 1, 2

  • Data Science, AI and machine learning

  • Linear regression and regularization

  • Model selection and evaluation

  • Classification: kNN, decision trees

  • Classification: SVM

  • Ensemble methods: random forests

  • Intro to probability:

  • Naive Bayes and logistic regression

  • Feature engineering and selection

  • Clustering: k-means, hierarchical clustering

  • Dimensionality reduction: PCA and SVD

  • Text mining and information retrieval

  • Network Analysis

  • Recommender systems

  • Relational databases, SQL

  • Big data storage and retrieval: noSQL, GraphDB

  • Big data distributed computing: map-reduce, spark, rdd

  • Neural Networks and Deep Learning

About

Introduction to Data Science @ FGV

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published