The MIT FutureMakers Create-a-thon is a virtual, part-time 6-week AI learning program, developed through a collaboration between SureStart and the MIT RAISE (Responsible AI for Social Empowerment and Education) Initiative.
The FutureMakers program, which started on July 6th, 2021 includes:
- Learning AI, machine learning and emotion AI concepts
- Turning skills into confidence through building AI solutions hands-on
- Consistent support and encouragement from a mentor 1-1
- Regular tech talks and career readiness seminars.
- Entrepreneurship and leadership skills development
This 6-week virtual program provided me with a unique opportunity to build AI solutions that tackle some of today’s most pressing challenges.
Here is a list of all projects that I worked on during this program
The last two weeks were a create-a-thon for which my team developed an app to aid the visually impaired in finding objects using object detection and speech recognition. More information on the project as well as the source code can be found here: https://github.com/mferuscomelo/ai-spy
Mon | Tue | Wed | Thu | Fri | Sat | Sun |
---|---|---|---|---|---|---|
01 | 02 | 03 | 04 | |||
05 | 06 | 07 | 08 | 09 | 10 | 11 |
12 | 13 | 14 | 15 | 16 | 17 | 18 |
19 | 20 | 21 | 22 | 23 | 24 | 25 |
26 | 27 | 28 | 29 | 30 | 31 |
Mon | Tue | Wed | Thu | Fri | Sat | Sun |
---|---|---|---|---|---|---|
01 | ||||||
03 | 03 | 04 | 05 | 06 | 07 | 08 |
09 | 10 | 11 | 12 | 13 | 14 | 15 |
16 | 17 | 18 | 19 | 20 | 21 | 22 |
23 | 24 | 25 | 26 | 27 | 28 | 29 |
30 | 31 |
I am looking forward to learning more about AI and its ethical impact on society. I am also hoping the apply the knowledge and skills learned during this program to hands-on project to ensure that I properly understand the material.
Dr. Kong's seminar on leadership taught me how to use storytelling to take action in my community and make a positive difference.
There are 3 types of callings: A call of us (community), self (leadership), and now (strategy and action). Once you have decided on the calling, you craft a story that has three parts: a choice, a challenge, and an outcome. A moral at the end is a plus.
This well-planned seminar inspired me to share my own story and experience to become a supportive and encouraging visionary leader.
- Reviewed ML models with this article
Supervised Learning | Unsupervised Learning |
---|---|
Use Cases: Predict outcomes for new data |
Use Cases: Gain insights from large volumes of data |
Applications: - Spam Detection - Sentiment Analysis - Pricing Predictions |
Applications: - Anomaly Detection - Recommendation Engines - Medical Image Classification |
Drawbacks: Can be time-consuming to label dataset and train models |
Drawbacks: Can have inaccurate results and reflects biases that might be present in the dataset |
Describe why the following statement is FALSE: Scikit-Learn has the power to visualize data without a Graphviz, Pandas, or other data analysis libraries.
The Scikit-Learn library is built on top of visualization libraries like Pandas and Graphviz. Therefore, data analysis libraries need to be installed prior to using Scikit-Learn.
- Gained a high level understanding of DL models and algorithms through this article
According to the WHO, an estimated 684 000 fatal falls occur each year, making it the second leading cause of unintentional injury death, after road traffic injuries. However, not all falls are fatal, with 37.3 million falls being severe enough to require medical attention. With so many people being injured or killed each year by falls, it is of great social significance to provide them with accurate, dependable, and effective procedures to mitigate the effects of falls.
I am using the SisFall dataset which consists of data collected from two accelerometers and one gyroscope. This dataset is the only one I could find that includes falls by people over 60. However, due to medical concerns, there is a bias towards younger groups of people.
I'm currently developing a model to detect falls using deep learning, which will be deployed to an Arduino. The binary classification model uses a combination of a Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) for time series prediction.
The biggest hurdle I am facing is the low processing power of the Arduino. I managed to train a model that can detect falls with over 99.5% accuracy, though the computation requirements for feature extraction proved to be too high for the Arduino to handle. I have therefore switched to using deep learning and am training the model on raw data. More information on the project can be found here: https://github.com/mferuscomelo/fall-detection
- Read this article on the difference between AI and ML.
Tensors are data-structures that can be visualized as n-dimensional arrays, with n > 2. We only call structures with 3 dimensions or more "Tensors" so as to not confuse them with lower-dimensional structures such as matrices, vectors, and scalars.
Tensors usually contain numerical data and are the backbone of neural networks. All transformations of a neural network can be reduced to tensor operations.
What did you notice about the computations that you ran in the TensorFlow programs (i.e. interactive models) in the tutorial?
The datasets had to be processed before training the model so that it could better identify the relationships between the data. This process is called feature extraction or feature engineering.
- Reviewed this guide about common components of neural networks and how they work with different ML functions and algorithms.
- Learned about CNNs using this cheatsheet
- Visualized how CNNs work with handwritten digits
- What is a confusion matrix?
- Reviewed presentation on algorithmic bias
- Played survival of the best fit to learn more about how AI might impact human resources and hiring processes in different fields
This game demonstrated the process of automating hiring decisions based on data which might be biased and its consequences. However, this topic isn't as far-fetched as some may think. In 2014, Amazon decided to try to automate hiring at their company1. Just like in the game, the hiring process done by humans was already biased with the overwhelming majority of hired employees being male. Although the team developing the algorithm might not have intended for this bias to be present, the sheer number of resumes present in the training set created an algorithmic bias towards hiring male applicants. In real life, men get a lot more support for getting into STEM related fields, whereas women are often actively discouraged from those careers and instead, are taught to start jobs "meant for women."
In the same way, there was an abundance of orange candidates while hiring, showing that blue applicants were removed from the competition even before they could enter. Towards the end, it got so bad that the applicant pool of around 10 people was comprised entirely of orange people. Seeing the statistics of my hiring process in the end, around 75% of people hired and rejected were orange.
This, combined with "Google's" dataset that wasn't inspected for bias, meant that the algorithm was hiring orange people with almost two times as often as blue people.
Can you give a real-world example of a biased machine learning model, and share your ideas on how you make this model more fair, inclusive, and equitable? Please reflect on why you selected this specific biased model.
In 2015, Google's image classification algorithm for Google Photos misclassified a black couple as being gorillas2. This racist classification was a result of algorithmic bias present in Google's ML model due to insufficient training data from a diverse group of people.
This example was an event that showed me that algorithms aren't as infallible as we might think. They propagate biases present in the data they see, some of which might not be noticed by humans. In the informational game survival of the best fit, there was a slight bias towards hiring orange people and rejecting blue people. This, combined with the fact that "Google's" hiring processes was used in the dataset without checking for bias, meant that the algorithm trained on the data amplified these biases to the point of making orange people almost twice as likely as blue people to be hired.
- Reviewed CNN Architecture
- Improved MNIST Digit Classification algorithm and added option to make predictions in the notebook
- Read article on choosing loss functions
- Watched lecture and reviewed slides on loss functions and optimization
- Reviewed article on choosing activation functions
- Learned how to implement the ReLU activation function
Choosing an activation function for a hidden layer
The ReLU has become the most used activation function for hidden layers.The function is simple to use and efficient at overcoming the drawbacks of earlier popular activation functions such as sigmoid and tanh. It is less prone to vanishing gradients, which prohibit deep models from being trained, however it can suffer from other issues such as saturated or "dead" units.
The ReLU activation function can be used in hidden layers for multilayer perceptrons and convolutional neural networks.
- Reviewed article on the importance of ethics in the real-world context of AI and automation.
- Reviewed different image classification techniques through this article
- Learned how to avoid overfitting
- Read about the ethics of machine learning
- Reviewed a tutorial on upsampling
- Learned about autoencoders
- Watched a TED Talk on the origins of Affective Computing
- Read about the EMPath Makeathon
- Reviewed this guide on applied NLP projects
- Watched this TED talk
The last two weeks were a create-a-thon for which my team developed an app to aid the visually impaired in finding objects using object detection and speech recognition. More information on the project as well as the source code can be found here: https://github.com/mferuscomelo/ai-spy
- Flower Classification
- Predicting Housing Prices
- TensorFlow Basics
- Belgian Traffic Sign Classification
- Wine Identification
- Sarcasm Detection
- MNIST Digits Classification
- Housing Prices Classification
- Gender Recognition from Faces
- Animal Classification
- Sentiment Analysis
- Autoencoders
- Speech Emotion Analyzer
- Movie Review Classifier
- Loss functions cheatsheet
- CNN cheatsheet
- Activation functions cheatsheet
- Pandas Cheatsheet
- NLP Resources
Difference between a scalar, a vector, a matrix and a tensor
Choosing an activation function for a hidden layer
Choosing an activation function for an output layer
1: Goodman, Rachel. “Why Amazon's Automated Hiring Tool Discriminated Against Women.” American Civil Liberties Union, American Civil Liberties Union, 15 Oct. 2018, www.aclu.org/blog/womens-rights/womens-rights-workplace/why-amazons-automated-hiring-tool-discriminated-against.
2: “Google Apologizes for Photos App's Racist Blunder.” BBC News, BBC, 1 July 2015, www.bbc.com/news/technology-33347866.