Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy branch of eda work spacev2 for rev #13

Open
wants to merge 35 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
399bec4
unzipping and extracting all files from coraal.zip to notebook
BJ-KHALED Jul 18, 2024
7bd81c7
looking into files in the coraal directory
BJ-KHALED Jul 18, 2024
3b83047
working on eda of interview date and making a df
BJ-KHALED Jul 18, 2024
e81fec6
visual bar chart of interviews per year
BJ-KHALED Jul 18, 2024
8d57155
evaluation on other meta data now
BJ-KHALED Jul 18, 2024
b712645
pause for today
BJ-KHALED Jul 18, 2024
a805b36
doing EDA on metadata folder with 6 txt files
BJ-KHALED Jul 19, 2024
e09a2e3
comparing histograms to bar charts
BJ-KHALED Jul 19, 2024
3a17f87
test pushing data to branch to see if I get error and correcting error
BJ-KHALED Jul 19, 2024
d4cba21
evaluating different visuals for metadata folder
BJ-KHALED Jul 19, 2024
bffb055
column analysis for dfs
BJ-KHALED Jul 19, 2024
7d7928d
Identifying useful columns for features
BJ-KHALED Jul 19, 2024
fc7b5d4
cleaning up df for running in model
BJ-KHALED Jul 19, 2024
cdd2b27
eda on cleaning df to be readable by model
BJ-KHALED Jul 19, 2024
56127c3
evaluating differnt cleaning process for transcript_df
BJ-KHALED Jul 19, 2024
81bd396
checking over random samples of cleaned data for mistakes
BJ-KHALED Jul 19, 2024
1e2dd8c
pause on work
BJ-KHALED Jul 19, 2024
650340a
removing rows from transcript df of Special symbols and keeping puncu…
BJ-KHALED Jul 21, 2024
73260a6
pause on sampling of cleaned df
BJ-KHALED Jul 21, 2024
c8c5c90
combining and cleaning df using only one loop
BJ-KHALED Jul 22, 2024
ca09fa9
cleaned the data for varying desires and implementing into sebastian …
BJ-KHALED Jul 22, 2024
bb22d62
5draft for cleaning data on transcript
BJ-KHALED Jul 22, 2024
6ca4b8e
added in comments
BJ-KHALED Jul 22, 2024
6a52ae7
removed outputs
BJ-KHALED Jul 23, 2024
34b5b25
looking for the max & min duration of audio files in dev folder
BJ-KHALED Jul 23, 2024
0adb7c0
attempting to create a desirable df
BJ-KHALED Jul 23, 2024
8edeceb
fixed issue with getting a duplicate row in indian accent nb and comm…
BJ-KHALED Jul 23, 2024
62a85eb
commenting on changes
BJ-KHALED Jul 23, 2024
22868f6
analysis of column data
BJ-KHALED Jul 23, 2024
1b6fefa
documenting steps
BJ-KHALED Jul 24, 2024
02f326c
commenting on steps of diffrent code blocks
BJ-KHALED Jul 24, 2024
9ff7c61
commenting on steps for code blocks and what they do
BJ-KHALED Jul 24, 2024
ce5bd19
cleaning content column of indian accent df to be more readable
BJ-KHALED Jul 25, 2024
47f2249
getting audio tensors for indian accent df
BJ-KHALED Jul 26, 2024
e938929
evaluating audio tensors made
BJ-KHALED Jul 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading