Skip to content

jayanth-tjvrr/Edgar_data_analysis

Repository files navigation

Edgar_data_analysis

Data Extraction and Text Analysis of Financial Reports

Objective of this assignment is to extract some sections (which are mentioned below) from SEC / EDGAR financial reports and perform text analysis to compute variables those are explained below. Link to SEC / EDGAR financial reports are given in excel spreadsheet “cik_list.xlsx”. Please add https://www.sec.gov/Archives/ to every cells of column F (cik_list.xlsx) to access link to the financial report. Example: Row 2, column F contains edgar/data/3662/0000950170-98-000413.txt Add https://www.sec.gov/Archives/ to form financial report link i.e. https://www.sec.gov/Archives/edgar/data/3662/0000950170-98-000413.txt

“Management's Discussion and Analysis”: MDA “Quantitative and Qualitative Disclosures about Market Risk”: QQDMR “Risk Factors”: RF

The output dataframe should contain:

  1. All input variables in “cik_list.xlsx”
  2. mda_positive_score
  3. mda_negative_score
  4. mda_polarity_score
  5. mda_average_sentence_length
  6. mda_percentage_of_complex_words
  7. mda_fog_index
  8. mda_complex_word_count
  9. mda_word_count
  10. mda_uncertainty_score
  11. mda_constraining_score
  12. mda_positive_word_proportion
  13. mda_negative_word_proportion
  14. mda_uncertainty_word_proportion
  15. mda_constraining_word_proportion
  16. qqdmr_positive_score
  17. qqdmr_negative_score
  18. qqdmr_polarity_score
  19. qqdmr_average_sentence_length
  20. qqdmr_percentage_of_complex_words
  21. qqdmr_fog_index
  22. qqdmr_complex_word_count
  23. qqdmr_word_count
  24. qqdmr_uncertainty_score
  25. qqdmr_constraining_score
  26. qqdmr_positive_word_proportion
  27. qqdmr_negative_word_proportion
  28. qqdmr_uncertainty_word_proportion
  29. qqdmr_constraining_word_proportion
  30. rf_positive_score
  31. rf_negative_score
  32. rf_polarity_score
  33. rf_average_sentence_length
  34. rf_percentage_of_complex_words
  35. rf_fog_index
  36. rf_complex_word_count
  37. rf_word_count
  38. rf_uncertainty_score
  39. rf_constraining_score
  40. rf_positive_word_proportion
  41. rf_negative_word_proportion
  42. rf_uncertainty_word_proportion
  43. rf_constraining_word_proportion
  44. constraining_words_whole_report

Checkout output data structure spreadsheet for format of your output.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages