Skip to content

Commit

Permalink
Merge pull request #135 from gsbdarc/merge_data_blog_fix
Browse files Browse the repository at this point in the history
Closes #133
  • Loading branch information
natalya-patrikeeva authored Jan 25, 2025
2 parents 15b0992 + 31a608c commit 33e87c4
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion docs/blog/posts/merging_data_sets_dask.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ pip install -r requirements.txt
```

## Data Download
We are going to use public EDGAR data from 2013. We will merge two data sets - EDGAR log files and financial statements. The Edgar log files are downloaded by grabbing a list of URLs for each log file from [the SEC website](https://www.sec.gov/files/edgar_logfiledata_thru_jun2013.html){:target="_blank"}.
We are going to use public EDGAR data from 2013. We will merge two data sets - EDGAR log files and financial statements. The Edgar log files are downloaded by grabbing a list of URLs for each log file from [the SEC website](https://www.sec.gov/files/edgar_logfiledata_thru_jun2017.html){:target="_blank"}.

Save the following to a file called `download_logs.py`.
```python title="download_logs.py"
Expand Down Expand Up @@ -243,6 +243,12 @@ def download_financials(year, quarters):
download_financials('2013', quarters)
```

Then, run the saved script to download 4 zip financial files:

```title="Terminal Input"
python download_financials_2013.py
```

Now you should have 4 zip files downloaded in `data/financial2013` directory:

```bash title="Terminal Input"
Expand Down

0 comments on commit 33e87c4

Please sign in to comment.