Skip to content

Commit

Permalink
Updated Corpus Size
Browse files Browse the repository at this point in the history
  • Loading branch information
rll307 committed Sep 24, 2024
1 parent b6f885c commit 69f6880
Showing 1 changed file with 9 additions and 10 deletions.
19 changes: 9 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,15 +22,14 @@ For the current time, the following datasets are available:

### Corpus size

| Doc | Types | Tokens |
|------------------------------|------------|-------------|
| CPI | 2615392 | 4563382 |
| Parliamentary Committees | 7985000 | 108251624 |
| Floor Parliamentary speeches | 3423405 | 322893136 |
| Gov. Programmes | 688342 | 5849807 |
| Inaugural Speeches | 31959 | 86206 |
| Total | 14.744.098 | 441.644.155 |

| Doc | Types | Tokens | Texts |
|------------------------------|---------|-----------|--------|
| CPI | 128089 | 3767972 | 75182 |
| Parliamentary Committees | 386534 | 91466136 | 2565 |
| Floor Parliamentary speeches | 1187492 | 367557793 | 434646 |
| Gov. Programmes | 218783 | 11158384 | 1120 |
| Inaugural Speeches | 15103 | 75918 | 35 |
| Total |19.36001 | 474.026.203| 51.3548|

## Availability

Expand Down Expand Up @@ -105,4 +104,4 @@ I would like to acknowledge CAPES and Alexander Humboldt Foundation for financin
[def2]: mailto:[email protected]
[def3]: mailto:[email protected]
[def4]: https://github.com/dariyash
[def5]: https://orcid.org/0000-0003-3709-4760
[def5]: https://orcid.org/0000-0003-3709-4760

0 comments on commit 69f6880

Please sign in to comment.