Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v0.1.0 original #17

Merged
merged 224 commits into from
Jul 15, 2024
Merged
Show file tree
Hide file tree
Changes from 192 commits
Commits
Show all changes
224 commits
Select commit Hold shift + click to select a range
128c132
Some updates to text functions
henriquesposito Mar 9, 2023
86c28de
Added urgency presentation
henriquesposito May 23, 2023
f3f76e5
Moved `extract_treaties()` function to poldis
henriquesposito Feb 20, 2024
cc1ae90
Added `extract_from_pdf()` draft function
henriquesposito Feb 21, 2024
78b8b56
Removed non standard files in package
henriquesposito Feb 26, 2024
23686b7
Small updates to extract script
henriquesposito Feb 26, 2024
49ad7bd
Reorganized scripts, updated documentation, and functions
henriquesposito Feb 26, 2024
414ade9
Added contributor names to description
henriquesposito Feb 27, 2024
a541a08
Added memo one as a case study, for now
henriquesposito Mar 10, 2024
1d5d065
Cleaned mini urgency dictionary in case study
henriquesposito Mar 11, 2024
84dd49f
Updated `annotate_text()`
henriquesposito Mar 12, 2024
8568bf0
Added `extract_promise()` function
henriquesposito Mar 12, 2024
b099f11
First attempt to extract subjects from text based on entity recognition
henriquesposito Mar 15, 2024
71da392
Updated description file
henriquesposito Mar 17, 2024
adcd55f
Added memo 2 to case study folder
henriquesposito Mar 17, 2024
36c2424
Added `extract_related_terms()`to segment functions
henriquesposito Mar 17, 2024
e88f55c
Added `get_urgency()` function
henriquesposito Mar 17, 2024
364f639
Updated documention
henriquesposito Mar 17, 2024
d42af7c
Fixed knitting issue with Memo 2
henriquesposito Mar 17, 2024
30c12d0
Re-build README.md
Feb 7, 2022
8e422c2
Added some frequency adverbs
jhollway Feb 26, 2024
3087d0a
Added some temporal adverbs
jhollway Feb 26, 2024
6137a8f
Added adverbs of degree
jhollway Feb 26, 2024
56aaa99
Re-build README.md
Feb 26, 2024
16d6f22
Moved adverbs to urgency script in develop branch
henriquesposito Feb 26, 2024
86d3e4b
Ignore ds store
jhollway Mar 18, 2024
b263572
Added frequency adverbs
jhollway Mar 18, 2024
fcbbde7
Added first time adverb weightings
jhollway Mar 18, 2024
d0b8e9a
Added some words for extracting promises
jaeltan Mar 19, 2024
b8a73fb
Updated `get_urgency()` to account for word weights
henriquesposito Mar 20, 2024
44fe8f6
Updated `extract_promises()` to be more concise
henriquesposito Mar 20, 2024
4ff319c
Updated `extract_subjects()` to work at the sentence level (i.e. for …
henriquesposito Mar 20, 2024
2dad577
Made `get_urgency()`more concise and also to work at the sentence level
henriquesposito Mar 20, 2024
93d6590
Updated MEMO 2 for new changes
henriquesposito Mar 20, 2024
687ca91
First attempt to add commitment level to urgency
henriquesposito Mar 20, 2024
e2f4718
Updated `get_urgency()` to be more concise and faster; the function n…
henriquesposito Mar 21, 2024
460a837
Updated segment to add "class" to objects for later checking/identifi…
henriquesposito Mar 21, 2024
5a92d1c
Updates to case study
henriquesposito Mar 21, 2024
0a6d710
Effort at segmenting text
jaeltan Mar 21, 2024
56011a1
Updated documentation across package for consistency
henriquesposito Mar 21, 2024
04804b7
Updated arguments in `extract_subjects()` to make the function more f…
henriquesposito Mar 21, 2024
f913aee
Updated `get_urgency()` to normalize urgency score by the median numb…
henriquesposito Mar 21, 2024
d5ca67f
Added examples to functions across package
henriquesposito Mar 21, 2024
bb18ec6
some preliminary adjustments to dictionary words and weightings
jaeltan Mar 22, 2024
e53f253
Updated examples, fixed small bugs, and added more todo for `get_urge…
henriquesposito Mar 24, 2024
e56c6e2
Updated Memo 3 for meeting
henriquesposito Mar 24, 2024
6c9e7ea
Updated Memo 3
henriquesposito Mar 24, 2024
e250bd2
Ignored "DS_Store"
henriquesposito Mar 24, 2024
126baa2
Added data on adverbs and adjectives form SO-CALL in package
henriquesposito Mar 25, 2024
5645eea
Added codes for adjectives and adverbs when coding urgency
henriquesposito Mar 25, 2024
018c96c
Updated urgency to score sentences based on lemmas instead of full words
henriquesposito Mar 25, 2024
a4498c0
Updated description
henriquesposito Mar 26, 2024
42c28dc
Updated documentation and examples
henriquesposito Mar 26, 2024
37b3924
`extract_subjects()` also accounts for nouns alongside entities. `ext…
henriquesposito Mar 26, 2024
7a80f10
Added user messages for `get_urgency()`
henriquesposito Mar 26, 2024
3cb26e9
reformulated code for segmenting
jaeltan Mar 27, 2024
d1bce38
Change segmenting to remove problems first before detecting promises
jaeltan Mar 27, 2024
b07b629
added more words for detecting new segments
jaeltan Mar 27, 2024
4d9a7a5
Updated documentation and examples
henriquesposito Mar 27, 2024
5dad533
Updated `annotate_text()` to add variables for adverbs, adjectives, n…
henriquesposito Mar 27, 2024
f3daa3a
`extract_subjects()` and `extract_related_terms()` match only nouns a…
henriquesposito Mar 27, 2024
0b1a058
Updated internal data by adding lemmas to adjectives and adverbs
henriquesposito Mar 27, 2024
9d4f65a
Updated adverbs and adjectives to use lemmas and match only when word…
henriquesposito Mar 27, 2024
9b0e5dc
Added sentence connector data
henriquesposito Mar 27, 2024
feb9c13
Added Memo 4 for meeting
henriquesposito Mar 27, 2024
13a0b47
Added to list of words for segmenting texts
jaeltan Mar 28, 2024
19003c0
Added first version of `rank_urgent_topics()` function to rank urgent…
henriquesposito Mar 28, 2024
908d3c2
Renamed functions to rank topics `get_urgency_rank()` for consistency
henriquesposito Mar 29, 2024
2dc545a
Added WHO texts to Case Study file
henriquesposito Apr 5, 2024
5a69131
Fixed small bug with promises extraction
henriquesposito Apr 12, 2024
4786fcd
added sample texts from UNGDC database to case study
jaeltan Apr 18, 2024
19806ea
modified extract_promises to identify problems and promises so that u…
jaeltan Apr 18, 2024
68e3aa0
Small updates to `extract_promises()`
henriquesposito Apr 18, 2024
cad9f60
`get_urgency()` is now more flexible and works when `extract_related_…
henriquesposito Apr 18, 2024
ce0d54a
Added script on inter-coder reliability
henriquesposito Apr 18, 2024
2e97a7c
Small updates to inter-coder reliability script
henriquesposito Apr 18, 2024
5870092
updated extract_promises to remove negative sentences
jaeltan Apr 18, 2024
69a665f
updated extract_promises to retain only lemmas, adverbs, and adjectiv…
jaeltan Apr 18, 2024
b4be70b
updated namespace
jaeltan Apr 18, 2024
23cf2e6
Updated get_urgency to work with new format of extract_promises
jaeltan Apr 18, 2024
40f4562
renamed segments in extract_promises output
jaeltan Apr 18, 2024
6f9607a
fixed issue with inflated degree scores
jaeltan Apr 18, 2024
2451d4a
fixed issue with getting topics in get_urgency
jaeltan Apr 19, 2024
b6b888c
added code to filter out past sentences for promises
jaeltan Apr 19, 2024
266cd2d
Made some corrections to the way negative and past sentences are deal…
jaeltan Apr 19, 2024
16abbc3
Added memo for updates to extract_promises and get_urgency
jaeltan Apr 22, 2024
591d314
Changes to words for segmenting text
jaeltan Apr 22, 2024
ca8c293
Updated `extract_related_terms()` to rely on `keyATM, keyword-assiste…
henriquesposito Apr 25, 2024
fee39d6
Fixed messages with `annotate_text()`
henriquesposito Apr 26, 2024
e8a7e3e
Fixed issues and updated `extract_promises()` by simplifying the func…
henriquesposito Apr 26, 2024
45a795b
Fixed small issues with `get_urgency()`
henriquesposito Apr 26, 2024
40058b5
Removed `{rlang}` from description file
henriquesposito Apr 26, 2024
4d02e42
Added first draft of urgency methods
henriquesposito Apr 28, 2024
aeb124d
Updated `extract_promises()` to remove duplicates
henriquesposito Apr 29, 2024
181266b
Small updates to documentation for `extract-promises()`
henriquesposito Apr 29, 2024
0720f4c
updates to extract_promises to remove NAs, past sentences, and promis…
jaeltan Apr 29, 2024
2f861ab
Added COP21 Paris leaders meeting data
henriquesposito May 10, 2024
c43ef90
Added first version of website
jaeltan May 15, 2024
a54d913
Added US data and RMD with data/figures to case study
henriquesposito May 15, 2024
e1d3ece
Added EU speeches data to Case Study
henriquesposito May 17, 2024
36d1c4c
Updated markdown in case study folder to include comparison between s…
henriquesposito May 17, 2024
fbc974c
Updated `extract_promises()` to go back to working only at the senten…
henriquesposito May 21, 2024
4a11a5b
removed variables related to segments and added more matches for dete…
jaeltan May 21, 2024
78ab6af
Added more word matches for filtering out statements that are not pro…
jaeltan May 21, 2024
d512a23
Changes to word matches for non-promises
jaeltan May 22, 2024
c47c13d
Added urgency dictionary to case study for now
jaeltan May 24, 2024
ba2ae3a
Updated and reorganized text tools
henriquesposito May 30, 2024
7212886
Remove unused packages from description
henriquesposito May 31, 2024
2b025f3
Renamed and updated promises script
henriquesposito May 31, 2024
1f15163
Updated how topics are extracted from texts
henriquesposito May 31, 2024
9dd8878
Updated documentation for new changes
henriquesposito May 31, 2024
fcc6cc9
Updated package data by removing unecessary datasets and adding CAP t…
henriquesposito May 31, 2024
87afe15
Added additional files to case study, for now
henriquesposito May 31, 2024
a17a237
Updated variable names in urgency dictionary
henriquesposito May 31, 2024
7180fc7
Updated `get_urgency()` to work with new dictionary and be more flexible
henriquesposito Jun 3, 2024
356da89
Removed `get_urgency_rank()` for now
henriquesposito Jun 3, 2024
7598c28
Fixed documentation issue with promises
henriquesposito Jun 3, 2024
ea5400f
Updated R dependency to 3.5. to avoid test issues
henriquesposito Jun 3, 2024
c65f698
Updated documentation for `extract_promises()`
henriquesposito Jun 3, 2024
e64d36a
Removed base R pipe in favor of `{dplyr}` pipe
henriquesposito Jun 3, 2024
909f09d
Updated documentation for `get_urgency()`
henriquesposito Jun 3, 2024
f25cf4d
Added basic pkgdown website that works
henriquesposito Jun 4, 2024
31bf4f9
Added {`pkgdown`} to description file
henriquesposito Jun 4, 2024
6db3a8e
Added "case study" files to .Rbuildignore
henriquesposito Jun 4, 2024
1372dd8
Renamed promises function `select_promises()`
henriquesposito Jun 4, 2024
e44be85
Renamed topic functions `gather_topics()` and `gather_related_terms()`
henriquesposito Jun 4, 2024
1590b88
Do not run examples for `extract_similarities()` that contain warning…
henriquesposito Jun 4, 2024
afb9d5f
Updated package documentation for new changes
henriquesposito Jun 4, 2024
b732070
Small updates to documentation for text tools
henriquesposito Jun 4, 2024
cc2dfa4
Added new markdown to illustrate new changes
henriquesposito Jun 6, 2024
5ad009e
Updated documentation and fixed issues with examples
henriquesposito Jun 7, 2024
7d6c998
Updated GitHub actions to check code coverage and build website
henriquesposito Jun 7, 2024
744be41
Updated NEWS
henriquesposito Jun 7, 2024
2934044
Updated description file
henriquesposito Jun 7, 2024
d14c1f8
removed dafault bump from push release workflow file
henriquesposito Jun 7, 2024
65013bf
Updated readme files by adding getting started text
henriquesposito Jun 7, 2024
0e33d67
Updated README file
henriquesposito Jun 7, 2024
b7f1a77
Merge branch 'master' into develop
henriquesposito Jun 7, 2024
9ba9628
Updated tests to avoid check failures
henriquesposito Jun 7, 2024
6818fcc
Fixed some small code factor issues
henriquesposito Jun 7, 2024
405945c
Updated urgency dictionary csv file
jaeltan Jun 7, 2024
d1d8faf
Moved case stiudy files to case-study branch
henriquesposito Jun 10, 2024
8ffea73
Corrected function name for promises data and added CAP citation
jaeltan Jun 10, 2024
dcccf94
Made some minor edits to matched words for selecting promises
jaeltan Jun 10, 2024
b84b49b
Updated urgency words dictionary in sysdata
jaeltan Jun 10, 2024
1dc06bb
Updated CAP_topics in sysdata to include British and American spellin…
jaeltan Jun 11, 2024
3e2d10f
Updated README to make it more concise
henriquesposito Jun 11, 2024
7b569e4
Fixed dependency issues with `extract_similarities()`
henriquesposito Jun 11, 2024
148e338
Fixed issues with identifying similar names in `extract_speaker()`
henriquesposito Jun 11, 2024
2bf586e
Updated normalizing values in `get_urgency()`
henriquesposito Jun 11, 2024
20a198e
Updated package documentation to improve consistency
henriquesposito Jun 11, 2024
323e559
Fixed a few code issues from `{lintr}` checks
henriquesposito Jun 11, 2024
46483ce
Corrected some missing British spelling and extra spaces found in CAP…
jaeltan Jun 11, 2024
3a1c6c8
Updated internal data
henriquesposito Jun 12, 2024
8a824d8
Updated internal data once more
henriquesposito Jun 12, 2024
ec05edf
Updates to readme
jaeltan Jun 13, 2024
cfc82ed
Updated readme file
henriquesposito Jun 13, 2024
a086741
Corrected some spelling and phrasing
jaeltan Jun 13, 2024
1231de7
Delete .DS_Store
henriquesposito Jun 13, 2024
a520fb8
Removed "case study" from .Rbuidignore
henriquesposito Jun 13, 2024
88bf58a
Moved a few imported packages to depends using `thisRequires()`
henriquesposito Jun 13, 2024
d23c2bc
addressed some PR comments
jaeltan Jun 13, 2024
143853d
Updated logo for vectorized version
henriquesposito Jun 13, 2024
7bed39e
Fixed more issues with readme
henriquesposito Jun 13, 2024
23aaaa8
Renamed `extract_similarities()` to `extract_text_similarities()` and…
henriquesposito Jun 13, 2024
2a7287a
Moved `{stringdist}` pachage to suggests
henriquesposito Jun 13, 2024
b7d69eb
Updated documentation
henriquesposito Jun 13, 2024
907b17b
Small updates to readme
henriquesposito Jun 13, 2024
6e88a66
Replaced "for loops" with vetorised version for efficiency
henriquesposito Jun 13, 2024
1b2d323
Renamed `load_pdf()` function to `read_pdf()`
henriquesposito Jun 13, 2024
c713cd4
Moved `thisRequires()`function to utils file
henriquesposito Jun 13, 2024
d760478
Removed for loops in text tools functions
henriquesposito Jun 14, 2024
9cc8a90
Removed for loops in topic functions
henriquesposito Jun 14, 2024
c4b2a5d
Fixed issues with declaring custom lists as dictionaries
henriquesposito Jun 14, 2024
2ca5884
Updated description branch
henriquesposito Jun 14, 2024
5dd1c00
Updated `extract_location()` to use NLP instead to extract geographic…
henriquesposito Jun 14, 2024
5dd1933
Updated documentation for new changes
henriquesposito Jun 14, 2024
f132a78
Moved `.clean_token()` function to utils
henriquesposito Jun 14, 2024
2dac784
Fixed issues related to how location and names are extracted and grouped
henriquesposito Jun 14, 2024
01f2785
Updated tests for new changes with `extract_names()` and `extract_loc…
henriquesposito Jun 14, 2024
ec6b155
Commented out examples for `extract_locations()`
henriquesposito Jun 14, 2024
522a35f
Added tests for `gather_topics()`
henriquesposito Jun 14, 2024
33bd91f
Added tests for urgency
henriquesposito Jun 14, 2024
1f040a3
Added tests for `select_promises()`
henriquesposito Jun 14, 2024
055c962
Added in line comments to `select_promises()` function
henriquesposito Jun 14, 2024
cd0cea5
Updated how lists and vectors are identified in `gather_topics()`
henriquesposito Jun 14, 2024
8c368c7
Fixed mispelling in `split_text()` documentation
henriquesposito Jun 14, 2024
9d1f255
Updated "return" documentation for all functions
henriquesposito Jun 14, 2024
3956cea
Updates NEWS for consistency
henriquesposito Jun 14, 2024
27868ce
Updated documentation for functions to clarify various issues related…
henriquesposito Jun 14, 2024
a524e16
Updated pkgdown.yml documentation
henriquesposito Jun 14, 2024
8dcb041
Added in line comments for `extract_context()`
henriquesposito Jun 14, 2024
85f6eee
Updated NEWS file
henriquesposito Jun 17, 2024
2183b14
Updated documentation across package to fix spelling mistakes
henriquesposito Jun 17, 2024
e10334e
Updated logos in package and in website to make these consistent
henriquesposito Jun 17, 2024
c4725d4
Fixed notes with win devel checks
henriquesposito Jun 17, 2024
c63cfc8
Updated decription file to avoid CRAN check issues
henriquesposito Jun 17, 2024
78df503
More updates to description file
henriquesposito Jun 17, 2024
c0e7cc6
Small updates to description
henriquesposito Jun 17, 2024
a3b7f3c
Small changes in description file
henriquesposito Jun 17, 2024
fa97186
More updates to description file
henriquesposito Jun 17, 2024
f4d4891
Updates to description file
henriquesposito Jun 17, 2024
cd05aa1
`gather_topic()` now returns a vector to facilitate working with 'urg…
henriquesposito Jun 27, 2024
2f53fe0
Updated urgency internal data
henriquesposito Jul 2, 2024
2796f19
Updated number of rows for urgency normalization
henriquesposito Jul 2, 2024
6ecd6be
Added argument to remove missing abservation in `select_promises()`
henriquesposito Jul 2, 2024
1edf73f
More examples for `get_urgency()`
henriquesposito Jul 2, 2024
4b205e1
Updated date and suggested packages in description file
henriquesposito Jul 2, 2024
2fd78f7
Updated NEWS for some more changes
henriquesposito Jul 2, 2024
032856a
Added plotting and summary methods for "urgency" and "topics" classes
henriquesposito Jul 2, 2024
f39adb3
Updated examples for `select_topics()`
henriquesposito Jul 2, 2024
bff1ecd
Fixed bugs with plot and summary methods for "topics" and "urgency"
henriquesposito Jul 2, 2024
7c44f4b
Fixed issues with internal data for urgency terms
henriquesposito Jul 3, 2024
e19907e
Made `gather_topics()` function more strict with topic matching by on…
henriquesposito Jul 3, 2024
af6f897
Updated `get_urgency()` function to also be more strict with the matc…
henriquesposito Jul 3, 2024
fcbb72d
Small updates to internal data
henriquesposito Jul 3, 2024
48e14d7
Removed uneccesary suggested packages in description
henriquesposito Jul 3, 2024
33551f9
`.clean_token()` helper function does not remove punctuation or stopw…
henriquesposito Jul 3, 2024
7a390b5
Fixed ordering bug with `get_urgency()`
henriquesposito Jul 3, 2024
8c23719
Small updates to `gather_topic()` to make it more efficient
henriquesposito Jul 3, 2024
4b96a9f
Fixed test issues with new changes in functions
henriquesposito Jul 3, 2024
e6eaa9f
Moved `{tm}` package to suggests
henriquesposito Jul 3, 2024
1e108ea
Updated date in NEWS and description
henriquesposito Jul 15, 2024
5ad0f81
Built website to check
henriquesposito Jul 15, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,7 @@
^.github$
^.gitignore$
^inst$
.DS_Store
henriquesposito marked this conversation as resolved.
Show resolved Hide resolved
^_pkgdown\.yml$
^docs$
^pkgdown$
5 changes: 4 additions & 1 deletion .github/workflows/prchecks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@ on:
pull_request:
branches:
- master
- main

name: Binary checks

Expand Down Expand Up @@ -41,6 +40,7 @@ jobs:
any::rcmdcheck
any::lintr
any::spelling
any::covr
needs: check

- uses: r-lib/actions/check-r-package@v2
Expand All @@ -63,6 +63,9 @@ jobs:
name: ${{ matrix.config.asset_name }}
path: build/

- name: Calculate code coverage
run: Rscript -e "covr::codecov()"

- name: Lint
run: lintr::lint_package()
shell: Rscript {0}
Expand Down
63 changes: 48 additions & 15 deletions .github/workflows/pushrelease.yml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
on:
push:
branches:
- main
- master

name: Check and release
Expand Down Expand Up @@ -39,6 +38,7 @@ jobs:
cache-version: 2
extra-packages: |
any::rcmdcheck
any::covr
any::remotes
needs: check

Expand All @@ -62,6 +62,9 @@ jobs:
name: ${{ matrix.config.asset_name }}
path: build/

- name: Calculate code coverage
run: Rscript -e "covr::codecov()"

release:
name: Bump version and release
needs: build
Expand All @@ -78,7 +81,6 @@ jobs:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
WITH_V: true
DEFAULT_BUMP: patch

- name: Checkout two
uses: actions/checkout@v2
Expand All @@ -101,7 +103,7 @@ jobs:
run: ls -R

- name: Rename Mac release
run: mv ./macOS/*.tgz pkg_macOS.tgz
run: mv ./macOS/*.tgz poldis_macOS.tgz

- name: Upload Mac binary
id: upload-mac
Expand All @@ -110,12 +112,12 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ steps.create_release.outputs.upload_url }}
asset_path: pkg_macOS.tgz
asset_name: pkg_macOS.tgz
asset_path: poldis_macOS.tgz
asset_name: poldis_macOS.tgz
asset_content_type: application/zip

- name: Rename Linux release
run: mv ./linuxOS/*.tar.gz pkg_linuxOS.tar.gz
run: mv ./linuxOS/*.tar.gz poldis_linuxOS.tar.gz

- name: Upload Linux binary
id: upload-lin
Expand All @@ -124,12 +126,12 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ steps.create_release.outputs.upload_url }}
asset_path: pkg_linuxOS.tar.gz
asset_name: pkg_linuxOS.tar.gz
asset_path: poldis_linuxOS.tar.gz
asset_name: poldis_linuxOS.tar.gz
asset_content_type: application/zip

- name: Rename Windows release
run: mv ./winOS/*.zip pkg_winOS.zip
run: mv ./winOS/*.zip poldis_winOS.zip

- name: Upload Windows binary
id: upload-win
Expand All @@ -138,8 +140,8 @@ jobs:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
upload_url: ${{ steps.create_release.outputs.upload_url }}
asset_path: pkg_winOS.zip
asset_name: pkg_winOS.zip
asset_path: poldis_winOS.zip
asset_name: poldis_winOS.zip
asset_content_type: application/zip

render:
Expand All @@ -151,16 +153,17 @@ jobs:

- uses: r-lib/actions/setup-r@v2

- uses: r-lib/actions/setup-pandoc@v1
- uses: r-lib/actions/setup-pandoc@v2

- uses: r-lib/actions/setup-r-dependencies@v2
with:
cache-version: 2
extra-packages: |
any::rcmdcheck
any::covr
any::remotes
needs: check

- name: Install package
run: R CMD INSTALL .

# Render README.md using rmarkdown
- name: render README
run: Rscript -e 'rmarkdown::render("README.Rmd", output_format = "md_document")'
Expand All @@ -170,3 +173,33 @@ jobs:
git add README.md
git commit -m "Re-build README.md" || echo "No changes to commit"
git push origin master || echo "No changes to commit"

pkgdown:
name: Build and deploy website
needs: render
runs-on: macOS-latest
env:
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
steps:
- uses: actions/checkout@v2

- uses: r-lib/actions/setup-r@v2

- uses: r-lib/actions/setup-pandoc@v2

- uses: r-lib/actions/setup-r-dependencies@v2
with:
cache-version: 2
extra-packages: |
any::rcmdcheck
any::pkgdown
needs: check

- name: Install package
run: R CMD INSTALL .

- name: Deploy package
run: |
git config --local user.email "[email protected]"
git config --local user.name "GitHub Actions"
Rscript -e 'pkgdown::deploy_to_branch(new_process = FALSE)'
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,3 +2,5 @@
.Rhistory
.RData
.Ruserdata
.DS_Store
docs
47 changes: 31 additions & 16 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,13 +1,22 @@
Package: poldis
Type: Package
Title: Tools for Analyzing Political Discourse
Version: 0.0.3
Date: 2022-09-25
Author: person(given = "Henrique",
family = "Sposito",
role = c("cre", "aut", "ctb"),
email = "[email protected]",
ORDCID = "0000-0003-3420-6085")
Version: 0.1.0
Date: 2024-06-14
Author: c(person(given = "Henrique",
family = "Sposito",
role = c("cre", "aut", "ctb"),
email = "[email protected]",
ORDCID = c("IHEID", "0000-0003-3420-6085")),
person(given = "James",
family = "Hollway",
role = c("ctb"),
email = "[email protected]",
comment = c("IHEID", ORCID = "0000-0002-8361-9647")),
person(given = "Jael",
family = "Tan",
role = "ctb",
comment = c("IHEID", ORCID = "0000-0002-6234-9764")))
Maintainer: Henrique Sposito <[email protected]>
Description: Tools for analyzing political discourse beyond official speeches.
License: MIT + file LICENSE
Expand All @@ -16,18 +25,24 @@ Imports:
stringr,
purrr,
stringi,
messydates,
remotes,
stringdist,
entity
quanteda,
spacyr,
tm,
textstem,
tidyr,
stringdist
Suggests:
rmarkdown,
testthat,
covr
Remotes:
trinker/entity
RoxygenNote: 7.2.0
covr,
tesseract,
pkgdown,
quanteda.textstats,
keyATM,
messydates,
pdftools
RoxygenNote: 7.3.1
Encoding: UTF-8
LazyData: True
Depends:
R (>= 2.10)
R (>= 3.5.0)
29 changes: 25 additions & 4 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,16 +1,37 @@
# Generated by roxygen2: do not edit by hand

export(annotate_text)
export(extract_context)
export(extract_location)
export(extract_date)
export(extract_locations)
export(extract_match)
export(extract_speaker)
export(extract_split)
export(extract_names)
export(extract_text_similarities)
export(extract_title)
export(gather_related_terms)
export(gather_topics)
export(get_urgency)
export(read_pdf)
export(select_promises)
export(split_text)
import(dplyr)
import(quanteda)
import(spacyr)
importFrom(dplyr,"%>%")
importFrom(dplyr,distinct)
importFrom(entity,person_entity)
importFrom(dplyr,filter)
importFrom(dplyr,group_by)
importFrom(dplyr,mutate)
importFrom(dplyr,select)
importFrom(dplyr,summarise)
importFrom(dplyr,summarize)
importFrom(dplyr,ungroup)
importFrom(purrr,map_chr)
importFrom(stringdist,stringsimmatrix)
importFrom(stringi,stri_trans_general)
importFrom(stringr,str_detect)
importFrom(stringr,str_extract)
importFrom(stringr,str_extract_all)
importFrom(stringr,str_remove_all)
importFrom(stringr,str_squish)
importFrom(tidyr,unite)
21 changes: 21 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,24 @@
# poldis 0.1.0

2024-06-14
henriquesposito marked this conversation as resolved.
Show resolved Hide resolved

## Package

- Closed #3 by adding code coverage and code factor (and badges) to package
- Closed #7 by adding a getting started section in README
- Closed #8 by adding a `{pkgdown}` website

## Functions

- Updated text tools
- Renamed old text functions to start with "extract_" (`extract_speaker()`, `extract_title()`, `extract_context()`, `extract_date()`, `extract_location()`, `extract_match()`)
henriquesposito marked this conversation as resolved.
Show resolved Hide resolved
- Closed #11 by adding `extract_similarities()` to fuzzy match texts
henriquesposito marked this conversation as resolved.
Show resolved Hide resolved
- Added `annotate_text()` function to classify words or sentences using NLP
- Added `load_pdfs()` function to help users loading readable and non-readable text files from PDFs
- Closed #14 by adding `select_promises()` function to extract future promises in text
- Closed #15 by adding `gather_topics()` and `gather_related_terms()` for assigning topics to texts
- Added `get_urgency()` function for coding urgency from text

# poldis 0.0.3

2022-09-25
Expand Down
47 changes: 47 additions & 0 deletions R/promises.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
#' Select future promises from political discourses
#'
#' Political promises are statements in which actors express their
#' intent or commitment to take political action in the future.
#' @param .data A (annotated) data frame or text vector.
#' For data frames, function will search for "text" variable.
#' For annotated data frames, please declare an annotated data frame
#' at the sentence level.
#' @importFrom stringr str_detect str_remove_all
#' @importFrom dplyr mutate distinct %>%
#' @examples
#' #select_promises(US_News_Conferences_1960_1980[1:2,3])
#' @return A data frame with syntax information by sentences and
#' a variable identifying which of these sentences are promises.
#' @export
select_promises <- function(.data) {
tags <- sentence <- lemmas <- promises <- NULL
if (inherits(.data, "data.frame")) {

Check warning on line 18 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=18,col=3,[brace_linter] Either both or neither branch in `if`/`else` should use curly braces.

Check warning on line 18 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=18,col=3,[brace_linter] Either both or neither branch in `if`/`else` should use curly braces.
if ("token_id" %in% names(.data))
stop("Please declare a text vector or an annotated data frame at the sentence level.")

Check warning on line 20 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=20,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 92 characters.

Check warning on line 20 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=20,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 92 characters.
} else .data <- suppressMessages(annotate_text(.data, level = "sentences"))

Check warning on line 21 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=21,col=36,[object_usage_linter] no visible global function definition for 'annotate_text'

Check warning on line 21 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=21,col=36,[object_usage_linter] no visible global function definition for 'annotate_text'
out <- .data %>%

Check warning on line 22 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=22,col=16,[object_usage_linter] no visible global function definition for '%>%'

Check warning on line 22 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=22,col=16,[object_usage_linter] no visible global function definition for '%>%'

Check warning on line 22 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=22,col=16,[object_usage_linter] no visible global function definition for '%>%'

Check warning on line 22 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=22,col=16,[object_usage_linter] no visible global function definition for '%>%'
dplyr::mutate(lemmas = tolower(lemmas),
promises = ifelse(stringr::str_detect(tags, "PRP MD ")|

Check warning on line 24 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=24,col=73,[infix_spaces_linter] Put spaces around all infix operators.

Check warning on line 24 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=24,col=73,[infix_spaces_linter] Put spaces around all infix operators.
stringr::str_detect(lemmas,
"going to|go to |need to|ready to|

Check warning on line 26 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=26,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 92 characters.

Check warning on line 26 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=26,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 92 characters.
|is time to|commit to|promise to|have to|
|plan to|intend to|let 's|let us|urge|
|require|want to"),
paste(sentence), NA), # detect promises
promises = ifelse(stringr::str_detect(promises, " not |
|yesterday|last week|
|last month|last year|
|thank|honor|honour|
|applause|greet|laugh|
|privilege to|great to|
|good to be|good to see") |

Check warning on line 37 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=37,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 83 characters.

Check warning on line 37 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=37,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 83 characters.
stringr::str_detect(tags, "MD VB( RB)? VBN|

Check warning on line 38 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=38,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 81 characters.

Check warning on line 38 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=38,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 81 characters.
|VBD( RB)? VBN|VBZ( RB)? VBN|

Check warning on line 39 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for macOS-latest

file=R/promises.R,line=39,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 86 characters.

Check warning on line 39 in R/promises.R

View workflow job for this annotation

GitHub Actions / Build for ubuntu-20.04

file=R/promises.R,line=39,col=81,[line_length_linter] Lines should not be more than 80 characters. This line is 86 characters.
|VBD( RB)? JJ|PRP( RB)? VBD TO|
|VBN( RB)? VBN"),
henriquesposito marked this conversation as resolved.
Show resolved Hide resolved
# Combinations of NLP tags to select
NA, promises)) %>%
dplyr::distinct()
class(out) <- c("promises", class(out))
out
}
Binary file modified R/sysdata.rda
Binary file not shown.
Loading
Loading