forked from CenterForOpenScience/scrapi
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathCHANGELOG
193 lines (153 loc) · 5.41 KB
/
CHANGELOG
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
*********
ChangeLog
*********
0.9.8 (2015-10-20)
==================
- Fix url-gathering for pubmedcentral (after they had a schema change)
0.9.7 (2015-09-18)
==================
- Bugfixes for OAI url gathering
0.9.6 (2015-09-17)
==================
- Elasticsearch now retries requests after connection errors
- Calhoun harvester now ignores that the SSL cert is invalid
- OAI url parser now terminates the regex capture after finding an invalid DOI
character
- harvester invoke task now puts the default start date as settings.DAYS_BACK
days before the end date
- scrapi.requests now exposes the requests.exceptions module
- Update README.md with updated date information
0.9.5 (2015-09-14)
==================
- Clinical Trials harvester now dumps lxml elements to dicionaries in
the otherProperties field
0.9.4 (2015-09-10)
==================
- Biomedcentral harvester now filters out results from the future
0.9.3 (2015-09-01)
==================
- Capture more uris from pubmedcentral harvester
- Update favicons so that all favicons are .icos (fixes IE display bug)
- Fix longname for Portland State University harvester
0.9.2 (2015-09-01)
==================
- fix specification of canonicalUri requirements in schema
- update harvesters to reflect change in specification
0.9.1 (2015-09-01)
==================
- Document __repr__ no longer throw exceptions (allowing errors to be reported)
0.9.0 (2015-08-27)
==================
- Update setup documentation
- add harvesters:
- WHOAS at MBLWHOI Library
- The OAKTrust Digital Repository at Texas A&M
- DigitalCommons@PCOM
- PDXScholar
- ScholarsArchive@OSU
- stricter date/time, email, uri validation
- automated data cleaning (strip out optional values with no semantic information)
- extraction of PDF links for OAI harvesters
- more consistent date formatting
- more consistent DOI extraction
- fix URLs for auto generated OAI harvesters
- OSF harvester now sorts by correct date
0.8.4 (2015-08-21)
==================
- Fix url gathering for datacite harvester
0.8.3 (2015-08-19)
==================
- Add funding information to the crossref harvester
0.8.2 (2015-08-11)
==================
- Add harvester for Washington University Open Scholarship
0.8.1 (2015-08-06)
==================
- Scitech harvester now uses the correct start and end dates
0.8.0 (2015-07-28)
==================
- Add harvesters for Smithsonian Digital Repository, Hacettepe,
Harvard Dataverse, Cyberleninka, Howard University, Scholarworks Umass,
Inter-University Consortium for Political and Social Research
- Python 3 support
- Fix DOI harvesting for OAI harvesters
- Fix OAI harvesters having their otherProperties overwritten when they
defined a new schema.
- Fix resumption tokens in OAI harvesters
- Fix date parsing for DOE schema harvesters
- Stop JSON processor from swallowing exceptions
- Update harvesters to make their schemas more closely match the spec
0.7.6 (2015-07-10)
==================
- Fix language harvesting for DOE and OAI harvesters
0.7.5 (2015-07-10)
==================
- Fix shareok harvester (SSL verification failures ignored)
0.7.4 (2015-07-08)
==================
- Fix probabilistic test failures
0.7.3 (2015-07-07)
==================
- Add Daily SSRN harvester
0.7.2 (2015-06-30)
==================
- Make harvesters run monday-sunday by default
0.7.1 (2015-06-15)
==================
- Base OAI schema now includes DOIs as object URIs
- If a migration begins to fail due to cassandra connection errors, we now
attempt to re-establish the connection
0.7.0 (2015-06-12)
==================
- Add University of Delaware, Harvard Dash,
Data Dryad, and Iowa Research harvesters
- Update skip logic for shareok
- Rewrote cassandra models to partition data to make migrations more efficient
- Added migration script for new models
- Rewrote migrations to take advantage of celery
- Added automatic malformed XML recovery
0.6.6 (2015-06-08)
==================
- Fixed small bug in dryad where documents without URIs were created
0.6.5 (2015-06-08)
==================
- Add harvard-dash, iowa research, and data dryad harvesters
- Make migrations a little more resilient (with autoretries)
- Fix a bug with introspection into function arguments for logging
0.6.0 (2015-05-04)
==================
- Better logging
- Add tests for harvesters
- Add the rename migration script
- Add the delete migration script
- Add the Zenodo, Scholarsbank, SHARE OK, CU Scholar, Calhoun, Caltech
Authors, BHL, and CogPrints harvesters
0.5.0 (2015-04-13)
==================
- Adds the Osf harvester
0.4.0 (2015-04-10)
==================
- Data One now uses XMLHarvester
- PLoS now uses XMLHarvester
- Crossref is no longer limited to collect 1000 documents
- Add the BioMed harvester
- Requests no longer crashes when recording is turned off
- Cassandra now only stores new versions of documents, no more duplicate
versions
- Use the jsonschema library for JSON transformer
- Implement the new schema
0.2.0 (2015-03-16)
==================
- Requests made with scrapi.requests are now recorded and replayed via
cassandra
- Improved test coverage
- Removed website, see erinspace/shareregistration or osf.io/share/ for its
replacement
- Manifest system for harvesters removed and replaced with metaclassing
- Added an img/ folder that stores the favicons of providers
- Implemented the transformer system which refactors how normalize is defined
for xml based harvesters
- Removed the storage module
0.1.0 (2015-03-09)
==================
Initial release