Skip to content
jacobthill edited this page Jul 8, 2024 · 2 revisions

Data Limitation Notes

  • SUL-Pub harvests publications for all active Stanford researchers. When a researcher leaves Stanford, SUL-Pub retains the harvested publications but sets the capActive field to False so no additional publications are harvested. The RIALTO orgs app harvests organizational data from the Profiles API. If a researcher is no longer employed by Stanford, the Profiles API will not retain their organizational data. This means that we will have publications from SUL-Pub from researchers who are not found in the authors.csv file exported from the RIALTO orgs app. We can count the total number of publications and contributions but we won't know the School, Department, or Academic Council status of the researcher so reports that make use of these fields will not include publications where these fields are empty. The farther back in time we go, the more likely we are to find researchers that are no longer employees of Stanford e.g. there are more researcher who have left Stanford since 2015 than there are researchers who have left Stanford since 2016. Consequently, the issue is compounded the farther back in time we go. This needs to be accounted for when we interpret reports e.g. Are Medical School publications increasing year over year? The answer to this will be influenced by the fact that we are more likely to throw away publications from earlier years since we don't have organizational data for the authors of those publications.
Clone this wiki locally