Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update knowledge-submissions-past-wikipedia.md #105

Merged
merged 2 commits into from
Jul 2, 2024
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 31 additions & 17 deletions docs/knowledge-submissions-past-wikipedia.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,27 +24,41 @@ Status:
- `denied`: Denied by the legal team, and posted on the [avoided list][avoided].
- `submitted`: Sent to the legal team for review
- `proposed`: The community would like to propose this as a possible place to take knowledge submissions from.
- `reviewed - manually verify`: Legal team has reviewed this domain and while much of its source material meets our open licensing criteria, not all of it does. Each submission from this source must be manually verified to actually be under an appropriate content license or e.g. definitively in the public domain.
hickeyma marked this conversation as resolved.
Show resolved Hide resolved

For the purposes of Knowledge submissions to the InstructLab project, data sourced from items in the `approved` category require no further vetting from the Triage and/or other Maintainer teams. Items in the `reviewed - manually verify` category will require vetting before the submission can be accepted.

To ensure that the data you would like to include in your knowledge submission meets the project licensing criteria, please make sure to talk to the Taxonomy maintainer team *before* you begin work on your submission. We would hate for you to do a great deal of work only to be told that the data source you selected would not work for the project. Please make sure you review the [Getting Started with Knowledge Submissions](https://github.com/instructlab/taxonomy?tab=readme-ov-file#getting-started-with-knowledge-contributions) documentation prior to submitting your request.

| Domain name | Status | Notes |
| :-- | :-- | :-- |
| <https://en.wikipedia.org/wiki/Main_Page> | approved | |
| Wikipedia: <https://en.wikipedia.org/wiki/Main_Page> | approved | |
| Project Gutenberg: <https://www.gutenberg.org/> | approved | Pre-1927 works; public domain under US copyright law |
| <https://www.congress.gov/> | proposed | |
| <https://www.whitehouse.gov/> | proposed | |
| <https://www.senate.gov/> | proposed | |
| <https://www.irs.gov/> | proposed | |
| NASA: <https://www.nasa.gov/> | proposed | See guidelines: <https://www.nasa.gov/nasa-brand-center/images-and-media/> |
| Smithsonian Libraries: <https://library.si.edu/>| proposed | For any material marked \"No Copyright - United States" or "CC0" as described here: <https://library.si.edu/copyright> |
| European Union (EU): <https://european-union.europa.eu/> | proposed | Specifically documents submitted under "public registrars": <https://european-union.europa.eu/principles-countries-history/principles-and-values/access-information_en> |
| Internet Archive: <https://archive.org/> | proposed | Pre-1927 works; public domain under US copyright law |
| Wikisource (library): <https://en.wikisource.org/> | proposed | "free library that anyone can improve" |

### Next steps

1. We have to find the correct legal person to find a way to be the correct point person for this project.
1. Collect suggested places from the community and add them to the above table
1. Work with our legal team to get approvals and denials.
1. Inform the triage team and triagers of the new locations we can or can not accept.
| Wikisource (library): <https://en.wikisource.org/> | approved | "free library that anyone can improve" |
| OpenStax textbooks family of publications <https://openstax.org/subjects> | approved | |
| The Open Organization publications <https://theopenorganization.org/> | approved | |
| The Scrum Guide <https://scrumguides.org/index.html> | approved | |
| <https://www.congress.gov/> | reviewed - manually verify | |
| <https://www.whitehouse.gov/> | reviewed - manually verify | |
| <https://www.senate.gov/> | reviewed - manually verify | |
| <https://www.irs.gov/> | reviewed - manually verify| |
| NASA: <https://www.nasa.gov/> | reviewed - manually verify | See guidelines: <https://www.nasa.gov/nasa-brand-center/images-and-media/> |
| Smithsonian Libraries: <https://library.si.edu/>| reviewed - manually verify | For any material marked \"No Copyright - United States" or "CC0" as described here: <https://library.si.edu/copyright> |
| European Union (EU): <https://european-union.europa.eu/> | reviewed - manually verify | Specifically documents submitted under "public registrars": <https://european-union.europa.eu/principles-countries-history/principles-and-values/access-information_en> |
| Internet Archive: <https://archive.org/> | reviewed - manually verify | Pre-1927 works; public domain under US copyright law |
| PLOS family of open access journals: <https://plos.org/publish/> | reviewed - manually verify | |
| Open Practice Library: <https://openpracticelibrary.com/> | reviewed - manually verify | |
| Cynefin.io wiki: <https://cynefin.io/wiki/Main_Page> | reviewed - manually verify | |
| The Open Education Project: <https://research.redhat.com/blog/research_project/foundations-in-open-source-education/> | reviewed - manually verify | |

### Process steps

1. Collect suggested places from the community by requesting they submit a pull request against this dev doc.
1. Work with our legal team to adjudicate. [@lhawthorn](https://github.com/lhawthorn) is currently the owner of this step, but is happy to educate & empower other folks to do this work.
1. Inform the triage team and triagers of the new locations we can or can not accept. This is currently done via an announcement in the [daily Triager Standup](https://github.com/instructlab/community/blob/main/Collaboration.md#triager-standup) and via a pull request to update the Knowledge Guide in one of the two locations listed below.

- Approved sources: <https://github.com/instructlab/taxonomy/blob/main/docs/KNOWLEDGE_GUIDE.md#accepted-knowledge>
hickeyma marked this conversation as resolved.
Show resolved Hide resolved
- Rejected sources: <https://github.com/instructlab/taxonomy/blob/main/docs/KNOWLEDGE_GUIDE.md#avoid-these-topics>
hickeyma marked this conversation as resolved.
Show resolved Hide resolved

[approved]: https://github.com/instructlab/taxonomy/blob/main/docs/KNOWLEDGE_GUIDE.md#accepted-knowledge
[avoided]: https://github.com/instructlab/taxonomy/blob/main/docs/KNOWLEDGE_GUIDE.md#avoid-these-topics