Update: This repository is no longer actively maintained. Development has moved to a new location: New Repository Link
Please refer to the new repository for the latest updates and bug fixes.
This page holds all major files related to the OnTheBooks project at the University of South Carolina as part of their Digital Research Services. It is an extension of the OnTheBooks project at the University of North Carolina at Chapel Hill Libraries.
The project consisted of several phases:
- Marginalia Removal
- Sentence Splitting and Cleaning
- Classification
- Corpus Analysis
More information about these phases can be found in the specific README.md files within those folders.
Since Github might not display IPython notebooks nicely, atleast as of writing this README, a better and static view for this repository and its files (especially the IPython ones) can be found on nbviewer's link for this repository. Nbviewer does not host notebooks, it only renders notebooks available on other websites.