Natural language processing support for Pandas DataFrames.
Text Extensions for Pandas adds extension types to Pandas DataFrames for representing natural language data, plus a library of functions for working with these extension types.
- Connect features with regions of a document
- Visualize the internal data of your NLP application
- Analyze the accuracy of your models
- Combine the results of multiple models
- Represent BERT embeddings in a Pandas series
- Store logits and other feature vectors in a Pandas series
- Store an entire time series in each cell of a Pandas series
- SpaCy
- Transformers
- IBM Watson Natural Language Understanding
- IBM Watson Discovry Table Understanding
For examples of how to use the library, take a look at the notebooks in this directory.
API documentation can be found at https://text-extensions-for-pandas.readthedocs.io/en/latest/
The source code for Text Extensions for Pandas is available at https://github.com/CODAIT/text-extensions-for-pandas.
We welcome code and documentation contributions! See the README file for more information on contributing.