Urduhack is a NLP library for urdu language. It comes with a lot of battery included features to help you process Urdu data in the easiest way possible.
You can reach out core contributor Mr Ikram Ali @ https://github.com/akkefa
- Academic users Easier experimentation to prove their hypothesis without coding from scratch.
- NLP beginners Learn how to build an NLP project with production level code quality.
- NLP developers Build a production level application within minutes.
- Normalization
- Preprocessing
- Tokenization
- Pipeline Module
- Models
- Pos tagger
- Lemmatizer
- Name entity recognition
- Sentimental analysis
- Image to text
- Question answering system
- Datasets loader
Urduhack officially supports Python 3.6–3.7, and runs great on PyPy.
Installing with tensorflow cpu version.
$ pip install urduhack[tf]
Installing with tensorflow gpu version.
$ pip install urduhack[tf-gpu]
import urduhack
# Downloading models
urduhack.download()
nlp = urduhack.Pipeline()
text = ""
doc = nlp(text)
for sentence in doc.sentences:
print(sentence.text)
for word in sentence.words:
print(f"{word.text}\t{word.pos}")
for token in sentence.tokens:
print(f"{token.text}\t{token.ner}")
Fantastic documentation is available at https://urduhack.readthedocs.io/
Documentation | |
---|---|
Installation | How to install Urduhack and download models |
Quickstart | New to Urduhack? Here's everything you need to know! |
API Reference | The detailed reference for Urduhack's API. |
Contribute | How to contribute to the code base. |
Special thanks to everyone who contributed to getting the Urduhack to the current state.
Thank you to all our backers! 🙏 [Become a backer]
Support this project by becoming a sponsor. [Become a sponsor]
Code released under the MIT License.