Skip to content
This repository has been archived by the owner on Apr 24, 2023. It is now read-only.

Latest commit

 

History

History
28 lines (20 loc) · 761 Bytes

README.md

File metadata and controls

28 lines (20 loc) · 761 Bytes

CPPNLPLib

Cal Poly Pomona NLP Lib

Tools

Crawler
  Amazon - using Amazon RESTful API (moved to https://github.com/AnakinFoxe/AmazonCrawler)
  Facebook - using Facebook4j
  Twitter - using Twitter4j
  
Translator
  Google - using Google traslate RESTful API

Utils

ChineseSeg - Chinese words segmentation using mmseg4j
FileProcessor - Process batch of files
MapUtil - Include sorting, updating, summation etc. for Map
NGram - N-gram manipulation
Preprocessor - Text preprocessing
SemSimilarity - Word semantic similarity using ws4j
SentenceDetector - Detect sentence boundary
Stemmer - using snowball stemmer
Stopword - Stopword removal for different languages