released this
23 May 18:22
474d74c Add a build release task
97208f8 Write a simple README
ed0c97d Don't include symlinks in the corpus
a8ae6a2 Split words manually instead of by regexp
cf33283 Check that this is a text file before opening
45f20a4 Document use with non-English documents
2b745c5 Replace deprecated use of ioutil with io
71d71f7 Backfill stoplist tests
911c082 Move corpus parsing out of main
85c23ad Backfill similarity tests
9b10bb2 If no query file's provided, read from STDIN
0913d52 Add --no-stemming flag to skip stemming
b41825e Add --no-stoplist flag to skip stoplist
adb900a Default to searching current directory
1fa579d License with the GPLv3
68c9692 Add a simple Makefile
69dee47 Include a simple manual page
2be26f1 Move code into a lib directory
da14b80 Save memory by clearing term freqs after TF-IDF
30caf51 Recursively search directories for files
25f1c88 Add a --verbose flag
750d5f2 Add --omit-target flag to skip target in results
f81720b Just print errors to stderr, don't log
08612d3 Only search files that seem to contain text
144d473 Add flags for sort order, limit, showing scores
43be014 Sort results, low-to-high
3cf6192 Display search results more readably
27eff9a Search the corpus with a query document
371c11d Maintain TF-IDF weights for each document
471fcd1 Corpus stores its inverse document frequency
4b7b1ac Documents store their term frequency
5358a2d Maintain a corpus of documents
e023141 Track count of term occurrences
cc1a73d Memoize stemming results in a local cache
765d3af Stem words with the Porter Stemmer
bee8fd0 Don't include words in a standard English stoplist
7959b68 Split tokens but retain contractions
c5a08da Instantiate a document containing words
e36485e Get target file and search files from args
8cfa20f Hello, docsim.
You can’t perform that action at this time.