This piece of software can be use to compute cardinalities of Levenshtein neighborhood of words. The technic used to count cardinalities is describe in ("On the Levenshtein Automaton and the Size of the Neighbourhood of a Word")[https://link.springer.com/chapter/10.1007/978-3-319-30000-9_16], H. Touzet, LATA 2016.
The computation rely on automata intersection (see the paper). One of the automata is the Deterministic Universal Levenshtein Automaton (DULA). This automaton can be generated for a fixed edit value regardless the words that you want to evaluate. This automaton can be loaded from file or generate on the fly by this software. But these automata are huge. So, please generate it before doing experiments with this software.
Compilation:
make
Classical use:
wordbourhood -a 4 -k 5 -d dula5.fsm caaccab aaa bcbbdadddcbcb
Command options:
-alphabet-size, -a: alphabet size [default 4]
-dula, -d: path to dula file in fsm format. This file can be generated by the ula program.
-help, -h: help
-k: number of allowed errors [default 1]
-verbose, -v: verbose
-words, -w: words filename
Warning: The dula file is huge, please generate them previous usage of this software.
- ("On the Levenshtein Automaton and the Size of the Neighbourhood of a Word")[https://link.springer.com/chapter/10.1007/978-3-319-30000-9_16], H. Touzet, LATA 2016
- ("Oral presentation")[http://www.gdr-bim.cnrs.fr/seqbio2016/wp-content/uploads/2016/11/transparents_touzet.pdf], Y. Dufresne, H. Touzet, SeqBio 2016