Sync man pages with the actual options

finalfusion · Nov 3, 2019 · f678410 · f678410
1 parent 64429da
commit f678410
Show file tree

Hide file tree

Showing 2 changed files with 55 additions and 13 deletions.
diff --git a/man/finalfrontier-deps.1.md b/man/finalfrontier-deps.1.md
@@ -41,18 +41,18 @@ discarded during training. The default context discard threshold is *1e-4*.
 :   The minimum count controls discarding of infrequent contexts. Contexts
 occuring fewer than *FREQ* times are not considered during training.  The
 default minimum count is 5.
-
-`--dims` *DIMS*
-
-:   The dimensionality of the trained word embeddings. The default
-dimensionality is 300.
 
 `--dependency_depth` *DEPTH*
 
 :   Dependency contexts up to *DEPTH* distance from the focus word in the
 dependency graph will be used to learn the representation of the focus word. The
 default depth is *1*.
 
+`--dims` *DIMS*
+
+:   The dimensionality of the trained word embeddings. The default
+dimensionality is 300.
+
 `--discard` *THRESHOLD*
 
 :   The discard threshold influences how often frequent focus words are
@@ -83,14 +83,15 @@ minimum count is 5.
 
 :   The minimum n-gram length for subword representations. Default: 3
 
-`--normalize_contexts`
+`--ngram_mincount` *FREQ*
 
-:   Normalize the attached form in the dependency contexts.
+:   The minimum n-gram frequency. n-grams occurring fewer than *FREQ*
+    times are excluded from training. This option is only applicable
+    with the *ngrams* argument of the `subwords` option.
 
-`--no_subwords`
+`--normalize_contexts`
 
-:   Train embeddings without subword information. This option overrides
-arguments for `buckets`, `minn` and `maxn`.
+:   Normalize the attached form in the dependency contexts.
 
 `--ns` *FREQ*
 
@@ -108,6 +109,26 @@ arguments for `buckets`, `minn` and `maxn`.
     threads increases the probability of update collisions, requiring
     more epochs to reach the same loss.
 
+`--subwords` *SUBWORDS*
+
+:   The type of subword embeddings to train. The possible types are
+    *buckets*, *ngrams*, and *none*. Subword embeddings are used to
+    compute embeddings for unknown words by summing embeddings of
+    n-grams within unknown words.
+
+    The *none* type does not use subwords. The resulting model will
+    not be able assign an embeddings to unknown words.
+
+    The *ngrams* type stores subword n-grams explicitly. The included
+    n-gram lengths are specified using the `minn` and `maxn`
+    options. The frequency threshold for n-grams is configured with
+    the `ngram_mincount` option.
+
+    The *buckets* type maps n-grams to buckets using the FNV1 hash.
+    The considered n-gram lengths are specified using the `minn` and
+    `maxn` options.  The number of buckets is controlled with the
+    `buckets` option.
+
 `--untyped_deps`
 
 :   Only use the word of the attached token in the dependency relation as

diff --git a/man/finalfrontier-skipgram.1.md b/man/finalfrontier-skipgram.1.md
@@ -89,15 +89,36 @@ OPTIONS
 
     The default model is *skipgram*.
 
-`--no_subwords`
+`--ngram_mincount` *FREQ*
 
-:   Train embeddings without subword information. This option overrides
-arguments for `buckets`, `minn` and `maxn`.
+:   The minimum n-gram frequency. n-grams occurring fewer than *FREQ*
+    times are excluded from training. This option is only applicable
+    with the *ngrams* argument of the `subwords` option.
 
 `--ns` *FREQ*
 
 :   The number of negatives to sample per positive example. Default: 5
 
+`--subwords` *SUBWORDS*
+
+:   The type of subword embeddings to train. The possible types are
+    *buckets*, *ngrams*, and *none*. Subword embeddings are used to
+    compute embeddings for unknown words by summing embeddings of
+    n-grams within unknown words.
+
+    The *none* type does not use subwords. The resulting model will
+    not be able assign an embeddings to unknown words.
+
+    The *ngrams* type stores subword n-grams explicitly. The included
+    n-gram lengths are specified using the `minn` and `maxn`
+    options. The frequency threshold for n-grams is configured with
+    the `ngram_mincount` option.
+
+    The *buckets* type maps n-grams to buckets using the FNV1 hash.
+    The considered n-gram lengths are specified using the `minn` and
+    `maxn` options.  The number of buckets is controlled with the
+    `buckets` option.
+
 `--threads` *N*
 
 :   The number of thread to use during training for