Skip to content

Commit

Permalink
Sync man pages with the actual options
Browse files Browse the repository at this point in the history
  • Loading branch information
danieldk committed Nov 3, 2019
1 parent 64429da commit f678410
Show file tree
Hide file tree
Showing 2 changed files with 55 additions and 13 deletions.
41 changes: 31 additions & 10 deletions man/finalfrontier-deps.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,18 @@ discarded during training. The default context discard threshold is *1e-4*.
: The minimum count controls discarding of infrequent contexts. Contexts
occuring fewer than *FREQ* times are not considered during training. The
default minimum count is 5.

`--dims` *DIMS*

: The dimensionality of the trained word embeddings. The default
dimensionality is 300.

`--dependency_depth` *DEPTH*

: Dependency contexts up to *DEPTH* distance from the focus word in the
dependency graph will be used to learn the representation of the focus word. The
default depth is *1*.

`--dims` *DIMS*

: The dimensionality of the trained word embeddings. The default
dimensionality is 300.

`--discard` *THRESHOLD*

: The discard threshold influences how often frequent focus words are
Expand Down Expand Up @@ -83,14 +83,15 @@ minimum count is 5.

: The minimum n-gram length for subword representations. Default: 3

`--normalize_contexts`
`--ngram_mincount` *FREQ*

: Normalize the attached form in the dependency contexts.
: The minimum n-gram frequency. n-grams occurring fewer than *FREQ*
times are excluded from training. This option is only applicable
with the *ngrams* argument of the `subwords` option.

`--no_subwords`
`--normalize_contexts`

: Train embeddings without subword information. This option overrides
arguments for `buckets`, `minn` and `maxn`.
: Normalize the attached form in the dependency contexts.

`--ns` *FREQ*

Expand All @@ -108,6 +109,26 @@ arguments for `buckets`, `minn` and `maxn`.
threads increases the probability of update collisions, requiring
more epochs to reach the same loss.

`--subwords` *SUBWORDS*

: The type of subword embeddings to train. The possible types are
*buckets*, *ngrams*, and *none*. Subword embeddings are used to
compute embeddings for unknown words by summing embeddings of
n-grams within unknown words.

The *none* type does not use subwords. The resulting model will
not be able assign an embeddings to unknown words.

The *ngrams* type stores subword n-grams explicitly. The included
n-gram lengths are specified using the `minn` and `maxn`
options. The frequency threshold for n-grams is configured with
the `ngram_mincount` option.

The *buckets* type maps n-grams to buckets using the FNV1 hash.
The considered n-gram lengths are specified using the `minn` and
`maxn` options. The number of buckets is controlled with the
`buckets` option.

`--untyped_deps`

: Only use the word of the attached token in the dependency relation as
Expand Down
27 changes: 24 additions & 3 deletions man/finalfrontier-skipgram.1.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,15 +89,36 @@ OPTIONS

The default model is *skipgram*.

`--no_subwords`
`--ngram_mincount` *FREQ*

: Train embeddings without subword information. This option overrides
arguments for `buckets`, `minn` and `maxn`.
: The minimum n-gram frequency. n-grams occurring fewer than *FREQ*
times are excluded from training. This option is only applicable
with the *ngrams* argument of the `subwords` option.

`--ns` *FREQ*

: The number of negatives to sample per positive example. Default: 5

`--subwords` *SUBWORDS*

: The type of subword embeddings to train. The possible types are
*buckets*, *ngrams*, and *none*. Subword embeddings are used to
compute embeddings for unknown words by summing embeddings of
n-grams within unknown words.

The *none* type does not use subwords. The resulting model will
not be able assign an embeddings to unknown words.

The *ngrams* type stores subword n-grams explicitly. The included
n-gram lengths are specified using the `minn` and `maxn`
options. The frequency threshold for n-grams is configured with
the `ngram_mincount` option.

The *buckets* type maps n-grams to buckets using the FNV1 hash.
The considered n-gram lengths are specified using the `minn` and
`maxn` options. The number of buckets is controlled with the
`buckets` option.

`--threads` *N*

: The number of thread to use during training for
Expand Down

0 comments on commit f678410

Please sign in to comment.