This repository has been archived by the owner on Mar 25, 2019. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1
/
README.txt
59 lines (39 loc) · 1.46 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
===========
crossing
===========
crossing simplifies the creation of transformation matrices using scikit-learn.
In theory, crossing can create a transformation matrix that maps a vector
in language A to language B, with vector data provided by programs such as
word2vec.
Requirements
------------
Three files are needed for the generation of transformation matrices:
1. A vector space model for language A
2. A vector space model for language B
3. A dictionary file A->B
The word vectors should like this:
a 0.1 0.2 0.3
word 1.0 2.0 3.0
another 1.1 2.2 3.3
thing 2.0 3.0 4.0
...
The dictionary file should look like this:
word_1 translation_1
...
Usage
-----
It is best used inside the Python interpreter or another script.
Depending on accuracy and regression model, several transformation matrices
can be collected in a ``VectorTransformator'' object:
>>> vt = crossing.VectorManager.VectorTransformator()
You have to fill VectorTransformator.V, VectorTransformator.W and
VectorTransformator.Dictionary with suitable language data.
A transformation matrix can then be created using:
>>> vt.createTransformationMatrix()
The matrix is then represented by a ``TransformationMatrix'' object. Both
``TransformationMatrix'' and ``VectorTransformator'' can then be used with
other NumPy data (matrices/vectors) using standard multiplication.
For instance, to transform a vector to language B:
>>> vt * "word"
Or:
>>> vt * [1.0, 2.0, 3.0]