-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathREADME
172 lines (126 loc) · 3.72 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
# knor for Python
[](https://travis-ci.org/flashxio/knorPy)
These are Python 2.7+ and Python 3 bindings for the standalone
machine in-memory portion of the clustering NUMA Optimized Routines. See the
full C++ library for details: https://github.com/flashxio/knor
## Supported OSes
The `knor` python package has been tested on the following OSes:
- Mac OSX Sierra, Mojave
- Ubuntu LTS 14.04, 16.04, 18.04
- Debian Linux
## Python Dependencies
- Numpy: `pip install -U numpy`
- Setuptools: `pip install -U setuptools`
- pybind11: `pip install -U pybind11`
## Mac Installation
```
pip install knor
```
## Linux installation
### Best Performance configuration
For the best performance make sure the `numa` system package is installed via
```
apt-get install -y libnuma-dbg libnuma-dev libnuma1
```
Then simply install the package via pip:
```
pip install knor
```
## Installation Errors
1.
```
File “/Library/Python/2.7/site-packages/pip/utils/__init__.py”, line X, in
call_subprocess
% (command_desc, proc.returncode, cwd))
InstallationError: Command “python setup.py egg_info” failed with error
code 1 in /private/tmp/pip-build-vaASFl/knor/
```
### Solution
Update your `python` version to at least `2.7.10`
2.
```
In file included from knor/cknor/libkcommon/clusters.cpp:23:
knor/cknor/libkcommon/util.hpp:29:10: fatal error: ‘random’ file not found
#include <random>
^
1 error generated.
error: command ‘/usr/bin/clang’ failed with exit status 1
```
### Solution
This usually occurs on Mac when your Xcode and Xcode command line tools are out of
date. Update then to at least Version 8
3.
```
unable to execute 'x86_64-linux-gnu-gcc': No such file or directory
error: command 'x86_64-linux-gnu-gcc' failed with exit status 1
```
### Solution
Install the `gcc` compiler via `apt-get install build-essential`
4.
```
fatal error: Python.h: No such file or directory compilation terminated.
```
### Solution
Install the development headers for the version on Python you intend to install knor e.g
```
apt-get install python-dev # for python2.x installs
apt-get install python3-dev # for python3.x installs
```
5.
```
fatal error: 'pybind11/pybind11.h' file not found
#include <pybind11/pybind11.h>
^~~~~~~~~~~~~~~~~~~~~
ImportError: No module named pybind11
```
### Solution
This occurs on Mac and is best solved by utilizing virtual environments as
follows:
```
sudo pip install virtualenv
virtualenv -p <python-version> <desired-path>
source <desired-path>/bin/activate
```
Where `python-version` is either `python2.7` or `python3`.
Then install `pybind11` by `pip install pybind11` then attempt to install `knor`
6.
ImportError: No module named 'setuptools'
```
### Solution
Use a recent version of `setuptools` from `pip`
```
sudo apt remove python-setuptools python3-setuptool
pip install -U setuptools
pip3 install -U setuptools
```
## Documentation
```
from knor import *
help(Kmeans)
help(SKmeans)
help(KmeansPP)
help(FuzzyCMeans)
help(Kmedoids)
help(Xmeans)
help(Hmeans)
help(Gmeans)
```
## Example
```
import knor
import numpy as np
data = np.random.random((100, 10))
km = knor.Kmeans(k=5)
ret = km.fit(data)
print(ret)
```
## The `cluster_t` return object
The `cluster_t` return object has the following attributes:
- `k`: The number of clusters requested
- `nrow`: The number of rows/samples in the dataset
- `ncol`: Then number of columns/features in the dataset
- `sizes`: The number of samples in each cluster/centroid
- `iters`: The number of iterations performed
- `centroids`: A `list` where each row is a cluster center
- `clusters`: A `list` index for which cluster each sample falls into