forked from datumbox/datumbox-framework
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathTODO.txt
65 lines (44 loc) · 2.4 KB
/
TODO.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
NEW ALGORITHMS
==============
- include an anomaly detection algorithm
- develop the FunkSVD: Also PLSI as probabilistic version of SVD
- ability to search through the configuration space and find the best performing algorithmic configuration
CODE IMPROVEMENT
================
- avoid having final static memory configuration that can't be changed during analysis.
- create the required exceptions
- create a proper logger for the messages
- document the code
MODEL PERSISTANCE
=================
- check out prepersist and postpersist instead of reflection: http://stackoverflow.com/questions/8312101/storing-multidimensional-arrays-with-morphia
- create MySQLDBStructureFactory, MemCacheStructureFactory (http://dustin.sallings.org/java-memcached-client/apidocs/net/spy/memcached/CacheMap.html)
- create a MapDBStructureFactory: https://github.com/jankotek/MapDB/blob/master/src/test/java/examples/_HelloWorld.java
- Building such factories is not a simple task. The library is heavily binded to Morphia and uses its functions (pre/post savers/loaders) and annotations to store data. Also transient fields are added to avoid in-memory serialization. The whole serialization should be redesigned to enable a custom and modular solution.
DOCUMENTATION
=============
- Documenting the code
- How-to blog post on the installation of the framework
- How-to blog post on building a Text Classification model
CHECK OUT HUGE COLLECTION LIBS, DBS AND STORAGE:
================================================
Java StoredMap + BerkeleyDB:
http://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/collections/StoredMap.html
http://www.oracle.com/technetwork/database/berkeleydb/overview/index-093405.html
Redis:
http://redis.io/
Cassandra Collections:
https://github.com/otaviojava/Easy-Cassandra/wiki
Vanilla-java - HugeCollections:
https://code.google.com/p/vanilla-java/wiki/HugeCollections
Fastutil:
http://fastutil.di.unimi.it/#install
http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22it.unimi.dsi%22
Joafip:
http://joafip.sourceforge.net/javadoc/net/sf/joafip/java/util/PHashMap.html
Hibernate:
http://docs.jboss.org/hibernate/orm/3.6/reference/en-US/html/collections.html
MapDB:
https://github.com/jankotek/MapDB/blob/master/src/test/java/examples/_HelloWorld.java
Lazy loading persistent objects
https://today.java.net/pub/a/today/2006/07/13/lazy-loading-is-easy.html