Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update project to clojure 1.5.x #1

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,5 @@ aws.clj
*.project
*.settings
*.pyc
/.lein-failures
/.lein-repl-history
27 changes: 16 additions & 11 deletions project.clj
Original file line number Diff line number Diff line change
@@ -1,13 +1,18 @@
(defproject infer "1.0-SNAPSHOT"
(defproject infer "1.1-SNAPSHOT"
:min-lein-version "2.0.0"
:description "inference and machine learning for clojure"
:dependencies [[org.clojure/clojure "1.2.0-master-SNAPSHOT"]
[org.clojure/clojure-contrib "1.2.0-SNAPSHOT"]
[clojure-csv/clojure-csv "1.1.0"]
[org.apache.commons/commons-math "2.0"]
:dependencies [[org.clojure/clojure "1.5.1"]
[clojure-csv "2.0.0-alpha2" :exclude org.clojure/clojure]
[org.apache.commons/commons-math "2.2"]
[ujmp-complete "0.2.4"]
[org.apache.mahout/mahout-core "0.3"]
[colt/colt "1.2.0"]
[incanter/parallelcolt "0.9.4"]]
:dev-dependencies [[org.clojars.mmcgrana/lein-javac "0.1.0"]
[swank-clojure "1.2.0"]
[lein-clojars "0.5.0"]])
[colt/colt "1.2.0"]
[net.sourceforge.parallelcolt/parallelcolt "0.10.0"]
[org.clojure/algo.monads "0.1.4" :exclude org.clojure/clojure]
[org.clojure/math.combinatorics "0.0.4" :exclude org.clojure/clojure]
[org.clojure/math.numeric-tower "0.0.2" :exclude org.clojure/clojure]
[org.clojure/algo.generic "0.1.1" :exclude org.clojure/clojure]
;; [org.apache.mahout/mahout-math "0.7"]
]
:java-source-paths ["src/jvm"]
:jvm-opts ["-Xmx512m"]
)
22 changes: 10 additions & 12 deletions src/infer/classification.clj
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,10 @@

Classifiers are maps of classifier-name -> functions, data are maps of
feature-name features."
(:use infer.features)
(:use infer.neighbors)
(:use infer.linear-models)
(:use [clojure.contrib.map-utils :only [deep-merge-with]])
(:use [infer.core :only [safe threshold-to map-map levels-deep all-keys]])
(:use [infer.probability :only [bucket +cond-prob-tuples]]))
(:use [infer features neighbors compat linear-models]
[infer.core :only [safe threshold-to map-map levels-deep all-keys]]
[infer.probability :only [bucket +cond-prob-tuples]]
))

(defn discretizing-classifiers
"Makes a discretizing classifier out of each key-range pair."
Expand Down Expand Up @@ -57,26 +55,26 @@

(defn map-as-matrix [m]
(let [ordered (map sort (vals (sort m)))]
(map (comp vec vals) ordered)))
(mapv (comp vec vals) ordered)))

(defn real-precision [confusion-matrix]
(map (fn [v i]
(/ (nth v i)
(mapv (fn [v i]
(/ (float (nth v i))
(apply + v)))
confusion-matrix
(range 0 (count confusion-matrix))))

(defn real-recall
"Computes recall by class label from confusion matrix."
[confusion-matrix]
(real-precision (seq-trans confusion-matrix)))
(real-precision (seq-trans confusion-matrix)))

(defn precision
"Computes precision by class label from confusion matrix."
[m]
(real-precision (map-as-matrix m)))
(real-precision (map-as-matrix m)))

(defn recall
"Computes recall by class label from confusion matrix."
[m]
(real-precision (seq-trans (map-as-matrix m))))
(real-precision (seq-trans (map-as-matrix m))))
18 changes: 18 additions & 0 deletions src/infer/compat.clj
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
(ns infer.compat
"Compatibility functions"
)

(defn deep-merge-with
"Like merge-with, but merges maps recursively, applying the given fn
only when there's a non-map at a particular level.

(deepmerge + {:a {:b {:c 1 :d {:x 1 :y 2}} :e 3} :f 4}
{:a {:b {:c 2 :d {:z 9} :z 3} :e 100}})
-> {:a {:b {:z 3, :c 3, :d {:z 9, :x 1, :y 2}}, :e 103}, :f 4}"
[f & maps]
(apply
(fn m [& maps]
(if (every? map? maps)
(apply merge-with m maps)
(apply f maps)))
maps))
7 changes: 4 additions & 3 deletions src/infer/core.clj
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
(ns infer.core
(:import org.apache.commons.math.util.MathUtils)
(:use clojure.contrib.monads)
(:use [clojure.set :only [intersection]]))
(:use clojure.algo.monads
[clojure.set :only [intersection]])
)

;;TODO: find tests for this stuff.

Expand Down Expand Up @@ -296,4 +297,4 @@
(best-by > keyfn coll))

(defn min-by [keyfn coll]
(best-by < keyfn coll))
(best-by < keyfn coll))
13 changes: 5 additions & 8 deletions src/infer/cross_validation.clj
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
(ns infer.cross-validation
(:use infer.features)
(:use infer.neighbors)
(:use infer.linear-models)
(:use [clojure.contrib.seq-utils :only [flatten]])
(:use [clojure.contrib.map-utils :only [deep-merge-with]])
(:use [infer.core :only [safe threshold-to map-map levels-deep all-keys]])
(:use [infer.probability :only [bucket +cond-prob-tuples]]))
(:use [infer features compat neighbors linear-models]
[infer.core :only [safe threshold-to map-map levels-deep all-keys]]
[infer.probability :only [bucket +cond-prob-tuples]])
)

(defn probs-only
"Compute probability from computed counts.
Expand Down Expand Up @@ -174,4 +171,4 @@ holds each seq of vectors out in turn as the test set, merges the rest as traini
discretized-knn
to-nn-model
nn-confusion-matrix
feature-vecs)))
feature-vecs)))
19 changes: 8 additions & 11 deletions src/infer/features.clj
Original file line number Diff line number Diff line change
@@ -1,15 +1,12 @@
(ns infer.features
(:import java.util.Random)
(:use clojure.contrib.combinatorics)
(:use clojure.contrib.math)
(:use clojure.set)
(:use infer.measures)
(:use infer.information-theory)
(:use infer.probability)
(:use infer.matrix)
(:use infer.core)
(:use [clojure.contrib.map-utils :only [deep-merge-with]])
(:use clojure.set))
(:use clojure.math.combinatorics
clojure.math.numeric-tower
clojure.set
[infer measures information-theory probability matrix core compat]
)

)

;;TODO: check on all these vec operations.
;;what about pop on ecs and butlast?
Expand Down Expand Up @@ -201,4 +198,4 @@

(defn marginalize-map [n m]
(map-from-vectors
(marginalize n (feature-vectors2 m missing-smoother))))
(marginalize n (feature-vectors2 m missing-smoother))))
8 changes: 4 additions & 4 deletions src/infer/io.clj
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
(ns infer.io
(:use clojure.contrib.duck-streams)
(:use clojure-csv.core)
(:use infer.matrix))
(:use clojure.java.io
clojure-csv.core ;; TODO: change to data.csv?
infer.matrix))

(defn csv->matrix [path]
(let [strings (parse-csv (slurp path))]
(matrix (for [row strings
:when (not (some #(= "" %) row))]
(map #(Float/parseFloat %) row)))))
(map #(Float/parseFloat %) row)))))
14 changes: 5 additions & 9 deletions src/infer/learning.clj
Original file line number Diff line number Diff line change
@@ -1,12 +1,8 @@
(ns infer.learning
(:use clojure.set)
(:use clojure.contrib.math)
(:use infer.core)
(:use infer.matrix)
(:use infer.measures)
(:use infer.probability)
(:use infer.information-theory)
(:use infer.features))
(:use clojure.set
clojure.math.numeric-tower
[infer core matrix measures probability information-theory features]
))

;;optimization, regularization, and subset selection
;;TODO: should be split into a few libs
Expand Down Expand Up @@ -191,4 +187,4 @@ A more robust implementation of the algorithm would also check whether the funct
;; (recur snext enext (+ k 1))))))


;;http://en.wikipedia.org/wiki/Regularization_(mathematics)
;;http://en.wikipedia.org/wiki/Regularization_(mathematics)
13 changes: 5 additions & 8 deletions src/infer/linear_models.clj
Original file line number Diff line number Diff line change
@@ -1,11 +1,8 @@
(ns infer.linear-models
(:use clojure.contrib.math)
(:use clojure.set)
(:use infer.core)
(:use infer.matrix)
(:use infer.learning)
(:use infer.measures)
(:use infer.probability))
(:use clojure.math.numeric-tower
clojure.set
[infer core matrix learning measures probability])
)

(defn vecize-1d
"if this is the 1d case, put each calue in a vec."
Expand Down Expand Up @@ -137,4 +134,4 @@ http://en.wikipedia.org/wiki/Tikhonov_regularization
via hard thresholding."
[X lambda precision]

)
)
12 changes: 5 additions & 7 deletions src/infer/lsh.clj
Original file line number Diff line number Diff line change
@@ -1,8 +1,9 @@
(ns infer.lsh
(:use [clojure.contrib.math :only (floor)])
(:use [clojure.set :only (union intersection difference)])
(:import [java.util Random])
(:use [infer.random-variate :only (random-normal)]))
(:use [clojure.math.numeric-tower :only (floor)]
[clojure.set :only (union intersection difference)]
[infer.random-variate :only (random-normal)])
(:import [java.util Random])
)

(defn dot-product
[x y]
Expand Down Expand Up @@ -39,9 +40,6 @@
(fn [data]
(floor (/ (+ b (dot-product data v)) r))))

(defn spherical-l2-hash
"Proposed by Terasawa and Tanaka (2007)")

(defn- apply-hash-ensemble
"Takes a list of minhash functions and data."
[hash-ensemble data]
Expand Down
18 changes: 0 additions & 18 deletions src/infer/matrix.clj
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,6 @@
(:import [org.ujmp.core.matrix Matrix2D])
(:import [org.ujmp.colt
ColtSparseDoubleMatrix2D])
;; (:import [org.apache.mahout.core SparseMatrix])
(:import [org.ujmp.parallelcolt
ParallelColtSparseDoubleMatrix2D])
(:import [org.ujmp.core.doublematrix
Expand Down Expand Up @@ -69,23 +68,6 @@
(defn sparse-pcolt-matrix [xs]
(sparse-matrix* xs #(ParallelColtSparseDoubleMatrix2D. %)))

;; (defn sparse-mahout-matrix [xs]
;; (let [n-rows (count xs)
;; cols (reduce (fn [acc row]
;; (union acc (into #{} (keys row))))
;; #{}
;; xs)
;; m (SparseMatrix. (long-array [n-rows (+ (apply max cols) 1)]))
;; row-indices (range 0 (count xs))]
;; (dorun
;; (map (fn [row r]
;; (dorun (map (fn [[c v]]
;; (.setQuick m r c v)
;; row)))
;; xs
;; row-indices))
;; m)))

(defn from-sparse-matrix [m]
(map (fn [coord]
(conj (into [] (map int coord)) (.getDouble m coord)))
Expand Down
12 changes: 5 additions & 7 deletions src/infer/measures.clj
Original file line number Diff line number Diff line change
@@ -1,10 +1,8 @@
(ns infer.measures
(:use clojure.contrib.math)
(:use clojure.contrib.map-utils)
(:use clojure.set)
(:use infer.core)
(:use infer.matrix)
(:use [infer.probability :only [gt lt binary]])
(:use [infer core compat matrix]
[infer.probability :only [gt lt binary]]
clojure.math.numeric-tower
clojure.set)
(:import org.apache.commons.math.stat.StatUtils)
(:import [org.apache.commons.math.stat.correlation
PearsonsCorrelation Covariance])
Expand Down Expand Up @@ -686,4 +684,4 @@ The Levenshtein distance has several simple upper and lower bounds that are usef
(reduce +
(map #(* % %)
(flatten (from-matrix A)))))]
Af))
Af))
21 changes: 9 additions & 12 deletions src/infer/neighbors.clj
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
(ns infer.neighbors
(:use infer.measures)
(:use infer.core)
(:use infer.features)
(:use clojure.contrib.math)
(:use [clojure.set :only (union intersection difference)])
(:import [java.util Random])
(:use [infer.random-variate :only (random-normal)]))
(:use [infer measures core features]
clojure.math.numeric-tower
[infer.random-variate :only (random-normal)]
[clojure.set :only (union intersection difference)])
(:import [java.util Random])
)

;;TODO: is motthing really the right name for this lib? Density estimation? k-NN & kernels?
;;TODO: change sigs to match the matrix apis of xs & ys rather that [xs & ys]
Expand Down Expand Up @@ -111,12 +110,13 @@
;;TODO:
;;1. pass the distance fn and weighing fn seperately rahter than composing into weigh prior to calling?
;;for kernels, but weighted mean calc is identical for k-nn
(defn nadaraya-watson-estimator [point weigh points]
"takes a query point, a weight fn, and a seq of points, and returns the weighted sum of the points divided but the sum of the weights. the weigh fn is called with the query point and each point in the points seq. the weigh fn is thus a composition of a weight fn and a distance measure.
(defn nadaraya-watson-estimator
"takes a query point, a weight fn, and a seq of points, and returns the weighted sum of the points divided but the sum of the weights. the weigh fn is called with the query point and each point in the points seq. the weigh fn is thus a composition of a weight fn and a distance measure.

http://en.wikipedia.org/wiki/Kernel_regression#Nadaraya-Watson_kernel_regression

"
[point weigh points]
(let [weights* (weights point weigh points)
divisor (sum weights*)]
(if (single-class? points)
Expand Down Expand Up @@ -175,9 +175,6 @@ http://en.wikipedia.org/wiki/Kernel_regression#Nadaraya-Watson_kernel_regression
(fn [data]
(floor (/ (+ b (dot-product data v)) r))))

(defn spherical-l2-hash
"Proposed by Terasawa and Tanaka (2007)")

(defn- apply-hash-ensemble
"Takes a list of minhash functions and data."
[hash-ensemble data]
Expand Down
9 changes: 4 additions & 5 deletions src/infer/probability.clj
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
(ns infer.probability
(:import [java.io File])
(:import [java.util Date Calendar])
(:use [clojure.set :only [difference]])
(:use [clojure.contrib.map-utils :only [deep-merge-with]])
(:use [infer.core :only [tree-comp any?]])
(:use [infer.core
:only [set-to-unit-map bottom-level? map-map same-length?]]))
(:use [clojure.set :only [difference]]
infer.compat
[infer.core :only [tree-comp any? set-to-unit-map bottom-level? map-map same-length?]])
)

(defn binary
"A function for binary classification that takes a booleavn value and returns
Expand Down
7 changes: 4 additions & 3 deletions src/infer/random_variate.clj
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
(ns infer.random-variate
(:use [clojure.contrib.math :only (expt sqrt)])
(:use [clojure.contrib.generic.math-functions :only (tan log cos sin)]))
(:use [clojure.math.numeric-tower :only (expt sqrt)]
[clojure.algo.generic.math-functions :only (tan log cos sin)])
)

(defn exp-rv
"Simulate an exponential distribution with
Expand Down Expand Up @@ -43,4 +44,4 @@
"Generate a lazy sequence of unit normal random variables."
[]
(let [bm (box-muller)]
(lazy-seq (concat bm (random-normal)))))
(lazy-seq (concat bm (random-normal)))))
Loading