Skip to content

Commit

Permalink
Printing out the metrics after you perform a cluster and added a true…
Browse files Browse the repository at this point in the history
…/false condition for shuffling
  • Loading branch information
GregJohnsonJr committed Jun 10, 2024
1 parent 50c3a7c commit b717404
Show file tree
Hide file tree
Showing 7 changed files with 64 additions and 58 deletions.
4 changes: 2 additions & 2 deletions R/Cluster.R
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,12 @@
#' @param cutoff A cutoff value
#' @param iterations The number of iterations
#' @return A data.frame of the clusters.
opti_cluster <- function(sparse_matrix, cutoff, iterations) {
opti_cluster <- function(sparse_matrix, cutoff, iterations, shuffle = TRUE) {
index_one_list <- sparse_matrix@i
index_two_list <- sparse_matrix@j
value_list <- sparse_matrix@x
clustering_output_string <- MatrixToOpiMatrixCluster(index_one_list, index_two_list, value_list, cutoff,

Check warning

Code scanning / lintr

no visible global function definition for 'MatrixToOpiMatrixCluster' Warning

no visible global function definition for 'MatrixToOpiMatrixCluster'

Check notice

Code scanning / lintr

Lines should not be more than 80 characters. This line is 106 characters. Note

Lines should not be more than 80 characters. This line is 106 characters.
iterations)
iterations, shuffle)
df <- t(read.table(text = clustering_output_string,
sep = "\t", header = TRUE))
df <- data.frame(df[-1, ])
Expand Down
4 changes: 2 additions & 2 deletions R/RcppExports.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Generated by using Rcpp::compileAttributes() -> do not edit by hand
# Generator token: 10BE3573-1514-4C36-9D1C-5A225CD40393

MatrixToOpiMatrixCluster <- function(xPosition, yPosition, data, cutoff, iterations = 2L) {
.Call(`_Opticluster_MatrixToOpiMatrixCluster`, xPosition, yPosition, data, cutoff, iterations)
MatrixToOpiMatrixCluster <- function(xPosition, yPosition, data, cutoff, iterations = 2L, shuffle = TRUE) {

Check notice

Code scanning / lintr

Variable and function name style should match snake_case or symbols. Note

Variable and function name style should match snake_case or symbols.

Check notice

Code scanning / lintr

Variable and function name style should match snake_case or symbols. Note

Variable and function name style should match snake_case or symbols.

Check notice

Code scanning / lintr

Variable and function name style should match snake_case or symbols. Note

Variable and function name style should match snake_case or symbols.

Check notice

Code scanning / lintr

Lines should not be more than 80 characters. This line is 107 characters. Note

Lines should not be more than 80 characters. This line is 107 characters.
.Call(`_Opticluster_MatrixToOpiMatrixCluster`, xPosition, yPosition, data, cutoff, iterations, shuffle)

Check notice

Code scanning / lintr

Indentation should be 2 spaces but is 4 spaces. Note

Indentation should be 2 spaces but is 4 spaces.

Check warning

Code scanning / lintr

no visible binding for global variable '_Opticluster_MatrixToOpiMatrixCluster' Warning

no visible binding for global variable '_Opticluster_MatrixToOpiMatrixCluster'

Check notice

Code scanning / lintr

Lines should not be more than 80 characters. This line is 107 characters. Note

Lines should not be more than 80 characters. This line is 107 characters.
}

97 changes: 49 additions & 48 deletions src/ClusterCommand.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,16 +17,16 @@ ClusterCommand::~ClusterCommand() {
/// @param optiMatrix
/// @return
std::string ClusterCommand::runOptiCluster(OptiMatrix *optiMatrix) {
std::string clusterOutputData;
std::string clusterMetrics;
std::string sensFile;
std::string outStep;
std::string clusterMatrixOutput;
if (!cutOffSet) {
clusterOutputData += ("\nYou did not set a cutoff, using 0.03.\n");
clusterMetrics += ("\nYou did not set a cutoff, using 0.03.\n");
cutoff = 0.05;
}

clusterOutputData += ("\nClustering " + distfile + "\n");
clusterMetrics += ("\nClustering " + distfile + "\n");

ClusterMetric *metric = nullptr;
metricName = "mcc";
Expand Down Expand Up @@ -57,11 +57,11 @@ std::string ClusterCommand::runOptiCluster(OptiMatrix *optiMatrix) {
// string distfile = columnfile;
// if (format == "phylip") { distfile = phylipfile; }
//
// sensFile += "label\tcutoff\ttp\ttn\tfp\tfn\tsensitivity\tspecificity\tppv\tnpv\tfdr\taccuracy\tmcc\tf1score\n";
// clusterOutputData += (
// "\n\niter\ttime\tlabel\tnum_otus\tcutoff\ttp\ttn\tfp\tfn\tsensitivity\tspecificity\tppv\tnpv\tfdr\taccuracy\tmcc\tf1score\n");
// outStep +=
// "iter\ttime\tlabel\tnum_otus\tcutoff\ttp\ttn\tfp\tfn\tsensitivity\tspecificity\tppv\tnpv\tfdr\taccuracy\tmcc\tf1score\n";
sensFile += "label\tcutoff\ttp\ttn\tfp\tfn\tsensitivity\tspecificity\tppv\tnpv\tfdr\taccuracy\tmcc\tf1score\n";
clusterMetrics += (
"\n\niter\ttime\tlabel\tnum_otus\tcutoff\ttp\ttn\tfp\tfn\tsensitivity\tspecificity\tppv\tnpv\tfdr\taccuracy\tmcc\tf1score\n");
outStep +=
"iter\ttime\tlabel\tnum_otus\tcutoff\ttp\ttn\tfp\tfn\tsensitivity\tspecificity\tppv\tnpv\tfdr\taccuracy\tmcc\tf1score\n";

bool printHeaders = true;

Expand All @@ -74,22 +74,22 @@ std::string ClusterCommand::runOptiCluster(OptiMatrix *optiMatrix) {
int iters = 0;
double listVectorMetric = 0; //worst state
double delta = 1;
// long long numBins;
long long numBins;
double tp, tn, fp, fn;
vector<double> results;
cluster.initialize(listVectorMetric, false, initialize);
// results = cluster.getStats(tp, tn, fp, fn);
// numBins = cluster.getNumBins();
// clusterOutputData += ("0\t0\t" + std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" +
// std::to_string(cutoff) + "\t" + std::to_string(tp) + "\t" + std::to_string(tn) + "\t" +
// std::to_string(fp) + "\t" + std::to_string(fn) + "\t");
// outStep += "0\t0\t" + std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" + std::to_string(cutoff) +
// "\t" + std::to_string(tp) + '\t' + std::to_string(tn) + '\t' + std::to_string(fp) + '\t' +
// std::to_string(fn) + '\t';
// for (double result : results) {
// clusterOutputData += (std::to_string(result) + "\t");
// outStep += std::to_string(result) + "\t";
// }
vector<double> stats;
cluster.initialize(listVectorMetric, canShuffle, initialize);
stats = cluster.getStats(tp, tn, fp, fn);
numBins = cluster.getNumBins();
clusterMetrics += ("0\t0\t" + std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" +
std::to_string(cutoff) + "\t" + std::to_string(tp) + "\t" + std::to_string(tn) + "\t" +
std::to_string(fp) + "\t" + std::to_string(fn) + "\t");
outStep += "0\t0\t" + std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" + std::to_string(cutoff) +
"\t" + std::to_string(tp) + '\t' + std::to_string(tn) + '\t' + std::to_string(fp) + '\t' +
std::to_string(fn) + '\t';
for (double result : stats) {
clusterMetrics += (std::to_string(result) + "\t");
outStep += std::to_string(result) + "\t";
}
//m->mothurOutEndLine();
// outStep += "\n";
// Stable Metric -> Keep the data stable, to prevent errors (rounding errors)
Expand All @@ -104,27 +104,27 @@ std::string ClusterCommand::runOptiCluster(OptiMatrix *optiMatrix) {
delta = std::abs(oldMetric - listVectorMetric);
iters++;

// results = cluster.getStats(tp, tn, fp, fn);
// numBins = cluster.getNumBins();

// clusterOutputData += (std::to_string(iters) + "\t" + std::to_string(time(nullptr) - start) + "\t" +
// std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" +
// std::to_string(cutoff) + "\t" + std::to_string(tp) + "\t" + std::to_string(tn) + "\t"
// + std::to_string(fp) + "\t" + std::to_string(fn) + "\t");
// outStep += (std::to_string(iters) + "\t" + std::to_string(time(nullptr) - start) + "\t" +
// std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" + std::to_string(cutoff) + "\t")
// + std::to_string(tp) + '\t' + std::to_string(tn) + '\t' + std::to_string(fp) + '\t' +
// std::to_string(fn) +
// '\t';
// for (double result : results) {
// clusterOutputData += (std::to_string(result) + "\t");
// outStep += std::to_string(result) + "\t";
// }
// outStep += "\n";
stats = cluster.getStats(tp, tn, fp, fn);
numBins = cluster.getNumBins();

clusterMetrics += (std::to_string(iters) + "\t" + std::to_string(time(nullptr) - start) + "\t" +
std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" +
std::to_string(cutoff) + "\t" + std::to_string(tp) + "\t" + std::to_string(tn) + "\t"
+ std::to_string(fp) + "\t" + std::to_string(fn) + "\t");
outStep += (std::to_string(iters) + "\t" + std::to_string(time(nullptr) - start) + "\t" +
std::to_string(cutoff) + "\t" + std::to_string(numBins) + "\t" + std::to_string(cutoff) + "\t")
+ std::to_string(tp) + '\t' + std::to_string(tn) + '\t' + std::to_string(fp) + '\t' +
std::to_string(fn) +
'\t';
for (double result : stats) {
clusterMetrics += (std::to_string(result) + "\t");
outStep += std::to_string(result) + "\t";
}
outStep += "\n";
}
ListVector *list = nullptr;

// clusterOutputData += "\n\n";
clusterMetrics += "\n\n";
list = cluster.getList();
//
if (printHeaders) {
Expand All @@ -133,13 +133,14 @@ std::string ClusterCommand::runOptiCluster(OptiMatrix *optiMatrix) {
} else { list->setPrintedLabels(printHeaders); }
clusterMatrixOutput = list->print(listFile);
delete list;
// results = cluster.getStats(tp, tn, fp, fn);
//
// sensFile += std::to_string(cutoff) + '\t' + std::to_string(cutoff) + '\t' + std::to_string(tp) + '\t' +
// std::to_string(tn) + '\t' +
// std::to_string(fp) + '\t' + std::to_string(fn) + '\t';
// for (double result : results) { sensFile + std::to_string(result) + '\t'; }
}
stats = cluster.getStats(tp, tn, fp, fn);

sensFile += std::to_string(cutoff) + '\t' + std::to_string(cutoff) + '\t' + std::to_string(tp) + '\t' +
std::to_string(tn) + '\t' +
std::to_string(fp) + '\t' + std::to_string(fn) + '\t';
for (double result : stats) { sensFile + std::to_string(result) + '\t'; }
Rcpp::Rcout << "Metrics for the current cluster:\n\n " << sensFile << "\n\n" << clusterMetrics << "\n\n";
}
delete matrix;
return clusterMatrixOutput;
}
3 changes: 3 additions & 0 deletions src/MothurDependencies/ClusterCommand.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@
#include <map>
#include "OptiMatrix.h"
#include "ListVector.h"
#include <Rcpp.h>
#include "Metrics/accuracy.hpp"
#include "Metrics/f1score.hpp"
#include "Metrics/fdr.hpp"
Expand Down Expand Up @@ -53,6 +54,7 @@ class ClusterCommand {
ClusterCommand() {}
~ClusterCommand();
bool SetMaxIterations(const int iterations) {maxIters = iterations; return maxIters == iterations;}
bool SetOpticlusterRandomShuffle(const bool shuffle) {canShuffle = shuffle; return canShuffle;}
std::string runOptiCluster(OptiMatrix*);


Expand All @@ -72,6 +74,7 @@ class ClusterCommand {
bool print_start;
time_t start;
unsigned long loops;
bool canShuffle;
vector<string> outputNames;
};

Expand Down
9 changes: 5 additions & 4 deletions src/RcppExports.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,8 @@ Rcpp::Rostream<false>& Rcpp::Rcerr = Rcpp::Rcpp_cerr_get();
#endif

// MatrixToOpiMatrixCluster
std::string MatrixToOpiMatrixCluster(const std::vector<int>& xPosition, const std::vector<int>& yPosition, const std::vector<double>& data, const double cutoff, const int iterations);
RcppExport SEXP _Opticluster_MatrixToOpiMatrixCluster(SEXP xPositionSEXP, SEXP yPositionSEXP, SEXP dataSEXP, SEXP cutoffSEXP, SEXP iterationsSEXP) {
std::string MatrixToOpiMatrixCluster(const std::vector<int>& xPosition, const std::vector<int>& yPosition, const std::vector<double>& data, const double cutoff, const int iterations, const bool shuffle);
RcppExport SEXP _Opticluster_MatrixToOpiMatrixCluster(SEXP xPositionSEXP, SEXP yPositionSEXP, SEXP dataSEXP, SEXP cutoffSEXP, SEXP iterationsSEXP, SEXP shuffleSEXP) {
BEGIN_RCPP
Rcpp::RObject rcpp_result_gen;
Rcpp::RNGScope rcpp_rngScope_gen;
Expand All @@ -21,15 +21,16 @@ BEGIN_RCPP
Rcpp::traits::input_parameter< const std::vector<double>& >::type data(dataSEXP);
Rcpp::traits::input_parameter< const double >::type cutoff(cutoffSEXP);
Rcpp::traits::input_parameter< const int >::type iterations(iterationsSEXP);
rcpp_result_gen = Rcpp::wrap(MatrixToOpiMatrixCluster(xPosition, yPosition, data, cutoff, iterations));
Rcpp::traits::input_parameter< const bool >::type shuffle(shuffleSEXP);
rcpp_result_gen = Rcpp::wrap(MatrixToOpiMatrixCluster(xPosition, yPosition, data, cutoff, iterations, shuffle));
return rcpp_result_gen;
END_RCPP
}

RcppExport SEXP run_testthat_tests(SEXP);

static const R_CallMethodDef CallEntries[] = {
{"_Opticluster_MatrixToOpiMatrixCluster", (DL_FUNC) &_Opticluster_MatrixToOpiMatrixCluster, 5},
{"_Opticluster_MatrixToOpiMatrixCluster", (DL_FUNC) &_Opticluster_MatrixToOpiMatrixCluster, 6},
{"run_testthat_tests", (DL_FUNC) &run_testthat_tests, 1},
{NULL, NULL, 0}
};
Expand Down
3 changes: 2 additions & 1 deletion src/main.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -14,11 +14,12 @@
//[[Rcpp::export]]
std::string MatrixToOpiMatrixCluster(const std::vector<int> &xPosition,
const std::vector<int> &yPosition, const std::vector<double> &data, const double cutoff,
const int iterations = 2)
const int iterations = 2, const bool shuffle = true)
{
OptimatrixAdapter adapter(cutoff);
const auto optiMatrix = adapter.ConvertToOptimatrix(xPosition,yPosition,data);
auto* command = new ClusterCommand();
command ->SetOpticlusterRandomShuffle(shuffle);
command->SetMaxIterations(iterations);
return command->runOptiCluster(optiMatrix);
}
Expand Down
2 changes: 1 addition & 1 deletion tests/testthat/test-test-opticluster.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
test_that("Clustering returns proper results", {
expected_df <- readRDS(test_path("extdata","df_test_file.RDS"))

Check notice

Code scanning / lintr

Commas should always have a space after. Note test

Commas should always have a space after.
matrix <- readRDS(test_path("extdata","matrix_data.RDS"))

Check notice

Code scanning / lintr

Commas should always have a space after. Note test

Commas should always have a space after.
df <- Opticluster::opti_cluster(matrix, 0.2, 2)
df <- Opticluster::opti_cluster(matrix, 0.2, 2, FALSE)
df$exists <- do.call(paste0, df) %in% do.call(paste0, expected_df)
expect_equal(class(df), "data.frame")
expect_true(all(df$exists == TRUE))
Expand Down

0 comments on commit b717404

Please sign in to comment.