Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add PCA as dimensionality reduction method #122

Open
wants to merge 30 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
71e11db
Edit comments in pom.xml
stefanhahmann Nov 26, 2024
5b9cf99
Add mvn -v to sonarcloud.yml
stefanhahmann Nov 27, 2024
6f47388
Add exclusion of net.imagej:imagej-legacy in maven surefire plugins
stefanhahmann Nov 27, 2024
92045cb
Set JDK to 21 for build job
stefanhahmann Nov 27, 2024
1c371a6
Set JDK to 21 for sonarcloud job
stefanhahmann Nov 26, 2024
bf2ca16
Add ij1-patcher explicit version 1.2.7
stefanhahmann Nov 28, 2024
6e40493
Change version of jacoco to 0.8.11 for compatibility with JDK 21
stefanhahmann Nov 27, 2024
74fa3ab
Add smile machine learning library to pom.xml
stefanhahmann Nov 26, 2024
de5e7f5
Add new unit test PCATest using smile machine learning library
stefanhahmann Nov 26, 2024
bd97e85
Add visual PCADemo
stefanhahmann Nov 27, 2024
f2d017e
Modify PlotPoints to accept 'null' input data and 'null' filter
stefanhahmann Dec 2, 2024
d0149d6
Add demo applications with the tgmm-mini dataset for PCA, t-SNE and UMAP
stefanhahmann Dec 2, 2024
8482bdb
Replace t-SNE library and enable t-SNE unit tests
stefanhahmann Dec 3, 2024
268db16
Activate t-SNE option in DimensionalityReductionView
stefanhahmann Dec 4, 2024
c265db7
Modify unit tests for PCA, t-SNE and UMAP such that it is checked whe…
stefanhahmann Dec 3, 2024
c7acfea
Add demo applications with the iris dataset for UMAP
stefanhahmann Dec 2, 2024
67258c3
Add a python script to run equivalent demos in python
stefanhahmann Dec 2, 2024
7f71ecb
Add PCA Feature, Computer and Serializer classes
stefanhahmann Dec 2, 2024
ad13910
Rename key of Branch Umap Feature to 'Branch UMAP outputs'
stefanhahmann Dec 2, 2024
32216bd
Add unit tests for BranchPcaFeature and SpotPcaFeature
stefanhahmann Dec 2, 2024
e335969
Add PCA as new dimensionality reduction method to DimensionalityReduc…
stefanhahmann Dec 4, 2024
abfca94
Add demo applications and a unit test for the UMAP implementation of …
stefanhahmann Dec 9, 2024
f99ed9c
Add t-SNE to parameter section of dimensionality reduction in README.md
stefanhahmann Dec 6, 2024
b542a56
Add PCA to parameter section of dimensionality reduction of README.md
stefanhahmann Dec 6, 2024
99bd9a5
Update umap_dialog.png
stefanhahmann Dec 6, 2024
929560a
Rename image from umap_dialog.png to dialog.png
stefanhahmann Dec 9, 2024
a2bdb0c
Add a link to the UMAP implementation used in the project
stefanhahmann Dec 9, 2024
d8d4554
Add mnist test data sets to gitignore
stefanhahmann Dec 12, 2024
3c97be5
Add comments to pom.xml re usage of which library for which dimension…
stefanhahmann Dec 17, 2024
0be6d3f
Deactivate smile demos and unit tests
stefanhahmann Dec 17, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,10 +16,10 @@ jobs:

steps:
- uses: actions/checkout@v2
- name: Set up Java
- name: Set up JDK 21
uses: actions/setup-java@v2
with:
java-version: '8'
java-version: '21'
distribution: 'zulu'
cache: 'maven'
- name: Set up CI environment
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/sonarcloud.yml
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ jobs:

steps:
- uses: actions/checkout@v2
- name: Set up JDK 17
- name: Set up JDK 21
uses: actions/setup-java@v2
with:
java-version: '17'
java-version: '21'
distribution: 'zulu'
cache: 'maven'

Expand All @@ -53,6 +53,7 @@ jobs:
run: |
export DISPLAY=:99
sudo Xvfb -ac :99 -screen 0 1280x1024x24 > /dev/null 2>&1 &
mvn -v
mvn -B verify --file pom.xml -Pcoverage sonar:sonar -Dsonar.projectKey=mastodon-sc_mastodon-deep-lineage -Dsonar.organization=mastodon-sc

- name: Upload artifacts for subsequent review
Expand Down
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,6 @@ buildNumber.properties
.mvn/wrapper/maven-wrapper.jar
/logs/
/mastodon-deep-lineage.iml
/src/test/resources/org/mastodon/mamut/feature/dimensionalityreduction/mnist_784.csv
/src/test/resources/org/mastodon/mamut/feature/dimensionalityreduction/mnist2500_X.txt
/src/test/resources/org/mastodon/mamut/feature/dimensionalityreduction/mnist2500_labels.txt
34 changes: 28 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -304,11 +304,22 @@ Tree2
## Dimensionality reduction

For visualizing high-dimensional data, e.g. in two dimensions, potentially getting more insights into your data, you can
reduce the dimensionality of the measurements, using this algorithm:
reduce the dimensionality of the measurements, using these algorithms:

* UMAP
* [Uniform Manifold Approximation Projection (UMAP)](https://arxiv.org/abs/1802.03426)
* [UMAP Python implementation](https://umap-learn.readthedocs.io/en/latest/)
* This [UMAP Java implementation](https://github.com/tag-bio/umap-java) is used.
* Further reading: [UMAP Python implementation](https://umap-learn.readthedocs.io/en/latest/)
* t-SNE
* [t-distributed Stochastic Neighbor Embedding (t-SNE)](https://lvdmaaten.github.io/tsne/)
* This [t-SNE Java implementation](https://haifengl.github.io/manifold.html#t-sne) is used.
* Further
reading: [t-SNE Python implementation](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html)
* PCA
* [Principal Component analysis](https://en.wikipedia.org/wiki/Principal_component_analysis)
* This [PCA Java implementation](https://haifengl.github.io/manifold.html#pca) is used.
* Further
reading: [PCA Python implementation](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html)

### Usage

Expand All @@ -333,7 +344,7 @@ If they are selected, the algorithm will use the link feature value of its incom
all incoming edges, if there is more than one incoming edge.

The dialog will look like this:
![umap_dialog.png](doc/dimensionalityreduction/umap_dialog.png)
![umap_dialog.png](doc/dimensionalityreduction/dialog.png)

By default, all measurements are selected in the box.

Expand All @@ -360,6 +371,17 @@ By default, all measurements are selected in the box.
representation. This parameter controls how tightly UMAP is allowed to pack points together.
Further reading: [Minimum Distance](https://umap-learn.readthedocs.io/en/latest/parameters.html#min-dist).

#### t-SNE Parameters

* Perplexity: The perplexity is related to the number of nearest neighbors that are used in other manifold learning
algorithms. Larger datasets usually require a larger perplexity. The recommended range is between 5 and 50.
Further
reading: [Perplexity](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn.manifold.TSNE).
* Maximum number of iterations: The maximum number of iterations for the optimization. The default is 1000. More
iterations will give more accurate results, but will also take longer to compute.
Further
reading: [Maximum Number of Iterations](https://scikit-learn.org/stable/modules/generated/sklearn.manifold.TSNE.html#sklearn.manifold.TSNE).

When you are done with the selection, click on `Compute`.
The resulting values will be added as additional columns to the selected table.

Expand Down Expand Up @@ -445,7 +467,7 @@ the Mastodon repository.
channel is used to check, if the dimensions of the label image in ImageJ match the dimensions of the image data in
Mastodon.
* ![plugin_import_example_8.png](doc/import/label_image/plugin_import_example_08.png)
* Click `OK` and the spots are imported into Mastodon.
* Click `OK` and the spots are imported into Mastodon.
* ![plugin_import_example_4.png](doc/import/label_image/plugin_import_example_04.png)

#### Label image as BDV channel
Expand All @@ -467,7 +489,7 @@ the Mastodon repository.
* Import the image sequence encoding the label images into ImageJ contained in folder: `Fluo-C3DL-MDA231/01_ERR_SEG/`
* Set the dimensions of the label image to 512x512x1x30x12 (XYCTZ) using `Image > Properties`
* ![plugin_import_example_3.png](doc/import/label_image/plugin_import_example_03.png)
* Merge the 2 images into a single image using the `Image > Color > Merge Channels...` command
* Merge the 2 images into a single image using the `Image > Color > Merge Channels...` command
* ![plugin_import_example_5.png](doc/import/label_image/plugin_import_example_05.png)
* Open Mastodon from Fiji and create a new project with merged image
* `Plugins > Mastodon > new Mastodon project > Use an image opened in ImageJ > Create`
Expand All @@ -480,7 +502,7 @@ the Mastodon repository.
* Select the BDV channel containing the label image that has been used to create the segmented label image. This is
used to check, if the dimensions of the label image and the image data in BDV match, which is required.
* ![plugin_import_example_9.png](doc/import/label_image/plugin_import_example_08.png)
* Click `OK` and the spots are imported into Mastodon.
* Click `OK` and the spots are imported into Mastodon.
* ![plugin_import_example_7.png](doc/import/label_image/plugin_import_example_07.png)

## Export
Expand Down
Binary file added doc/dimensionalityreduction/dialog.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed doc/dimensionalityreduction/umap_dialog.png
Binary file not shown.
25 changes: 17 additions & 8 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@

<properties>
<package-name>org.mastodon</package-name>
<!-- license information -->
<license.licenseName>bsd_2</license.licenseName>
<license.projectName>mastodon-deep-lineage</license.projectName>
<license.organizationName>Mastodon authors</license.organizationName>
Expand All @@ -23,11 +24,14 @@
<mastodon.group>org.mastodon</mastodon.group>
<!-- when a pom-scijava exists that references the scijava-common 2.99.1 release, this can be removed again -->
<scijava-common.version>2.99.1-SNAPSHOT</scijava-common.version>
<!-- when a pom-scijava exists that references the ij1-patcher 1.2.7 release, this can be removed again -->
<ij1-patcher.version>1.2.7</ij1-patcher.version>

<releaseProfiles>sign,deploy-to-scijava</releaseProfiles>

<enforcer.skip>true</enforcer.skip>

<!-- SonarCloud configuration. Used for code quality metric reports -->
<sonar.host.url>https://sonarcloud.io</sonar.host.url>
<sonar.java.coveragePlugin>jacoco</sonar.java.coveragePlugin>
<sonar.dynamicAnalysis>reuseReports</sonar.dynamicAnalysis>
Expand Down Expand Up @@ -115,21 +119,21 @@
<artifactId>jfreesvg</artifactId>
</dependency>

<!-- UMAP -->
<!-- UMAP (production) -->
<dependency>
<groupId>tech.molecules</groupId>
<artifactId>external-umap-java</artifactId>
<version>1.0</version>
</dependency>

<!-- t-SNE -->
<!-- PCA, t-SNE (production), UMAP (tests only) -->
<dependency>
<groupId>com.github.lejon.T-SNE-Java</groupId>
<artifactId>tsne</artifactId>
<version>v2.6.4</version>
<groupId>com.github.haifengl</groupId>
<artifactId>smile-core</artifactId>
<version>4.0.0</version>
</dependency>

<!-- Standardization for UMAP preprocessing -->
<!-- Standardization for dimensionality reduction preprocessing -->
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-math3</artifactId>
Expand All @@ -150,7 +154,6 @@
<scope>test</scope>
</dependency>


<!-- hierarchical clustering, tests only -->
<dependency>
<groupId>nz.ac.waikato.cms.weka</groupId>
Expand Down Expand Up @@ -239,6 +242,9 @@
<version>3.1.0</version>
<configuration>
<argLine>-Xmx3g</argLine>
<classpathDependencyExcludes>
<classpathDependencyExclude>net.imagej:imagej-legacy</classpathDependencyExclude>
</classpathDependencyExcludes>
</configuration>
</plugin>
</plugins>
Expand All @@ -255,12 +261,15 @@
<version>3.1.0</version>
<configuration>
<argLine>@{argLine} -Xmx3g</argLine>
<classpathDependencyExcludes>
<classpathDependencyExclude>net.imagej:imagej-legacy</classpathDependencyExclude>
</classpathDependencyExcludes>
</configuration>
</plugin>
<plugin>
<groupId>org.jacoco</groupId>
<artifactId>jacoco-maven-plugin</artifactId>
<version>0.8.7</version>
<version>0.8.11</version>
<executions>
<execution>
<id>prepare-agent</id>
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
/*-
* #%L
* mastodon-deep-lineage
* %%
* Copyright (C) 2022 - 2024 Stefan Hahmann
* %%
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
* #L%
*/
package org.mastodon.mamut.feature.branch.dimensionalityreduction.pca;

import java.util.List;

import org.mastodon.feature.Feature;
import org.mastodon.feature.FeatureProjectionKey;
import org.mastodon.feature.FeatureProjectionSpec;
import org.mastodon.feature.FeatureSpec;
import org.mastodon.feature.Multiplicity;
import org.mastodon.mamut.feature.dimensionalityreduction.pca.AbstractPcaFeature;
import org.mastodon.mamut.model.branch.BranchSpot;
import org.mastodon.properties.DoublePropertyMap;
import org.scijava.plugin.Plugin;

/**
* Represents a PCA feature for BranchSpots in the Mastodon project.
* <br>
* This feature is used to store the PCA outputs for BranchSpots.
* <br>
* The PCA outputs are stored in a list of {@link DoublePropertyMap}s. The size of the list is equal to the number of dimensions of the PCA output.
*/
public class BranchPcaFeature extends AbstractPcaFeature< BranchSpot >
{
public static final String KEY = "Branch PCA outputs";

private final BranchSpotPcaFeatureSpec adaptedSpec;

public static final BranchSpotPcaFeatureSpec GENERIC_SPEC = new BranchSpotPcaFeatureSpec();

public BranchPcaFeature( final List< DoublePropertyMap< BranchSpot > > outputMaps )
{
super( outputMaps );
FeatureProjectionSpec[] projectionSpecs =
projectionMap.keySet().stream().map( FeatureProjectionKey::getSpec ).toArray( FeatureProjectionSpec[]::new );
this.adaptedSpec = new BranchSpotPcaFeatureSpec( projectionSpecs );
}

@Plugin( type = FeatureSpec.class )
public static class BranchSpotPcaFeatureSpec extends FeatureSpec< BranchPcaFeature, BranchSpot >
{
public BranchSpotPcaFeatureSpec()
{
super( KEY, HELP_STRING, BranchPcaFeature.class, BranchSpot.class, Multiplicity.SINGLE );
}

public BranchSpotPcaFeatureSpec( final FeatureProjectionSpec... projectionSpecs )
{
super( KEY, HELP_STRING, BranchPcaFeature.class, BranchSpot.class, Multiplicity.SINGLE, projectionSpecs );
}
}

@Override
public FeatureSpec< ? extends Feature< BranchSpot >, BranchSpot > getSpec()
{
return adaptedSpec;
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
/*-
* #%L
* mastodon-deep-lineage
* %%
* Copyright (C) 2022 - 2024 Stefan Hahmann
* %%
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
* #L%
*/
package org.mastodon.mamut.feature.branch.dimensionalityreduction.pca;

import java.util.Collection;
import java.util.List;
import java.util.concurrent.locks.ReentrantReadWriteLock;

import org.mastodon.RefPool;
import org.mastodon.mamut.feature.dimensionalityreduction.pca.AbstractPcaFeature;
import org.mastodon.mamut.feature.dimensionalityreduction.pca.AbstractPcaFeatureComputer;
import org.mastodon.mamut.model.Model;
import org.mastodon.mamut.model.branch.BranchLink;
import org.mastodon.mamut.model.branch.BranchSpot;
import org.mastodon.mamut.model.branch.ModelBranchGraph;
import org.mastodon.properties.DoublePropertyMap;
import org.scijava.Context;

public class BranchPcaFeatureComputer extends AbstractPcaFeatureComputer< BranchSpot, BranchLink, ModelBranchGraph >
{

public BranchPcaFeatureComputer( final Model model, final Context context )
{
super( model, context );
}

@Override
protected AbstractPcaFeature< BranchSpot > createFeatureInstance( final List< DoublePropertyMap< BranchSpot > > umapOutputMaps )
{
return new BranchPcaFeature( umapOutputMaps );
}

@Override
protected RefPool< BranchSpot > getRefPool()
{
return model.getBranchGraph().vertices().getRefPool();
}

@Override
protected ReentrantReadWriteLock getLock( final ModelBranchGraph branchGraph )
{
return branchGraph.getLock();
}

@Override
protected Collection< BranchSpot > getVertices()
{
return model.getBranchGraph().vertices();
}
}
Loading
Loading