Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
eade1a5
storage: Remove variant-secondary-annotation-index with samples. #TAS…
j-coll May 13, 2025
62dc57b
storage: Extend ProjectMetadata to include explicit solr configuratio…
j-coll May 14, 2025
26e8cd1
storage: Use separate versioning for configset. #TASK-6217
j-coll May 16, 2025
560a864
storage: Disable waitSearcher and openSearcher when writing into solr…
j-coll May 20, 2025
52a746d
storage: Replace the pending secondary annot index table with a file …
j-coll May 22, 2025
6950971
benchmark: Allow benchmark over https. Upgrade jmeter version. Add ge…
j-coll May 26, 2025
6721225
storage: Deprecate previos active searchmetadata if needed. #TASK-6217
j-coll May 26, 2025
b7c8095
storage: Remove "stored" fields from manage-schema. Update solr-confi…
j-coll May 27, 2025
cb335f4
storage: Add files integrity check. In fails, force MR execution. #TA…
j-coll May 27, 2025
7917f63
storage: Fix VariantSearchToVariantConverter. #TASK-6217
j-coll May 27, 2025
e4194fa
storage: Test with solr cloud instead of solr core. #TASK-6217
j-coll May 28, 2025
b4193cb
storage: Fix final search index synchronization. #TASK-6217
j-coll May 28, 2025
bc90599
storage: Reduce dfs.replication for temporary files. #TASK-6217
j-coll May 28, 2025
e758974
storage: Fix MiniSolrCloudCluster configuration. Add configsets. #TAS…
j-coll May 29, 2025
6fb012a
storage: Log dfs.replication when writing MR writes to hdfs. #TASK-6217
j-coll May 29, 2025
486454f
storage: Create collections with Tlog replicas in multi-node solrclou…
j-coll May 29, 2025
50b060a
storage: On solr write error, abort any hbase mutation or file clean.…
j-coll May 29, 2025
5af7f6e
storage: Do not clean pending variant files on reader error. #TASK-6217
j-coll May 30, 2025
105febc
storage: Mark for cleaning only one file per variant. Ignore end posi…
j-coll May 30, 2025
aae1744
storage: Add automatic sharding on collection creation. #TASK-6217
j-coll Jun 2, 2025
3d53309
storage: Fix NPE reading status. Add numDocs and size in bytes. #TASK…
j-coll Jun 3, 2025
10e42c4
storage: Move stats to a different solr collection. #TASK-6217
j-coll Jun 4, 2025
c686a72
pom: Remove mortbay servlet-api dependencies. #TASK-6217
j-coll Jun 5, 2025
bead342
storage: Fix active search index metadata selection. #TASK-6217
j-coll Jun 5, 2025
b3409c7
storage: Use join for filtering by stats in other collection. #TASK-6217
j-coll Jun 5, 2025
54d8567
storage: Force sort by "id" on each solr query. #TASK-6217
j-coll Jun 5, 2025
77e4e2a
storage: Remove SEARCH_INDEX_LAST_TIMESTAMP. Use "lastUpdateDate" fro…
j-coll Jun 5, 2025
b31dd45
storage: Increase insertBatchSize to 15k. #TASK-6217
j-coll Jun 6, 2025
5c843d5
storage: Add command opencga-admin.sh benchmark variant. #TASK-6217
j-coll Jun 6, 2025
f0c7dad
storage: Store variantStats values as counters instead of freq values…
j-coll Jun 10, 2025
629f58a
storage: Add functional queries for variant stats filters. #TASK-6217
j-coll Jun 10, 2025
fd54084
storage: Remove stats collection in solr. #TASK-6217
j-coll Jun 10, 2025
619d13d
storage: Add aux query to speed up function stats queries. #TASK-6217
j-coll Jun 11, 2025
d6eed32
storage: Restore filter cohortStatsPass. #TASK-6217
j-coll Jun 11, 2025
218c299
storage: Centralize HBase-Search sync operations in HadoopVariantSear…
j-coll Jun 11, 2025
f78e461
storage: Move VariantSearchSyncInfo to opencga-core. #TASK-6217
j-coll Jun 12, 2025
a54448e
storage: Add VariantStatsHash to synchonize only modified stats. #TAS…
j-coll Jun 16, 2025
77e3ef6
storage: Remove deprecated solr collections. #TASK-6217
j-coll Jun 16, 2025
11b6471
storage: Add a WARNING event on queries when updating solr. #TASK-6217
j-coll Jun 16, 2025
49b4730
storage: Add sortable id to variant search. Ensure backward-compatibi…
j-coll Jun 17, 2025
1854fd4
storage: Preserve backward-compatibility on functionalstats feature. …
j-coll Jun 17, 2025
e63eb06
storage: Ensure events are copied to final QueryResult. #TASK-6217
j-coll Jun 17, 2025
eb258c1
storage: Add testWhileLoadingEvent. Improve watchdog. #TASK-6217
j-coll Jun 17, 2025
f74924c
storage: Fix variant-prune cleaning from solr. #TASK-6217
j-coll Jun 17, 2025
fde108b
storage: Use creationDate to discard old sync flags. #TASK-6217
j-coll Jun 17, 2025
7bafae6
storage: Delete secondary index pending variant files from hdfs at en…
j-coll Jun 18, 2025
e3c15f7
storage: Execute discover pending variants MR only if outdated. #TASK…
j-coll Jun 18, 2025
5b2cd12
storage: Query by any cohort in benchmark. Add benchmark_plot_series.…
j-coll Jun 19, 2025
5dceb48
storage: Force new collection if any search index attribute changes. …
j-coll Jul 3, 2025
4baf853
storage: Only check attributes from configuration. #TASK-6217
j-coll Jul 3, 2025
feea050
storage: Check hdfs rename output. #TASK-6217
j-coll Jul 9, 2025
2124eae
storage: Ignore cohorts sync hash map column if not enabled. #TASK-6217
j-coll Jul 9, 2025
933e460
storage: Fix partially updateing stats. Discard other fields. #TASK-6217
j-coll Jul 9, 2025
f84ed11
storage: Ensure sync if running stats and solrIndex concurrently. #TA…
j-coll Jul 9, 2025
25a1b5f
storage: Update specific VariantStats instead of them all. #TASK-6217
j-coll Jul 11, 2025
bad4045
storage: Remove debug log from ProgressLogger. #TASK-6217
j-coll Jul 11, 2025
b37b15e
Merge branch 'develop' into TASK-6217
j-coll Jul 11, 2025
1f2e0c0
storage: Ensure updated documents have at least one variantStats. #TA…
j-coll Jul 11, 2025
5e47f86
storage: fix converter setIncludeIndexStatus #TASK-6217
j-coll Jul 15, 2025
2027394
analysis: Start HadoopExternalResource before VariantSolrExternalReso…
j-coll Jul 16, 2025
c45fbae
storage: Fix tests. #TASK-6217
j-coll Jul 16, 2025
7473c18
storage: Fix MAF filter in solr #TASK-6217
j-coll Jul 18, 2025
fb23ae1
analyis: Fix VariantOperationsTest. #TASK-6217
j-coll Jul 18, 2025
505937e
storage: Fix checkstyle. #TASK-6217
j-coll Jul 18, 2025
3486c5e
storage: Fix SolrQueryParser MAF. #TASK-6217
j-coll Jul 21, 2025
89a1f26
storage: Add searchIndex information to opencga-admin storage status.…
j-coll Jul 22, 2025
0bb98d9
storage: Do not flip MAF filters over 0.5. #TASK-6217
j-coll Jul 22, 2025
6405b66
storage: Close output asynchronously at DiscoverVariantsFileBasedMapp…
j-coll Jul 22, 2025
28a6f61
storage: Add default list of queries to benchmark. #TASK-6217
j-coll Jul 23, 2025
81e471a
storage: set final name for configset. #TASK-6217
j-coll Jul 23, 2025
01fb512
storage: Increase "search.load.numThreads" from 1 to 2. #TASK-6217
j-coll Jul 24, 2025
3f055f6
storage: Add simple constructor for VariantSearchToVariantConverter. …
j-coll Aug 1, 2025
9a42970
storage: Make SolrQueryParser abstract. Add VariantStorageSolrQueryPa…
j-coll Aug 1, 2025
2b4ca34
storage: Fix discover pending files with unusual contig names. #TASK-…
j-coll Aug 18, 2025
86c568f
storage: Remove repeated metadata queries. #TASK-6217
j-coll Aug 18, 2025
50ffd1d
storage: Fix "isSampleDataColumn". #TASK-6217
j-coll Aug 27, 2025
be4c252
storage: Improve csv parse. #TASK-6217
j-coll Aug 28, 2025
9a06fda
storage: Add ShutdownHookUtils. Ensure PendingVariantsFileCleaner abo…
j-coll Aug 28, 2025
ba9fc64
storage: Add a size threshold to input and commit Solr batches. #TASK…
j-coll Aug 28, 2025
441a3bf
storage: Prefer active over staging search indexes for queries. #TASK…
j-coll Aug 28, 2025
096e315
storage: Allow blue-green deployment on overwrite=true. #TASK-6217
j-coll Aug 28, 2025
0fcb936
storage: Fix SolrQueryParserTest. #TASK-6217
j-coll Aug 29, 2025
120aaca
storage: Missing SearchIndexMetadata should be created in ACTIVE stat…
j-coll Oct 10, 2025
2d3675a
Merge branch 'develop' into TASK-6217
j-coll Nov 6, 2025
bcb08ce
storage: Fix compilation issues. #TASK-6217
j-coll Nov 6, 2025
a8e3a1b
Merge branch 'develop' into TASK-6217
j-coll Nov 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions opencga-analysis/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,10 @@
<groupId>org.opencb.commons</groupId>
<artifactId>commons-datastore-core</artifactId>
</dependency>
<dependency>
<groupId>org.opencb.commons</groupId>
<artifactId>commons-datastore-solr</artifactId>
</dependency>
<dependency>
<groupId>org.opencb.opencga</groupId>
<artifactId>opencga-core</artifactId>
Expand Down Expand Up @@ -248,6 +252,40 @@
<groupId>org.apache.solr</groupId>
<artifactId>solr-test-framework</artifactId>
<scope>test</scope>
<exclusions>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-auth</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-hdfs-client</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.htrace</groupId>
<artifactId>htrace-core</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpmime</artifactId>
</exclusion>
<exclusion>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
</exclusion>
</exclusions>
</dependency>

<dependency>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,9 +82,11 @@
import org.opencb.opencga.storage.core.StorageEngineFactory;
import org.opencb.opencga.storage.core.StoragePipelineResult;
import org.opencb.opencga.storage.core.exceptions.StorageEngineException;
import org.opencb.opencga.storage.core.exceptions.VariantSearchException;
import org.opencb.opencga.storage.core.metadata.VariantMetadataFactory;
import org.opencb.opencga.storage.core.metadata.VariantStorageMetadataManager;
import org.opencb.opencga.storage.core.metadata.models.*;
import org.opencb.opencga.storage.core.metadata.models.project.SearchIndexMetadata;
import org.opencb.opencga.storage.core.utils.CellBaseUtils;
import org.opencb.opencga.storage.core.variant.BeaconResponse;
import org.opencb.opencga.storage.core.variant.VariantStorageEngine;
Expand All @@ -98,6 +100,7 @@
import org.opencb.opencga.storage.core.variant.query.projection.VariantQueryProjectionParser;
import org.opencb.opencga.storage.core.variant.score.VariantScoreFormatDescriptor;
import org.opencb.opencga.storage.core.variant.search.solr.VariantSearchLoadResult;
import org.opencb.opencga.storage.core.variant.search.solr.VariantSearchManager;

import java.io.IOException;
import java.net.URI;
Expand Down Expand Up @@ -239,22 +242,6 @@ public List<StoragePipelineResult> index(String study, List<String> files, Strin
.index(study, files, UriUtils.createDirectoryUriSafe(outDir), params, token));
}

public void secondaryIndexSamples(String study, List<String> samples, ObjectMap params, String token)
throws CatalogException, StorageEngineException {
secureOperation(VariantSecondaryIndexSamplesOperationTool.ID, study, params, token, engine -> {
engine.secondaryIndexSamples(study, samples);
return null;
});
}

public void removeSearchIndexSamples(String study, List<String> samples, ObjectMap params, String token)
throws CatalogException, StorageEngineException {
secureOperation("removeSecondaryIndexSamples", study, params, token, engine -> {
engine.removeSecondaryIndexSamples(study, samples);
return null;
});
}

public VariantSearchLoadResult secondaryAnnotationIndex(String project, String region, boolean overwrite, ObjectMap params, String token)
throws CatalogException, StorageEngineException {
return secureOperationByProject(VariantSecondaryAnnotationIndexOperationTool.ID, project, params, token, engine -> {
Expand Down Expand Up @@ -1236,6 +1223,35 @@ public void variantPrune(String project, URI outdir, VariantPruneParams params,
});
}

public List<ObjectMap> getSearchStatus(String project, String token)
throws StorageEngineException, CatalogException, IOException, VariantSearchException {
List<ObjectMap> results = new ArrayList<>();
try (VariantStorageEngine engine = getVariantStorageEngineByProject(project, new ObjectMap(), token)) {
VariantSearchManager variantSearchManager = engine.getVariantSearchManager();
ProjectMetadata pm = engine.getMetadataManager().getProjectMetadata();
SearchIndexMetadata active = pm.getSecondaryAnnotationIndex().getActiveIndex();
if (active != null) {
String collection = variantSearchManager.buildCollectionName(active);
ObjectMap result = new ObjectMap();
result.put("metadata", active);
if (variantSearchManager.exists(active)) {
result.put("collection", collection);
}
results.add(result);
}
for (SearchIndexMetadata stagingIndex : pm.getSecondaryAnnotationIndex().getStagingIndexes()) {
String collection = variantSearchManager.buildCollectionName(stagingIndex);
ObjectMap result = new ObjectMap();
result.put("metadata", stagingIndex);
if (variantSearchManager.exists(stagingIndex)) {
result.put("collection", collection);
}
results.add(result);
}
return results;
}
}

// Permission related methods

private interface VariantReadOperation<R> {
Expand Down

This file was deleted.

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -123,8 +123,8 @@ public class VariantOperationsTest {
public static Object[][] parameters() {
return new Object[][]{
// {MongoDBVariantStorageEngine.STORAGE_ENGINE_ID},
{DummyVariantStorageEngine.STORAGE_ENGINE_ID},
{HadoopVariantStorageEngine.STORAGE_ENGINE_ID}
{HadoopVariantStorageEngine.STORAGE_ENGINE_ID},
{DummyVariantStorageEngine.STORAGE_ENGINE_ID}
};
}

Expand Down Expand Up @@ -174,7 +174,7 @@ public void tearDown() {

try {
VariantStorageEngine engine = opencga.getStorageEngineFactory().getVariantStorageEngine(storageEngine, DB_NAME);
if (storageEngine.equals(HadoopVariantStorageEngine.STORAGE_ENGINE_ID)) {
if (storageEngine.equals(HadoopVariantStorageEngine.STORAGE_ENGINE_ID) && hadoopExternalResource.isReady()) {
VariantHbaseTestUtils.printVariants(((VariantHadoopDBAdaptor) engine.getDBAdaptor()), Paths.get(opencga.createTmpOutdir("_hbase_print_variants_AFTER")).toUri());
}
} catch (Exception e) {
Expand All @@ -188,10 +188,6 @@ public void tearDown() {

@BeforeClass
public static void beforeClass() throws Exception {
if (HadoopVariantStorageTest.HadoopSolrSupport.isSolrTestingAvailable()) {
solrExternalResource = new VariantSolrExternalResource();
solrExternalResource.before();
}
}

@AfterClass
Expand All @@ -201,8 +197,9 @@ public static void afterClass() {
hadoopExternalResource.after();
hadoopExternalResource = null;
}
if (HadoopVariantStorageTest.HadoopSolrSupport.isSolrTestingAvailable()) {
if (solrExternalResource != null) {
solrExternalResource.after();
solrExternalResource = null;
}
}

Expand Down Expand Up @@ -244,6 +241,14 @@ private void loadDataset() throws Throwable {
DummyVariantStorageEngine.configure(opencga.getStorageEngineFactory(), true);
}

if (HadoopVariantStorageTest.HadoopSolrSupport.isSolrTestingAvailable()) {
if (solrExternalResource == null) {
solrExternalResource = new VariantSolrExternalResource();
solrExternalResource.before();
} else {
solrExternalResource.clearCollections();
}
}
catalogManager = opencga.getCatalogManager();
if (HadoopVariantStorageTest.HadoopSolrSupport.isSolrTestingAvailable()) {
variantStorageManager = opencga.getVariantStorageManager(solrExternalResource);
Expand All @@ -265,6 +270,12 @@ private void loadDataset() throws Throwable {
}

setUpCatalogManager();
if (HadoopVariantStorageTest.HadoopSolrSupport.isSolrTestingAvailable()) {
solrExternalResource.configure(variantStorageManager.getVariantStorageEngine(STUDY, token));
solrExternalResource.configure(variantStorageManager.getVariantStorageEngineForStudyOperation(STUDY, new ObjectMap(), token));
}

dummyVariantSetup(variantStorageManager, STUDY, token);

file = opencga.createFile(STUDY, "variant-test-file.vcf.gz", token);
// variantStorageManager.index(STUDY, file.getId(), opencga.createTmpOutdir("_index"), new ObjectMap(VariantStorageOptions.ANNOTATE.key(), true), token);
Expand Down
4 changes: 2 additions & 2 deletions opencga-app/app/misc/solr/INSTALL.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,6 @@ After compiling and installing OpenCGA, the six Solr config sets are located at
In order to upload all of them, you need to execute the following commands:

```
$ ./bin/solr zk upconfig -n opencga-variant-configset-REPLACEME_OPENCGA_VERSION -d ~/opencga/build/misc/solr/opencga-variant-configset-REPLACEME_OPENCGA_VERSION -z localhost:9983
$ ./bin/solr zk upconfig -n opencga-rga-configset-REPLACEME_OPENCGA_VERSION -d ~/opencga/build/misc/solr/opencga-rga-configset-REPLACEME_OPENCGA_VERSION -z localhost:9983
$ ./bin/solr zk upconfig -n opencga-variant-configset-XYZ -d ~/opencga/build/misc/solr/opencga-variant-configset-XYZ -z localhost:9983
$ ./bin/solr zk upconfig -n opencga-rga-configset-XYZ -d ~/opencga/build/misc/solr/opencga-rga-configset-XYZ -z localhost:9983
```
Loading
Loading