Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create FASTA writer for QuatSymm multiple alignment #100

Merged
merged 4 commits into from
Jul 24, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 18 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[![Build Status](https://travis-ci.org/rcsb/symmetry.png)](https://travis-ci.org/rcsb/symmetry)

# Symmetry in Biomolecular Structures
# Symmetry in Protein Structures

This project collects tools to detect, analyze, and visualize **protein symmetry**. This includes the CE-Symm tool for **internal symmetry**, a tool for **quaternary symmetry**, and other experiments relating to symmetry.

Expand All @@ -14,34 +14,43 @@ User interfaces are available within the [symmetry-tools](https://github.com/rcs

<img src="docu/img/1u6d_symmetry.png" align="left" width="150" alt="C6 internal symmetry in PDB:1U6D" title="PDB:1U6D" />

CE-Symm is a tool for detecting internal symmetry in protein structures. CE-Symm version 2 is able to detect both open and closed symmetry and provide a multiple alignment of all repeats.
CE-Symm is a tool for detecting internal symmetry in protein structures.
CE-Symm version 2 is able to detect both open and closed symmetry and provide a multiple alignment of all repeats.

See [CE-Symm documentation](symmetry-tools/docs/CeSymm.md) for more details.
See [CE-Symm documentation](symmetry-tools/docs/CeSymm.md) for more details on how to use the tool.

When using CE-Symm, please cite:
If you find CE-Symm useful for your research, please consider citing:

CE-Symm version 2:
#### CE-Symm version 2:

**Analyzing the symmetrical arrangement of structural repeats in proteins with CE-Symm**<br/>
*Spencer E Bliven, Aleix Lafita, Peter W Rose, Guido Capitani, Andreas Prlić, & Philip E Bourne* <br/>
[PLOS Computational Biology (2019) 15 (4):e1006842.](https://journals.plos.org/ploscompbiol/article/citation?id=10.1371/journal.pcbi.1006842) <br/>
[![doi](https://img.shields.io/badge/doi-10.1371%2Fjournal.pcbi.1006842-blue.svg?style=flat)](https://doi.org/10.1371/journal.pcbi.1006842) [![pubmed](https://img.shields.io/badge/pubmed-31009453-blue.svg?style=flat)](http://www.ncbi.nlm.nih.gov/pubmed/31009453)

CE-Symm version 1:
#### CE-Symm version 1:

**Systematic detection of internal symmetry in proteins using CE-Symm**<br/>
*Douglas Myers-Turnbull, Spencer E Bliven, Peter W Rose, Zaid K Aziz, Philippe Youkharibache, Philip E Bourne, & Andreas Prlić* <br/>
[J Mol Biol (2013) 426 (11): 2255–2268.](https://doi.org/10.1016/j.jmb.2014.03.010) <br/>
[![doi](https://img.shields.io/badge/doi-10.1016%2Fj.jmb.2014.03.010-blue.svg?style=flat)](https://doi.org/10.1016/j.jmb.2014.03.010) [![pubmed](https://img.shields.io/badge/pubmed-24681267-blue.svg?style=flat)](http://www.ncbi.nlm.nih.gov/pubmed/24681267)

### Quaternary Symmetry
### QuatSymm

<img src="docu/img/1G63.jpg" alt="Tetrahedral symmetry of PDB:1G63" title="PDB:1G63" style="float: left; width:150px" align="left" width="150"/>

The QuatSymm tool is used for identifying quaternary symmetry in protein complexes. It is also able to tolerate mutations and detect pseudosymmetry in complexes with structurally homologous subunits.
The QuatSymm tool is used for identifying quaternary symmetry in protein complexes.
It is also able to detect pseudosymmetry in complexes with structurally homologous subunits.
The QuatSymm tool is used by the RCSB Protein Data Bank's [quaternary symmetry analysis](http://www.rcsb.org/pdb/browse/stoichiometry.do) and it has also been utilized to assess protein assembly predictions at [CASP challenges](https://predictioncenter.org/).

See [QuatSymm documentation](symmetry-tools/docs/QuatSymm.md) for more details.

The QuatSymm algorithms are used by the RCSB Protein Data Bank's [quaternary symmetry analysis](http://www.rcsb.org/pdb/browse/stoichiometry.do). The tool was also utilized for biological assembly assessment during the 1[2th Critical Assessment of Structure Prediction in 2016](https://doi.org/10.1002/prot.25408).
If you find the QuatSymm tool useful for your research, please consider citing the following publications where it has been described:

**BioJava 5: A community driven open-source bioinformatics library**<br/>
*Aleix Lafita, Spencer Bliven, Andreas Prlić, Dmytro Guzenko, Peter W. Rose, Anthony Bradley, Paolo Pavan, Douglas Myers-Turnbull, Yana Valasatava, Michael Heuer, Matt Larson, Stephen K. Burley, Jose M. Duarte* <br/>
[PLOS Computational Biology 15(2): e1006791](http://dx.plos.org/10.1371/journal.pcbi.1006791) <br/>
[![doi](http://img.shields.io/badge/doi-10.1371%2Fjournal.pcbi.1006791-blue.svg?style=flat)](https://doi.org/10.1371/journal.pcbi.1006791)


## Dependencies
Expand Down
21 changes: 21 additions & 0 deletions symmetry-tools/src/main/java/main/QuatSymmMain.java
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
import org.slf4j.LoggerFactory;

import workers.QuatSymmWorker;
import writers.QuatSymmFastaWriter;
import writers.QuatSymmStatsWriter;
import writers.QuatSymmWriter;

Expand Down Expand Up @@ -162,6 +163,18 @@ public static void main(String[] args) throws InterruptedException {
logger.error(e.getMessage());
}
}

if (cli.hasOption("fasta")) {
String filename = cli.getOptionValue("fasta");
if(filename == null || filename.isEmpty())
filename = "-"; // standard out
try {
writers.add(new QuatSymmFastaWriter(filename));
} catch (IOException e) {
logger.error("Error: Ignoring file " + filename + ".");
logger.error(e.getMessage());
}
}

// Default Writer
if (writers.isEmpty() && !cli.hasOption("noverbose")) {
Expand Down Expand Up @@ -379,6 +392,14 @@ private static Options getOptions() {
.argName("file")
.desc("Output a tsv file with detailed symmetry information (default)")
.build());
options.addOption(Option
.builder("f")
.longOpt("fasta")
.hasArg()
.optionalArg(true)
.argName("file")
.desc("Output alignment as FASTA alignment output")
.build());

// jmol
grp = new OptionGroup();
Expand Down
55 changes: 55 additions & 0 deletions symmetry-tools/src/main/java/writers/QuatSymmFastaWriter.java
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
package writers;

import java.io.IOException;
import java.util.List;
import java.util.stream.Collectors;

import org.biojava.nbio.structure.StructureException;
import org.biojava.nbio.structure.StructureIdentifier;
import org.biojava.nbio.structure.align.client.StructureName;
import org.biojava.nbio.structure.align.multiple.MultipleAlignment;
import org.biojava.nbio.structure.align.multiple.util.MultipleAlignmentWriter;
import org.biojava.nbio.structure.cluster.SubunitCluster;
import org.biojava.nbio.structure.symmetry.core.QuatSymmetryResults;

/**
* Writes the QuatSymm multiple alignment in FASTA format in a single
* file. Different entries are split by the characters '//'.
*
* @author Aleix Lafita
*
*/
public class QuatSymmFastaWriter extends QuatSymmWriter {
public QuatSymmFastaWriter(String filename) throws IOException {
super(filename);
}

@Override
public synchronized void writeResult(String identifier,
QuatSymmetryResults result) throws StructureException {
if (result != null ) {
for (SubunitCluster cluster:result.getSubunitClusters()){
// There is bug because Structure Identifiers are null - quick fix here
List<StructureIdentifier> structident = cluster.getSubunits().stream()
.map(n -> new StructureName(identifier + "_" + n.getName()))
.collect(Collectors.toList());

MultipleAlignment alignment = cluster.getMultipleAlignment();
alignment.getEnsemble().setStructureIdentifiers(structident);
if(alignment != null) {
writer.write(MultipleAlignmentWriter.toFASTA(alignment));
}
writer.println("//");
writer.flush();
}
}
}

@Override
public synchronized void writeHeader() throws IOException {
// No header for Fasta files
}



}
2 changes: 1 addition & 1 deletion symmetry-tools/src/main/java/writers/QuatSymmWriter.java
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,6 @@ public QuatSymmWriter(String filename) throws IOException {
* @throws IOException
*/
abstract public void writeResult(String identifier,
QuatSymmetryResults result) throws IOException;
QuatSymmetryResults result) throws Exception;

}