Skip to content

SamakshSingh99/GeneScoPy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GeneScoPy

License Python Version

Genome assembly sequence and GFF/GTF file analyzer

Overview

GeneScoPy is a python based standalone graphical user interface (GUI) tool for working with genome assembly sequence files (FASTA) and genome annotation files (GTF/GFF). It provides a platform for:

  • Inspecting and managing genome assembly files.
  • Viewing genome annotation details.
  • Performing basic analyses such as computing assembly statistics (e.g., N50, GC content, scaffold sizes).
  • Searching and navigating annotation files efficiently.
  • Higlight the region of interest in the FASTA sequence (selection based).

Key Features

  • File Compatibility: Supports FASTA and GTF/GFF file formats.
  • Assembly Details: Displays total assembly length, scaffold counts, largest and smallest scaffolds, N50, and GC content.
  • Annotation Table: Presents GTF/GFF data in an easy-to-navigate table with fields like scaffold, source, feature, start and end positions, strand, frame, product, and gene name.
  • Sequence Viewer: Allows users to view scaffold sequences in a text editor.
  • Search Functionality: Provides tools for searching and navigating annotation records by keywords.
  • Highlight Functionality Highlights the sequence region of interest based annotation selection.
  • User-Friendly Interface: Built with a modern and intuitive GUI.

Installation

  1. Clone this repository:
    git clone https://github.com/SamakshSingh99/GeneScoPy/
    cd GeneScoPy
  2. Ensure you have Python installed (version 3.7 or higher).
  3. Install required dependencies:
    pip install tk
  4. Run the tool:
    python ./Script/Script.py

How to Use

Opening Files

  1. Launch the application.
  2. Use the File menu to open a FASTA file or GTF/GFF file.
Screenshot 2024-12-11 at 10 03 48 AM

Viewing Assembly Details

  • After loading a FASTA file, the "Assembly Details" section will display information about:
    • Total assembly length.
    • Number of scaffolds.
    • Largest and smallest scaffolds.
    • N50 value.
Screenshot 2024-12-11 at 10 04 51 AM

Viewing Annotation Data

  • Load a GTF/GFF file to populate the annotation table.
  • Use the table columns to explore scaffold details, gene annotations, and other metadata.
Screenshot 2024-12-11 at 10 06 47 AM

Searching Annotations

  • Use the search bar to find specific entries in the annotation table.
  • Navigate through results using the Previous and Next buttons.
  • Check the sequence box to find the highlighted regions for selection.
  • Reset the search to view the entire table again.
Screenshot 2024-12-11 at 11 32 44 AM

File Management

  • Scaffold sequences can be selected from the list and displayed in the sequence viewer for detailed inspection.

Contributing

Contributions are welcome! If you'd like to enhance the tool or fix any bugs:

  1. Fork the repository.
  2. Create a new branch for your feature or bug fix.
  3. Submit a pull request with a detailed description.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

Special thanks to the bioinformatics community for inspiring this project.

Contact

For questions or support, please open an issue on the GitHub repository.

About

Genome Assembly Sequence and GFF/GTF Analyser

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages