Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ftp: Protein data files (features and aa composition) #2221

Open
ValWood opened this issue Sep 3, 2024 · 4 comments
Open

ftp: Protein data files (features and aa composition) #2221

ValWood opened this issue Sep 3, 2024 · 4 comments
Assignees

Comments

@ValWood
Copy link
Member

ValWood commented Sep 3, 2024

https://www.pombase.org/data/Protein_data/

Need to decide what is required

PeptideStats.tsv 2017-11-23 21:10 190K  
 Protein_Features.tsv 2017-10-20 21:35 2.8M  
 aa_composition.tsv 2017-10-20 21:35 333K  

make sure these are copied and updated with the other files

@ValWood
Copy link
Member Author

ValWood commented Sep 3, 2024

I think we should write out the updated versions of amino acid composition. I think we have been asked for this in the past.
I'm not sure about the protein features. I don't think we need to make this available.

Note that we link to both of these files here:
https://www.pombase.org/downloads/protein-datasets

If we do decide the protein features file is worth keeping, we should
i) remove tmhmms from it because they are in their own file
ii) Rename it as protein domains and families

make sure all currently linked files are up to date

@ValWood ValWood changed the title Protein data ftp: Protein data files (features and aa composition) Sep 3, 2024
kimrutherford added a commit to pombase/pombase-legacy that referenced this issue Sep 4, 2024
These files were being generated nightly but not updated in SVN:
  ftp_site/pombe/Protein_data/aa_composition.tsv
  ftp_site/pombe/Protein_data/PeptideStats.tsv
  ftp_site/pombe/Protein_data/Protein_Features.tsv

Refs: pombase/website#2221
@kimrutherford
Copy link
Member

make sure all currently linked files are up to date

They'll be updated nightly from Wednesday night. (I'll check on Thursday)

For now I haven't made any other changes.

@kimrutherford
Copy link
Member

If we do decide the protein features file is worth keeping, we should
i) remove tmhmms from it because they are in their own file

The TM domains aren't in that file anymore.

ii) Rename it as protein domains and families

We renamed the gene page section to "Protein domains and features" so how about "protein_domains_and_features.tsv" as the file name?

@ValWood
Copy link
Member Author

ValWood commented Nov 4, 2024

so how about "protein_domains_and_features.tsv" as the file name?

definitely!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants