Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

biosynthesis processes that are single-step in some organisms and multi-step in others #29672

Open
cmungall opened this issue Feb 6, 2025 · 5 comments

Comments

@cmungall
Copy link
Member

cmungall commented Feb 6, 2025

Consider this process

id: GO:0009229
name: thiamine diphosphate biosynthetic process
namespace: biological_process
def: "The chemical reactions and pathways resulting in the formation of thiamine diphosphate, a derivative of thiamine (vitamin B1) which acts as a coenzyme in a range of processes including the Krebs cycle." [GOC:jl, ISBN:0140512713]
synonym: "thiamin diphosphate biosynthetic process" EXACT [GOC:cuators]
synonym: "thiamin pyrophosphate biosynthesis" EXACT []
synonym: "thiamin pyrophosphate biosynthetic process" EXACT []
synonym: "thiamine diphosphate anabolism" EXACT []
synonym: "thiamine diphosphate biosynthesis" EXACT []
synonym: "thiamine diphosphate formation" EXACT []
synonym: "thiamine diphosphate synthesis" EXACT []
synonym: "thiamine pyrophosphate biosynthesis" EXACT []
synonym: "thiamine pyrophosphate biosynthetic process" EXACT []
synonym: "TPP biosynthesis" EXACT []
synonym: "TPP biosynthetic process" EXACT []
intersection_of: GO:0009058 ! biosynthetic process
intersection_of: has_primary_output CHEBI:58937 ! thiamine(1+) diphosphate(3-)

the start point is undefined (formally or informally)

Some euks, such as plants and yeast have a whole pathway for this, and the annotations in these species look more or less what you might expect (with a few annotations that seem to be indirect roles). In all cases the penultimate step is making thiamine, when is then turned into ThDP by TPK1.

But we mammals rely on other organisms to make thiamine, and our "thiamine diphosphate biosynthetic process (sensu stricto)" is trivially the activity of a single gene (TPK1). Ideally we would see only one human gene annotated to this "pathway" -- and in fact we do, if we look only at IBAs. But there is a lot of annotation of genes (THTPA, SLC19A2, SLC19A3, SLC25A19) that are related in some other way.

These other annotations make sense when we look at what Reactome calls "thiamin metabolism" https://reactome.org/PathwayBrowser/#/R-HSA-196819

Image

I think Reactome's use of the catch-all "metabolism" is right here, because really it's only one step here that is biosynthesis (TPK1).

IMO it's clear the PAINT annotators and other annotators are annotating to a different concept here.

My proposal here is to ban single-step pathways, restrict ThDP biosynthesis to be bacteria, plants, yeast, etc. However, I do think it's useful to have a concept for what is shown in the Reactome diagram and retain the same genes. ThDP transport (no such term in GO) isn't quite right since it's really uptake-modification-transport. Metabolism is not wrong but we use metabolism as a general grouping.

This would be a bit of work and reannotation. But right now the concept we have in GO is being interpreted in at least two different ways which is not good.

@cmungall
Copy link
Member Author

cmungall commented Feb 6, 2025

Aside: this is my technique for initial review:

runoak -i amigo:NCBITaxon:9606 associations GO:0009229 | llm -m 4o -t summarize-BP-annotations

Using this template

Results:

The biological process in focus is the thiamine diphosphate biosynthetic process. Thiamine diphosphate (ThDP) is an essential cofactor for several enzymatic reactions critical to cellular metabolism. The gene products associated with this process play roles in either the synthesis or transport of components crucial for ThDP biosynthesis.

  1. TPK1: Thiamine pyrophosphate kinase 1 (TPK1) is directly involved in the biosynthesis of ThDP. It catalyzes the conversion of thiamine monophosphate to thiamine diphosphate through ATP-dependent phosphorylation, which is a direct step in the biosynthetic pathway of ThDP. Therefore, TPK1 plays a central role in the generation of functional ThDP from its precursor forms.

  2. THTPA: Thiamine triphosphatase (THTPA) regulates the levels of thiamine triphosphate (ThTP) and indirectly ensures the balance of ThDP. Although THTPA's primary function is not the direct synthesis of ThDP but hydrolyzes ThTP, its activity affects the overall pool and regulation of thiamine phosphate derivatives, assisting in the maintenance of sufficient ThDP levels for cellular metabolic needs.

  3. SLC19A3 and SLC19A2: These are members of the solute carrier family facilitating thiamine transport across cell membranes. Although they do not directly catalyze the biosynthesis of ThDP, they supply thiamine (vitamin B1) to cells, which is the precursor for ThDP synthesis. SLC19A3 and SLC19A2 are critical for providing adequate substrate availability intracellularly for TPK1 to convert into ThDP.

  4. SLC25A19: This transporter is involved in the exchange of mitochondrial thiamine with cytosolic thiamine phosphate derivatives. While its function is not directly described as synthesizing ThDP, the transport of these components is necessary for compartmentalizing reactions where ThDP is utilized or synthesized and ensuring appropriate distribution in the mitochondria and cytosol.

The intricate interplay of these gene products outlines the biological context of thiamine and ThDP within cellular metabolism, from uptake and transport to modification into active cofactor form. There's nothing particularly unclear about this model, but it is worth noting that the involvement of THTPA might be seen as more indirect unless explicitly considering organism-specific regulatory needs for ThDP and related metabolites. Overall, this path includes dependent and supportive roles rather than a strict linear A -> B style synthesis route.

@ValWood
Copy link
Contributor

ValWood commented Feb 7, 2025

fission yeast GO-CAM curated by @PCarme
https://www.pombase.org/gocam/gene/66c7d41500000963/SPCC1223.02/nmt1

We would not annotate anything upstream of thiamine to thiamine diphosphate biosynthetic process, and these annotations are not included in the model (although we have lots of inferred annotations to this term which are upstream, (I will query these if we agree)
https://www.pombase.org/term/GO:0009229

However, we currently have tnr3 annotated to thiamine salvage, which I'm not sure is quite right (I have only used salvage to apply to the output molecule, this would be salvage in the opposite direction).

I have been thinking that there is a case for single-step pathway for some functions. Quite often, we have 2 clearly modular pathways and a 'linking reaction'. There seems no other way to describe these, and it seems natural that the MF really is a single-step process.
For example, some of the reactions providing activated sugars to glycosylation pathways, we decided on an editors call a while back that were not 'part of' the pathways they donate to, but the molecule is synthesized in a single step (and used by many pathways).

Also, since thiamine diphosphate is the active form of vitamin B1, wouldn't it seem weird not to have a biosynthesis term for it?

@pgaudet
Copy link
Contributor

pgaudet commented Feb 7, 2025

I agree with Val, the 1 step = MF, > 1 step = BP is a misleading rule.

@deustp01
Copy link

deustp01 commented Feb 7, 2025

This would be a bit of work and reannotation. But right now the concept we have in GO is being interpreted in at least two different ways which is not good.

Except here is an argument, put out for sanity testing, there are not really two different uses of a single term here.

Our prototrophic distant ancestors evolved a pathway to synthesize thiamine de novo from simple starting materials. More recent ancestors evolved the ability to consume prototrophs and their products, relieving the selective pressure to maintain the full, metabolically expensive, de novo pathway. As a result, many descendants including us have lost some or all of the molecular functions needed for the de novo process. But, as here, we have retained a few. I don't know the genomics in this case, but it's easy, for other recently truncated processes like purine catabolism, to find pseudogene relics of now-lost function steps, reinforcing the link between the current stub and the ancestral process.

Now, how do we label this few- or single-step stub that remains of the multistep de novo process. The founding statement for GO describes it as a "tool for the unification of biology" [over large evolutionary distances]. With that statement in mind, the present-day stub is still thiamine metabolism, viewed in terms of both present day chemistry and evolutionary origin. We need a label that ties it to the large de novo process it is descended from. And setting some minimum amount of the original process that needs to be preserved (here, two or more functions), seems like tidy bookkeeping but not good evolutionary biology.

This is not an argument for allowing single-function processes in general, but only for preserving the "process" label in cases where evolution has partly dismantled an ancestral multi-step process.

@deustp01
Copy link

deustp01 commented Feb 7, 2025

The intricate interplay of these gene products outlines the biological context of thiamine and ThDP within cellular metabolism, from uptake and transport to modification into active cofactor form. There's nothing particularly unclear about this model, but it is worth noting that the involvement of THTPA might be seen as more indirect unless explicitly considering organism-specific regulatory needs for ThDP and related metabolites. Overall, this path includes dependent and supportive roles rather than a strict linear A -> B style synthesis route.

There is a pathway boundary issue here. I think that by default we treat metabolism / biosynthesis / catabolism of a chemical entity as including only the steps that transform a starting chemical into the final product or products, but not the steps that move the starting chemical to the place where the metabolism will happen, nor the steps that move the products away. Thus glycolysis starts with cytosolic glucose and ends with cytosolic pyruvate.

Should we try to be consistent about this? Maybe variation in this aspect of defining process specific boundaries is OK as long as there is consensus on how to define these various boundaries correctly / consistently in each case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants