Gene name aliases #37

Gregory94 · 2021-03-11T07:27:39Z

There has been some confusion with gene names between datasets that are generated with our workflow and gene names from files created by others. This is most likely caused by the use of different naming conventions or gene aliases (i.e. the same gene can have multiple names).

One of the differences in gene names are between _pergene.txt files from the workflow of the Kornmann lab and from our workflow. I have checked the differences between the two files using this python script. This takes two _pergene.txt files as input and for each file creates a list of all gene names present in that file. It then looks for all genes that are in one list but not the other and vice versa.

I saw that there are 80 genes that are different between the Kornmann files and our files. I checked all genes and I noticed that sometimes they were using either a different naming convention for genes (e.g. we use MRX3 whilst they use YBL095W which are two names for the same gene) or they used an alias (e.g. we use BOL3 whilst they used AIM1, again both referring to the same gene).

Just be aware when comparing data files from different sources that include gene names, that there might be differences in the names for the same genes.

Solving this issue can be done using the Yeast_Protein_Names.txt file that stores all different names for the genes.
Alternatively you can use the genomicfeatures_dataframe.py script that creates a python dataframe including, for each gene its aliased and different naming conventions (it is also using the Yeast_Protein_Names.txt file).

Gregory94 · 2021-03-11T10:20:42Z

Important note when using Yeast_Protein_Names.txt.
There has been a major update concerning gene names and aliases in Yeast_Protein_Names.txt.
More gene names are present and some genes have updated aliases.
This has been updated on the master branch.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gene name aliases #37

Gene name aliases #37

Gregory94 commented Mar 11, 2021 •

edited

Loading

Gregory94 commented Mar 11, 2021 •

edited

Loading

Gene name aliases #37

Gene name aliases #37

Comments

Gregory94 commented Mar 11, 2021 • edited Loading

Gregory94 commented Mar 11, 2021 • edited Loading

Gregory94 commented Mar 11, 2021 •

edited

Loading

Gregory94 commented Mar 11, 2021 •

edited

Loading