Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow/convert spaces inside relationships in column 16? #28

Open
austinmeier opened this issue Feb 1, 2017 · 2 comments
Open

allow/convert spaces inside relationships in column 16? #28

austinmeier opened this issue Feb 1, 2017 · 2 comments

Comments

@austinmeier
Copy link

@cmungall is there a way to allow spaces in column 16?

We have lots of germplasm annotations that receive the relationship: "has_phenotype_score()" And the string that get's pulled from the Samara scrape file often has spaces in it. For example:

"Culm_diameter_(mm)_of_basal_internode_at_repro.=6"

Currently I am simply substituting spaces, and other illegal characters for "_". It works, but it looks really bad when displayed on the browser. I was wondering if there was a way to allow spaces if the string containing them falls inside the "( )" of the relationship. Or if there is something I can replace the spaces with that would be converted to spaces when viewing them in the browser (Think "%20" in URLs). In the above example the column16 from the GAF would look like:

has_phenotype_score(Culm%20diameter%20(mm)%20of%20basal%20internode%20at%20repro.=6)

That way the browser would display this as:

has phenotype score Culm diameter (mm) of basal internode at repro.=6

Let me know if this is not clear.

@cmungall
Copy link
Member

cmungall commented Feb 1, 2017

is there a way to allow spaces in column 16?

No

You're already abusing poor col16 enough!

it looks really bad when displayed on the browser.

This is all solved if we make sure there is an ontology providing labels (and ideally defs) for all relations used.

The underlying storage in amigo actually uses the RO ID. The c16 format is already slightly hacky in that it allows the use of what are called 'shorthand' IDs. This is exactly analogous to what you see in obo files:

id: PO:0000002
name: anther wall
relationship: part_of PO:0009066 ! anther

...

[Typedef]
id: part_of
name: part_of
xref: BFO:0000050 ! magic xref
is_transitive: true

you'll notice in the OWL the part_of is gone, it's just the URI and the label (name).

Now, I don't think we want Culm_diameter_(mm)_of_basal_internode_at_repro in RO. You can make a private ontology.

But we may want to explore another pattern

e.g.

has_measurement(FOO:nn),has_unit(UO:nn),has_value(6)

where FOO:nn is scale-independent

or

has_measurement(FOO:nn),has_value(6)

where FOO:nn has the scale

(and FOO:nn may be a CO class)

@austinmeier
Copy link
Author

Oh I've been abusing column 16 since I got started on this GAF business!!

We intentionally added the "has_phenotype_value()" relationship in the TO so that we could have it displayed, but due to the insane variation in "values" that we are pulling between Samara scraping GRIN, and IRRI's GRIMS database, that using a pre-composed pattern such as the one you've suggested becomes rather labor intensive.
I will look into doing something similar for the Samara scrape, as the data for that seems to be relatively "uniform"

If using the pattern suggested, can FOO:nn be the GRINDescr:nn for each grin descriptor? Because that might actually work out quite nicely for the GRIN data.

I'll do some poking around, and see what I find. The main issue I see is that the scales used in GRIN are not CO scales (or at least not exactly.) Perhaps Jorrit may be able to scrape the Descriptors and their respective scales, and we could whip up an internal ontology to support the phenotypes in GRIN... (just thinking out loud.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants