-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How do I capture known antigen/epitope reactivity to AIRR objects (Rearrangement/Cell) #781
Comments
Discussion starts here: |
From the call:
Reactivity:
Rearrangement:
|
PR with initial attempt created in #784 Need to look at For For |
This might be silly, but... By "level of annotation" I mean capturing how complete the match is between the entity in the ADC and the entity in the external repository (e.g. IEDB). It is kind of a quality score that describes how exact the match is between the two repositories. In IEDB we can have receptors with full v domain for both chains (a In the ADC we can have a similar range of completeness for When annotating a |
From the meeting:
Drop the old fields:
Create an example for this for an "observed" rearrangement that was observed in IEDB for discussion. |
Here is an example of using |
Example of annotated
This is the beta chain of a known receptor in IEDB: https://www.iedb.org/receptor/182992 which is specific to this epitope: https://www.iedb.org/epitope/1616345 My equivalence criteria for annotations is that the CDR3, V and J calls need to be an exact match. This applies. I want to annotate this so I create a
I then set for this Rearrangement Rearrangement.reactivty_id = XXXXXXX If I have another Rearrangement that matches I can also set its Rearrangement.reactivty_id = XXXXXXX The only problem I see is that there is no way for me to differentiate this I would still lean to having |
FYI, in the T1D repository, there are 21 If I was to fully annotate this repository for that reactivity record, I would have 21 I should also note that there are 347 such If you flip this around, what we have done is taken one There are 180297 Who knows what the real frequency per IEDB |
One other quick observation, the alpha chain for the IEDB In reality each repository would have a different |
For completeness, here is an example of a
This is what we would have stored previously, but with the suggestion that we don't store these things any more, we lose this information.
This seems important to me, are we sure we want to drop this? |
My suggestion is to consider We would need to figure out names for the two fields that are Thoughts? |
IIRC, our logic from the last call was there were certainly valid use cases for this information, but that we didn't want to hassle with enumerating them all because there's quite a few experimental approaches we'd need to accommodate if we went down that road. So we're trading schema completeness for tractability.
I think Could |
Based on discussion at the meeting today, I think we are leaning to have Just like if using an inference algorithm that works on some closeness criteria of the CDR3 to predict reactivity, we are using an algorithm with some matching criteria to a Receptor in IEDB to predict reactivity based on the Receptor/Reactivity data in IEDB. I think this works. |
Haven't thought of that to date, but... Since we are documenting inference of a I think this is a pretty extreme edge case? I suppose that my hope would be that any Receptor/Reactivity evidence found in a study in the ADC would be curated into a repository like IEDB, where other evidence for that Receptor/Reactivity would reside. There is a level of curation process and methodology around that, where evidence is gathered through multiple studies and assays to provide evidence that supports that Receptor/Reactivity. When storing an inferred reactivity in the ADC, by using such an external resource such as IEDB, we get stronger evidence to support the inference. With that said, since the ADC is storing Reactivity, your use case should not be considered out of the question I suppose. |
I would almost suggest that the flow might go the other way. If a study stored in the ADC detects a Once it is curated in IEDB, then any |
Small steps I suppose. 8-) We have in the past used a string to keep such fields flexible, with documentation on recommended or possible examples to guide users in the right direction. That way we don't need to be complete, but do enable the ability to capture the data. One possibility... 8-) If we don't have something like this, when we (iReceptor) load these data we will probably just do this ourselves by storing the data in custom internal iReceptor fields. We don't want to not load this data when we are loading everything else, and then try and load it piecemeal after we come up with a mechanism to do it properly 8-) Easier for us to store these fields in our best guess and then map/convert them later if necessary... So look for the fields `ir_reactivity_method, ir_reactivity_measure, ir_reactivity_value, ir_reactivity_unit' in our repositories 8-) |
If I understand correctly, you are asking if the ADC had published epitope specific experimental data that was not yet in the IEDB, we (the IEDB) could use the ADC to identify and curate it? If so, yes, as long as the data is connected to its PMID.
|
I think the crux of what remains is what we do with these fields. The discussion above seems to imply that our meetings came to the conclusion that these should be removed from the So it seems wrong to me to remove the fields we decided we needed for Cell->Reactivity interactions because we are adding Rearrangement->Reactivity interactions. If this is the case, then I would suggest that we need a different way to link Rearrangements to Reactivity rather than dumb down our Reactivity object. |
This pull request is currently adding a minor field to Rearrangement but removing important functionality from Reactivity - which was not what was intended for this issue and the related pull request. The original intent for If we can't come up with another solution, I would rather drop If we need to come up with a different way of capturing inferred reactivity then I would prefer some other mechanism over changing Reactivity as we are discussing here. |
* Initial changes for this issue See #781 * Undo mistaken change. * Change reference to IEDB epitope as an example * Change examples to be internally more consistent * Sync schemas. * Add reactivity_id to v3 spec * Fix nullable * Sync v2 and v3 specs * Fix nullable * Change reactivity fields from enum to string reactivity_readout and reactivity_method * Change reactivity fields from enum to string reactivity_readout and reactivity_method * Sync * Add reactivity_ref As per discussion here: #784 (comment) * Add IEDB CURIE info * Add reactivity_ref Add CURIE info for IEDB * Sync python and R * Change case of UNIPROT CURIE * Sync * Add IEDB providers. * Add IEDB providers. * Sync schemas * Add provider links in CURIE map * Schema consistency updates * Consistency check * Sync open api 3 version of spec * Add second URL to IEDB_RECEPTOR provider. * Change url to be an array for IEDB_RECEPTOR * Change quotes - yaml parser is interpreting ? in string as a special character
Moving discussion of linking AIRR objects of known epitope/antigen reactivity/specificity to this issue and out of #776
The text was updated successfully, but these errors were encountered: