Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency in cvterm table when setting dbxref_id on delete set null #121

Open
dreyes17 opened this issue Sep 17, 2021 · 2 comments
Open

Comments

@dreyes17
Copy link

Hello,

here is the table cvterm. The column dbxref_id couldn't be null nut the foreign key is set to on delete set null. I have checked other tables with dbxref_id and they can be null such as the table feature for example. But in this case I think that having a cvterm without knowing its provenance doesn't make sense. Do you agree? So for me it should be on delete cascade rather than making dbxref_id nullable.

create table cvterm (
cvterm_id bigserial not null,
primary key (cvterm_id),
cv_id bigint not null,
foreign key (cv_id) references cv (cv_id) on delete cascade INITIALLY DEFERRED,
name varchar(1024) not null,
definition text,
dbxref_id bigint not null,
foreign key (dbxref_id) references dbxref (dbxref_id) on delete set null INITIALLY DEFERRED,

is_obsolete int not null default 0,
is_relationshiptype int not null default 0,
constraint cvterm_c1 unique (name,cv_id,is_obsolete),
constraint cvterm_c2 unique (dbxref_id)
);

There are also some tables where you explicitly put null even if it is initialized to that value by default so I don't know if you are missing the not or not. For instance:

create table featuremap (
featuremap_id bigserial not null,
primary key (featuremap_id),
name varchar(255),
description text,
unittype_id bigint null,
foreign key (unittype_id) references cvterm (cvterm_id) on delete set null INITIALLY DEFERRED,
constraint featuremap_c1 unique (name)
);

Maybe you can give an overview to the nulls.

Thank you

@scottcain
Copy link
Member

Hi @dreyes17

This requires some serious code archeology, as the decisions on this table structure happened at least 15 years ago. Making cvterm.dbxref_id not null was the more recent item and it does seem to conflict with the on delete set null. Basically, I recall that the thinking on not allowing cascading delete is that the cvterm table is so integrated into everything in the rest of the schema that allowing a cascading delete was too likely to cause havoc. I'm honestly not sure what would happen when someone deleted a dbxref_id in this case. If I had time right now I'd create a test database just to find out (I'll make a note to try later). The fact that dbxref_id was allowed to be null in the first iterations of the schema is not too surprising: the idea was that, yes, if you were loading GO you'd want the dbxref but if you were just creating a simple ad hoc vocabulary then why would you need a reference? (That thinking was later changed to be more like "no matter what you're creating, document it in some way.")

For featuremap (which I don't really think gets used much these days--I think it's original idea was to support CMAP), the unittype I think is supposed to be a simple vocabulary with items like "base pair" and "centimorgans" as items. I think again, the allowing it to be nulled if the cvterm is deleted is to keep a whole bunch of data associated with the map from being deleted should somebody delete the unit term.

Does this make sense and/or help?

@dreyes17
Copy link
Author

Yes I understand, if I delete a cvterm record I could destroy several records such as features or companalysis in the database. My issue was actually that I was using this docker image which I guess is unofficial and I realized that cv_relationships such as i_a were not already inserted for the SO ontology. I need that inheritance because I am introducing the EDAM and IOBC ontologies that have some terms mapped with the SO. My thoughts were to delete the SO ontology and introduce it again using the XORT to have the cv_relationships. I think I can still do it by using an alter table but is not optimal.

It also surprised me that postgres didn't raise an exception when creating the cvterm table. Maybe using an on delete restrict would make more sense, but I am not sure.

Thank you for the rapid response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants