-
-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Solution to frequently missing taxonomy specifications in UK submissions #112
Comments
#76 is related |
Hi there - apologies for lack of knowledge of this, but for UK data i am frequently unable to pass statements because of:
Was this fixed? I assume its a problem with the UK data. Thanks very much |
Hey, no worries. But finding some solution to this problem is in the top 3 list of fixes I want to implement in the next month. Will keep you updated. |
@manusimidt I realise that this is provided for free and its an excellent solution, but i wonder if the UK aspect is any closer to completion? I am close to completing a research report and would ideally like to use this (with attribution!) |
The issue I still have with UK submissions is that many of them do not properly import their underlying taxonomies. If I would find the taxonomy schema files on the web it would be pretty easy to implement. For example, the taxonomy endpoints for us submissions are pretty straightforward and easy to find:
If I had the same table for the UK taxonomies, I could really easy fix this issue. However, I can't find proper taxonomy endpoints for the taxonomies where the parser fails:
So basically if I find the missing taxonomy schema files on the web (and basically complete the table for all the years), it would be relatively easy to fix this issue. |
@manusimidt does this page contain what you need? https://www.frc.org.uk/accountants/frc-taxonomies#current-taxonomies-downloads or Realise they're not end points but have the xsd files |
https://uk-taxonomies-tdp.corefiling.com/yeti/resources/yeti-gwt/Yeti.jsp https://www.frc.org.uk/accountants/frc-taxonomies#current-taxonomies-downloads
But this would not even solve the full problem, since there site provides only the FRC Taxonomies (as far as I can see). I still don't know to which the other prefixes and namespaces belong.. (i.e.: "dpl-countries", " dpl-frc"...) |
On your last point, I think (but admit i dont know much about this) countries and frc are in the Zip folders, but named slightly differently. But yeah i see your point. How frustrating! I can try and raise this with companies house |
Was there any progress on this with companies house? Currently running in to the same issue They are in zip format but could we put them in a public s3 bucket or a dedicated github project and map them to the right namespace? |
Are these any of the missing Schema URLs?
This also works for previous years ie |
hi @manusimidt - what is needed for me to create a PR with these additional taxonimies? Are you able to link to a similar PR demonstrating whats needed? Happy to have a go at getting these added (and anymore i can find) |
@Cave-Johnson thanks for researching and providing the links. They indeed look promising! I am not aware of any PR which addressed this problem. Keep in mind that most of the errors are caused not because the schema URL of the taxonomy is missing but because both the schema URL and the namespace of a given taxonomy are missing. This creates for example the rather frequent error Here a simple map between the namespace and schema URL is not enough since the namespace
|
Another option would be that the parser just throws a warning and just ignores all facts that were tagged with the taxonomy which could not be imported. However, this could lead to cascading problems since the parser for example also parses XBRL footnotes. If the fact which the footnote references is not present due to missing taxonomy then the assigning of the footnote to the fact will also fail. |
Probably the last option (to just ignore facts tagged with not locatable taxonomies) would be the best one. In the last two years 90% of my time I invested in I also tried to formulate this in #84. I hope that I have some time over Christmas to dive into this. In the meantime any PRs or suggestions are highly appreciated :) |
I just ran a test with new UK submissions and noticed that the schema URL for the following namespace was often missing:
@Cave-Johnson already found the DPL taxonomy for 2023: https://xbrl.frc.org.uk/dpl/2023-01-01/dpl-2023-01-01.xsd |
I'll do some digging and see what I can find
Brief search this might be a good source for UK accepted taxonomies
https://www.gov.uk/government/publications/taxonomies-accepted-by-hm-revenue-and-customs/taxonomies-accepted-by-hmrc
…On Tue, 21 Nov 2023, 22:21 Manuel Schmidt, ***@***.***> wrote:
I just ran a test with new UK submissions and noticed that the schema URL
for the following namespace was often missing:
The taxonomy with namespace **http://www.hmrc.gov.uk/schemas/ct/dpl/2021-01-01** could not be found
@Cave-Johnson <https://github.com/Cave-Johnson> already found the DPL
taxonomy for 2023:
https://xbrl.frc.org.uk/dpl/2023-01-01/dpl-2023-01-01.xsd
If we find a 2021 version these errors simply could be resolved for now by adding
it to the mapping
<https://github.com/manusimidt/py-xbrl/blob/485a428e74425d7bbdd5aa632d0310955166e6f6/xbrl/taxonomy.py#L31>
—
Reply to this email directly, view it on GitHub
<#112 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AET5RUHWSEUQDB7MWFXS7EDYFUSOBAVCNFSM6AAAAAAYITI6VCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRRG44TGOJVGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
A bit more digging i have found the following https://www.hmrc.gov.uk/schemas/ct/combined/2021-01-01/dpl-2021-frs-102-2021-v1.0.0.xsd which lead me to
Are these what was missing? |
Been having the same issue recently when parsing companies house docs, was wondering if there was any update on this please? |
I have tried it with the taxonomies provided by @Cave-Johnson, however, unfortunately, they did not work for all of the failed submissions I tested on. |
I think the only viable option is to build in some "fail gracefully" mode into py-xbrl in which the parser returns a partially parsed xbrl instance document and just omits every fact/context that is labeled/uses a concept from a taxonomy which could not be located. |
understand your point RE fail gracefully logic and that makes sense. It would still be good to get the taxonomies linked where possible. Ive done a bit more digging and come across the attached that references dpl countries in the |
@manusimidt Thanks for all the looking into this! was just wondering if @Cave-Johnson solution could have helped? |
Many UK submissions are missing a proper taxonomy declaration.
They just use the prefix without specifying the corresponding namespace and/or do not reference the taxonomy schema file.
Currently, the parser is not able to handle those issues, since It does not know where this taxonomy is located.
This issue occurs frequently with the following prefixes:
The text was updated successfully, but these errors were encountered: