Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check and debug inter-table and inter-dimensional calculations for both exploded tables #2703

Closed
3 tasks done
Tracked by #2016
e-belfer opened this issue Jun 28, 2023 · 7 comments · Fixed by #2890
Closed
3 tasks done
Tracked by #2016
Assignees
Labels
data-repair Interpolating or extrapolating data that we don't actually have. ferc1 Anything having to do with FERC Form 1 rmi

Comments

@e-belfer
Copy link
Member

e-belfer commented Jun 28, 2023

In #2669 we added inter-table and inter-dimensional inter-table calculations into the process of the table explosion. We should look at these calculations and debug signs, component names etc. to improve calculation accuracy.

Tasks

  1. 4 of 4
    data-repair ferc1 rmi
    cmgosnell
  • Update accumulated_depreciation component to in_service plant status for depreciation_utility_plant_in_service
  • add total plant_status for construction_work_in_progress link between balance_sheet_assets_ferc1 and utility_plant_summary_ferc1
  • add total plant_status for the accumulated_provision_for_depreciation_amortization_and_depletion_of_plant_utility link between balance_sheet_assets_ferc1 and utility_plant_summary_ferc1
@e-belfer e-belfer added ferc1 Anything having to do with FERC Form 1 data-repair Interpolating or extrapolating data that we don't actually have. rmi labels Jun 28, 2023
@cmgosnell cmgosnell moved this from New to Backlog in Catalyst Megaproject Jul 3, 2023
@e-belfer e-belfer changed the title Check and correct inter-table and inter-dimensional calculations for both exploded tables Check and debug inter-table and inter-dimensional calculations for both exploded tables Jul 12, 2023
@cmgosnell cmgosnell assigned e-belfer and unassigned cmgosnell Jul 17, 2023
@cmgosnell cmgosnell self-assigned this Jul 31, 2023
@zaneselvans zaneselvans moved this from Backlog to In progress in Catalyst Megaproject Aug 14, 2023
@e-belfer e-belfer moved this from In progress to Backlog in Catalyst Megaproject Aug 28, 2023
@e-belfer e-belfer moved this from Backlog to In progress in Catalyst Megaproject Aug 29, 2023
@e-belfer
Copy link
Member Author

e-belfer commented Sep 25, 2023

First round of investigations from August:
Balance sheet assets
depreciation_utility_plant_in_service_correction: 3873

  • 3026 of these have a relative difference of 1 (i.e. the component is duplicated).
  • Fixed plant status on component to "in service" - reduced to 3841 wrong.

utility_plant_in_service_classified_and_property_under_capital_leases_correction: 3530

utility_plant_net_correction: 2353

  • 1:1 link from balance sheet assets to utility plant summary tables to remove duplicated calculation. There isn't a utility type for the parent factoid in the calculation components table, perhaps this is causing issues?

accumulated_provision_for_depreciation_amortization_and_depletion_of_plant_utility_correction: 2077

  • 1:1 link from balance sheet assets to utility plant summary tables to remove duplicated calculation. There isn't a utility type for the parent factoid in the calculation components table, perhaps this is causing issues?

utility_plant_and_construction_work_in_progress: 2061

  • 1:1 link from balance sheet assets to utility plant summary tables to remove duplicated calculation. There isn't a utility type for the parent factoid in the calculation components table, perhaps this is causing issues?

construction_work_in_progress_correction: 1904

  • 1:1 link from balance sheet assets to utility plant summary tables to remove duplicated calculation. There isn't a utility type for the parent factoid in the calculation components table, perhaps this is causing issues?
  • Added total to child factoid dimensions, to match other parallel examples.

Balance sheet liabilities
retained_earnings_correction: 159

  • We're comparing retained earnings from Current Year End of Quarter/Year Balance (Balance sheet liabilities table) and Current Quarter/Year Year to Date Balance (Retained earnings table). Might this have something to do with the differences?

Income statement
General observations: errors are relatively distributed across years and utilities.

operation_expense_correction: 3061

  • Tried removing power_production_expenses from sub-components (increases wrong calcs to 3410). This component sums operation + maintenance costs from all sub-sections and should be removed.

maintenance_expense_correction: 358

  • Manually reviewed all components, no obvious source of error.
  • Often appears just to match one maintenance component and not the other (e.g. just general maintenance, just distribution maintenance), so this is more likely user error than error on our end.

other_operating_revenues_correction: 217
depreciation_expense_correction: 163
administrative_and_general_expenses_correction: 102

@cmgosnell
Copy link
Member

cmgosnell commented Oct 10, 2023

Balance sheet asset explosion inter-table calc debugging notes:

Starting w/ this error rate:

balance_sheet_assets_ferc1: has 12015 (59.92%) records whose calculations don't match. Adding correction records to make calculations match reported values.

High level # of factoids in the inter-table reported factoids that are off from their calculated value from their sub-components:

xbrl_factoid
utility_plant_in_service_classified_and_property_under_capital_leases                 3741
utility_plant_net                                                                     2106
accumulated_provision_for_depreciation_amortization_and_depletion_of_plant_utility    2080
utility_plant_and_construction_work_in_progress                                       2073
construction_work_in_progress                                                         1906
depreciation_utility_plant_in_service                                                  488
utility_plant_in_service_plant_purchased_or_sold                                        35
Name: count, dtype: Int64

Image
^^ i removed utility_plant_in_service_classified_and_property_under_capital_leases for reasons listed below

utility_plant_in_service_classified_and_property_under_capital_leases

In trying my little debugging test on the calcs that checks whether the absolute difference between the reported and calculated values are the exact same value of other components I learned that the vast majority (3167/3741) of the errors in utility_plant_in_service_classified_and_property_under_capital_leases are because of the error that Jon id-ed when we were first starting this journey: many utilities include utility_plant_in_service_completed_construction_not_classified in utility_plant_in_service_classified_and_property_under_capital_leases. This happens across all years. It doesn't totally make sense because the sub-component oft included is specifically "not-classified". I. tried adding this into the mannual calc fixes and it caused a 67% error rate instead of the existing 59%.

this accounts for 24% of the errors in the balance sheet asset explosion.

another 680 records are of a similar fashion but with a smaller subset of records (~600) including utility_plant_in_service_property_under_capital_leases

depreciation_utility_plant_in_service

I noticed that a decent number of the calculated values here had a rel_diff of 2 because the sign of the accumulated_depreciation was negative while the depreciation_utility_plant_in_service was positive. The worst offender was utility_id_ferc1 == 211. i looked into it and 🧼-ed it up but it feels vv bespoke & time consuming for fixing ~20 calculated values.

so I'm planning on leaving in these bb's that fall into this clear pattern:
Image

@zaneselvans
Copy link
Member

You probably already saw this but it looks like there might be a small population of records with a sign error -- the ones that head down and to the right from the origin.

@cmgosnell
Copy link
Member

You probably already saw this but it looks like there might be a small population of records with a sign error -- the ones that head down and to the right from the origin.

yes i already fixed some of them and described the problem in the section above depreciation_utility_plant_in_service

@cmgosnell
Copy link
Member

After some work via #3064 and #3078... i got the balance sheet asset inter-table calcs cleaned up a bit but they are still not awesome and none of these fall into the "off-by fact x" pattern. Here is the summary outputs:

image

image

@zaneselvans
Copy link
Member

Ugh I love your plotting theme.

@cmgosnell
Copy link
Member

I'm going to call this good enough for now. If we get further direction for further cleaning we can certainly open a new more directed version of this issue.

@github-project-automation github-project-automation bot moved this from Backlog to Done in Catalyst Megaproject Jan 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data-repair Interpolating or extrapolating data that we don't actually have. ferc1 Anything having to do with FERC Form 1 rmi
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants