-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse trailing comments from buses data in v30 files #78
base: main
Are you sure you want to change the base?
Changes from 5 commits
5794808
935a243
2f2eb66
b471eda
8348816
a1beb1c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -244,6 +244,51 @@ end | |
return block | ||
end | ||
|
||
### | ||
### Buses | ||
### | ||
|
||
# In v30 files, we've seen trailing end-of-line comments. These can be present | ||
# in any section, but so far we have only had a use-case for the comments in the | ||
# buses section. | ||
# See https://github.com/nickrobinson251/PowerFlowData.jl/issues/27 | ||
# | ||
# The currently handles data which looks like: | ||
# 111,'STBC ',161.00,1, 0.00, 0.00,227, 1,1.09814, -8.327, 1 /* [STBC 1 ] */ | ||
# And does _not_ handle data with a comma before the comment like: | ||
# 111,'STBC ',161.00,1, 0.00, 0.00,227, 1,1.09814, -8.327, 1, /* [STBC 1 ] */ | ||
@generated function parse_row!(rec::R, bytes, pos, len, options) where {R <: Buses30} | ||
block = Expr(:block) | ||
n = fieldcount(R) | ||
# The last column is the comment. We need to handle the second-last column separately | ||
# too in order to be sure we stop parsing it before hitting the comment. | ||
for col in 1:(n - 2) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we need to handle the last two entires differently, becase the second-last entry (the last "real" column) we need to be sure to stop parsing before the comment There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I'm unfamiliar with the package, so I have some potentially basic questions. I see that this function dispatches over There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
we define
As far as i understood only the Buses comments ("augmented names") were needed, so this PR only parses comments from the Buses section (and only for v30 files). I doesn't extract comments present in any other section. Do we need to? These are good question and i'm very happy to answer as many questions as you have! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, we do use augmented names for other components, namely generators, loads, and branches (incl. transformers) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. so we need to functionality not just for buses? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh, wait, for other components, these comments can be reconstructed from columns already in the data, no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In the case of loads and generators the augmented names are constructed from other columns. But for branches (including transformers) we use the end of line comments. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. transformers are the biggest pain 😂 |
||
T = eltype(fieldtype(R, col)) | ||
push!(block.args, quote | ||
rec, pos, code = parse_value!(rec, $col, $T, bytes, pos, len, options) | ||
end) | ||
end | ||
m = n - 1 | ||
Tm = eltype(fieldtype(R, m)) | ||
Tn = eltype(fieldtype(R, n)) | ||
push!(block.args, quote | ||
# @show (rec, pos, code) | ||
# Assume no comma delim before the comment, so need to set whitespace as delim. | ||
# And move to next non-whitespace character, so we don't immediately hit a delim. | ||
pos = checkdelim!(bytes, pos, len, OPTIONS_SPACE) | ||
(rec, pos, code) = parse_value!(rec, $m, $Tm, bytes, pos, len, OPTIONS_SPACE) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we need to set whitespace as the delim in order to not parse the comment when parsing the second-last entry (hence |
||
if newline(code) | ||
# If we hit a newline before delimiter there is no trailing comment. | ||
push!(getfield(rec, $n)::Vector{$Tn}, missing) | ||
else | ||
(rec, pos, code) = parse_value!(rec, $n, $Tn, bytes, pos, len, options) | ||
end | ||
end) | ||
push!(block.args, :(return rec, pos)) | ||
# @show block | ||
return block | ||
end | ||
|
||
### | ||
### transformers | ||
### | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -94,7 +94,7 @@ using Test | |
@test startswith(repr(mime, net.caseid), "CaseID: (ic = 0, sbase = 100.0, ") | ||
|
||
@test repr(mime, net.buses; context=(:compact => true)) == "Buses with 2 records" | ||
@test startswith(repr(mime, net.buses), "Buses with 2 records, 11 columns:\n──") | ||
@test startswith(repr(mime, net.buses), "Buses with 2 records, 12 columns:\n──") | ||
|
||
mt_dc_line = net.multi_terminal_dc.lines[1] | ||
@test eval(Meta.parse(repr(mt_dc_line))) isa MultiTerminalDCLine | ||
|
@@ -497,6 +497,23 @@ using Test | |
@test net_space.branches.j == net_space_manual.branches.j | ||
end | ||
|
||
@testset "End-of-line comments" begin | ||
net_eol = parse_network("testfiles/eolcomments.raw") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. todo: add comment here about only |
||
|
||
# Only `Buses30` parse trailing EOL comments. | ||
buses = net_eol.buses | ||
@test length(buses) == 2 | ||
@test buses.i == [10010, 337918] # first col | ||
@test buses.owner == [1, 1] # last col before comments | ||
@test all(contains.(buses.comment, ["NOBO 1", "NAU_E2 1"])) | ||
|
||
loads = net_eol.loads | ||
@test length(loads) == 1 | ||
@test loads.i == [10010] # first col | ||
@test loads.owner == [1] # last non-missing col | ||
@test isequal(loads.intrpt, [missing]) # last col | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm wondering what exactly this test is checking for. Is it checking that the end of line comment has not been interpreted as There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. it's just checking that
i don't follow this - can you explain? Anyway it sounds like if we want something like this PR then it does need to parse commnts from more than just Buses data, so this'd need updating There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I thought this test might be intended to check that the end of line comment hadn't been parsed and placed into the last column by mistake. In the test file for this test, |
||
end | ||
|
||
@testset "issues" begin | ||
sz = parse_network("testfiles/spacezero.raw") | ||
@test length(sz.buses) == 2 | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
0,100.0,30 / PSS(tm)E-30 RAW created Thu, Oct 13 2018 13:08 | ||
SEP 2018 V2 MODEL | ||
MADE ON 13-Oct-2018 13:08@ | ||
10010,'NOBO ',161.00,1, 0.00, 0.00,327, 1,1.03182, 5.469, 1 /* [NOBO 1 ] */ | ||
337918,'3NAUNEPPE-2+',115.00,1, 0.00, 0.00,327, 1,1.01644, 0.679, 1 /* [NAU_E2 1 ] */ | ||
0 / end of bus cards | ||
10010,'T1',1,327, 1, 7.698, 4.063, 9.463, 16.549, -1.802, 2.043, 1 /* [NOBO T1 ] */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest, why does this need to be a generated function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unfortunately thiis is the necessary to avoid dynamic dispatch (#22)
whenever we call:
the type of
T
is not known til runtime, meaaning theparse_value!
call has to look-up decide at runtime which method to call, which adds a big performance cost, since we have oneparse_value!
call per value/element in the fileso we use a generated function to unroll the loop and generate code like
I think the other way we might have been able to avoid the dynamic dispatch is by manually doing what "union splitting" does e.g. like is done in ODBC.jl
CSV.jl uses a mix of these techniques e.g. splitting the big union of expected types and generating code for custom types