-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential bug: Rows from one table appear in the parsing of another when 1 row is added to a third #82
Comments
@dmulyalin - Sorry for tagging you directly, but do you have any idea what could be the cause of this? |
Would recommend to try simplifying your template e.g. this gives same results as yours one but a bit easier to read IMHO:
For undesired matches - was not able to reproduce the problem by doing this:
but, several tecniques to avoid unnecessary matches:
|
Sorry I haven't gotten back to you yet. I find it a bit unsettling that you are not able to reproduce the problem. It makes me doubt whether there is a setup issue at my end. But I did test it multiple times and tried to boil it down to the very core before creating the issue. Regarding the change of template: You are probably right, but this template was taken from a bigger templates, perhaps 10 times as large with a lot of complexity. It might not be possible to do these simplifications in real life. And since the data we are parsing is quite messy and outside our control (and kind of unpredictable sometimes), then we need at least some general matching. I think you might be right that we need to pre-process the input data - I had just hoped that TTP would spare us for that because it is such a strong parser. Regardless, I will work more on the template. Regarding reproducing: Before I close this, I would like to make one last effort to see if anyone else can reproduce it. Let me think a bit about how. |
I have now reproduced the issue in PythonAnywhere. main.py I then installed Calling main.py with the input file without the line mentioned above: Calling main.py with the input file with the problematic line: In the red circle, I have highlighted a couple of matches that contains data from a different table than what it is supposed to, for instance {
"A": "id",
"B": "mode",
"C": "on/off",
"D": "Light",
"J": "Freq.",
"K": "Ship.",
"L": "[log.dec.]",
"active": "[Hz]"
} Here's the live console you can play around with: |
This is kind of a weird one. I am not sure if I am doing something wrong or if there is a bug in some of the
_start_/_end_
logic (or somewhere else).Here's the setup: I have text file with many tables and other values that needs to be extracted. For the purpose of this issue I have reduced the text file to only 3 tables and a few other values to be a minimal reproducible example.
Here is the file:
example-output-01.txt
Here is the template file:
In this template I only care about extracting values from one table, namely "Network_SectionData` (second table), plus a few other values.
In the text file, we also have a building table (first table) and a summary table (third table).
If I run
Then I see the list of expected extracted rows in
network.section_data
.However, if the following line is added to the end of the building table, just above
### END BUILDING DATA
then these values from the third table starts to appear in the parsed output:
These are the values from the summary table (third table) that happen to match the
I find this very peculiar, because
ttp
seemingly shouldn't care about._end_
indicators and if just one of them found a correct match, it should never look down in the summary table section in the first place.Note: I know that I could probably find a way around this by making sure that my match indicators only match number for instance, but for my use case I need to rely solely on
_start_
and_end_
indicators.Windows 10,
python 3.7
,ttp 0.9.1
The text was updated successfully, but these errors were encountered: