Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I would like to propose a way to solve the bug caused by string index through index sliding of the line of pdb_string.
The truncate_chain method assumes that serial_number, the second section of the line, consists of 5 digits, and handles pdb_string based on the index of the string.
However, if serial_number in the line is composed of more than 6 digits (100000), an error occurs and the case is as follows.
Example line where the bug occur
'ATOM 100000 HA ASP V 50 -84.500 -6.184-184.148 1.00 55.62 H '
Error message
The truncate_chain method was designed to capture ' 50', but 'V 5' was chosen, since serial_number is 6 digits.
What
Through the slideIdx variable, index sliding of lines is performed without significant changes in the code.
How
slideIdx is allocated as many digits if serial_number is greater than 5 digits.
(slideIdx is 0 if serial_number is less than or equal to 5 digits.)
In an operation based on the string index, slideIdx is added to the currently set line index.
Results
5,361 Fv regions are extracted as pdb files.
Additionally, to parse the 5,361 Fv regions extracted above into a json file...
In order to execute solvent/tools/preprocess_multimer_datasets.py,
_parse_coordinates method in Biopython should be also revised with same way by sliding index.
( The method's link is https://github.com/biopython/biopython/blob/d416809344f1e345fbabbdaca4dd6dcf441e53bd/Bio/PDB/PDBParser.py#L168-L320 )