Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Option to treat subsequent points with different SWC tags as fork points #469

Open
Helveg opened this issue Jul 25, 2023 · 23 comments
Assignees

Comments

@Helveg
Copy link
Contributor

Helveg commented Jul 25, 2023

Consider a morphology with an axon consisting of a single branch, but with 3 different SWC tags: 2 for axon, 6 for axon hillock, and 7 for axon initial segment. With the current MorphIO parser this gets reduced into 1 Section with 1 type, depending on the SWC tag of the first point of the branch.

I'd like to request a new option to split this branch into 3 Sections, wherever there is an SWC tag transition (taking into account the no_duplicates option)

@mgeplf
Copy link
Contributor

mgeplf commented Jul 25, 2023

I think I noticed something similar just recently; let me create an example file, and we can check that we're on the same page.

@Helveg
Copy link
Contributor Author

Helveg commented Jul 25, 2023

Conceptually a series of SWC samples with the following tags:

1-1-1-6-6-6-7-7-7

gets converted in MorphIO to a single Section with type 1:

|--------1-------|

more equivalent to this SWC:

1-1-1-1-1-1-1-1-1

I'd like an option to parse this as 3 Sections with their respective types:

|--1--|--6--|--7--|

@mgeplf
Copy link
Contributor

mgeplf commented Jul 25, 2023

Yup, we're on the same page:

contents =('''1 1 0 4 0 3.0 -1
              2 6 0 0 2 0.5 1
              3 7 0 0 3 0.5 2
              4 8 0 0 4 0.5 3
              5 9 0 0 5 0.5 4''')
n = Morphology(contents, "swc")

should have len(n.sections) == 4

I will take a look when I have moment. I have a rewrite of the SWC parser in progress, that covers this case, afaik. I need to finish it though.

@mgeplf
Copy link
Contributor

mgeplf commented Jul 27, 2023

@Helveg I have implemented a change that should fix this.
However, it's in a completely new SWC reader - I'm passing all the tests that I have in NeuroM and nearly all of them in MorphIO (the ones that I'm not, I think we should re-evaluate what is right.

I'd really appreciate it if you could try with your code, because I think:
a) it would fix this issue
b) you may turn up other issues.

The branch w/ my new reader is: https://github.com/BlueBrain/MorphIO/compare/efficient-swc-build
Let me know if you have any luck.

@Helveg
Copy link
Contributor Author

Helveg commented Aug 11, 2023

Hmm, my version is MorphIO-3.3.6.dev54+g57e4e98.d20230811 from the latest commit on the efficient-swc-build branch and I get this error:

>       morpho = morphio.Morphology(os.fspath(file_like))
E       morphio._morphio.RawDataError: 
E       /mnt/c/Users/pwd06/git/arborize/tests/data/morphologies/multitagged.swc:1:error
E       Unable to parse this line

for this SWC file, which looks good:

1 1 0.0 0.0 0.0 2.9000000953674316 -1
2 1 0.0 5.622320175170898 0.0 2.9000000953674316 1
3 6 0.0 5.622320175170898 0.0 0.75 2
4 6 0.0 6.622320175170898 0.0 0.75 3
5 7 0.0 6.622320175170898 0.0 0.3499999940395355 4
6 7 0.0 16.6223201751709 0.0 0.3499999940395355 5
7 8 0.0 16.6223201751709 0.0 0.15000000596046448 6
8 8 0.0 29.222320556640625 0.0 0.15000000596046448 7
9 8 0.0 41.82231903076172 0.0 0.15000000596046448 8
10 8 0.0 54.42232131958008 0.0 0.15000000596046448 9
11 8 0.0 67.02232360839844 0.0 0.15000000596046448 10
12 8 0.0 79.62232208251953 0.0 0.15000000596046448 11
13 8 0.0 92.22232055664062 0.0 0.15000000596046448 12
14 8 0.0 104.82231903076172 0.0 0.15000000596046448 13
15 8 0.0 117.42231750488281 0.0 0.15000000596046448 14
16 8 0.0 130.02232360839844 0.0 0.15000000596046448 15
17 8 0.0 142.622314453125 0.0 0.15000000596046448 16

I'm on WSL2, my files might have weird line endings and other fun cross-platform freakiness

@mgeplf
Copy link
Contributor

mgeplf commented Aug 22, 2023

Cool, thanks for the example; I will have a look.

@mgeplf
Copy link
Contributor

mgeplf commented Aug 24, 2023

I tried loading the above snippet, and it seemed to work. I will try some variations on it though. I don't have a WSL2 machine at my disposal, though.

@Helveg
Copy link
Contributor Author

Helveg commented Aug 25, 2023

It's the line endings, I believe. After running dos2unix on the file the parser works. I think Windows line endings are \r\n, maybe you determine the line ending in a platform specific manner?

>>> import morphio
>>> morphio.Morphology("git/arborize/tests/data/morphologies/multitagged.swc")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
morphio._morphio.RawDataError:
git/arborize/tests/data/morphologies/multitagged.swc:1:error
Unable to parse this line
(b4py9) robin@LAPTOP-JF6T3PSU:~$ dos2unix git/arborize/tests/data/morphologies/multitagged.swc
dos2unix: converting file git/arborize/tests/data/morphologies/multitagged.swc to Unix format...
>>> import morphio
>>> morphio.Morphology("git/arborize/tests/data/morphologies/multitagged.swc")
<morphio._morphio.Morphology object at 0x7f6ff26c2a70>

@mgeplf
Copy link
Contributor

mgeplf commented Aug 25, 2023

It's the line endings, I believe. After running dos2unix on the file the parser works. I think Windows line endings are \r\n, maybe you determine the line ending in a platform specific manner?

Yup, thanks for catching that.

@Helveg
Copy link
Contributor Author

Helveg commented Sep 13, 2023

Any status updates? :)

@mgeplf
Copy link
Contributor

mgeplf commented Sep 14, 2023

Sorry, I was away, and haven't had a single stitch of time to look since I got back. I think it's a quick fix, though, so I can maybe find time today.

@mgeplf
Copy link
Contributor

mgeplf commented Sep 15, 2023

I have added handling of windows end of line here:
cad7bdd

Could you try it, and see if it fixes this problem?

@Helveg
Copy link
Contributor Author

Helveg commented Sep 22, 2023

Parses into 1 soma and 3 sections! 🎉

@mgeplf
Copy link
Contributor

mgeplf commented Sep 22, 2023

Parses into 1 soma and 3 sections!

I hope that's what you expected!

@Helveg
Copy link
Contributor Author

Helveg commented Sep 22, 2023

It is, thanks :)

@Helveg
Copy link
Contributor Author

Helveg commented Sep 28, 2023

When would this land in release? It's a blocker for me. If you can sneak it in in the next few days I would not have to fix a whole series of pesky bugs with my own SWC parser ;)

@mgeplf
Copy link
Contributor

mgeplf commented Sep 29, 2023

I'm super busy with other things at the moment.

There are a couple tests that fail, and I think it might be a semantic decision that needs to be made. For that, I will need to discuss with others what the right decision is.

@mgeplf
Copy link
Contributor

mgeplf commented Sep 29, 2023

@Helveg you're input would be helpful: do you think the following is a valid (but small) SWC file:

1 1 0 0 0 1 -1
1 3 1 0 0 1  1

@Helveg
Copy link
Contributor Author

Helveg commented Sep 29, 2023

Nitpick: the sample identifiers (first column) need to be unique. I'd say since SWC points from the child to the parent, the inferred segment should have the child's properties all the way up to the parent. The resulting output of:

1 1 0 0 0 1 -1
2 3 1 0 0 1  1

should then be a disc shaped soma segment of length 0, and a dendritic segment of length 1, just like:

1 1 0  0 0 1 -1
2 1 1  0 0 1  1
3 3 2  1 0 1  2
4 3 2 -1 0 1  2

would intuitively create a soma and 2 dendrites, not a Y shaped soma.

@Helveg
Copy link
Contributor Author

Helveg commented Sep 29, 2023

I'd say it's a design choice of MorphIO to allow for 0 length segments or not, but I don't see the harm in allowing it. I'd find it needlessly restrictive for valid edge cases, and would leave these kinds of quality checks to dedicated morphology auditing/repairing tools that one can build on top of MorphIO.

@mgeplf
Copy link
Contributor

mgeplf commented Sep 29, 2023

Nitpick: the sample identifiers (first column) need to be unique.

Yeah, that was a typo; sorry.

I'd say it's a design choice of MorphIO to allow for 0 length segments or not, but I don't see the harm in allowing it.

Okay, makes sense.

Currently, we're also transforming:

1 1 0 0 0 1 -1
2 3 1 0 0 1  1
3 3 1 1 0 1  2
4 3 1 0 1 1  2

Into:

1 1  0  0  0  1 -1
2 3  1  0  0  1  1
3 3  1  1  0  1  2
4 3  1  0  0  1  1
5 3  1  0  1  1  4

Which seems wrong: ie now the soma has two children instead of one. Thoughts?

@Helveg
Copy link
Contributor Author

Helveg commented Sep 29, 2023

I don't understand? Why add a point?

@mgeplf
Copy link
Contributor

mgeplf commented Sep 29, 2023

I don't understand? Why add a point?

Yeah, that's my confusion too; I don't in the "new" parser, and that's what I have to work out. Thanks for your input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants