Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra needles selected when knitting lace #40

Open
Adrienne200 opened this issue Feb 1, 2021 · 44 comments · May be fixed by #218
Open

Extra needles selected when knitting lace #40

Adrienne200 opened this issue Feb 1, 2021 · 44 comments · May be fixed by #218
Labels
Milestone

Comments

@Adrienne200
Copy link

Users have reported mispatterning in lace quite often; I've seen it myself from the beginning. I originally thought that because the symptom is similar to the issue I logged in #161 of the Desktop software, its report should go there too. But it's quite likely that the lace issue is different, so while I'm moving #161 to the firmware section where it belongs, I'm splitting it into two separate reports. The lace one might even be easier to diagnose.

Here's the report from #161 about the lace carriage, edited slightly. Still reproducible in v0.95:
For this you need a lace carriage; typically the problem happens within 30-50 rows, (or fewer, I've see under 20) no need to knit hundred of rows.
Tips for using the lace carriage:
never let it select two needles forwards side by side, it MUST be only single needle patterns. The carriage will jam on the next row if 2 needles are selected together. Only use patterns designed for lace.
to actually knit from the sample lace pattern, you need to Mirror the image. That is, put it back to the way Brother designed it; for lace the automatic mirroring is not what you want. Not needed if you are just air-knitting to see this issue.

Use the pattern “910 lace 10-35 60x30.png” (attached) basic settings as usual, 2-color, start from row 0, start/stop=30, Infinite Repeat, Single, center.
Knit as usual, except using the lace carriage instead of the K carriage. If air-knitting, no need to use the K-carriage at all.
Keep knitting, waiting for the beep every row as usual. it’s OK to cross a turn mark sometimes, it will still fail anyway.
Notice that the pattern is regular, every row has only one needle selected per repeat, 10 needles apart.
Watch the selection, the every-10th pattern is consistent, until after 30 - 50 rows suddenly one row it’s different. It has selected some extra needles. They are regular, usually 4 needles offset from the actual pattern.

The rest of #161 is discussion that may not apply to the lace-carriage issue. Except there's this statement from Chris:
"The decision which pixel has to be taken for setting the needle is slightly different for the L carriage (https://github.com/AllYarnsAreBeautiful/ayab-firmware/blob/master/knitter.cpp#L315 and https://github.com/AllYarnsAreBeautiful/ayab-firmware/blob/master/knitter.cpp#L340, depending on the direction of the carriage) because it has a different mechanical length than the K carriage. These values were found empirically and therefore might be not perfect.
But in terms of timing for recognition of the current carriage position, it's the same as for the K carriage."

Lace.zip

@dl1com
Copy link
Contributor

dl1com commented Feb 27, 2023

Probably same as #41?

@Adrienne200
Copy link
Author

When I moved #161 out of the Desktop section, I intentionally created two separate firmware issues #40 and #41, as the symptom with the Lace carriage was sufficiently different that it could be a separate bug from the K-carriage one. And in both cases, the symptom is different from what I see described in #48 and #50, so I'm reserving judgment until I have a working build and can retest these. I hope I'm wrong and both #40 and #41 are indeed fixed by PR #50, but I won't be surprised if they're not. I see both #40 and #41 as belonging in the "Needs Testing" category for now.

@erikbent
Copy link

erikbent commented Mar 4, 2023

I have knit 9 different patterns with the L-carriage (KH910 with Shield V1 version 0.95). I used the full bed (192 needles because the left 8 are not selected if the L-carriage comes from the left). Only with the last pattern I saw 1 wrong needle selection (pattern 105 with 27 repeats).

Sorry, I see now that it is alreadt known that knitting past the turnmarks is working correctly.

@SamB
Copy link

SamB commented Jun 24, 2023

@Adrienne200 ... So are you saying someone needs to add the ready - please test label to #40 and #41 ? I guess @dl1com has the power to do that ...

@t0mpr1c3
Copy link
Contributor

t0mpr1c3 commented Jun 24, 2023

@Adrienne200 ... So are you saying someone needs to add the ready - please test label to #40 and #41 ?

thanks for the prompt!

@Adrienne200
Copy link
Author

I have knitted many rows with the lace carriage and 1.0 build test230626, without any mispatterning. I think this long-standing issue is finally fixed, thanks everyone.

@github-project-automation github-project-automation bot moved this from Needs Testing to Done in AYAB 1.0.0 Release Tracking Jul 4, 2023
@Tuuleh
Copy link

Tuuleh commented Jul 4, 2023

I've tested this with v1.0 and while it no longer selects adjacent extra needles for me, there are other selection problems. Namely, after knitting some rows (around 10), needle selection when pushing to the left gets broken (selects no needles in my test pattern). The patterning seems to work OK with K carriage, other than that there are some blank rows where some needle selection happens with both K and L carriage, but that was a problem for me before already. So in short - after about 10 rows of knitting on my test pattern, the L carriage stops selecting needles when pushing to the left.

I can't exclude there isn't something funky in my test pattern, so will also try with ones supplied under this issue. You can find my test pattern below.

Panna_Chart_1_resized

@Adrienne200
Copy link
Author

My 910 machine has been working perfectly, but as soon as I moved to my 930, it's right back to frequent mispatterning with the lace carriage. (No sightings so far of the similar issue with the K carriage (#41) so maybe that's fixed across the board, to be determined as more testing happens.)
It's the same "shadowing" behavior that I reported previously, where a selected (D- position, solenoid unpowered) is repeated 16 needles later, where that spot should have been unselected (B, solenoid powered.) This time it seems to be triggered by crossing the turn mark - if the errant behavior is not seen for a few rows, crossing the turn mark can often cause it to happen. (Which is the exact opposite of #41, where crossing the turn mark caused it to start behaving correctly. Weird.) And it often happens right away in row 1, rarely any need to knit 30+ rows to see it fail. I've shot some video, posting it in the Discord.

@Adrienne200
Copy link
Author

AYAB software version: test230626
Computer/OS: Intel Mac/Big Sur. Same behavior on Windows 7.
Knitting machine: KH930
AYAB hardware: interface

Steps:

  1. Install AYAB hardware on a 930/940 (which are identical from an AYAB perspective)
  2. Go into Set Preferences and check that you have told AYAB that you're on a 930.
  3. Flash the firmware.
  4. Open the attached test file. It's a simple repeat of 1-black 9-white on every row.
  5. Needles L30 - R30
  6. Leave all the settings at default, except check Infinite Repeat. Normally for lace you'd also check Knit Side Image, not critical for this test.
  7. No yarn, we're just going to "air knit" So no need to do any K carriage rows.
  8. Start AYAB as usual. Lace carriage on left, cross the left turn and hesitate for the multiple beep.
  9. Knit back and forth as usual, watching the needle selection.
  10. If yours doesn't fail immediately, try crossing the left turn mark.
    Expected: every row should be 1 needle to D followed by 9 at B, repeating.
    Actual: Right to left always works, left to right almost always fails. After the first set of 10, the 6th needle of each group of 9 has been selected to D when it should have remained at B. It appears to be a copy or shadow where it has failed to forget its previous data and is repeating it one cycle later. Video in the Discord's Testing channel.
    Every 10th column 60 x 6

@dl1com
Copy link
Contributor

dl1com commented Jul 14, 2023

AYAB software version: test230626 Computer/OS: Intel Mac/Big Sur. Same behavior on Windows 7. Knitting machine: KH930 AYAB hardware: interface

Steps:

1. Install AYAB hardware on a 930/940 (which are identical from an AYAB perspective)

2. Go into Set Preferences and check that you have told AYAB that you're on a 930.

3. Flash the firmware.

4. Open the attached test file. It's a simple repeat of 1-black 9-white on every row.

5. Needles L30 - R30

6. Leave all the settings at default, except check Infinite Repeat. Normally for lace you'd also check Knit Side Image, not critical for this test.

7. No yarn, we're just going to "air knit" So no need to do any K carriage rows.

8. Start AYAB as usual. Lace carriage on left, cross the left turn and hesitate for the multiple beep.

9. Knit back and forth as usual, watching the needle selection.

10. If yours doesn't fail immediately, try crossing the left turn mark.
    Expected: every row should be 1 needle to D followed by 9 at B, repeating.
    Actual: Right to left always works, left to right almost always fails. After the first set of 10, the 6th needle of each group of 9 has been selected to D when it should have remained at B. It appears to be a copy or shadow where it has failed to forget its previous data and is repeating it one cycle later. Video in the Discord's Testing channel.
    ![Every 10th column 60 x 6](https://user-images.githubusercontent.com/24358225/251931212-bb0d9b1d-7885-4f14-8175-41751920f9ce.png)

@AllYarnsAreBeautiful/ayab-contributors anybody who can take a look into this?

@clholgat
Copy link
Contributor

I can repro on @Adrienne200's 930.

It is much more common when the carriage is moving quickly. I'm guessing it's old state stuck in the solenoid buffer. Need to do more digging.

@X-sam X-sam removed the v0.95 label Mar 30, 2024
@clholgat
Copy link
Contributor

This also reproduces on my 910 so it's not machine specific.

@jonathanperret
Copy link
Contributor

@Adrienne200 I just followed your latest instructions on my 910 (I know you said to test on a 930 but since @clholgat reproduced it on a 910 I figured I should be able to see it as well) and found no mispatterning, even when moving the carriage very fast, crossing the turn marks any number of times, etc.
I suspect @clholgat 's recent changes in #196 may have fixed this, would you mind giving it another go when you get a chance? 🙏

@Adrienne200 Adrienne200 moved this from Needs Testing to Todo in AYAB 1.0.0 Release Tracking Dec 3, 2024
@Adrienne200
Copy link
Author

Still seeing this on a 930 with Beta3 Desktop and firmware. Left-right only, and somewhat speed dependent. I'm air-knitting, and if I zoom across faster than you'd go when actually knitting it fails multiple times per row. It is shadowing, that is it repeats a D needle 16 later in the direction of travel. At more reasonable speed it only fails occasionally, but the issue is still there. I've never seen it on a 910 since 0.95 or very early in the 1.0 cycle, and I've used more than one different 910. But it is speed dependent, so could be more about different machines than different models.

@jonathanperret
Copy link
Contributor

With another round of testing on my KH910, looking closely at I think I finally understood what was happening: in the affected direction (left to right), solenoids are switched slightly too late. In fact, by changing the value in the firmware to trigger solenoid activation just one needle later, I managed to reproduce on my KH910 the exact behavior described in this issue.

So, it should be a simple matter of tweaking that value so that activation occurs a bit sooner, right? Well there's another problem that rears its head when you do that: when knitting a full-width pattern, starting with the lace carriage from the left, the first few needles (on the left of the bed) may not be correctly selected on the very first pass of the lace carriage (further passes in either direction correctly select all needles).

Although it may seem related, this is not caused by the time it takes (or rather, used to take in previous releases) for the desktop app to send the first line's data to the firmware once the carriage is identified. In fact the problem can be seen even if you move very slowly across the left turn mark, or take a long pause right after the first beep.

What happens is that by the time the firmware decides the current carriage is a lace carriage, the carriage center has traveled well past the turn mark and needle L100, and is closer to needle 4-5. And because on the lace carriage, the needle selectors are so close to the center (12 needle widths on either side), at this point it is too late to activate the solenoid that would influence the selection of needle L100.

Now, part of the reason why the carriage is detected so late is that when we detect an S magnet in front of the left sensor, we can't be sure it is not a garter carriage's first magnet of its right-side N-S pair. So, the firmware waits until at least GARTER_SLOP needles have been traveled without another magnet being seen before declaring that it is really a lace carriage and that we can start selecting needles for it.

So it looks like there's some fine-tuning to be done here, to make sure we can reliably both distinguish a garter carriage from a lace carriage, and also start activating solenoids soon enough for the first needles to be selected correctly. It can be done obviously since the Brother firmware manages it just fine.

@jpcornil-git
Copy link
Contributor

You shouldn't wait for more than 3 needles after a first detection to make a carriage decision
=> For L/when moving to the right, ideal solenoid to set is at -4 (vs carriage center) for the needle selector (at -12)
==> If you are at 3, you are setting -1, i.e. one needle margin (no uncertainty here as solenoid to set is related to FW carriage position not the physical one)

@jonathanperret
Copy link
Contributor

You shouldn't wait for more than 3 needles after a first detection to make a carriage decision => For L/when moving to the right, ideal solenoid to set is at -4 (vs carriage center) for the needle selector (at -12) ==> If you are at 3, you are setting -1, i.e. one needle margin (no uncertainty here as solenoid to set is related to FW carriage position not the physical one)

The thing is, center - 4 is already the timing for setting solenoids in ayab-firmware (START_OFFSET is 32, with an END_OFFSET of 28), and that's what fails on Adrienne's 930. And center - 5 (1 needle later) fails on my 910, which leads me to thinking that center - 4 is marginal, so I want to try center - 3, but this hits the detection delay issue I mentioned above.

In fact, in ayabAsync-firmware (which probably fails on that 930 since it has the exact same timing as ayab-firmware for that case) if I change the timing to center - 3, I can see in ayab-simavr that solenoid 0 (driving needle L100) is not activated until the carriage center gets to position 20, i.e. 16 needles later, so missing selection for needle L100.

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 11, 2024

ayabAsync is setting carriage position 3 needles after detection and therefore, for L, code sets position to exactly 3 while updating solenoid 3-4=-1 at the same/that time. If you move offset to -3 then it will start with solenoid 0 while with -2 it will start with 1 ... and you will miss 0 (not because of cam/solenoid timing/... but because you start with 1/too late)

=> with -4 ayabAsync should therefore never miss L100 (by construction, no marginalities here)

NB: I may be off by one here, I'll check tomorrow code & simavr (you are running latest version I suppose ?)

@jonathanperret
Copy link
Contributor

jonathanperret commented Dec 12, 2024

NB: I may be off by one here, I'll check tomorrow code & simavr (you are running latest version I suppose ?)

Yes, I just tested with jpcornil-git/ayabAsync-firmware@60fa8a3 and jpcornil-git/simavr@ca908be .

Command line: ayab.elf --mcu atmega328 --freq 16000000 ../ayabAsync-firmware/.pio/build/uno/firmware.hex --machine KH910 --carriage L

Using this pattern:

With center-3 you can see the first solenoid (shown rightmost by ayab-simavr) is not changed:

Solenoids =[................], Carriage =   5, BP = 0, Sensors = (1650, 1650)
Solenoids =[.............|..], Carriage =   6, BP = 0, Sensors = (1650, 1650)
Solenoids =[............||..], Carriage =   7, BP = 0, Sensors = (1650, 1650)
Solenoids =[............||..], Carriage =   8, BP = 0, Sensors = (1650, 1650)
Solenoids =[..........|.||..], Carriage =   9, BP = 0, Sensors = (1650, 1650)
Solenoids =[.........||.||..], Carriage =  10, BP = 0, Sensors = (1650, 1650)

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 12, 2024

Just had a look at the ayabAsync code and this is expected.

When the carriage is detected (carriage position sets to 3 and solenoid to set at 3-4=-1), the first linereq message is issued and therefore the pattern to knit not yet available, i.e. solenoid 15 (-1) is reset. One needle later (carriage center at 4 and solenoid to set at 0), pattern should be available and solenoid (at 0) set.

=> If you shift this by one (-3 iso -4) then the first solenoid will always (i.e. no uncertainty here) be reset while with -4 L100 will never be missed.

You have to make sure of course that the pattern is available on time from desktop and that the firmware doesn't lag behind.
This seems to be the case for the former and for ayabAsync as it never blocks (lot of tests done by "doo" with all carriage types starting from the left/write at max speed without waiting for carriage detection/pattern reception beeps).

Next rows will be less critical wrt that as you will get the reqline well before (at least 16 needles for L, 12 + 4)
=> with -3 first needle will be missed but only for the first row.

@jpcornil-git
Copy link
Contributor

@Adrienne200 could you check that https://code.jonathanperret.net/ayab-webtool/#pr=209 has no similar issue (can then be used as working baseline to investigate differences) ?

@jonathanperret
Copy link
Contributor

jonathanperret commented Dec 12, 2024

=> If you shift this by one (-3 iso -4) then the first solenoid will always (i.e. no uncertainty here) be reset while with -4 L100 will never be missed.

Yes, this is demonstrated clearly in simavr as shown above : ayabAsync-firmware (like ayab-firmware) does not handle the first needle if timing is set to center-3.

But my point is that carriage-4 appears marginal: on some machines, at high speed, the solenoid can fail to grab the armature in time, resulting in a needle going to D when it should not. This is unrelated to serial communications, as the problem happens even in the middle of the bed where no communication is taking place.

Now, there’s a possibility that ayabAsync-firmware’s center-4 is different from ayab-firmware’s center-4. For example, if the encoder signals are swapped, there will be a 1/4 needle change in timing. Also, as we know ayab-firmware suffers from a penalty due to calling analogRead in its ISR. But this penalty is only about 100 microseconds — even at 1 m/s this would be only 0.1 mm of carriage movement, or 0.5 degrees of cam shaft rotation. A test on Adrienne’s 930 will tell us if that’s enough to make things work, but if this is what makes the difference I would still want to move away from center-4 for safety. Another interesting test would be to do like I did with ayab-firmware on my 910: change the timing to center-5 and see if it breaks at high speed.

But other than that there’s no blocking in ayab-firmware. Line data is requested and received in milliseconds. However there may be a way to win back one needle of latency: I think that currently, even once line data is available, we don’t set a solenoid until the carriage has moved to the next needle. We should be able to immediately set the solenoid for the current position as soon as data is here. I’ll be looking into this for ayab-firmware.

@jonathanperret
Copy link
Contributor

In the longer term we could change the API so that line data is available before carriage detection (even though it would not technically not require new message types, the order of messages would change so that’d be a breaking API change).

This would be helpful not to save on transmission time (again, milliseconds) but because it would let the firmware start activating solenoids even before the exact carriage type is known (i.e. see an L magnet => start selecting for L).

Observe that the magnet-to-selector distance is the same (12 needles) between the L carriage and the G carriage, so selecting the first needles as for an L carriage would actually be correct even if it turns out it’s a G carriage! This might not be a coincidence…

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 12, 2024

But my point is that carriage-4 appears marginal: on some machines, at high speed, the solenoid can fail to grab the armature in time, resulting in a needle going to D when it should not. This is unrelated to serial communications, as the problem happens even in the middle of the bed where no communication is taking place.

carriage-4 is optimal for L apriori, just when the armature is against the solenoid (the 'no click' scenario), when firmware and hardware positions match perfectly (mismatch of +/- 1 possible). I believe it is safer to set the solenoid earlier on (ready ahead of being against the armature) rather than later and, if correct, -3 would therefore be a better choice than -5 to absorb position mismatch (but not for a correct L100 handing)

However there may be a way to win back one needle of latency: I think that currently, even once line data is available, we don’t set a solenoid until the carriage has moved to the next needle. We should be able to immediately set the solenoid for the current position as soon as data is here. I’ll be looking into this for ayab-firmware.

I thought about that as well but this is not integrating well as firmware "moves" with position changes as of today.

In the longer term we could change the API so that line data is available before carriage detection (even though it would not technically not require new message types, the order of messages would change so that’d be a breaking API change).

This is only a solution for issues where the desktop doesn't reply quickly enough (doesn't seem to be the case but there could be old/slow serial implementation)

Observe that the magnet-to-selector distance is the same (12 needles) between the L carriage and the G carriage, so selecting the first needles as for an L carriage would actually be correct even if it turns out it’s a G carriage! This might not be a coincidence…

Not sure I understand that one, magnet position/solenoid to set/needles selector are at (for Right:Left carriage directions):

  • L 0:0/-4:+4/-12:+12
  • G +12:-12/+8:-8/0:0
    => magnet position and needle selector are indeed 12 needles apart for these two but solenoid to set is different (<> 4) ?

@jonathanperret
Copy link
Contributor

But my point is that carriage-4 appears marginal: on some machines, at high speed, the solenoid can fail to grab the armature in time, resulting in a needle going to D when it should not. This is unrelated to serial communications, as the problem happens even in the middle of the bed where no communication is taking place.

carriage-4 is optimal for L apriori, just when the armature is against the solenoid (the 'no click' scenario), when firmware and hardware positions match perfectly (mismatch of +/- 1 possible). I believe it is safer to set the solenoid earlier on (ready ahead of being against the armature) rather than later and, if correct, -3 would therefore be a better choice than -5 to absorb position mismatch (but not for a correct L100 handing)

Exactly, which is why I want to move it to center - 3, and find a way to still handle L100 correctly.

However there may be a way to win back one needle of latency: I think that currently, even once line data is available, we don’t set a solenoid until the carriage has moved to the next needle. We should be able to immediately set the solenoid for the current position as soon as data is here. I’ll be looking into this for ayab-firmware.

I thought about that as well but this is not integrating well as firmware "moves" with position changes as of today.

I don't know about ayabAsync-firmware but it is pretty easy to do in ayab-firmware, you just have to not start selecting until you have data, this way the needle that was current when the carriage detection happened is not considered "done" (m_sOldPosition is not set) and it can be processed as soon as we get the data (which happens generally way before we have moved to the next needle). Note that m_sOldPosition is the only reason why ayab-firmware does not appear to do anything when the carriage does not move — if you don't let it set that variable, it will keep trying to select even without carriage movement.

I've tried the necessary change, it works in the simulator but the unit tests are giving me a bit of trouble.

In the longer term we could change the API so that line data is available before carriage detection (even though it would not technically not require new message types, the order of messages would change so that’d be a breaking API change).

This is only a solution for issues where the desktop doesn't reply quickly enough (doesn't seem to be the case but there could be old/slow serial implementation)

Actually, it is a solution for the issue even without serial timing problems, as I tried to explain above. Having the data sooner lets the firmware start activating solenoids as soon as the L magnet is seen (i.e. center at needle 0), by assuming it is an L carriage. We cannot be sure it's not a G carriage until needle 2-3 is passed (what ayab-firmware calls GARTER_SLOP), but it would not matter because the timing is correct for a G carriage as well…

Observe that the magnet-to-selector distance is the same (12 needles) between the L carriage and the G carriage, so selecting the first needles as for an L carriage would actually be correct even if it turns out it’s a G carriage! This might not be a coincidence…

Not sure I understand that one, magnet position/solenoid to set/needles selector are at (for Right:Left carriage directions):

  • L 0:0/-4:+4/-12:+12
  • G +12:-12/+8:-8/0:0
    => magnet position and needle selector are indeed 12 needles apart for these two but solenoid to set is different (<> 4) ?

Regardless of the carriage in use, the solenoid used for needle L100 can only be solenoid 0 or solenoid 7, depending on belt engagement (belt shift) — this is determined by the association of the needle selecting plates with the rotary cams and in turn the solenoids. So right off the bat, a difference of 4 solenoids for the same needle does not make sense.

Your "solenoid to set" values here appear to be timing values, i.e. -4 for the L carriage would mean "when the carriage center is over needle X, activate solenoid for needle X - 4". But the relationship between a needle and a solenoid is fixed (modulo belt shift), so when you activate the solenoid for needle X - 4 you'll actually be manipulating solenoid e.g. (X - 4) % 16 (assuming regular belt engagement). So, if your timing were -3 instead, when over needle X you'd be activating a different solenoid ((X - 3) % 16) but for a different needle (X - 3)! You'd have activated the solenoid for needle X - 4 one needle earlier, while the carriage center was over needle X - 1.

Anyway, what I meant is that the distance between the (rightmost for G) L magnet and the needle selector is the same for both carriages:
image

Mechanically this means that for needle L100, for both carriages if you activate that needle's solenoid (i.e. 0 or 7) as soon as the L magnet crosses the left sensor, you will have activated the solenoid 12 needles before selection happens, which is a comfortably safe value (8 needles before selection is theoretically optimal and should be correct and silent on an perfectly tuned machine, but it appears they are not all so, so it seems safer to select earlier even though it does "click" — note that I tested the Brother firmware and its timing did make the solenoids click).

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 12, 2024

OK I see what you meant now, it is valid at both sides, i.e. needle to select for L/G is -4/+4 at the Left/Right side wrt magnet position (inner one for G) and beltshift is also not playing a role if detection is incorrect (beltshift is the same at the left side while at the right side they are opposite but L/Regular == G/Shifted wrt solenoid to set selection).

In fact, G carriage can be seen as a L carriage centered around the inner magnet (right/left one when moving to the right/left)

This observation could indeed help to start selecting solenoid earlier; this is only required if:

  • -4 is marginal and you have to reduce it by 1 or more -> jury is still out
  • Desktop app is not able to reply within one needle after carriage detection -> don't seem to be the case

You have to take North detection into account as well (you may want to continue to wait here to avoid declaring K while it is effectively G outer magnet that has to stay hidden)

Note also that desktop would require some adaptation as well as I don't believe it tracks carriage changes (would be nice in general to track carriage changes)

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 12, 2024

In the longer term we could change the API so that line data is available before carriage detection (even though it would not technically not require new message types, the order of messages would change so that’d be a breaking API change).

Agree, but we should rather change the Q&A direction, i.e. Desktop App sends a setLine while Firmware replies with cnfLine here + indState when something change (line completed, carriage detection, ...), i.e. only autonomous status updates everything else are answers to questions/request from Host. There are no reasons for the firmware to manage/know the current line number, it should be managed by the host (would allow to introduce new capabilities like rewind/redo, resume work, append new patterns, ...)

@jonathanperret
Copy link
Contributor

Agree, but we should rather change the Q&A direction, i.e. Desktop App sends a setLine while Firmware replies with cnfLine here + indState when something change (line completed, carriage detection, ...), i.e. only autonomous status updates everything else are answers to questions/request from Host. There are no reasons for the firmware to manage/know the current line number, it should be managed by the host (would allow to introduce new capabilities like rewind/redo, resume work, append new patterns, ...)

Yes, as we discussed before I agree it would be saner for the commands to be one-way, with only acknowledgements and status updates coming from firmware (or status updates could be omitted in favor of polling by the host, there are benefits to either approach I think).

I agree too that the firmware has no business knowing the row number — in advanced multicolor modes it's not even a simple pattern row index.

@jonathanperret
Copy link
Contributor

You have to take North detection into account as well (you may want to continue to wait here to avoid declaring K while it is effectively G outer magnet that has to stay hidden)

Note also that desktop would require some adaptation as well as I don't believe it tracks carriage changes (would be nice in general to track carriage changes)

Yes, the desktop needs to change to send the data before knowing the carriage type (in fact, the whole "start machine" state is not necessary IMHO, you could treat the first row just like any other, and just start in "unknown carriage" state); but it wouldn't see a carriage change in that case, as the firmware wouldn't report a carriage type until enough needles have been passed to make sure it is not a G carriage, as currently.

Alternately, the firmware could report an L carriage immediately and a G carriage later — this would actually "work" with the current desktop app. But as you said the desktop app doesn't handle carriage changes currently, so it would show an L carriage when using a G carriage.

@jpcornil-git
Copy link
Contributor

Agree with that as well, the first row is nothing special for the machine and can be transmitted before carriage detection/after machine configuration, it should also include start/stop needles to enable variable pattern width (and reqStart removed/agreed).

API could therefore be something like:

  • From Host
    • reset: reset firmware
    • getFirmwareInfo: report FW (& HW if any), API info
    • setMachine: set machineType (only thing firmware should know to work correctly and move to operation)
    • setConfig: enable/disable beeper, continuous reporting, color changer, ...
    • setPattern: set next row pattern with associated properties (start/stop needles, color, ...)
    • setHardware: set solenoids, leds, buzzer, ... (for test purposes, mostly makes sense before setMachine/firmware control of these)
    • reportState: see below
  • To Host
    • Ack to the above messages
    • stateReport: report status changes (state, carriage type/position/direction, beltPhase)
      • hall sensor values, solenoid states, encoder state can either be always added or only when requested by the request (test purpose)

stateReport could be autonomous (when enabled by setConfig) or not (polling using reportState)

No specific test mode/state/console either, everything can be driven from the above API.

Should probably be discussed in a different forum/not the purpose of the issue :-)

@jpcornil-git
Copy link
Contributor

Comments from Discord for the records
From:@Adrienne200

Jp asked in fw#40 to check fw 209, the ayabsync version. It does not display the shadowing issue, however fast I zoom across L-R. (930, Mac, 1.4 shield, Beta3 Desktop, JP’s firmware)

From:@jonathanperret

This is interesting, I will look closer at what could be causing the difference (I shared some hypotheses in #40 (comment) but I need to confirm these). In any case, I have a fix almost ready to go into ayab-firmware for #40.

From:@jpcornil-git

Thanks @adrienneh -> given that ayabAsync doesn't show the issue, timing of the solenoid update (-4 in github discussion) is not the root cause of the issue but should be looked elsewhere in the ayab-firmware implemention (compared with ayabAsync for differences)

@jonathanperret
Copy link
Contributor

So, I used a logic analyzer to record the signals sent to the solenoids (well, just solenoid 0 in this test) in relation with the encoder signals (belt phase, V1 and V2), using ayab-firmware and ayabAsync-firmware, both in their current configuration (i.e. selecting at center - 4).

I used a 16 on/16 off pattern to maximize solenoid switching (every time the carriage moves one needle to the right, it should switch one solenoid on or off):

Here's one instance of the firmware switching solenoid 0 off as the carriage center passes needle 16n + 4 — the crossing of a needle is marked by a rising edge on V2 (note that the names V1 and V2 here are chosen arbitrarily but that should not matter):
image
Since the belt phase signal (topmost in this picture) repeats every 16 needles, the same selection pattern repeats as well all over the bed:
image

The first observation I made is that ayab-firmware and ayabAsync-firmware do have the same selection timing, as mostly expected from looking at the code. The traces above come out identical in both firmwares: both react to a rising edge on V2, and switch the solenoid right after the second rising edge of V2 after a belt phase edge.

So, that eliminates the possibility of either triggering on different edges of the encoder signals, or of an off-by-one error that would cause either firmware to trigger one needle earlier or later than expected.

Looking at the precise timing of activation now, i.e. how much actual time it takes for the firmware to process a rising edge and select a solenoid, which we can see by zooming in on the highlighted area of the first picture above, here's what ayab-firmware does:

image

This is telling us that it takes roughly 480 microseconds for ayab-firmware to notice the V2 signal has gone high, find out it needs to change solenoid 0, and send the change out to the solenoid.

Here's the same measurement for ayabAsync-firmware:

image

What we see is that ayabAsync-firmware takes about 650 microseconds to do the same job.

Both times are well within the tolerance of the mechanical systems downstream, so there's nothing to worry about here, but the fact that ayabAsync-firmware actually reacts slightly slower than ayab-firmware eliminates the hypothesis that its reaction time could explain the difference.

So, this leaves me with no explanation for the observed difference on Adrienne's machine so far, unfortunately.

Re-reading the reports though, this phrase struck me:

If yours doesn't fail immediately, try crossing the left turn mark.

And I realized when recording these traces, I actually took care of never crossing a turn mark, because I only wanted to focus on needle selection timing and not position detection. So, there remains a glimmer of hope that ayab-firmware somehow gets confused when crossing a turn mark and this shifts the assumed carriage position enough that the selection timing becomes off.

I will need to record more traces to test that last hypothesis. And, if anyone has another hypothesis to offer, I'll gladly look into it!

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 14, 2024

ISR processing starts with v1 not v2 and differs for ayab-firmware and ayabAsync

... but, if there is an issue with that, it should apply to all carriages, not sure I understand how Lace would be more sensitive to this.

Wrt beltPhase, there is a difference with ayabAsync as well (I sample it at the extremum of the sensor waveform while ayab-firmware samples it at the first threshold crossing occurence -> bias to 'early' and more sensitive to threshold level) but it is less critical for Lace than others (South is more centered within the beltPhase waveform than North) -> shouldn't lead to a wrong beltshift selection for L but a more probable shift by 1 of the carriage position.

Is my understanding of the issue below correct ?

  • only seen with Lace
  • mostly left to right
  • crossing the sensor may turn the issue on/off
  • mispatterning is regular, i.e. not random ?
  • higher carriage speed increases probability of the issue
    • still present if you keep knitting high speed but extend carriage travel a bit (to make sure the new row is really in) ?
  • sticky ?, i.e. when issue is not there, you have to e.g. cross the sensor to see it ?

@jonathanperret
Copy link
Contributor

ISR processing starts with v1 not v2 and differs for ayab-firmware and ayabAsync

... but, if there is an issue with that, it should apply to all carriages, not sure I understand how Lace would be more sensitive to this.

As I mentioned, the V1/V2 naming in my report is arbitrary, I didn't care to check back with the service manual because it was not relevant here — particularly since it is clearly handled the same between the two firmwares.

Also note I measured the time from encoder pulse to solenoid change, which is downstream from whichever ISR/main loop arrangement the firmware uses. ayabAsync-firmware may well have a shorter ISR (which I agree is a good thing in principle), but in the end I could reliably measure a slightly longer reaction time from it (650 vs 500 microseconds). But as I said, this is well within the requirements so I wouldn't spend too much time digging into that difference.

My theory for how a difference in reaction time could make the lace carriage fail was that if ayabAsync-firmware reacted much faster than ayab-firmware, i.e. 500 microseconds versus let's say 5 milliseconds, it might push the solenoid selection timing just past the requirement to activate the solenoid before the rotary cam has released the U-lever and the armature is no longer in contact with the solenoid core. But since the difference actually goes the other way, the hypothesis is invalidated.

Wrt beltPhase, there is a difference with ayabAsync as well (I sample it at the extremum of the sensor waveform while ayab-firmware samples it at the first threshold crossing occurence -> bias to 'early' and more sensitive to threshold level) but it is less critical for Lace than others (South is more centered within the beltPhase waveform than North) -> shouldn't lead to a wrong beltshift selection for L but a more probable shift by 1 of the carriage position.

Interesting but since the waveforms are practically identical in the end between the two firmwares it seems unlikely there was a problem reading the belt phase signal. Note that the belt phase signal is not used to determine carriage position so an error reading it could not lead to a shift by 1 of the position.

However the carriage position is indeed reset by async-firmware at the first Hall sensor threshold crossing, so that could lead to a difference of 1 in assumed carriage position as you say. However, further on the bed, this would show up as a measured difference in the number of encoder edges between the belt phase change and the solenoid switch wouldn't it?

There may still be an idea here. It might be that Adrienne's 930 is such that reading the left Hall sensor at the threshold vs extremum does lead to a ±1 difference in carriage position, but this doesn't happen on my machine. I'll try to think of a test that Adrienne could run to see if that could explain the difference. Thanks for the suggestion.

Is my understanding of the issue below correct ?

All good questions, I'll let @Adrienne200 fill in as I'm interested in the answers as well.

@jpcornil-git
Copy link
Contributor

  • Reaction time
    It is expected that ayabAsync is slightly slower to react given that almost all of the processing is done outside of the ISR but the extra latency is not playing a role here as it is marginal (200us within a budget of 10ms assuming 100needles/sec for carriage speed or ~2% of the budget ). What may play a role on the other hand is the "ISR occupation", e.g. if ayab ISR is taking 1ms and Serial ISR 6ms but time between v1 raising/falling edges is only 5ms you have a problem/will miss at least a carriage position update.
    Unlikely the issue here as this would be an issue for all carriages (unless you move L, lighter, more quickly :-))
  • BeltPhase
    Unlikely the issue here, as I said more budget with L/South than other K/North + the whole pattern should be shifted.
  • Sensor threshold
    This will be compounded with the +/-1 needle error related to the ideal sampling grid, i.e. could make it +/-2 if pulse width is to wide (ayab-firmware takes the edge of the pulse iso center estimation for ayabAsync). This is also machine/carriage-specific btw

The "directionality" tends to indicate that there is a bias of the carriage position but I believe you tested that on simavr and didn't see a difference.

I'll wait for @Adrienne200 feedback as well.

@jonathanperret
Copy link
Contributor

It is expected that ayabAsync is slightly slower to react given that almost all of the processing is done outside of the ISR but the extra latency is not playing a role here as it is marginal (200us within a budget of 10ms assuming 100needles/sec for carriage speed or ~2% of the budget ).

Indeed, either value is safe. A quick note about the budget though, it looks like it is safer to consider it closer to 2ms than 10ms for the worst case. Here you can see I clocked my 910 (well, I was driving, officer 😅) doing about 460 needles/sec:

image

This was with all needles out of work — I wouldn't try running the lace carriage with needles in work at even close to that speed (actually I tried, and the result was bent needles — for science!). But still, I can imagine a small jerk of the hand while knitting at a regular speed could produce a large instantaneous speed for a needle or two.

But anyway, 650µs is still well within that tightened budget, so nothing to worry about here.

@jpcornil-git
Copy link
Contributor

... 0.5s to cross the needle bed was extremely fast, you should apply for a guinness book world record :-)

We should probably take 5ms/a 2x margin wrt expected worst case. Stress point for ayab-firmware will be when you are out of the active needles (less resistance/higher speed) and get Serial rx activity (consuming irq budget; ~100 instructions@16MHz or max. 10us/byte, and at 115200 baud about one byte/100us -> max. 10%/90% left/OK based on your measurement above).

Hopefully the issue is sticky, i.e. when not there it is never there, as this would point towards initial carriage positioning (given that timing is the same as ayabAsync that doesn't seem to have the issue but has a different alignment strategy). Directionality points into that direction as well (too early in one direction means too late in the other, former is usually OK but not the latter, mostly left->right => shifted to the right=too early detection ?) while speed is probably related to inertia/mechanical ability for the solenoid to still pull the armature while already passed for a needle or two ...

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 14, 2024

One thing you could do to test this hypothesis is to make a test build with START_OFFSET modified for KH910 when heading to the right (leave to the left as-is) with -3 or -2 timing vs -4 today (I believe) and leave KH930 as is => easy to switch between the two (same of KH910 when starting from the left) to check if it makes a difference or '-3' for KH910 and '-2' for KH930 to test sensitivity/offset or ...

@jonathanperret
Copy link
Contributor

jonathanperret commented Dec 14, 2024

OK so there may be something in this sensor hypothesis. At least, I’ve managed to create a configuration in simavr, where ayab-firmware fails but ayab-asyncFirmware doesn’t.

To do this I changed the width of the sensor detection area in simavr to be 5 needles wide instead of just 1. Note that to exhibit the bug you’d also need my patch that simulates the inability to grab an armature when the cam isn’t pushing it down (see jonathanperret/ayab-simavr@5514458).

Triggering the sensor for 5 needle widths seems excessive (on my 910 it is triggered for two needle widths), but not inconceivable if that machine’s sensor is particularly sensitive.

Looking at the code in ayab-firmware, what happens is that it keeps resetting the carriage position to 0 (well, 28 because of the annoying offset) as long as the magnet is detected. In effect, when travelling left-to-right this sets position 0 at the rightmost edge of the trigger area.

With a 5-needle wide trigger area, this means the position computed by ayab-firmware will be two needles less than the position computed by ayabAsync-firmware (assuming a symmetrical sensor output). This will cause ayab-firmware to activate solenoids two needles later, which we’ve seen can cause the symptoms described above.

Even a 3-needle wide area would cause a one needle position difference between the firmwares.

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 14, 2024

5 is probably a strech but 3 is very common and varies over time, e.g. from old ayabAsync log files with L carriage:

...
INFO:Analyzer:IndState: 0 3   1 L   (Regular    ,Left ) Hall:?     (  338,   368) [92 OK]
INFO:Analyzer:IndState: 0 3   0 L   (Regular    ,Left ) Hall:Left  (   16,   367) [cd OK] <= below 200
INFO:Analyzer:IndState: 0 3 255 L   (Regular    ,Left ) Hall:Left  (  230,   367) [92 OK] <= not far from 200
INFO:Analyzer:IndState: 0 3 254 L   (Regular    ,Left ) Hall:Left  (  266,   366) [c2 OK] <= not far from 200
INFO:Analyzer:IndState: 0 3 253 L   (Regular    ,Left ) Hall:?     (  374,   366) [04 OK]
... (same run/other direction)
INFO:Analyzer:IndState: 0 3 255 L   (Regular    ,Right) Hall:?     (  418,   367) [05 OK]
INFO:Analyzer:IndState: 0 3   0 L   (Regular    ,Right) Hall:Left  (   15,   367) [c8 OK] <= below 200
INFO:Analyzer:IndState: 0 3   1 L   (Regular    ,Right) Hall:Left  (  184,   367) [f9 OK] <= below 200
INFO:Analyzer:IndState: 0 3   2 L   (Regular    ,Right) Hall:Left  (  375,   367) [f4 OK]
INFO:Analyzer:IndState: 0 3   3 L   (Regular    ,Right) Hall:?     (  360,   367) [11 OK]
... (same run/right sensor)
INFO:Analyzer:IndState: 0 3 198 L   (Regular    ,Right) Hall:?     (  324,   305) [03 OK]
INFO:Analyzer:IndState: 0 3 199 L   (Regular    ,Right) Hall:Right (  325,    14) [aa OK] <= below 200
INFO:Analyzer:IndState: 0 3 200 L   (Regular    ,Right) Hall:Right (  327,    14) [b6 OK] <= below 200
INFO:Analyzer:IndState: 0 3 201 L   (Regular    ,Right) Hall:Right (  326,   179) [37 OK] <= below 200
INFO:Analyzer:IndState: 0 3 202 L   (Regular    ,Right) Hall:?     (  327,   349) [6a OK]
... (same run/other direction)
INFO:Analyzer:IndState: 0 3 201 L   (Regular    ,Left ) Hall:?     (  327,   301) [d1 OK]
INFO:Analyzer:IndState: 0 3 200 L   (Regular    ,Left ) Hall:Right (  327,    22) [e7 OK] <= below 200
INFO:Analyzer:IndState: 0 3 199 L   (Regular    ,Left ) Hall:Right (  326,    14) [c4 OK] <= below 200
INFO:Analyzer:IndState: 0 3 198 L   (Regular    ,Left ) Hall:Right (  326,   199) [42 OK] <= below 200
INFO:Analyzer:IndState: 0 3 197 L   (Regular    ,Left ) Hall:Right (  324,   347) [bd OK]
INFO:Analyzer:IndState: 0 3 196 L   (Regular    ,Left ) Hall:?     (  324,   367) [92 OK]
... other run/log
INFO:Analyzer:IndState: 0 3 198 L   (Shifted    ,Right) Hall:?     (  354,   309) [06 OK]
INFO:Analyzer:IndState: 0 3 199 L   (Shifted    ,Right) Hall:Right (  353,    14) [f5 OK] <= below 200
INFO:Analyzer:IndState: 0 3 200 L   (Shifted    ,Right) Hall:Right (  352,    14) [2c OK] <= below 200
INFO:Analyzer:IndState: 0 3 201 L   (Shifted    ,Right) Hall:Right (  353,   177) [c3 OK] <= below 200
INFO:Analyzer:IndState: 0 3 202 L   (Shifted    ,Right) Hall:Right (  352,   348) [56 OK]
...
INFO:Analyzer:IndState: 0 3 201 L   (Shifted    ,Left ) Hall:?     (  353,   294) [f0 OK] <= not (that) far from 200
INFO:Analyzer:IndState: 0 3 200 L   (Shifted    ,Left ) Hall:Right (  353,    14) [c4 OK] <= below 200
INFO:Analyzer:IndState: 0 3 199 L   (Shifted    ,Left ) Hall:Right (  353,    14) [5e OK] <= below 200
INFO:Analyzer:IndState: 0 3 198 L   (Shifted    ,Left ) Hall:Right (  353,   204) [20 OK] <= not far from 200
INFO:Analyzer:IndState: 0 3 196 L   (Shifted    ,Left ) Hall:Right (  353,   348) [ab OK]
...

@jpcornil-git
Copy link
Contributor

You may also enable continuous reporting in ayab-desktop to fetch sensor data and confirm or not the above hypothesis
FWIW, I just commited my local changes to my ayab-desktop/main fork, see jpcornil-git/ayab-desktop@32edada

@jonathanperret
Copy link
Contributor

So I just opened #218 — full writeup there but in essence what I'm trying is to use the left edge of the sensor signal instead of the right edge.

Note that I kind of like what ayabAsync-firmware does with finding the extremum of the signal, but I felt it would be complicated to replicate that logic in ayab-firmware at this point. And also, that using the left edge should add a difference of at most 1 needle, but in the "right" direction.

@jpcornil-git
Copy link
Contributor

jpcornil-git commented Dec 16, 2024

Just added some comments to #218 (mostly additional code cleanup), the same change should be applied to the right sensor (same problem -> you should use the first detection event as well).

Proposed changes should fix the current issue if we are correct about the root cause, only side-effect is that there will be an offset of 2 on average between the position sets by the left and right sensors but this should be OK (offset in a safe direction)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

Successfully merging a pull request may close this issue.

10 participants