Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failures on non-amd64 architectures #48

Open
MTecknology opened this issue Sep 1, 2017 · 13 comments
Open

Build failures on non-amd64 architectures #48

MTecknology opened this issue Sep 1, 2017 · 13 comments

Comments

@MTecknology
Copy link

I noticed that tests fail on non-amd64 architectures. Failure logs are available on Debian's reproducible build project [1]. These failure logs were produced using rev:8ea508f. I've also reproduced this same behavior on rev:61afce4.

--- PASS: TestCollectionStatsClose (0.00s)
=== RUN   TestCollectionGet
--- PASS: TestCollectionGet (0.00s)
=== RUN   TestMossDGM
--- FAIL: TestMossDGM (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x804931b]

goroutine 1022 [running]:
testing.tRunner.func1(0x1881aa00)
        /usr/lib/go-1.8/src/testing/testing.go:622 +0x226
panic(0x81d40c0, 0x82c2248)
        /usr/lib/go-1.8/src/runtime/panic.go:489 +0x22a
sync/atomic.LoadUint64(0x186f42bc, 0x0, 0x0)
        /usr/lib/go-1.8/src/sync/atomic/asm_386.s:159 +0xb
github.com/couchbase/moss.(*dgmTest).getDGMStats(0x186f4240, 0x23d, 0x0, 0x82c6080)
        /build/golang-github-couchbase-moss-0.0~git20170828.0.61afce4/obj-i686-linux-gnu/src/github.com/couchbase/moss/dgm_moss_test.go:469 +0x88
github.com/couchbase/moss.TestMossDGM(0x1881aa00)
        /build/golang-github-couchbase-moss-0.0~git20170828.0.61afce4/obj-i686-linux-gnu/src/github.com/couchbase/moss/dgm_moss_test.go:887 +0x2f9
testing.tRunner(0x1881aa00, 0x8204518)
        /usr/lib/go-1.8/src/testing/testing.go:657 +0x7e
created by testing.(*T).Run
        /usr/lib/go-1.8/src/testing/testing.go:697 +0x242
exit status 2
FAIL    github.com/couchbase/moss       0.032s

[1] https://tests.reproducible-builds.org/debian/rb-pkg/buster/i386/golang-github-couchbase-moss.html

@steveyen
Copy link
Member

steveyen commented Sep 1, 2017

Hi @MTecknology - thanks for the report!

Looks like offending line of code (dgm_moss_test.go:469) is...

stats["numKeysWrite"] = atomic.LoadUint64(&dgm.numKeysWrite)

...so, it's doing some 64-bit maneuver. The builder seems to be i386, so it's possibly an alignment issue, as mentioned at https://golang.org/pkg/sync/atomic/#pkg-note-BUG

On both ARM and x86-32, it is the caller's responsibility to arrange for 64-bit alignment of 64-bit words accessed atomically. The first word in a variable or in an allocated struct, array, or slice can be relied upon to be 64-bit aligned.

Unfortunately, I don't have a 32-bit environment for me to test a fix, and somewhat a bit lethargic to get my hands on one of those. Also, there's probably an not insignificant number of other instances of alignment issues throughout the moss codebase (lots of atomic usage for stats), since main dev work, afaik, is done by folks who are on 64-bit.

This might devolve to one of those things where this project needs to clearly advertise for now that it "works with 64-bit" platforms.

@MTecknology
Copy link
Author

MTecknology commented Sep 2, 2017

I'm delighted you were able to quickly identify the problem, but the rest of the things you said have me a bit nervous. I've seen this issue across a number of packages; my first filing of bugs has been against couchbase/moss and blevesearch/bleve because they seemed like the best bet for a resolution that I could watch, learn from, and bring to the other projects (ideally patch 'n PR). Unfortunately, I'm not a strong developer and Golang is still very new to me so it's not likely I can do much to contribute to the fix.

If it helps, I would be more than eager to provide a 32-bit host! If the resolution can be turned into a formula/template or if anyone is interested in helping me understand at least that much of Golang, I would be eager to help get this fixed up all over (including the other packages).

@steveyen
Copy link
Member

steveyen commented Sep 5, 2017

Hi @MTecknology,
Indeed, your nervousness is not unwarranted! The reason why is you happened to run into and raise a (legitimate!) issue that's not in my sweet spot and, honestly, also not high in my "keeps me up at night / priority" area (i.e., 32-bit related platform).

Thanks for the offer of a 32-bit host, but I'd have to politely and sheepishly decline -- perhaps in the same way that I might sheepishly decline an offer to try a unicycle... I might damage both myself and, more importantly, the borrowed unicycle :-)

On a "formula" or "template" for how to address this, I'd probably end up brushing off memory alignment related articles and learn more about that, such as with adding padding fields to structs, and would have to one-by-one chase down each struct padding problem as they appear. Some links...

@mckaygerhard
Copy link

hi all, i have same problem i arrives here thanks to @MTecknology, as you @steveyen does not used 32 bits, in my environments we cannot use 64bits due the requeriments are too high and we have a lot of resources but all 32bits.. very olders..
but perfectly working and operational.. spend money on new 64bit hardware are nonsense if those are pretty working, software it's made to adapt for, not to people adapt to software

its only a very important point, please don't fogot to have that in prority of task..

@hisundar
Copy link
Contributor

hi @steveyen , @scottlashley , since the failure is in a test and this specific test is only meant to do detailed benchmark evaluations. I am just wondering if it might be easier to disable the problematic getDGMStats() call for normal go test runs and enable it only when a special flag like -dbpath has been specified?
This way 32-bit users will not be affected and there won't be any impact to normal DGM tests which would pass in the -dbpath anyway?
Any thoughts?

@steveyen
Copy link
Member

Hi @hisundar, quick thought is I'd be a little bit leery of disabling some calls in one test in a special case. That said, it might work, but my guess instead is that this will reveal itself to be an overall multi-step issue, in that disabling or fixing one thing will just reveal the next thing and the next and the next -- e.g., perhaps several places where 32-bit is handled incorrectly.

Hi @mckaygerhard -- thank you for the comment. I understand what you are saying and your hopes for priority. To be clear, I would certainly not reject any efforts or fixes to make 32-bit work if folks make it happen, but unfortunately do not have the time to work on it and make it happen myself.

Attempting a pun, which I hope translates well: so many things to do, and 32-bit doesn't quite reach the higher-order bits on that list. :-)

cheers,
steve

@MTecknology
Copy link
Author

I hope it is someday fixed in a way that I can copy/paste the resolution to other projects where I've seen this problem, but the higher-order bits is understood. For the Debian build system, I'm disabling tests that don't work on all architectures.

@MTecknology
Copy link
Author

@steveyen Is it possible to reach out to you directly for a little bit of help? I now have this build failure and I'm struggling to figure out how to fix it. (I'm MTecknology on Freenode/OFTC if that works..)

@mckaygerhard
Copy link

hi @steveyen, thanks for answered, demostrate some care in the quality of the results, and i understand the amount of work that the team or you are doing.. if i could i'll analize the problem and send patches, but my skills in go programing are very new, and so umm, i noted a pretty and elegant way of saying that there is no current interest in 32 bits !? please dont forgot, we areneed of that due the packagin of many other distro like debian and fedora for gitea

@tyronelucero
Copy link

Hello! Actually I'm working with 32 bits, like the guy mckaygerhard I'm in need of some attention, not all of us have the latest technology ... thanks!

@mckaygerhard
Copy link

hi @steveyen i able to reproduce same problem in 64BITS ARM setupd, so the problem its not related to 32bit only

@steveyen
Copy link
Member

hi @mckaygerhard -- thank you again for the report and more information. I am less sure what the 64-bit ARM issue is about -- it might be a different kind of issue -- since I am not as familiar with the ARM platform.

Also, unfortunately, similar to the 32-bit issue, too, I also do not have a 64-bit ARM system either. :-/

@mckaygerhard
Copy link

ok i understand @steveyen so i post to make sense that the problem its not related to 32 bit only, its related to programing assuming only 64bits intel/amd .. i thinkg make an application only to and specific arch not widelly used (remenber the arm are so extended now.. )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants