Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unusable SingleStore deployment on m1 mac #72

Open
caseybrown89 opened this issue May 29, 2024 · 1 comment
Open

Unusable SingleStore deployment on m1 mac #72

caseybrown89 opened this issue May 29, 2024 · 1 comment

Comments

@caseybrown89
Copy link

caseybrown89 commented May 29, 2024

Describe the bug

I recently upgraded my Mac from Monterey 12.5 to Sonoma 14.5, and Docker Desktop from 4.16.1 (engine 20.10.22, compose v2.15.1) to 4.30 (engine - 26.1.1, compose v2.27.0-desktop.2)

Prior to upgrading the OS and Docker, the SingleStore database worked as expected, though was a bit on the slow side. I was able to execute integration tests against the database which included various activities like:

  1. Multiple schema creation (targeted at a schema per tenant in a multi-tenant application)
  2. Running of migrations (creation of tables, loading via INSERT statements)
  3. Execution of read and write queries across schemas

After upgrading, the local SingleStore database on Docker is no longer tenable. The database fails in different ways across the three steps above depending on the Docker Desktop configuration:

Docker Configuration Result Error
"Use virtualization framework" - off
File sharing implementation: osxfs
Schema creation and migrations succeed (albeit very slowly), tests fail during execution of read and write queries time="2024-05-28T15:40:23Z" level=error msg="Error 1777 (HY000): Partition xxx:0 has no master instance. This is likely because the node or nodes that hold a copy of the partition are down. Check for offline leaf nodes by running SHOW LEAVES and bring them back online to restore access to the partition"
"Use virtualization framework" - off
File sharing implementation: gRPC FUSE
Schema creation and migrations succeed (albeit very slowly), tests fail during execution of read and write queries time="2024-05-28T16:02:20Z" level=error msg="Error 1777 (HY000): Partition xxx:0 has no master instance. This is likely because the node or nodes that hold a copy of the partition are down. Check for offline leaf nodes by running SHOW LEAVES and bring them back online to restore access to the partition"
"Use virtualization framework" - on
File sharing implementation: VirtioFS
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: off
Migrations fail 286636318 2024-05-28 16:37:28.619 ERROR: Thread 99999 (ntid 342, conn id 29): ShardingAlterTableV6: Alter Table timed out sending PREPARE messages to all the leaves
286636432 2024-05-28 16:37:28.620 WARN: Thread 99999 (ntid 342, conn id 29): operator(): Alter table on xxx.client has failed, rolling back transaction. Error: 2286: Operation ALTER timed out while waiting for concurrent operation to finish. Use SHOW PROCESSLIST to investigate long running concurrent operation, or consider increasing the value of alter statement's timeout value or the default_distributed_ddl_timeout global variable
"Use virtualization framework" - on
File sharing implementation: VirtioFS
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: on
Migrations fail ==> /var/lib/memsql/ce0473ab-fc9f-45ae-a5ea-0e1c6c236947/tracelogs/memsql.log <==
276996548 2024-05-28 20:31:03.363 ERROR: Thread 99999 (ntid 293, conn id 28): ShardingAlterTableV6: Alter Table timed out sending PREPARE messages to all the leaves
276997002 2024-05-28 20:31:03.364 WARN: Thread 99999 (ntid 293, conn id 28): operator(): Alter table on xxx.client has failed, rolling back transaction. Error: 2286: Operation ALTER timed out while waiting for concurrent operation to finish. Use SHOW PROCESSLIST to investigate long running concurrent operation, or consider increasing the value of alter statement's timeout value or the default_distributed_ddl_timeout global variable
"Use virtualization framework" - on
File sharing implementation: osxfs
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: on
Migrations fail ==> /var/lib/memsql/b91e2313-f74d-4cdb-847d-1bff188babe5/tracelogs/memsql.log <==
286541859 2024-05-28 20:39:57.493 ERROR: Thread 99999 (ntid 332, conn id 28): ShardingAlterTableV6: Alter Table timed out sending PREPARE messages to all the leaves
286542303 2024-05-28 20:39:57.494 WARN: Thread 99999 (ntid 332, conn id 28): operator(): Alter table on labsengpte.contact has failed, rolling back transaction. Error: 2286: Operation ALTER timed out while waiting for concurrent operation to finish. Use SHOW PROCESSLIST to investigate long running concurrent operation, or consider increasing the value of alter statement's timeout value or the default_distributed_ddl_timeout global variable
"Use virtualization framework" - on
File sharing implementation: VirtioFS
Use Rosetta for x86_64/amd64 emulation on Apple Silicon: on
Remove migration file causing alter failures (deadlock)
Schema creation and migrations succeed (albeit very slowly), tests time out after 20 minutes

To Reproduce
Steps to reproduce the behavior:

  1. It is hard to share the schemas and migration files causing the issues as it's for a proprietary application. I'm happy to share privately or work toward smaller, representative test case.
    1. At a high level, there are 36 migration files applied across three separate schemas (or "databases"). There is a total of 36 tables in each schema (the migration files do not correlate 1-1 with the tables)

Expected behavior

I expect the database to succeed in executing the migration files and the query performance to be reasonable. Prior to the mac OS and Docker upgrade, the test suite worked locally though its runtime was high (> 10 minutes end-to-end)

Desktop (please complete the following information):

  • OS: macOS Sonoma 14.5

  • Chip: Apple M1 Max

  • Docker version: 26.1.1

    • Resources:
      • 10 cores
      • 16GB RAM
      • 1GB swap
      • Virtual Disk Limit: 200GB
      • Filesystem: varies, see above
  • Image tag: ghcr.io/singlestore-labs/singlestoredb-dev:0.2.21

Additional context
The sql-migrate tool is being used for migration execution

@caseybrown89
Copy link
Author

The last version of Docker Desktop where I can get SingleStore to work is 4.27.2, which is purportedly Docker engine version 25.0.3 as listed in the GUI, but seems like it might actually be 25.0.2 (according to release notes). Docker Desktop 4.28.0 is bundled with engine 25.0.3 and fails to run SingleStore in our application, which also leaves me to believe Docker Desktop 4.27.2 is actually engine version 25.0.2.

According to some Docker GH issues (PHP-FPM issue in Docker Desktop 4.27.2: WARNING: [pool www] child 85 exited on signal 11 (SIGSEGV) #7182, Mac M1 - after upgrade to Docker Desktop 4.27.1 docker container with java fails with qemu: uncaught target signal 11 (Segmentation fault) - core dumped), 25.0.3 "fixes some issues with Rosetta and QEMU". I would start there looking for what may have changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant