Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Context broker stuck on startup when the number of tenants is bigger than the mongo connection pool #1704

Open
sw-libelium opened this issue Nov 8, 2024 · 2 comments · Fixed by #1712
Assignees
Labels
bug Something isn't working Fixed - needs validation

Comments

@sw-libelium
Copy link

Description

OrionLD gets stuck on startup when the number of orion databases in mongodb is bigger than the mongo connection pool. No error is provided and the last log message is the following:

time=Friday 08 Nov 09:11:24 2024.098Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=MongoGlobal.cpp[238]:mongoInit | msg=Connected to mongo at mongo-db:orion as user 'mongo'

Details

Setup

Using the following docker compose configuration (docker-compose.yml):

services:

  mongo-db:
    image: mongo:4.4
    hostname: mongo-db
    container_name: mongo-db
    networks:
      - fiware-core
    ports:
      - "27017:27017"
    volumes:
      - mongo-db-data:/data/db
      - mongo-db-config:/data/configdb

    environment:
      - MONGO_READ_CONCERN_LEVEL=local
      - MONGO_WRITE_CONCERN=w:majority
      - MONGO_INITDB_ROOT_USERNAME=mongo
      - MONGO_INITDB_ROOT_PASSWORD=mongo

  orion-ld:
    image: fiware/orion-ld:1.7.0
    hostname: orion-ld
    container_name: orion-ld
    networks:
      - fiware-core

    ports:
      - "1026:1026"

    depends_on:
      - mongo-db

    command: -logLevel DEBUG

    environment:
      - ORIONLD_MULTI_SERVICE=TRUE
      - ORIONLD_MONGO_HOST=mongo-db
      - ORIONLD_MONGO_USER=mongo
      - ORIONLD_MONGO_PASSWORD=mongo
      - ORIONLD_MONGO_POOL_SIZE=10

networks:
  fiware-core:
    driver: bridge

volumes:
  mongo-db-data:
  mongo-db-config:

And the following tenant creation script (named as create_tenants.sh) that creates as many tenants as given in the first command line parameter:

#!/bin/bash

if [ $# -ne 1 ]; then
    echo "Usage: $0 <number_of_tenants>" >&2
    exit 1
fi

N_TENANTS=$1

for i in $(seq 1 $N_TENANTS); do
    curl --location --request POST 'http://localhost:1026/ngsi-ld/v1/entities' \
        --header "NGSILD-Tenant: t$i" \
        --header 'Content-Type: application/json' \
        --data-raw '{
            "id": "urn:ngsi-ld:Device:001",
            "type": "Device",
            "name": {
                "type": "Property",
                "value": "Device001"
            }
        }'

    if [ $? -ne 0 ]; then
        echo "ERROR CREATING TENANT t$i" >&2
        exit 1
    fi
done

used Docker Compose version is v2.29.7

Replication

First make sure that the environment is completely new:

docker compose down -v

Then start the services:

docker compose up -d

Then run the before mentioned script, which to create more tenants than the size of the mongo connection pool (11):

./create_tenants.sh 11

The script should give no error messages. Finally, restart orion-ld

docker restart orion-ld

Checking orion-ld logs, the restart logs look like:

time=Friday 08 Nov 09:29:44 2024.435Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[816]:versionInfo | msg=Version Info:
time=Friday 08 Nov 09:29:44 2024.435Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[817]:versionInfo | msg=-----------------------------------------
time=Friday 08 Nov 09:29:44 2024.435Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[818]:versionInfo | msg=orionld version:    1.7.0
time=Friday 08 Nov 09:29:44 2024.435Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[819]:versionInfo | msg=based on orion:     1.15.0-next
time=Friday 08 Nov 09:29:44 2024.435Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[820]:versionInfo | msg=core @context:      https://uri.etsi.org/ngsi-ld/v1/ngsi-ld-core-context-v1.6.jsonld
time=Friday 08 Nov 09:29:44 2024.435Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[821]:versionInfo | msg=git hash:           nogitversion
time=Friday 08 Nov 09:29:44 2024.436Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[822]:versionInfo | msg=build branch:       
time=Friday 08 Nov 09:29:44 2024.436Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[823]:versionInfo | msg=compiled by:        root
time=Friday 08 Nov 09:29:44 2024.436Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[824]:versionInfo | msg=compiled in:        
time=Friday 08 Nov 09:29:44 2024.436Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=orionld.cpp[825]:versionInfo | msg=-----------------------------------------
time=Friday 08 Nov 09:29:44 2024.439Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=mongocInit.cpp[303]:mongocInit | msg=Connecting to mongo for the C driver (URI: mongodb://mongo:mongo@mongo-db/)
time=Friday 08 Nov 09:29:44 2024.463Z | lvl=TMP | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=mongoConnectionPool.cpp[313]:mongoConnectionPoolInit | msg=Connecting to mongo for the C++ legacy driver
time=Friday 08 Nov 09:29:44 2024.572Z | lvl=INFO | corr=N/A | trans=N/A | from=N/A | srv=N/A | subsrv=N/A | comp=Orion | op=MongoGlobal.cpp[238]:mongoInit | msg=Connected to mongo at mongo-db:orion as user 'mongo'

If I try to get the version:

curl --location --request GET 'http://localhost:1026/version' -v

The result is curl: (56) Recv failure

Solution

Increasing ORIONLD_MONGO_POOL_SIZE solves the problem, which means that ORIONLD_MONGO_POOL_SIZE must be greater than the number of orion databases (tenants) in mongoDB. I have noted that this includes all databases starting with "orion", and excludes the rest.

Notes

Althoug I provide a script that rapidly creates the tenants, the error was detected in an SAAS environment where the tenants were created through many months.

@kzangeli
Copy link
Collaborator

kzangeli commented Nov 8, 2024

ok, interesting. You found a cornercase nobody has seen so far.
I'll try to make time to fix this problem asap, just, all next week I'm away for a European project meeting (aerOS).
I'll get something done there during the talks, I just can't promise anything ...
The week after I'm "free".

@kzangeli kzangeli self-assigned this Nov 8, 2024
@kzangeli kzangeli added the bug Something isn't working label Nov 8, 2024
@kzangeli kzangeli mentioned this issue Nov 8, 2024
kzangeli added a commit that referenced this issue Nov 28, 2024
@kzangeli kzangeli mentioned this issue Nov 28, 2024
@kzangeli
Copy link
Collaborator

Sorry for taking so long but I've been quite swamped lately with issues that take precedence over anything else, from the European projects that actually pay my salary.
Anyway, I finished the last urgent feature this morning and got time over for this issue.
Hopefully the PR launched fixes the issue, even though you already have a pretty solid workaround for it.
So, after the PR #1712 is merged (hopefully very soon), please test again and let me know

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Fixed - needs validation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants