Skip to content

out_doris: add new doris out plugin #9514

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 13 commits into
base: master
Choose a base branch
from

Conversation

joker-star-l
Copy link

@joker-star-l joker-star-l commented Oct 23, 2024

Add new doris out plugin.

#9501

Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
[INPUT]
    Name cpu
    Interval_Sec 3

[OUTPUT]
    name doris
    match *
    host localhost
    port 8030
    user admin
    password admin
    database d_fb
    table t_fb
    time_key date
    header columns date, cpu_p, timestamp=from_unixtime(date), log=cast(cpu_p as string)
  • Debug log output from testing the change
Fluent Bit v3.2.0
* Copyright (C) 2015-2024 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

______ _                  _    ______ _ _           _____  _____ 
|  ___| |                | |   | ___ (_) |         |____ |/ __  \
| |_  | |_   _  ___ _ __ | |_  | |_/ /_| |_  __   __   / /`' / /'
|  _| | | | | |/ _ \ '_ \| __| | ___ \ | __| \ \ / /   \ \  / /  
| |   | | |_| |  __/ | | | |_  | |_/ / | |_   \ V /.___/ /./ /___
\_|   |_|\__,_|\___|_| |_|\__| \____/|_|\__|   \_/ \____(_)_____/


[2024/10/23 18:16:58] [ info] Configuration:
[2024/10/23 18:16:58] [ info]  flush time     | 1.000000 seconds
[2024/10/23 18:16:58] [ info]  grace          | 5 seconds
[2024/10/23 18:16:58] [ info]  daemon         | 0
[2024/10/23 18:16:58] [ info] ___________
[2024/10/23 18:16:58] [ info]  inputs:
[2024/10/23 18:16:58] [ info]      cpu
[2024/10/23 18:16:58] [ info] ___________
[2024/10/23 18:16:58] [ info]  filters:
[2024/10/23 18:16:58] [ info] ___________
[2024/10/23 18:16:58] [ info]  outputs:
[2024/10/23 18:16:58] [ info]      doris.0
[2024/10/23 18:16:58] [ info] ___________
[2024/10/23 18:16:58] [ info]  collectors:
[2024/10/23 18:16:58] [ info] [fluent bit] version=3.2.0, commit=1828228f55, pid=47353
[2024/10/23 18:16:58] [debug] [engine] coroutine stack size: 24576 bytes (24.0K)
[2024/10/23 18:16:58] [ info] [storage] ver=1.5.2, type=memory, sync=normal, checksum=off, max_chunks_up=128
[2024/10/23 18:16:58] [ info] [cmetrics] version=0.9.8
[2024/10/23 18:16:58] [ info] [ctraces ] version=0.5.7
[2024/10/23 18:16:58] [ info] [input:cpu:cpu.0] initializing
[2024/10/23 18:16:58] [ info] [input:cpu:cpu.0] storage_strategy='memory' (memory only)
[2024/10/23 18:16:58] [debug] [cpu:cpu.0] created event channels: read=29 write=30
[2024/10/23 18:16:58] [debug] [doris:doris.0] created event channels: read=31 write=32
[2024/10/23 18:16:58] [ info] [output:doris:doris.0] worker #0 started
[2024/10/23 18:16:58] [ info] [output:doris:doris.0] worker #1 started
[2024/10/23 18:16:58] [ info] [sp] stream processor started
[2024/10/23 18:17:01] [debug] [task] created task=0x7f003c026da0 id=0 OK
[2024/10/23 18:17:01] [debug] [output:doris:doris.0] task_id=0 assigned to thread #0
[2024/10/23 18:17:01] [debug] [output:doris:doris.0] http body: [{"date":1729678620,"cpu_p":0.5925925925925926,"user_p":0.3333333333333333,"system_p":0.2592592592592593,"cpu0.p_cpu":1.0,"cpu0.p_user":0.0,"cpu0.p_system":1.0,"cpu1.p_cpu":0.3333333333333333,"cpu1.p_user":0.3333333333333333,"cpu1.p_system":0.0,"cpu2.p_cpu":1.0,"cpu2.p_user":0.6666666666666666,"cpu2.p_system":0.3333333333333333,"cpu3.p_cpu":1.0,"cpu3.p_user":1.0,"cpu3.p_system":0.0,"cpu4.p_cpu":1.0,"cpu4.p_user":0.6666666666666666,"cpu4.p_system":0.3333333333333333,"cpu5.p_cpu":0.0,"cpu5.p_user":0.0,"cpu5.p_system":0.0,"cpu6.p_cpu":0.6666666666666666,"cpu6.p_user":0.0,"cpu6.p_system":0.6666666666666666,"cpu7.p_cpu":0.6666666666666666,"cpu7.p_user":0.3333333333333333,"cpu7.p_system":0.3333333333333333,"cpu8.p_cpu":0.6666666666666666,"cpu8.p_user":0.3333333333333333,"cpu8.p_system":0.3333333333333333,"cpu9.p_cpu":0.6666666666666666,"cpu9.p_user":0.6666666666666666,"cpu9.p_system":0.0,"cpu10.p_cpu":1.0,"cpu10.p_user":0.3333333333333333,"cpu10.p_system":0.6666666666666666,"cpu11.p_cpu":0.0,"cpu11.p_user":0.0,"cpu11.p_system":0.0,"cpu12.p_cpu":0.6666666666666666,"cpu12.p_user":0.3333333333333333,"cpu12.p_system":0.3333333333333333,"cpu13.p_cpu":0.0,"cpu13.p_user":0.0,"cpu13.p_system":0.0,"cpu14.p_cpu":0.6666666666666666,"cpu14.p_user":0.6666666666666666,"cpu14.p_system":0.0,"cpu15.p_cpu":0.0,"cpu15.p_user":0.0,"cpu15.p_system":0.0,"cpu16.p_cpu":0.6666666666666666,"cpu16.p_user":0.0,"cpu16.p_system":0.6666666666666666,"cpu17.p_cpu":0.6666666666666666,"cpu17.p_user":0.6666666666666666,"cpu17.p_system":0.0}]
[2024/10/23 18:17:01] [debug] [upstream] KA connection #63 to localhost:8030 is connected
[2024/10/23 18:17:01] [debug] [http_client] not using http_proxy for header
[2024/10/23 18:17:01] [debug] [http_client] server localhost:8030 will close connection #63
[2024/10/23 18:17:01] [debug] [output:doris:doris.0] localhost:8030, HTTP status=307
(null)

[2024/10/23 18:17:01] [debug] [upstream] KA connection #64 to 127.0.0.1:8040 is connected
[2024/10/23 18:17:01] [debug] [http_client] not using http_proxy for header
[2024/10/23 18:17:02] [debug] [output:doris:doris.0] 127.0.0.1:8040, HTTP status=200
{
    "TxnId": 28028,
    "Label": "ce92e8a5-c4d6-430e-8e0d-3298f2f9c738",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 1,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 1525,
    "LoadTimeMs": 218,
    "BeginTxnTimeMs": 17,
    "StreamLoadPutTimeMs": 127,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 24,
    "CommitAndPublishTimeMs": 49
}


[2024/10/23 18:17:02] [debug] [upstream] KA connection #64 to 127.0.0.1:8040 is now available
[2024/10/23 18:17:02] [debug] [out flush] cb_destroy coro_id=0
[2024/10/23 18:17:02] [debug] [task] destroy task=0x7f003c026da0 (task_id=0)
[2024/10/23 18:17:04] [debug] [task] created task=0x7f003c0670d0 id=0 OK
[2024/10/23 18:17:04] [debug] [output:doris:doris.0] task_id=0 assigned to thread #1
[2024/10/23 18:17:04] [debug] [output:doris:doris.0] http body: [{"date":1729678623,"cpu_p":3.444444444444445,"user_p":3.0,"system_p":0.4444444444444444,"cpu0.p_cpu":8.666666666666666,"cpu0.p_user":8.0,"cpu0.p_system":0.6666666666666666,"cpu1.p_cpu":0.6666666666666666,"cpu1.p_user":0.6666666666666666,"cpu1.p_system":0.0,"cpu2.p_cpu":5.0,"cpu2.p_user":4.333333333333333,"cpu2.p_system":0.6666666666666666,"cpu3.p_cpu":1.333333333333333,"cpu3.p_user":1.333333333333333,"cpu3.p_system":0.0,"cpu4.p_cpu":12.33333333333333,"cpu4.p_user":11.33333333333333,"cpu4.p_system":1.0,"cpu5.p_cpu":0.0,"cpu5.p_user":0.0,"cpu5.p_system":0.0,"cpu6.p_cpu":1.0,"cpu6.p_user":0.3333333333333333,"cpu6.p_system":0.6666666666666666,"cpu7.p_cpu":2.0,"cpu7.p_user":2.0,"cpu7.p_system":0.0,"cpu8.p_cpu":6.666666666666667,"cpu8.p_user":6.0,"cpu8.p_system":0.6666666666666666,"cpu9.p_cpu":5.0,"cpu9.p_user":5.0,"cpu9.p_system":0.0,"cpu10.p_cpu":1.333333333333333,"cpu10.p_user":0.6666666666666666,"cpu10.p_system":0.6666666666666666,"cpu11.p_cpu":0.0,"cpu11.p_user":0.0,"cpu11.p_system":0.0,"cpu12.p_cpu":4.333333333333333,"cpu12.p_user":3.666666666666667,"cpu12.p_system":0.6666666666666666,"cpu13.p_cpu":0.6666666666666666,"cpu13.p_user":0.3333333333333333,"cpu13.p_system":0.3333333333333333,"cpu14.p_cpu":3.666666666666667,"cpu14.p_user":2.333333333333333,"cpu14.p_system":1.333333333333333,"cpu15.p_cpu":0.0,"cpu15.p_user":0.0,"cpu15.p_system":0.0,"cpu16.p_cpu":3.333333333333333,"cpu16.p_user":3.333333333333333,"cpu16.p_system":0.0,"cpu17.p_cpu":6.0,"cpu17.p_user":4.666666666666667,"cpu17.p_system":1.333333333333333}]
[2024/10/23 18:17:04] [debug] [upstream] KA connection #63 to localhost:8030 is connected
[2024/10/23 18:17:04] [debug] [http_client] not using http_proxy for header
[2024/10/23 18:17:04] [debug] [http_client] server localhost:8030 will close connection #63
[2024/10/23 18:17:04] [debug] [output:doris:doris.0] localhost:8030, HTTP status=307
(null)

[2024/10/23 18:17:04] [debug] [upstream] KA connection #64 to 127.0.0.1:8040 is connected
[2024/10/23 18:17:04] [debug] [http_client] not using http_proxy for header
[2024/10/23 18:17:04] [debug] [output:doris:doris.0] 127.0.0.1:8040, HTTP status=200
{
    "TxnId": 28029,
    "Label": "1be0852a-1bbb-4029-a643-11c3cc2b426e",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 1,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 1537,
    "LoadTimeMs": 55,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 8,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 19,
    "CommitAndPublishTimeMs": 26
}


[2024/10/23 18:17:04] [debug] [upstream] KA connection #64 to 127.0.0.1:8040 is now available
[2024/10/23 18:17:04] [debug] [out flush] cb_destroy coro_id=0
[2024/10/23 18:17:04] [debug] [task] destroy task=0x7f003c0670d0 (task_id=0)
[2024/10/23 18:17:07] [debug] [task] created task=0x7f003c027190 id=0 OK
[2024/10/23 18:17:07] [debug] [output:doris:doris.0] task_id=0 assigned to thread #0
[2024/10/23 18:17:07] [debug] [output:doris:doris.0] http body: [{"date":1729678626,"cpu_p":14.07407407407407,"user_p":12.40740740740741,"system_p":1.666666666666667,"cpu0.p_cpu":13.66666666666667,"cpu0.p_user":10.66666666666667,"cpu0.p_system":3.0,"cpu1.p_cpu":14.0,"cpu1.p_user":12.0,"cpu1.p_system":2.0,"cpu2.p_cpu":15.0,"cpu2.p_user":14.0,"cpu2.p_system":1.0,"cpu3.p_cpu":23.66666666666667,"cpu3.p_user":23.0,"cpu3.p_system":0.6666666666666666,"cpu4.p_cpu":22.0,"cpu4.p_user":19.0,"cpu4.p_system":3.0,"cpu5.p_cpu":5.333333333333333,"cpu5.p_user":4.333333333333333,"cpu5.p_system":1.0,"cpu6.p_cpu":20.33333333333333,"cpu6.p_user":20.0,"cpu6.p_system":0.3333333333333333,"cpu7.p_cpu":8.666666666666666,"cpu7.p_user":7.0,"cpu7.p_system":1.666666666666667,"cpu8.p_cpu":16.0,"cpu8.p_user":12.33333333333333,"cpu8.p_system":3.666666666666667,"cpu9.p_cpu":8.333333333333334,"cpu9.p_user":8.333333333333334,"cpu9.p_system":0.0,"cpu10.p_cpu":8.0,"cpu10.p_user":5.666666666666667,"cpu10.p_system":2.333333333333333,"cpu11.p_cpu":16.66666666666667,"cpu11.p_user":15.0,"cpu11.p_system":1.666666666666667,"cpu12.p_cpu":16.0,"cpu12.p_user":12.33333333333333,"cpu12.p_system":3.666666666666667,"cpu13.p_cpu":17.66666666666667,"cpu13.p_user":16.66666666666667,"cpu13.p_system":1.0,"cpu14.p_cpu":8.333333333333334,"cpu14.p_user":8.0,"cpu14.p_system":0.3333333333333333,"cpu15.p_cpu":8.0,"cpu15.p_user":7.333333333333333,"cpu15.p_system":0.6666666666666666,"cpu16.p_cpu":9.0,"cpu16.p_user":6.666666666666667,"cpu16.p_system":2.333333333333333,"cpu17.p_cpu":22.66666666666667,"cpu17.p_user":21.0,"cpu17.p_system":1.666666666666667}]
[2024/10/23 18:17:07] [debug] [upstream] KA connection #63 to localhost:8030 is connected
[2024/10/23 18:17:07] [debug] [http_client] not using http_proxy for header
[2024/10/23 18:17:07] [debug] [http_client] server localhost:8030 will close connection #63
[2024/10/23 18:17:07] [debug] [output:doris:doris.0] localhost:8030, HTTP status=307
(null)

[2024/10/23 18:17:07] [debug] [upstream] KA connection #64 to 127.0.0.1:8040 is connected
[2024/10/23 18:17:07] [debug] [http_client] not using http_proxy for header
[2024/10/23 18:17:07] [debug] [output:doris:doris.0] 127.0.0.1:8040, HTTP status=200
{
    "TxnId": 28030,
    "Label": "8c20c58f-ffd6-4aa1-8946-58c72a6b761a",
    "Comment": "",
    "TwoPhaseCommit": "false",
    "Status": "Success",
    "Message": "OK",
    "NumberTotalRows": 1,
    "NumberLoadedRows": 1,
    "NumberFilteredRows": 0,
    "NumberUnselectedRows": 0,
    "LoadBytes": 1553,
    "LoadTimeMs": 53,
    "BeginTxnTimeMs": 0,
    "StreamLoadPutTimeMs": 4,
    "ReadDataTimeMs": 0,
    "WriteDataTimeMs": 23,
    "CommitAndPublishTimeMs": 25
}


[2024/10/23 18:17:07] [debug] [upstream] KA connection #64 to 127.0.0.1:8040 is now available
[2024/10/23 18:17:07] [debug] [out flush] cb_destroy coro_id=1
[2024/10/23 18:17:07] [debug] [task] destroy task=0x7f003c027190 (task_id=0)
^C[2024/10/23 18:17:09] [engine] caught signal (SIGINT)
[2024/10/23 18:17:09] [ warn] [engine] service will shutdown in max 5 seconds
[2024/10/23 18:17:09] [ info] [input] pausing cpu.0
[2024/10/23 18:17:09] [ info] [engine] service has stopped (0 pending tasks)
[2024/10/23 18:17:09] [ info] [input] pausing cpu.0
[2024/10/23 18:17:09] [ info] [output:doris:doris.0] thread worker #0 stopping...
[2024/10/23 18:17:09] [ info] [output:doris:doris.0] thread worker #0 stopped
[2024/10/23 18:17:09] [ info] [output:doris:doris.0] thread worker #1 stopping...
[2024/10/23 18:17:09] [ info] [output:doris:doris.0] thread worker #1 stopped
  • Attached Valgrind output that shows no leaks or memory corruption was found
    c33568e835a2e883993e4a05fb03fcf

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

fluent-bit-docs/pull/1483

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@patrick-stephens
Copy link
Contributor

Does this need additional dependencies?
Do we need to update the packaging builds for the releases for each target (specifically older ones like CentOS 7 if new deps are required)?

@joker-star-l
Copy link
Author

Does this need additional dependencies? Do we need to update the packaging builds for the releases for each target (specifically older ones like CentOS 7 if new deps are required)?

This component has no new additional dependencies.

@joker-star-l
Copy link
Author

joker-star-l commented Oct 31, 2024

It seems that msvc does not have __sync_fetch_and_add function. <stdatomic.h> is not available until after c11. How can I use a general atomic counting method? @patrick-stephens
image

@patrick-stephens
Copy link
Contributor

It seems that msvc does not have __sync_fetch_and_add function. <stdatomic.h> is not available until after c11. How can I use a general atomic counting method? @patrick-stephens image

Yes, you have to support all legacy targets I'm afraid so the various vendored libraries have atomic support in place. I'm not sure if there is a general way in Fluent Bit to do it @cosmo0920 ?

@cosmo0920
Copy link
Contributor

It seems that msvc does not have __sync_fetch_and_add function. <stdatomic.h> is not available until after c11. How can I use a general atomic counting method? @patrick-stephens image

Yes, you have to support all legacy targets I'm afraid so the various vendored libraries have atomic support in place. I'm not sure if there is a general way in Fluent Bit to do it @cosmo0920 ?

We need to use InterlockedAdd in Windows for the equivalent operation of __sync_fetch_and_add.

@joker-star-l
Copy link
Author

It seems that msvc does not have __sync_fetch_and_add function. <stdatomic.h> is not available until after c11. How can I use a general atomic counting method? @patrick-stephens image

Yes, you have to support all legacy targets I'm afraid so the various vendored libraries have atomic support in place. I'm not sure if there is a general way in Fluent Bit to do it @cosmo0920 ?

We need to use InterlockedAdd in Windows for the equivalent operation of __sync_fetch_and_add.

Thank you both!

@joker-star-l
Copy link
Author

It will be in the review queue so be patient and wait for a review.

OK, thank you!

@joker-star-l
Copy link
Author

Copy link
Contributor

This PR is stale because it has been open 45 days with no activity. Remove stale label or comment or this will be closed in 10 days.

@github-actions github-actions bot added the Stale label Mar 16, 2025
@joker-star-l
Copy link
Author

Hello community, is this PR still in review?

@github-actions github-actions bot removed the Stale label Mar 23, 2025
@patrick-stephens
Copy link
Contributor

It still seems to be under development plus we cannot merge until the DCO check is satisfied.

Signed-off-by: composer <[email protected]>
Signed-off-by: composer <[email protected]>
Signed-off-by: composer <[email protected]>
Signed-off-by: composer <[email protected]>
Signed-off-by: composer <[email protected]>
Signed-off-by: composer <[email protected]>
Signed-off-by: composer <[email protected]>
Signed-off-by: composer <[email protected]>
@rohan-changejar
Copy link

Any timeline for merging this pr to the main line.


/* Append headers */
flb_http_add_header(c, "format", 6, "json", 4);
flb_http_add_header(c, "Expect", 6, "100-continue", 12);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This 100-continue return status code 100 will cause retransmission, can it be removed? If needed, can it be added manually?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is necessary for the Doris http stram load API. Adding by the user themselves will increase the complexity of the user configuration.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because of the "100-continue" header, in the group commit mode, fluent-bit retry sends data, and if doris is a "Duplicate Key Model" the logs will produce duplicates.

This kind is not friendly.

It can also provide parameters to control whether to enable 100-continue.

Copy link
Author

@joker-star-l joker-star-l May 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it is caused by 100-continue?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm sure

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-required ok-package-test Run PR packaging tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants