Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix stat_post_url functionality (Fixes #43) #54

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

odensc
Copy link

@odensc odensc commented Aug 22, 2020

I fixed a few things here to generally bring the stat post functionality up to snuff.

  • Change Content-Type header for stat POST to application/json
    • Not sure why this was urlencoded before, nothing in the code seems to send urlencoded bodies - on_event uses query params
  • Fix sprintf usage in HttpClient
    • m_http_method was being nulled out on subsequent requests.
  • Fix Content-Length header string find operation
    • HTTP headers are case insensitive, so some servers would send back lower case content-length and the request would fail.
  • Set callback for http_stat_client
    • It seems this was the crux of the issue here, the request body callback in SLSManager was never actually hooked up.
  • Send properly formatted JSON
    • Before, all the stat objects were simply appended into one big string. I fixed this so it POSTs a properly formatted JSON array.

As a side note, the indentation and formatting of the repo seemed to be all over the place... so I hope I got that right. Also I don't use C++ much, so hopefully some more experienced eyes can take a look here.

Fixes #43.

@odensc odensc mentioned this pull request Aug 22, 2020
@Edward-Wu
Copy link
Owner

hi, odensc
thanks for your great job!

@ravenium
Copy link

Can confirm this is working, tested with docker and a couple of independent streams. FWIW it gets my thumbs up for a merge :)

Couple thoughts:
@odensc Whatever magic you worked with stats, can you see if some of these issues exist in on_event? I can see event triggers firing for on_connect but on_close doesn't seem to fire properly (maybe 10% of the time)

@Edward-Wu Is the fact that on_event is a bunch of URL parameters and stats is a json payload by design? It might make parsing more sane if they were both json POST bodies.

@ravenium
Copy link

ravenium commented Sep 28, 2020

2 more things:

  1. on_close works for output functions (e.g. pulling from SLS to something like OBS), but doesn't show up in the stats_url (which claims nobody is pulling). Tested with OBS on two different machines. So maybe there's something else missing?

  2. Forgot to say thanks to both of you - my C++ is terrible at best :)

@odensc
Copy link
Author

odensc commented Oct 3, 2020

@ravenium on_event seemed to work in my limited testing.. Can you share the logs from around the time when you close the stream? Wondering if the HTTP request is being attempted and just failed, or if SLS even detected that the stream closed at all.

(also I just remembered your name, I have used your docker container for a few things :)

@ravenium
Copy link

ravenium commented Oct 5, 2020

@odensc Awesome! :) I haven't been able to get our streaming group to give it a whirl until they get a way of receiving notifications. Well, that and they like RTMP's ability to preview in a browser. Bah!

I did a quick start, count to 5, stop with on_event turned on. Naturally the first time I tried, it worked for both on_connect and on_close, but here's one where it didn't:
2020-10-05 20:17:22:077 SLS INFO: [0x5579705e2520]CSLSRole::add_to_epoll, listener, sock=276841123, m_is_write=0, ret=0. 2020-10-05 20:17:22:077 SLS INFO: [0x5579706291a0]CSLSGroup::check_new_role, worker_number=0, listener=0x5579705e2520, add_to_epoll fd=276841123, role_map.size=1. 2020-10-05 20:17:40:390 SLS INFO: [0x557970604da0]CSLSSrt::libsrt_accept ok, new sock=276841122, 10.0.0.5:62977. 2020-10-05 20:17:40:390 SLS INFO: [0x5579705e2520]CSLSListener::handler, new client[10.0.0.5:62977], fd=276841122. 2020-10-05 20:17:40:390 SLS INFO: [0x5579705e2520]CSLSListener::handler, [10.0.0.5:62977], sid 'input/live/desktop' 2020-10-05 20:17:40:391 SLS INFO: [0x5579705e2520]CSLSListener::handler, new pub=0x55797067fb20, key_stream_name=input/live/desktop. 2020-10-05 20:17:40:391 SLS INFO: [0x7fcb3a581e88]CSLSMapData::add ok, key='input/live/desktop'. 2020-10-05 20:17:40:391 SLS INFO: [0x55796f800d48]CSLSMapPublisher::set_push_2_pushlisher, ok, publisher=0x55797067fb20, app_streamname=input/live/desktop, m_map_push_2_pushlisher.size()=1. 2020-10-05 20:17:40:391 SLS INFO: [0x5579706a1d40]CTCPRole::setup, create sock ok, m_fd=7. 2020-10-05 20:17:40:391 SLS INFO: [0x5579706a1d40]CTCPRole::setup, setsockopt reused ok, m_fd=7. 2020-10-05 20:17:40:392 SLS INFO: [0x5579706a1d40]CTCPRole::set_nonblock, set O_NONBLOCK ok, m_fd=7. 2020-10-05 20:17:40:392 SLS INFO: [0x5579706a1d40]CTCPRole::connect, ok, m_fd=7, host=10.0.0.10, port==8080. 2020-10-05 20:17:40:392 SLS INFO: [0x5579706a1d40]CHttpClient::generate_http_request, ok, m_url='http://10.0.0.10:8080/test?on_event=on_connect&role_name=publisher&srt_url=input/live/desktop&remote_ip=10.0.0.5&remote_port=62977', content len=0. 2020-10-05 20:17:40:393 SLS INFO: [0x5579705e2520]CSLSListener::handler, new publisher[10.0.0.5:62977], key_stream_name=input/live/desktop. 2020-10-05 20:17:40:393 SLS INFO: [0x55796f800e48]CSLSMapRelay::add_relay_manager, no relay conf info, app_uplive=input/live, stream_name=desktop. 2020-10-05 20:17:40:393 SLS INFO: [0x5579705e2520]CSLSListener::handler, m_map_pusher->add_relay_manager failed, new role[10.0.0.5:62977], key_stream_name=input/live/desktop. 2020-10-05 20:17:40:394 SLS INFO: [0x55797067fb20]CSLSRole::add_to_epoll, publisher, sock=276841122, m_is_write=0, ret=0. 2020-10-05 20:17:40:394 SLS INFO: [0x5579706291a0]CSLSGroup::check_new_role, worker_number=0, publisher=0x55797067fb20, add_to_epoll fd=276841122, role_map.size=2. 2020-10-05 20:17:40:536 SLS INFO: [0x5579706a1d40]CHttpClient::parse_http_response, m_response_code:'200', url='http://10.0.0.10:8080/test?on_event=on_connect&role_name=publisher&srt_url=input/live/desktop&remote_ip=10.0.0.5&remote_port=62977', http_method='POST'. 2020-10-05 20:17:40:536 SLS INFO: [0x5579706a1d40]CHttpClient::parse_http_response, m_response_content_length=0. 2020-10-05 20:17:40:536 SLS INFO: [0x5579706a1d40]CHttpClient::parse_http_response, finished, url='http://10.0.0.10:8080/test?on_event=on_connect&role_name=publisher&srt_url=input/live/desktop&remote_ip=10.0.0.5&remote_port=62977', http_method='POST', content_len=0. 2020-10-05 20:17:40:536 SLS INFO: [0x5579706a1d40]CHttpClient::recv, finished. 2020-10-05 20:17:40:536 SLS INFO: [0x55797067fb20]CSLSRole::check_http_client_response, http finished, publisher, http_url='http://10.0.0.10:8080/test', response_code=200, response=''.

Here's the event log when I clicked "stop streaming" in OBS:

2020-10-05 20:17:40:536 SLS INFO: [0x5579706a1d40]CTCPRole::close ok, m_fd=7. 2020-10-05 20:17:46:440 SLS INFO: [0x55797067fb20]CSLSRole::get_state, get_sock_state, ret=6, call invalid_srt. 2020-10-05 20:17:46:440 SLS ERROR: CSLSSrt::libsrt_neterrno, err=6003, Non-blocking call failure: transmission timed out. 2020-10-05 20:17:46:440 SLS INFO: [0x55797067fb20]CSLSRole::invalid_srt, close sock=276841122, m_state=2. 2020-10-05 20:17:46:440 SLS INFO: [0x55797067f540]CSLSSrt::libsrt_close, fd=276841122. 2020-10-05 20:17:46:441 SLS INFO: [0x5579706d4e20]CTCPRole::setup, create sock ok, m_fd=7. 2020-10-05 20:17:46:441 SLS INFO: [0x5579706d4e20]CTCPRole::setup, setsockopt reused ok, m_fd=7. 2020-10-05 20:17:46:441 SLS INFO: [0x5579706d4e20]CTCPRole::set_nonblock, set O_NONBLOCK ok, m_fd=7. 2020-10-05 20:17:46:441 SLS INFO: [0x5579706d4e20]CTCPRole::connect, ok, m_fd=7, host=10.0.0.10, port==8080. 2020-10-05 20:17:46:441 SLS INFO: [0x5579706d4e20]CHttpClient::generate_http_request, ok, m_url='http://10.0.0.10:8080/test?on_event=on_close&role_name=publisher&srt_url=input/live/desktop&remote_ip=10.0.0.5&remote_port=62977', content len=0

It looks like the second generated http request fails in some way. I strapped a pcap to the test web server I'm using and it (correctly) sees the entire on_connect exchange, but on_close only gets as far as a SYN/SYNACK, at which point it RSTs.

@fullbrightness
Copy link

I spun up a new version of SLS this week and I'm not receiving the on_close call. I'm going to revert to an old version, as this is a critical call to our management service.

@odensc
Copy link
Author

odensc commented Oct 24, 2020

@fullbrightness do you know what version it broke on? or are you saying this PR broke it?

@fullbrightness
Copy link

fullbrightness commented Oct 24, 2020

@odensc I compiled early this week, but I didn't include this pull request, so probably the main version. The other instance I have running I compiled in June. I read your post before realizing this hadn't been merged back in yet.

@fullbrightness
Copy link

[0x7fd29034c090]CHttpClient::generate_http_request, ok, m_url='http://:<>/sls/on_event?on_event=on_close&role_name=publisher&srt_url=video.tx/live/streamname-mv&remote_ip=<>&remote_port=59310', content len=0.

I see the event being generated but its not actually happening. The on_connect works every time. The version thats not working is actually running on Ubuntu 18.04 whereas the the version that is working is running in the cloud on 20.04.

@ravenium
Copy link

[0x7fd29034c090]CHttpClient::generate_http_request, ok, m_url='http://:<>/sls/on_event?on_event=on_close&role_name=publisher&srt_url=video.tx/live/streamname-mv&remote_ip=<>&remote_port=59310', content len=0.

I see the event being generated but its not actually happening. The on_connect works every time. The version thats not working is actually running on Ubuntu 18.04 whereas the the version that is working is running in the cloud on 20.04.

That debug log looks awfully close to mine - I'm not a C++ guy (Golang is my crutch) but it looks like the request gets reset before it can even handshake.

@odensc
Copy link
Author

odensc commented Oct 28, 2020

Seems like some sort of race condition possibly. I don't currently have the time to look into this in depth, as it's not a feature I require.

You could add a breakpoint/log to each step in HttpClient.cpp and see where it gets tripped up. Since generate_http_request is being logged with "ok," that means the next step is handler. If there was an error there, it should have been logged, but it wasn't - which means the TCP connection was established, as that's the next step. For some reason it's prematurely terminating that connection before the request data is sent.

{
info_str += '[';
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job debugging and fixing this!

JSON response best practice is to always return a JSON object, not just an Array. I recommend that when sending this data in the stats request body, it be wrapped in an object like { workers: [...workers] }. This way, you can extend the data returned later without making breaking changes.

For reference, this is what I'm seeing in the request body:

[
  {
    port: '8080',
    role: 'player',
    pub_domain_app: 'input/live',
    stream_name: 'roger',
    url: 'output/live/roger',
    remote_ip: '192.168.1.114',
    remote_port: '61426',
    start_time: '2020-11-03 11:35:37',
    kbitrate: '11033'
  },
  {
    port: '8080',
    role: 'publisher',
    pub_domain_app: 'input/live',
    stream_name: 'roger',
    url: 'input/live/roger',
    remote_ip: '192.168.1.114',
    remote_port: '64291',
    start_time: '2020-11-03 11:35:27',
    kbitrate: '11765'
  },
  {
    port: '8080',
    role: 'listener',
    pub_domain_app: '',
    stream_name: '',
    url: '',
    remote_ip: '',
    remote_port: '',
    start_time: '2020-11-03 11:35:08',
    kbitrate: '0'
  }
]

Copy link

@timokorkalainen timokorkalainen Jan 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, at least for me the response now starts with "[,{ port: ..." in which the comma of course breaks any JSON parsing. This happens when the first worker in the list actually has an empty info. Great direction though, really waiting for these changes to get merged!

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect any pull requests to be merged. It would be great if someone interested in maintaining the project would fork it, because there are a lot of very small improvements that would make it much more flexible and usable in different scenarios. I've personally moved on to a different solution for SRT streaming, but this project was a great help to me initially.

Copy link

@ravenium ravenium Jan 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this project was one of the first I noticed and really got me going for open source SRT streaming. People get busy and such, so it's understandable that not everyone has time to keep up with their stuff. Luckily, it's inspired others to do their own versions - if it's ok to mention other projects here, I've been following voc/srtrelay (which is based in part on sls). The author wrote it in golang so a little more merciful to my coding wheelhouse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Statistics
6 participants