Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RSDK-7403: Improve remote camera clients. #4294

Merged
merged 35 commits into from
Sep 6, 2024

Conversation

dgottlieb
Copy link
Member

No description provided.

@viambot viambot added the safe to test This pull request is marked safe to test from a trusted zone label Aug 19, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Aug 19, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Aug 19, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Aug 19, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Aug 19, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Aug 19, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Aug 19, 2024
@viambot viambot removed the safe to test This pull request is marked safe to test from a trusted zone label Aug 19, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Aug 20, 2024
robot/web/stream/state/state.go Outdated Show resolved Hide resolved
robot/web/stream/server.go Outdated Show resolved Hide resolved
robot/web/stream/server.go Outdated Show resolved Hide resolved
robot/web/stream/state/state.go Outdated Show resolved Hide resolved
robot/web/stream/stream.go Show resolved Hide resolved
robot/impl/resource_manager.go Outdated Show resolved Hide resolved
robot/client/client.go Outdated Show resolved Hide resolved
components/camera/client.go Outdated Show resolved Hide resolved
grpc/shared_conn.go Show resolved Hide resolved
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 5, 2024
Copy link
Member Author

@dgottlieb dgottlieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pushed with a merge I screwed up -- but I think git figured it out.

Relying on CI to tell me if things are working.

// GetSleepTimeFromErrorCount returns a sleep time from an error count.
func (opts *BackoffTuningOptions) GetSleepTimeFromErrorCount(errorCount int) time.Duration {
if errorCount < 1 || opts == nil {
return 0
}
multiplier := math.Pow(2, float64(errorCount-1))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bugfix: This would go wildly negative after few seconds of errors.

Cooldown: 5 * time.Second,
}
if err := streamFunc(opts); err != nil {
if err := streamFunc(&webstream.BackoffTuningOptions{}); err != nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Settings removed because it never worked. And now streamFunc disobeys the the input anyways. So make it explicit this code has no expectations about backoff. Removing the need to pass a BackoffTuningOptions was a bit noisy, so I'm avoiding doing that in this PR.

components/camera/client.go Show resolved Hide resolved
c.rtpPassthroughMu.Lock()
for _, tmp := range c.runningStreams {
if count.Add(1)%10000 == 0 {
c.logger.Infow("ReadRTP called. Sampling 1/10000", "count", count.Load(), "packetTs", pkt.Header.Timestamp)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to debug

test.That(t, startCount.Load(), test.ShouldEqual, 2)
test.That(t, stopCount.Load(), test.ShouldEqual, 2)

// multiple Decrement() calls when the count is already at zero doesn't call any methods
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion was no longer true. Excessive decrements were calling logger.Fatal. I believe that's how I inherited the PR (i might have smashed git history -- on multiple branches by accident).

Should definitely flag if there's a reason we expect multiple decrements to take place.

robot/impl/resource_manager.go Outdated Show resolved Hide resolved
robot/web/stream/server.go Outdated Show resolved Hide resolved
robot/web/stream/state/state.go Outdated Show resolved Hide resolved
robot/web/stream/state/state.go Outdated Show resolved Hide resolved
robot/web/stream/stream.go Show resolved Hide resolved
@dgottlieb
Copy link
Member Author

Test failure caught a stream state event handler leak -- taking a look:

2024-09-05T13:43:18.1171423Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1171786Z         \_ goroutine leak(s) detected: found unexpected goroutines:
2024-09-05T13:43:18.1172913Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1173776Z         \_ [Goroutine 368 in state select, with go.viam.com/rdk/robot/web/stream/state.(*StreamState).sourceEventHandler on top of the stack:
2024-09-05T13:43:18.1174879Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1175053Z         \_ goroutine 368 [select]:
2024-09-05T13:43:18.1176160Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1176716Z         \_ go.viam.com/rdk/robot/web/stream/state.(*StreamState).sourceEventHandler(0x4000f78be0)
2024-09-05T13:43:18.1177963Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1178306Z         \_ 	/__w/rdk/rdk/robot/web/stream/state/state.go:166 +0x138
2024-09-05T13:43:18.1179441Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1179667Z         \_ go.viam.com/utils.ManagedGo.func1()
2024-09-05T13:43:18.1180788Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1181221Z         \_ 	/home/testbot/go/pkg/mod/go.viam.com/[email protected]/runtime.go:180 +0x54
2024-09-05T13:43:18.1182344Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1182839Z         \_ go.viam.com/utils.PanicCapturingGoWithCallback.func1()
2024-09-05T13:43:18.1183969Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1184412Z         \_ 	/home/testbot/go/pkg/mod/go.viam.com/[email protected]/runtime.go:164 +0x54
2024-09-05T13:43:18.1185509Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1185979Z         \_ created by go.viam.com/utils.PanicCapturingGoWithCallback in goroutine 1
2024-09-05T13:43:18.1187077Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1187515Z         \_ 	/home/testbot/go/pkg/mod/go.viam.com/[email protected]/runtime.go:151 +0x70
2024-09-05T13:43:18.1188623Z     logging.go:136: 2024-09-05T13:40:22.218Z	ERROR	process._/tmp/TestShutdownshutdown_functionality3422911741/002/server.StdErr	pexec/managed_process.go:275	
2024-09-05T13:43:18.1188778Z         \_ ]

@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 5, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 5, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 5, 2024
@viambot viambot added safe to test This pull request is marked safe to test from a trusted zone and removed safe to test This pull request is marked safe to test from a trusted zone labels Sep 5, 2024
Copy link
Member

@nicksanford nicksanford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, just some non blocking feedback to confirm you intended to delete some tests & some suggestions around logging level.

pc, ok := rpc.ContextPeerConnection(ctx)
server.logger.Infow("Adding video stream", "name", req.Name, "peerConn", pc)
defer server.logger.Warnf("AddStream END %s", req.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to keep this at warn or downgrade this to Info?

Users may be surprised to see warning logs whenever they open a video stream.

// If it no longer exists, it:
// 1. calls RemoveTrack on the senders of all peer connections that called AddTrack on the camera name.
// 2. decrements the number of active peers on the stream state (which should result in the
// stream state having no subscribers and calling gostream.Stop() or rtppaserverthrough.Unsubscribe)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// stream state having no subscribers and calling gostream.Stop() or rtppaserverthrough.Unsubscribe)
// stream state having no subscribers and calling gostream.Stop() or rtp passthrough.Unsubscribe()

pc, ok := rpc.ContextPeerConnection(ctx)
server.logger.Infow("Adding video stream", "name", req.Name, "peerConn", pc)
defer server.logger.Warnf("AddStream END %s", req.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defer server.logger.Warnf("AddStream END %s", req.Name)
defer server.logger.Infof("AddStream END %s", req.Name)

ss.activePeers--
if ss.activePeers <= 0 {
ss.activePeers = 0
state.logger.Warn("rtp_passthrough not possible, falling back to GoStream", "err", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state.logger.Warn("rtp_passthrough not possible, falling back to GoStream", "err", err)
state.logger.Info("rtp_passthrough not possible, falling back to GoStream", "err", err)

I don't think we want to emit warn logs for cameras like webcam, how about we just emit this at info level?

if err != nil {
ss.logger.CDebugw(ctx, "rtp_passthrough not possible, falling back to GoStream", "err", err.Error(), "name", ss.Stream.Name())
state.logger.Warnw("tick: rtp_passthrough not possible, falling back to GoStream", "err", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state.logger.Warnw("tick: rtp_passthrough not possible, falling back to GoStream", "err", err)
state.logger.InfoW("tick: rtp_passthrough not possible, falling back to GoStream", "err", err)

}
ss.logger.CWarnw(ctx, "Stream using experimental H264 passthrough", "name", ss.Stream.Name())
ss.monitorSubscription(sub)
state.logger.Warnw("Stream using experimental H264 passthrough", "name", state.Stream.Name())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state.logger.Warnw("Stream using experimental H264 passthrough", "name", state.Stream.Name())
state.logger.Infow("Stream using experimental H264 passthrough", "name", state.Stream.Name())

@@ -1912,232 +1910,6 @@ func TestConfigMethod(t *testing.T) {
test.That(t, actualCfg, test.ShouldResemble, &expectedCfg)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just confirming, we want to be deleting TestReconnectRemote & TestReconnectRemoteChangeConfig?

@dgottlieb dgottlieb merged commit 33d9f80 into viamrobotics:main Sep 6, 2024
20 checks passed
@dgottlieb dgottlieb deleted the RSDK-7403 branch September 6, 2024 16:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
safe to test This pull request is marked safe to test from a trusted zone
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants