Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IW3 - Any possibility of a realtime mode? #319

Open
IkariDevGIT opened this issue Mar 3, 2025 · 95 comments
Open

IW3 - Any possibility of a realtime mode? #319

IkariDevGIT opened this issue Mar 3, 2025 · 95 comments

Comments

@IkariDevGIT
Copy link

I really think adding a real-time mode could be a solid addition. I don’t know all the technical details on how it would work under the hood, but the benefits are definitely there. Here are a few ways it could be useful:

  • Streaming without having to wait for a full video to download, this could work for a series, movies, anime, YouTube videos, or whatever else someone wants to watch.
  • Flatscreen gaming.

To give some context on how this would even be used, there’s a program called Virtual Desktop that lets you remote into your PC using VR. It has an SBS mode, which means you can already watch anything on your PC in 3D if the output is SBS. If real-time support were possible, this could make that everything you see on your pc into 3D.

I actually think it would be pretty practical. I only have 12GB of VRAM and still manage to get around 26 FPS with IW3, and that’s with just a slight adjustment to the default config.

Not sure how realistic this is, but I’d love to hear your thoughts on it!

@nagadomi
Copy link
Owner

nagadomi commented Mar 4, 2025

If you specify Video Format=mkv, you can playback the video during conversion.
And SKYBOX and Pigasus support SMB (Windows Shared Folder).
By playback mkv via SMB, the video during conversion can be playback in near realtime.

However, those who want realtime actually expect to be able to choose the video they want to play from the VR headset.
So I think that the request regarding realtime is about UI and what is needed is the development of a video player or DLNA server.

@IkariDevGIT
Copy link
Author

@nagadomi Thank you for the response!

What I’m suggesting isn’t about watching a file while it’s being converted. It’s rather about applying the depth estimation and 3D conversion in real-time to anything displayed on a PC screen, including games, web videos, or live streams. Essentially, the idea would be to process frames dynamically as they appear, rather than converting a pre-existing video file.

The reason I brought up Virtual Desktop is that it already allows you to stream your PC screen to VR with SBS support. If IW3 had a real-time mode, it could theoretically take the live video feed from a PC screen and apply its depth-based conversion before sending it to VR, effectively turning anything you see into 3D.

@nagadomi
Copy link
Owner

nagadomi commented Mar 4, 2025

Maybe it is technically possible.

  • ffmpeg can take screen recordings as input and output realtime stream such as RTMP.
  • Since iw3 uses ffmpeg bindings, it can input the desktop stream, convert it to 3D, and output another stream.
  • Several VR video players can play streaming video. (The oculus browser's HTML5 video player supports SBS display, so it may be possible to play streaming video with it.)

However, the results other than full screen playback maybe not good.

@nagadomi
Copy link
Owner

nagadomi commented Mar 5, 2025

If it's a toy level (HTML5 Video player with MJPEG stream, maybe laggy),
I can develop it in a day, I'll give it a try.

@nagadomi
Copy link
Owner

nagadomi commented Mar 5, 2025

I implemented this.
It works better than I expected, but when I look at GUI windows other than full screen video, it is slightly distorted and makes me 3D sickness.

@nagadomi
Copy link
Owner

nagadomi commented Mar 5, 2025

If anyone gets interested, please give it a try.

switch to the dev branch first.
https://github.com/nagadomi/nunif/blob/dev/windows_package/docs/README.md#dev-branch

then see
https://github.com/nagadomi/nunif/blob/dev/iw3/docs/desktop.md

@IkariDevGIT
Copy link
Author

I tested it, and I’m genuinely impressed that you were able to implement this in just one day, it works pretty good!

I had a couple of questions and suggestions regarding performance optimizations:

  1. Batch Size for Real-Time Mode – Is batch size adjustable in real-time mode? Allowing it to process multiple frames at once could help improve performance on lower-end GPUs, even if it introduces some latency.

  2. Frame Interpolation for Depth Processing – A potential optimization could be an option to process depth estimation only every n frames (e.g., every 2 frames) and interpolate the depth information between them. This would likely introduce some latency but could significantly improve FPS while maintaining reasonable depth accuracy.

Let me know what you think! I’d be happy to test any further improvements.

@IkariDevGIT
Copy link
Author

Little follow-up to my last message:

I wanted to propose an additional optimization that could help improve performance in real-time mode. Instead of processing depth estimation for every frame, an adaptive motion-based depth frame skipping mechanism could be implemented.

The idea is to dynamically adjust depth processing frequency based on motion intensity:

  • High Motion: If significant movement is detected between consecutive frames, depth estimation would be computed normally for each frame.
  • Low/No Motion: If little to no movement is detected, instead of recalculating depth, the previous depth frame would be reused or interpolated between the depth frames for rendering, reducing the computational load.

Combining motion-based depth frame skipping + depth frame skipping with with interpolation + proper batch size implementation for real-time mode (may already exist, but my testing didn’t yield noticeable improvements) could lead to significantly better performance and higher FPS.

@nagadomi
Copy link
Owner

nagadomi commented Mar 6, 2025

I will consider improving the FPS, but when using Any_V2_S, it seems that the performance of depth estimation model is not a bottleneck.

On my machine(RTX3070Ti), the performance of Any_V2_S(resolution=392) itself can achieve about 120 FPS even with batch-size=1.
But with python -m iw3.desktop --depth-model Any_V2_S it only achieves about 14 FPS.

Increasing the batch size will be helpful for improvement because it can improve everything including device memory transfer, image resizing and warping, etc.

@IkariDevGIT
Copy link
Author

Wait, what? I'm barely getting 17 FPS in normal mode and around 7–9 FPS in desktop mode.

My specs:
CPU: Intel Core i9-10850K
RAM: 16GB (3200MHz)
Storage: Installed on an SSD
GPU: RTX 3060

Any idea what could be causing this?

@nagadomi
Copy link
Owner

nagadomi commented Mar 6, 2025

I made some minor improvements to iw3.desktop. 14FPS->24FPS. JPEG encoding was slow, I changed that.


120 FPS is the performance of the depth estimation model itself, not the video conversion.

def _bench():
import time
B = 4
N = 100
model = DepthAnythingModel("Any_L")
model.load(gpu=0)
x = torch.randn((B, 3, 392, 392)).cuda()
model.infer(x)
torch.cuda.synchronize()
with torch.no_grad():
t = time.time()
for _ in range(N):
model.infer(x)
torch.cuda.synchronize()
print(round(1.0 / ((time.time() - t) / (B * N)), 4), "FPS")

I changed B = 4 -> B = 1, Any_L -> Any_V2_S,
then

python -m iw3.depth_anything_model

Video conversion speed depends on the resolution of the input video, but can achieve 30 FPS(*1) for HD and 100 FPS for SD on my machine.
*1: 30FPS is same result as Any_B, most of the processing time is other than depth estimation

If you think your env is too slow, check the following.

  • Depth Model: Any_V2_S
  • Depth Batch Size: 4
  • Worker Threads: 4
  • FP16: ON
  • Stream: ON
  • TTA: OFF
  • Low VRAM: OFF

other settings are default values

@IkariDevGIT
Copy link
Author

After implementing these changes, my performance has improved to around 15-17 FPS on average, already a good improvement. Do you have any additional ideas for further optimizing performance?

Also, I believe the following features would greatly enhance usability, particularly when the user is not directly in front of the PC:

  • Simple controls: Basic input methods such as mouse clicks, scrolling and keyboard inputs would make it much more easier to use.
  • Audio transmission: The ability to transmit audio would be a valuable addition.

@Salmaun321
Copy link

Hi, first of all, nice work!

I can get over 800fps in the benchmark if I set batch size = 8, but only about 27fps streaming, my specs:
CPU: Intel Core i5-13600K
RAM: 32GB
GPU: RTX 4090

it seems that changing the preset, model or CRF etc, makes little to no effect

@nagadomi
Copy link
Owner

nagadomi commented Mar 7, 2025

I parallelized screenshot and 3D conversion. on my machine improved 24FPS -> 48FPS.
And probably due to browser limitations, video updates will not perform above 30 FPS, so this should be sufficient performance.

@Salmaun321

it seems that changing the preset, model or CRF etc, makes little to no effect

iw3.desktop uses JPEG sequential images(MJPEG) not video codec, so those options are ignored.
You can only specify --stream-quality options for MJPEG. --stream-quality 90 by default. When lower quality value is specified, network traffic can be reduced.

EDIT:
If you want to change the depth model, use --depth-model option. e.g, --depth-model Any_L
https://github.com/nagadomi/nunif/blob/dev/iw3/docs/desktop.md#stereo-setting

@Salmaun321
Copy link

Salmaun321 commented Mar 8, 2025

That last update really helped, I'm getting ~60fps. It is still very stuttery, but much better than before.
I'm using the Pico 4 default browser, now the issue is that it doesn't get the right aspect ratio, I always get a square image no matter if I set half or full sbs.

@loawizard
Copy link

loawizard commented Mar 9, 2025

this is dream come true for me! I only have 14 streaming frames though on 4070ti. this is the same if i stream localy on my pc and also while watching on my browser app in mibox. Any way i can increase that stream speed. i'm currently using command :

python -m iw3.desktop --depth-model Any_V2_S --divergence 2.75 --convergence 0.7

Thank you so much for doing this. if i can get it to 20-25 frames that would be amazing!

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

@Salmaun321
I added --full-sbs option. Half SBS by default, because Meta Quest's browser only supports Half SBS.
python -m iw3.desktop --stream-fps 30 --full-sbs

@loawizard
You can set target Streaming FPS with --stream-fps option. 15 by default. Try adding --stream-fps 30.
Also, as far as I have tried, it does not work correctly when Streaming FPS is higher than 30. Probably due to some limitation of browser.
I may change the default to --stream-fps 30 eventually.


Also note that while iw3.desktop web page is open on the PC side, performance will be degraded due to browser rendering load.

@Salmaun321
Copy link

Salmaun321 commented Mar 9, 2025

With the full-sbs option the aspect ratio was solved, but I noticed another issue.

When I start the stream it reaches 60+fps, then over time stabilizes to ~40 while on the desktop, but as soon as I open any fullscreen video it gradually drops to ~12fps, and never returns above that

@francdn
Copy link

francdn commented Mar 9, 2025

Does anyone know how to watch on the apple vision pro? Safari browser does not seem to play the sbs in 3d.

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

@Salmaun321
I fixed it probably. Update and try again.

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

@francdn
I only have Meta Quest so I don't know anything about the other devices.

Is the video not animated and only the first frame is displayed? Or no image is shown at all?
Safari's HTML5 video player does not support 3D display?

@francdn
Copy link

francdn commented Mar 9, 2025

The video plays but not in 3d. I can only see the sbs image.

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

This can be used with any VR video player that allows playback by specifying a video URL(and MJPEG video), but I don't know which players support it.
The URI of the streaming video is /stream.jpg. e.g, http://192.168.11.2:1303/stream.jpg
(Maybe I need to change it to something like stream_LR.avi, but I haven't figured anything yet.)

@Salmaun321
Copy link

Dude, you're awesome, now its playing at 28fps very stable.

Still lacks audio, but man, that's incredible, anything real time 2d to 3d better than any pre rendered movie, fucking awesome

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

Thank you.
You can use your PC's headphones to play audio, not the VR headset's headphones.

@loawizard
Copy link

loawizard commented Mar 9, 2025

got Estimated FPS = 19.85, Streaming FPS ! much better already!

python -m iw3.desktop --depth-model Any_V2_S --divergence 2.75 --convergence 0.7 --batch-size 2 --stream-fps 30

anyway i can crank it up a bit more or have we reached the limit? 4070ti

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

Even if you close the web browser, does it still only run at 20 FPS?
My RTX 3070Ti is probably low performance than RTX 4070Ti, but I'm getting Estimated FPS = 40-50 FPS.
It’s possible that the high screen resolution is the reason. My desktop is HD (1920x1080).

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

@loawizard
I improved screenshot performance. Update and try again.

@IkariDevGIT
Copy link
Author

@nagadomi Can you please add audio streaming?

@nagadomi
Copy link
Owner

nagadomi commented Mar 9, 2025

@IkariDevGIT
You can use your PC's headphones to play audio.

@francdn
Copy link

francdn commented Mar 12, 2025

@nagadomi Would it be too complicated to make run the streaming server in https?

@nagadomi
Copy link
Owner

@francdn
What is the purpose of that?
SSL setup for local network is very complicated for users. At the least you need to create your own certificates. Additionally, various tricks will be needed to prevent Chrome from reporting SSL errors.

@francdn
Copy link

francdn commented Mar 12, 2025

@francdn What is the purpose of that? SSL setup for local network is very complicated for users. At the least you need to create your own certificates. Additionally, various tricks will be needed to prevent Chrome from reporting SSL errors.

Just secure streaming.

@tufeixp
Copy link

tufeixp commented Mar 12, 2025

Regarding windows_capture, it may be stopping when there are no screen changes.

I vote for window capturing support cause I don't have two screens but one glasses-free 3d screen on my desktop.

@nagadomi
Copy link
Owner

@francdn
iw3.desktop can only bind local network addresses. (unless you specify --bind-addr 0.0.0.0)
That means iw3.desktop is not accessible from the internet.
And the communication between you and the server is not accessible from the internet.
So, it is secure as long as iw3.desktop is used in your home network.

However, it is not recommended to use iw3.desktop on networks that are not managed by you, such as FreeWiFi.
That's the kind of security level iw3.desktop has.

@nagadomi
Copy link
Owner

@tufeixp
I don't fully understand your environment, but I can support display/monitor selection. I don't plan to support selecting Application Window for now because its size and position can change, and the mouse might get lost.

@tufeixp
Copy link

tufeixp commented Mar 13, 2025

@tufeixp I don't fully understand your environment, but I can support display/monitor selection. I don't plan to support selecting Application Window for now because its size and position can change, and the mouse might get lost.

I'm a user of lenovo 27 inches 3d monitor, see : https://www.lenovo.com/nl/nl/p/accessories-and-software/monitors/professional/63f1uar3eu?msockid=2856b3b76490673105dca7fa65f666c5
I have to connect to another 2d screen in order to watch the content converted by iw3.desktop, with the current monitor capturing method. Maybe full screen window selecting and capturing can be easier to support, if size and position really matters?

@nagadomi
Copy link
Owner

nagadomi commented Mar 13, 2025

@tufeixp
Selecting the monitor index and region(x, y, width, height) is easy.
Currently, the screen buffer size and streaming video size is fixed at initialization and cannot be changed dynamically. Also hidden/minimized Windows cannot be captured at the moment.

Your request is to convert application windows on one monitor to 3D and view it on the same monitor?

@tufeixp
Copy link

tufeixp commented Mar 13, 2025

@tufeixp Selecting the monitor index and region(x, y, width, height) is easy. Currently, the screen buffer size and streaming video size is fixed at initialization and cannot be changed dynamically. Also hidden/minimized Windows cannot be captured at the moment.

Your request is to convert application windows on one monitor to 3D and view it on the same monitor?

Yes, in that way I'm able to watch youtube 2d videos and play games in 3d on the same monitor, makes sense for everybody!
Streaming brings latency, so I wish you think about adding a standalone topmost borderless fullscreen window to render the result directly, better to bypass mouse event to windows by setting the sbs window transparent too.

@nagadomi
Copy link
Owner

@tufeixp
I don't know how that monitor displays 3D.
Does it support 3D display of SBS images other than browser video?

Anyway, that is different from my intended use, so that support will be later.

@tufeixp
Copy link

tufeixp commented Mar 13, 2025

@tufeixp I don't know how that monitor displays 3D. Does it support 3D display of SBS images other than browser video?

Anyway, that is different from my intended use, so that support will be later.

Yes, it has built-in support for all kinds of 3d contents, including SBS media.

@loawizard
Copy link

If that could be possible it would be could than I can hook up my projector through HDMI.

But how can you capture something in 3d on one screen and display it on same one . That doesn't sound like something that's possibly. If we can capture something on one screen. And display it on another monitor through HDMI . Makes more sense to me

@francdn
Copy link

francdn commented Mar 14, 2025

Is there anything we have to do so see the cursor in windows? I don't see it with pil.

@nagadomi
Copy link
Owner

@francdn
With pil and pil_mp, the mouse cursor is not visible in the screenshot,
so iw3.desktop draws only the position.
pil:

Image

pil_mp
Image

Don't you see these green circles or squares anywhere?
Possibly, the mouse cursor may not be drawn in the correct position due to multiple monitors or DPI settings.
No mouse cursor on both pil_mp and pil, or only pil?

@francdn
Copy link

francdn commented Mar 14, 2025

I see a small circle but its position does not match the pointer position (and it is not due to lag). I am using 2160p on my computer. See screenshot. The mouse is over "Dark Theme Off" at the right bottom corner and the blue circle shows up more or less in the middle of the screen vertically/horizontally.

Image

If I move close to the left corner of the screen the circle and the pointer align horizontally more or less, but not vertically. See screenshot below. The mouse is over "Inspect" at the left bottom corner and the blue circle shows up more or less in the middle of the screen vertically.

Image

I only have one display connected. Happens for both pil and pil_mp.

@nagadomi
Copy link
Owner

@francdn
It seems to be related to the display scaling setting.
I changed it to 150% and confirmed that the mouse position is incorrect.
I will find out how to fix it.

@nagadomi
Copy link
Owner

@francdn
Maybe fixed. run update.bat and try again.

@francdn
Copy link

francdn commented Mar 14, 2025

Thanks a lot. I will try later.

@loawizard
Copy link

Setting higher than --stream-fps 30(e.g, 60) will probably cause instability due to browser behavior. If the instability occurs even with --stream-fps 30, it's probably being affected by the load from other programs, such as video player with hardware acceleration.

It might also be related to the process priority. Not sure if it is effective, but it should be possible to change it from TaskManger. If it is effective, I will be able to set it to a higher priority from the program side.

if i dont set it higher then 30 i get 20 fps if i set it to 60 i get steaming fps of 35-36

@nagadomi
Copy link
Owner

nagadomi commented Mar 17, 2025

@loawizard
Possibly a problem with the precision of time() and sleep().
I made some changes related to that.
And Sleep Precision display was added. In my environment it is less than 0.5 ms.

Estimated FPS = 42.16, Screenshot FPS = 76.44, Sleep Precision = 0.458 ms, Streaming FPS = 28.08
Estimated FPS = 49.52, Screenshot FPS = 102.61, Sleep Precision = 0.085 ms, Streaming FPS = 29.86
Estimated FPS = 50.36, Screenshot FPS = 101.42, Sleep Precision = 0.090 ms, Streaming FPS = 29.92
Estimated FPS = 52.00, Screenshot FPS = 107.68, Sleep Precision = 0.090 ms, Streaming FPS = 29.94
Estimated FPS = 51.24, Screenshot FPS = 108.07, Sleep Precision = 0.087 ms, Streaming FPS = 29.92
Estimated FPS = 50.16, Screenshot FPS = 102.32, Sleep Precision = 0.461 ms, Streaming FPS = 29.58
Estimated FPS = 50.30, Screenshot FPS = 101.67, Sleep Precision = 0.082 ms, Streaming FPS = 29.96
Estimated FPS = 44.67, Screenshot FPS = 80.84, Sleep Precision = 0.126 ms, Streaming FPS = 29.54
Estimated FPS = 48.08, Screenshot FPS = 99.14, Sleep Precision = 0.086 ms, Streaming FPS = 30.05
Estimated FPS = 43.11, Screenshot FPS = 77.70, Sleep Precision = 0.233 ms, Streaming FPS = 29.29
Estimated FPS = 49.20, Screenshot FPS = 95.15, Sleep Precision = 0.094 ms, Streaming FPS = 29.85

@nagadomi
Copy link
Owner

@loawizard
OK, I confirmed that sleep precision is 15 ms on Python 3.10 + Windows. It was much lower precision than I expected.
So I have fixed it to be around 1 ms. (And Sleep Precision display has been removed)
Hopefully the overall FPS instability has been fixed.

@Salmaun321
Copy link

Salmaun321 commented Mar 18, 2025

I think this "motion detection", is what is causing the stuttering.

https://youtu.be/ANeygSrZCdU

if the screenshot fps is constant and locked to the stream fps the playback would be smoother

@nagadomi
Copy link
Owner

@Salmaun321
I think that is using windows_capture, but does pil/pil_mp have the same behavior?
Also, Screenshot FPS is estimated value calculated from the time from frame request to response, it is not the running FPS.
pil is running at stream-fps and pil_mp is running at constant 60 fps.

@nagadomi
Copy link
Owner

@Salmaun321
I changed to allow sending duplicate frames in pil_mp and wc_mp.
Estimated FPS and Stream FPS should no longer be affected by screenshot frequency.

@loawizard
Copy link

I just tried the gui and the python code i used earlier. fresh reinstall.

for some reason it was extremely choppy and my browser on my android tv kept on crashing so unusable so far i will try the other ones that are not wc_mp

@loawizard
Copy link

yeah it crashes all my android browsers no matter what settings I choose

@nagadomi
Copy link
Owner

nagadomi commented Mar 21, 2025

@loawizard
I changed the browser side code which might be the cause. dd14316

Did iw3-desktop work in that environment before?
Is the issue only with GUI and not with CLI?
Is the issue only with browser on Android TV and not with Chrome on PC?

@nagadomi
Copy link
Owner

For now, I changed it to render with requestAnimationFrame again.

@IkariDevGIT
Copy link
Author

Hey @nagadomi, thanks for merging realtime mode into main and for all the improvements so far. I’m still hitting a critical bottleneck with the current streaming method. The 30 FPS limit, along with browser lag and no audio support, is becoming a real issue for my use cases. I realize I’ve asked a couple of times before, but now it’s impacting practical usability significantly. Is there any possibility of exploring another streaming method, perhaps leveraging a different protocol (like WebRTC) or another approach to optimize performance? I’m more than happy to test out any potential changes or provide further feedback. And again, i am sorry for asking this so much, but this is kind of crucial at this point. Thanks again for your hard work! I definitely noticed the faster overall performance so far tho, so good job on that!

@nagadomi
Copy link
Owner

@IkariDevGIT
I think I'll wait and see for a while. After all, I've been working on this for the past few weeks...
Also, since many people still haven't used it, various issues may be reported.

Once Torch 2.7 is released, GPU JPEG encoder will be available, so performance will improve even further.
(It already exists, but it's buggy and cannot be used.)

The protocol change might happen at some point later, perhaps in a few months, but it's not certain yet, as I haven't encountered any issues in my environment.
For audio, protocol change is necessary, but regarding performance, it's not yet clear what the bottleneck is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants