Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X11 not starting properly with no monitor connected to the eGPU #106

Open
T-X opened this issue Apr 7, 2023 · 6 comments
Open

X11 not starting properly with no monitor connected to the eGPU #106

T-X opened this issue Apr 7, 2023 · 6 comments
Assignees
Labels
knowledge This issue contains helpful knowledge / insights

Comments

@T-X
Copy link

T-X commented Apr 7, 2023

Generally the eGPU switcher works great. I have a Razer Core X Chroma eGPU with an HDMI monitor and a VR headset via DisplayPort and both work fine (as long as my GPU does not run out of VRAM and there are still several issues with Monado + SteamVR with the HP Reverb G2). I can also reenable the laptop screen fine with the eGPU enabled via xrandr/arandr in this case (it's just named differently, eDP-1 instead of eDP and not enabled by default when the eGPU is enabled). Thanks a lot for this very useful tool!

What does not work for me yet is starting X11/lightdm with no screen connected to the eGPU, so using just the laptop screen but with the eGPU enabled. The eGPU switcher correctly notices that no monitor is connected to the eGPU and that it has four HDMI+DisplayPort ports available. And suggest to use the "--override" flag.

Running the following leaves me with a black laptop screen:

$ sudo ~/dev-priv/egpu-switcher/bin/egpu-switcher switch --override
$ sudo /etc/init.d/lightdm restart

On a non-X11 TTY xrandr shows the following error message:

$ DISPLAY=:0 xrandr
Invalid MIT-MAGIC-COOKIE-1 key
Can't open display :0
$ DISPLAY=:1 xrandr
Can't open display :1
$ DISPLAY=:2 xrandr
Can't open display :2

Where I would have expected to see at least eDP-1.

$ cat /etc/X11/xorg.conf.d/99-egpu-switcher.conf
# autogenerated by egpu-switcher

Section "Module"
    Load           "modesetting"
EndSection


Section "Device"
    Identifier     "Device0"
    Driver         "amdgpu"
    BusID          "PCI:7@0:0:0"
    Option         "AllowEmptyInitialConfiguration"
    Option         "AllowExternalGpus" "True"
EndSection

Xorg log:
Xorg.0.log

lspci:
lspci.txt

  • Did you install egpu-switcher via ppa or via git + make
    • git + make, no install
  • What Linux distribution (+ version) are you using
    • Debian Sid, Linux 6.1.15-1 (from Debian packages)
  • What brand / model is your laptop
    • Lenovo Thinkpad T14s AMD Gen3
  • What brand / model is your GPU (+ enclosure)
    • iGPU: AMD/ATI Ryzen Radeon 680M, eGPU: Radeon RX 6650 XT, eGPU enclosure: Razer Core X Chroma
  • What drivers (+ version) are you using
    • OpenSource amdgpu from Linux 6.1.15-1, firmware-amd-graphics 20230210-4, Mesa 22.3.6-1
  • What Desktop-Environment do you use (+ Display-Manager)
    • lightdm
  • If you are not using a Desktop-Environment, what Window-Manager do you use?
    • Xmonad (with the LightDM greeter with no Xmonad starting up by default, but not even the LightDM greeter appears on the laptop monitor when no monitor is connected to the eGPU)
@T-X T-X assigned hertg Apr 7, 2023
@hertg
Copy link
Owner

hertg commented Apr 8, 2023

Thanks for the report, unfortunately I'm not really familiar with that particular issue (how to handle the situation of using a non NVIDIA GPU with no monitors attached). The original --override flag came from @ewagner12, so they probably have some more valuable information about what is happening here?

@ewagner12 just a sidenote, I have rewritten egpu-switcher in golang last year, but have made my best efforts to port your contributions 1:1 without changing their behaviors, with the exception of the remove feature, which I tried to port but simply wasn't able to get to work.

The source code for the --override flag is here:

egpu-switcher/cmd/switch.go

Lines 127 to 143 in 13b8a06

if driver != "nvidia" {
outputs, err := gpu.Outputs()
if err != nil {
return err
}
connectedOutputs, err := gpu.ConnectedOutputs()
if err != nil {
return err
}
if outputs > 0 && connectedOutputs == 0 {
logger.Warn("No eGPU attached display detected with open source drivers. (Of %d eGPU outputs detected) Internal mode and setting DRI_PRIME variable are recommended for this configuration.\n", outputs)
if !override {
return fmt.Errorf("Not setting eGPU mode. Run the command with the '--override' flag to force loading eGPU mode")
}
logger.Debug("-> Overridden: setting eGPU mode")
}
}

@ewagner12
Copy link
Contributor

Hi @T-X I actually have the same laptop as you, nice.

For this hardware switching to the eGPU with the script is only the best when using an external display connected to the eGPU. The DRI_PRIME=1 environment variable is the preferred method for using the eGPU on the laptop's internal display.

For this combination of using the eGPU with open source drivers (amdgpu, i915 or nouveau), but with no display connected to the eGPU you should not switch to the "eGPU mode" of the script. Instead let it use the "internal mode" and use the environment variable DRI_PRIME=1 or a similar method to offload rendering to the eGPU. This can be added to /etc/environment or as a prefix before any command to run it on the eGPU.

If you force Xorg to run on the eGPU with the --override command, with an AMD card I find that even if you get an X session running (I'm able to switch to a TTY and use startx -- to get X running) it will be extremely laggy and unusable so that's why this is not recommended.

@hertg I've used the new golang version of the script and it works well and as expected for me. For the remove command, I wouldn't worry about adding it unless there's a lot of people asking for it. Since I originally added that part of the script things have changed on the driver side and now it's unnecessary for me. With an AMD card on the latest kernel and GNOME desktops I can simply log out, turn off any displays connected to the eGPU and unplug the eGPU. So unless there's someone that needs it, I don't think it's necessary to add that functionality to the script anymore.

@T-X
Copy link
Author

T-X commented Apr 9, 2023

Ha! I found a solution / workaround:

  1. Let lightdm login automatically: I added "autologin-user=myuser" to /etc/lightdm.conf

After that DISPLAY=:0 xrandr on VT2 shows meaningful values and detects eDP-1-1. However running DISPLAY=:0 xrandr --output eDP-1-1 --auto still throws the error "xrandr: Configure crtc 5 failed".

  1. On VT2 (Ctrl+Alt+F2) run xrandr through a terminal emulator instead:
DISPLAY=:0 lxterminal -e xrandr --verbose --output eDP-1-1 --auto
  1. Press Ctrl+Alt+F7 to hop into your logged in desktop, which is now also visible on the laptop screen with your eGPU enabled. And without needing a display connected to the eGPU.

I don't know why the LightDM greeter and Xmonad won't enable and use eDP-1-1 by default in such a case. But this workaround is good enough for me now.

@hertg hertg added the knowledge This issue contains helpful knowledge / insights label Apr 9, 2023
@ewagner12
Copy link
Contributor

Hi @hertg I tested this more on my system and I found one bug potentially causing unintended behavior in the script related to this issue.

If /etc/X11/xorg.conf.d/99-egpu-switcher.conf exists:
Expected Behavior: Running sudo egpu-switcher switch auto in the above case of AMD eGPU connected with no displays removes the X11 conf file.
Actual Behavior: Running sudo egpu-switcher switch auto in the above case only prints an error, /etc/X11/xorg.conf.d/99-egpu-switcher.conf still exists.

I believe that at line 139 in

egpu-switcher/cmd/switch.go

Lines 137 to 153 in 13b8a06

logger.Warn("No eGPU attached display detected with open source drivers. (Of %d eGPU outputs detected) Internal mode and setting DRI_PRIME variable are recommended for this configuration.\n", outputs)
if !override {
return fmt.Errorf("Not setting eGPU mode. Run the command with the '--override' flag to force loading eGPU mode")
}
logger.Debug("-> Overridden: setting eGPU mode")
}
}
nomodesetting = nomodesetting || viper.GetBool("egpu.nomodesetting")
conf := xorg.RenderConf("Device0", driver, gpu.XorgPCIString(), !nomodesetting)
if err := xorg.CreateEgpuFile(x11ConfPath, conf, verbose); err != nil {
return err
}
if post := viper.GetString("hooks.egpu"); post != "" {
if err := runHook(post); err != nil {
logger.Error("egpu hook error: %s", err)
}
should also remove the eGPU file unless the --override is issued. Otherwise the behavior of the switching is inconsistent in this case, as it keeps the file if it previously existed, but doesn't create it if it doesn't exist.

I can't speak to behavior in Xmonad as I don't use it, but I believe this should avoid a black screen unless the --override command is used

@hertg
Copy link
Owner

hertg commented Apr 9, 2023

@ewagner12 Thanks for your report, that indeed makes sense. I created a new issue #107 for this

@T-X
Copy link
Author

T-X commented Apr 17, 2023

I've also opened a ticket on LightDM here. As I noticed that with just a docking station and no eGPU the LightDM greeter uses all available monitors that are routed through the iGPU. However when the eGPU is enabled through the egpu-switcher then LightDM only uses any monitors connected to the eGPU but not the eDP display anymore, as originally reported.

And also as after login configuring the eDP display via xrandr works. And when I use KDE (instead of xmonad, my prefered window manager) with an eGPU enabled then KDE also enables all connected displays by default, including the eDP-1-1 display. Which made me think that maybe LightDM should/could do something different when eGPUs are used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
knowledge This issue contains helpful knowledge / insights
Projects
Status: Triage
Development

No branches or pull requests

3 participants