Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nil pointer in e1000 driver on AWS EC2 #81

Open
aidansteele opened this issue Aug 15, 2021 · 3 comments
Open

nil pointer in e1000 driver on AWS EC2 #81

aidansteele opened this issue Aug 15, 2021 · 3 comments

Comments

@aidansteele
Copy link

aidansteele commented Aug 15, 2021

Hi,

This is a very cool project - thank you for building it! I tried running the helloworld example on AWS EC2 and got the following error. This output is running with the "debug" log level.

[trap] tid:2
[syscall] tid:4
[pci] no pci device found for e1000
[inet] begin dhcp
panic: nil pointer or invalid memory access
goroutine 1 [running]:
github.com/icexin/eggos/kernel.pageFaultPanic()
	/Users/aidan/dev/oss/eggos/kernel/trap.go:73 +0x2a
github.com/icexin/eggos/drivers/e1000.(*driver).Transmit(0x68056140, 0x6808c200, 0x3a29de, 0x6)
	/Users/aidan/dev/oss/eggos/drivers/e1000/e1000.go:224 +0x3b
github.com/icexin/eggos/inet.(*endpoint).WritePacket(0x680561e0, 0x68024cdc, 0x4, 0x3ff958, 0x4, 0x680249e8, 0x6, 0x0, 0x0, 0x300000800, ...)
	/Users/aidan/dev/oss/eggos/inet/endpoint.go:89 +0x67
gvisor.dev/gvisor/pkg/tcpip/link/nested.(*Endpoint).WritePacket(...)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/link/nested/nested.go:117
gvisor.dev/gvisor/pkg/tcpip/link/ethernet.(*Endpoint).WritePacket(0x68056230, 0x68024cdc, 0x4, 0x3ff958, 0x4, 0x680249e8, 0x6, 0x0, 0x0, 0x300000800, ...)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/link/ethernet/ethernet.go:66 +0x12d
gvisor.dev/gvisor/pkg/tcpip/stack.(*nic).writePacket(0x680ee000, 0x68024cdc, 0x4, 0x3ff958, 0x4, 0x680249e8, 0x6, 0x0, 0x0, 0x300000800, ...)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/stack/nic.go:368 +0xfe
gvisor.dev/gvisor/pkg/tcpip/stack.(*nic).writePacketBuffer(0x680ee000, 0x68024cdc, 0x4, 0x3ff958, 0x4, 0x680249e8, 0x6, 0x0, 0x0, 0x300000800, ...)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/stack/nic.go:314 +0xd0
gvisor.dev/gvisor/pkg/tcpip/stack.(*nic).enqueuePacketBuffer(0x680ee000, 0x6808a960, 0x800, 0x404800, 0x6808c200, 0x6805822a, 0x8, 0x8)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/stack/nic.go:329 +0x25d
gvisor.dev/gvisor/pkg/tcpip/stack.(*nic).WritePacket(0x680ee000, 0x6808a960, 0x800, 0x6808c200, 0x1c, 0x0)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/stack/nic.go:307 +0x53
gvisor.dev/gvisor/pkg/tcpip/network/ipv4.(*endpoint).writePacket(0x6806ca00, 0x6808a960, 0x6808c200, 0x6808a900, 0x0, 0x0)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/network/ipv4/ipv4.go:495 +0x375
gvisor.dev/gvisor/pkg/tcpip/network/ipv4.(*endpoint).WritePacket(0x6806ca00, 0x6808a960, 0x4000000011, 0x6808c200, 0x0, 0x0)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/network/ipv4/ipv4.go:445 +0x179
gvisor.dev/gvisor/pkg/tcpip/stack.(*Route).WritePacket(0x6808a960, 0x4000000011, 0x6808c200, 0x680b0458, 0xd2d0)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/stack/route.go:462 +0xad
gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*udpPacketInfo).send(0x6845e930, 0x405290, 0x680639e0, 0x68022340)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/transport/udp/endpoint.go:874 +0x3f9
gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).write(0x680a4f00, 0x405290, 0x680639e0, 0x68022340, 0x0, 0x35f520, 0x1, 0x680639e0)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/transport/udp/endpoint.go:581 +0x176
gvisor.dev/gvisor/pkg/tcpip/transport/udp.(*endpoint).Write(0x680a4f00, 0x405290, 0x680639e0, 0x68022340, 0x0, 0x36c720, 0x3381e0, 0x680562d0)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/transport/udp/endpoint.go:431 +0x68
gvisor.dev/gvisor/pkg/tcpip/adapters/gonet.(*UDPConn).WriteTo(0x680562d0, 0x680cc100, 0xfa, 0xfa, 0x406f28, 0x68063860, 0x0, 0x0, 0x0)
	/Users/aidan/go/pkg/mod/gvisor.dev/[email protected]/pkg/tcpip/adapters/gonet/gonet.go:651 +0x1eb
github.com/icexin/eggos/inet/dhcp.(*Client).Request(0x680ec090, 0x409328, 0x6804c660, 0x0, 0x0, 0x0, 0x0)
	/Users/aidan/dev/oss/eggos/inet/dhcp/client.go:150 +0x5e5
github.com/icexin/eggos/inet.dodhcp(0x680249e8, 0x6, 0x6, 0x68056230)
	/Users/aidan/dev/oss/eggos/inet/stack.go:87 +0x147
github.com/icexin/eggos/inet.Init()
	/Users/aidan/dev/oss/eggos/inet/stack.go:50 +0x234
github.com/icexin/eggos.kernelInit()
	/Users/aidan/dev/oss/eggos/eggos.go:34 +0x1dc
github.com/icexin/eggos.init.0()
	/Users/aidan/dev/oss/eggos/eggos.go:38 +0x25

That nil pointer panic is from this code:

func (d *driver) Transmit(pkt *stack.PacketBuffer) error {
desc := &d.txdescs[d.txidx]

A successful run of the same kernel in qemu has different logs:

[trap] tid:2
[video] can't found video info from bootloader, video disabled
[syscall] tid:4
[pci] found 8086:100e for e1000, irq:43

[e1000] enable bus master
[e1000] mmap for bar0 0xfebc0000
[e1000] begin reset
[e1000] reset done
[e1000] link up
[e1000] begin read mac
[e1000] mac:525400123456
[inet] begin dhcp
[dhcp] offer done
[dhcp] offer ip:10.0.2.15 server:10.0.2.2
[dhcp] lease:24h0m0s
[inet] dhcp done
[inet] addr:10.0.2.15
[inet] gateway:10.0.2.2
[inet] mask:255.255.255.0
[inet] dns:10.0.2.3
hello eggos
[syscall] write(1)(0x1, 0x681a4020, 0xc, 0x20, 0xc, 0x4) = 12

I understand that this is unlikely to be enough detail for you to diagnose the problem. I am happy to try assist, but I might need some directions from you. Let me know what details you need and I will try provide them.

@aidansteele
Copy link
Author

aidansteele commented Aug 15, 2021

Some more details: I logged the PCI devices discovered in both qemu and ec2.

PCI devices in qemu
[pci] found devices: ([]*pci.Device) (len=6 cap=8) {
 (*pci.Device)(0x68024560)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 4663
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 0,
   Func: (uint8) 0
  },
  Class: (uint8) 6,
  SubClass: (uint8) 0,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 }),
 (*pci.Device)(0x68024570)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 28672
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 1,
   Func: (uint8) 0
  },
  Class: (uint8) 6,
  SubClass: (uint8) 1,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 }),
 (*pci.Device)(0x68024580)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 28688
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 1,
   Func: (uint8) 1
  },
  Class: (uint8) 1,
  SubClass: (uint8) 1,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 }),
 (*pci.Device)(0x68024590)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 28947
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 1,
   Func: (uint8) 3
  },
  Class: (uint8) 6,
  SubClass: (uint8) 128,
  IRQLine: (uint8) 9,
  IRQNO: (uint8) 41
 }),
 (*pci.Device)(0x680245a0)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 4660,
   Device: (uint16) 4369
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 2,
   Func: (uint8) 0
  },
  Class: (uint8) 3,
  SubClass: (uint8) 0,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 }),
 (*pci.Device)(0x680245b0)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 4110
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 3,
   Func: (uint8) 0
  },
  Class: (uint8) 2,
  SubClass: (uint8) 0,
  IRQLine: (uint8) 11,
  IRQNO: (uint8) 43
 })
}
PCI devices in EC2
[pci] found devices: ([]*pci.Device) (len=6 cap=8) {
 (*pci.Device)(0x68024554)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 4663
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 0,
   Func: (uint8) 0
  },
  Class: (uint8) 6,
  SubClass: (uint8) 0,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 }),
 (*pci.Device)(0x68024560)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 28672
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 1,
   Func: (uint8) 0
  },
  Class: (uint8) 6,
  SubClass: (uint8) 1,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 }),
 (*pci.Device)(0x68024570)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 32902,
   Device: (uint16) 28947
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 1,
   Func: (uint8) 3
  },
  Class: (uint8) 0,
  SubClass: (uint8) 0,
  IRQLine: (uint8) 9,
  IRQNO: (uint8) 41
 }),
 (*pci.Device)(0x68024580)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 7439,
   Device: (uint16) 4369
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 3,
   Func: (uint8) 0
  },
  Class: (uint8) 3,
  SubClass: (uint8) 0,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 }),
 (*pci.Device)(0x68024590)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 7439,
   Device: (uint16) 32865
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 4,
   Func: (uint8) 0
  },
  Class: (uint8) 1,
  SubClass: (uint8) 8,
  IRQLine: (uint8) 11,
  IRQNO: (uint8) 43
 }),
 (*pci.Device)(0x680245a0)({
  Ident: (pci.Identity) {
   Vendor: (uint16) 7439,
   Device: (uint16) 60448
  },
  Addr: (pci.Address) {
   Bus: (uint8) 0,
   Device: (uint8) 5,
   Func: (uint8) 0
  },
  Class: (uint8) 2,
  SubClass: (uint8) 0,
  IRQLine: (uint8) 0,
  IRQNO: (uint8) 32
 })
}

@icexin
Copy link
Owner

icexin commented Aug 15, 2021

Thank you very much for your report.

The network card driver model of eggos is intel's e1000 series. From the PCI list you provided, the network card used by ec2 is the one with vendor 7439 and device 32865, which is an exclusive model of amazon, see https://www.pcilookup.com/?ven=1d0f&dev=&action=submit. I will try to support the virtio driver later, so that the network module can be used on the cloud server. I wonder if ec2 supports virtio network card?

I'm just curious about how you made eggos into an ec2 image?

Thanks again for sharing eggos' attempts on the cloud server.

@aidansteele
Copy link
Author

Hi @icexin, thanks for your quick feedback! I realised the next day that you are indeed correct, EC2 does not use the e1000 so I understand why it couldn't find it 😄

I did some more research and the network hardware for EC2 is actually quite complex. There are at least three options that I am aware of:

  • Elastic Network Adapter (which is what is seen in above) which is on the newest instance types (the ones with a "5" in them, e.g. c5, m5, r5)
  • Intel 82599 VF (which uses Intel's ixbgevf driver) on c3, c4, m4 and a few others.
  • Xen's VIF on older instances.

The ENA drivers are open source and seem to be well-documented, but I am not enough of an expert to replicate them in Go: https://github.com/amzn/amzn-drivers/tree/master/kernel/linux/ena. This would be the best perf driver to copy, but I assume it's complex.

The older instances (the ones without ENA) are based on Xen and have the "Xen Platform Device" PCI device attached. It has vendor ID 0x5853 and device ID 0x0001. This appears to be documented here (ctrl+F for xen_platform_pci=1). The network driver on Linux is here.

Regarding eggos -> EC2 image: I can submit a PR sometime in the next few days with a script to do it. But the approximate process is:

  • egg build
  • egg pack to ISO
  • upload ISO to S3
  • aws ec2 import-snapshot --format RAW --user-bucket ... path-to-iso-in-S3 (gives a snapshot ID)
  • aws ec2 register-image ... snapshot-id $snapshotID (gives an AMI ID)
  • aws ec2 run-instances ... --image-id $amiID

Sorry it is not more detailed. It is on my other computer. I will submit a PR soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants