Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discovery seems to take two or more tries to work #2

Open
fhunleth opened this issue Jul 19, 2017 · 28 comments
Open

Discovery seems to take two or more tries to work #2

fhunleth opened this issue Jul 19, 2017 · 28 comments

Comments

@fhunleth
Copy link
Contributor

This results in a timeout the first time that you try to contact the device running the mdns library. The second or third time seem to be ok.

@fhunleth
Copy link
Contributor Author

@mobileoverlord @tmecklem Do either you have anything to add? I'm trying to capture the conversation on July 12th regarding this issue since others are seeing it as well.

@tmecklem
Copy link

That captures it pretty well. I'm willing to do more investigation or capture logs if needed. I'm working with this (and experiencing it) quite a bit now.

@fhunleth
Copy link
Contributor Author

Believe it or not, I'm not sure that I'm seeing it on Linux. Either that or I'm getting really lucky.

@entone
Copy link
Contributor

entone commented Jul 19, 2017

I'm wondering if you guys can capture the IP address of the node when it's not working and when it is working. There's a TTL setting for the IP registration, I'm wondering if there may be a cache issue from previous deployment attempts.

I'm on Linux as well and will start investigating. The "one" test for the mDNS library "should" fail if this was an issue, but that does run locally, so maybe certain routers are at fault.

@entone
Copy link
Contributor

entone commented Jul 19, 2017

@tmecklem could you run the mix test in you local dev environment?

@entone
Copy link
Contributor

entone commented Jul 19, 2017

I've updated the tests to be a bit more thorough

@tmecklem
Copy link

tmecklem commented Jul 19, 2017

System Software Overview:

  System Version:	macOS 10.12.5 (16F73)
  Kernel Version:	Darwin 16.6.0
  Boot Volume:	Macintosh HD
  Boot Mode:	Normal
  Computer Name:	xxxxxx
  User Name:	xxxxxx
  Secure Virtual Memory:	Enabled
  System Integrity Protection:	Enabled
  Time since boot:	1 day 5:18

Hardware Overview:

  Model Name:	MacBook Pro
  Model Identifier:	MacBookPro11,2
  Processor Name:	Intel Core i7
  Processor Speed:	2 GHz
  Number of Processors:	1
  Total Number of Cores:	4
  L2 Cache (per Core):	256 KB
  L3 Cache:	6 MB
  Memory:	8 GB
  Boot ROM Version:	MBP112.0138.B25
  SMC Version (system):	2.18f15
13:01 $ mix test
warning: variable "address" is unused
  test/mdns_test.exs:45

warning: variable "char_host" is unused
  test/mdns_test.exs:45

warning: variable "address" is unused
  test/mdns_test.exs:56


13:01:03.622 [debug] Testing Server

13:01:03.625 [debug] [flags: [:up, :loopback, :running, :multicast], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}, addr: {65152, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

13:01:03.626 [debug] [flags: [:pointtopoint, :multicast]]

13:01:03.626 [debug] [flags: []]

13:01:03.626 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [120, 49, 193, 198, 55, 92], addr: {65152, 0, 0, 0, 7295, 44286, 11481, 20117}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}, addr: {10, 20, 30, 37}, netmask: {255, 255, 255, 0}, broadaddr: {10, 20, 30, 255}]

13:01:03.626 [debug] {65152, 0, 0, 0, 7295, 44286, 11481, 20117}

13:01:03.648 [debug] Address: {65152, 0, 0, 0, 7295, 44286, 11481, 20117}

13:01:03.648 [debug] Hostname: usVTVKQYsw.local


  1) test client and server (MdnsTest)
     test/mdns_test.exs:26
     ** (exit) exited in: GenServer.call(Mdns.Server, {:start, []}, 5000)
         ** (EXIT) an exception was raised:
             ** (MatchError) no match of right hand side value: {:error, :eaddrinuse}
                 (mdns) lib/mdns/server.ex:66: Mdns.Server.handle_call/3
                 (stdlib) gen_server.erl:636: :gen_server.try_handle_call/4
                 (stdlib) gen_server.erl:665: :gen_server.handle_msg/6
                 (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
     stacktrace:
       (elixir) lib/gen_server.ex:737: GenServer.call/3
       test/mdns_test.exs:33: (test)



Finished in 0.09 seconds
1 test, 1 failure

Randomized with seed 556480

13:01:03.668 [error] GenServer Mdns.Server terminating
** (MatchError) no match of right hand side value: {:error, :eaddrinuse}
    (mdns) lib/mdns/server.ex:66: Mdns.Server.handle_call/3
    (stdlib) gen_server.erl:636: :gen_server.try_handle_call/4
    (stdlib) gen_server.erl:665: :gen_server.handle_msg/6
    (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3
Last message (from #PID<0.166.0>): {:start, []}
State: %Mdns.Server.State{ip: {0, 0, 0, 0}, services: [], udp: nil}
Client #PID<0.166.0> is alive
    (stdlib) gen.erl:169: :gen.do_call/4
    (elixir) lib/gen_server.ex:734: GenServer.call/3
    test/mdns_test.exs:33: MdnsTest."test client and server"/1
    (ex_unit) lib/ex_unit/runner.ex:302: ExUnit.Runner.exec_test/1
    (stdlib) timer.erl:166: :timer.tc/1
    (ex_unit) lib/ex_unit/runner.ex:250: anonymous fn/3 in ExUnit.Runner.spawn_test/3

@entone
Copy link
Contributor

entone commented Jul 19, 2017

Ok, to run mDNS locally on OSX you need to kill the default mDNS server running locally.
In OS X, you can disable it by opening terminal and typing the following:

launchctl unload /System/Library/LaunchDaemons/com.apple.mDNSResponder.plist
launchctl unload /System/Library/LaunchDaemons/com.apple.mDNSResponderHelper.plist

@entone
Copy link
Contributor

entone commented Jul 19, 2017

The test will probably still not work, as the address function is returning your ipv6 address, and I specifically look for the ipv4. Will try to fix.

@entone
Copy link
Contributor

entone commented Jul 20, 2017

Address function in the test is updated to return ipv4 addresses

@aadavids
Copy link

I'm one of the others seeing something similar, except I haven't been able to contact the device at all. This in Linux (Ubuntu 16.10). Here's my mix test output in my dev env:

$ mix test
warning: variable "address" is unused
  test/mdns_test.exs:46

warning: variable "char_host" is unused
  test/mdns_test.exs:46

warning: variable "address" is unused
  test/mdns_test.exs:57


15:40:02.072 [debug] Testing Server

15:40:02.073 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

15:40:02.073 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [68, 28, 168, 228, 46, 205], addr: {10, 0, 1, 143}, netmask: {255, 255, 255, 0}, broadaddr: {10, 0, 1, 255}, addr: {65152, 0, 0, 0, 60569, 35875, 46618, 58212}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

15:40:02.073 [debug] {10, 0, 1, 143}

15:40:02.084 [debug] Address: {10, 0, 1, 143}

15:40:02.084 [debug] Hostname: dcFIOwMwed.local

15:40:02.089 [debug] {:error, :nxdomain}


  1) test client and server (MdnsTest)
     test/mdns_test.exs:27
     match (=) failed
     code:  {:ok, {:hostent, char_host, [], :inet, 4, [address]}} = lookup
     right: {:error, :nxdomain}
     stacktrace:
       test/mdns_test.exs:46: (test)



Finished in 0.06 seconds
1 test, 1 failure

Randomized with seed 21530

@entone
Copy link
Contributor

entone commented Jul 20, 2017

Odd that a hostname lookup would fail in .005 seconds. The mDNS server doesn't even log the event on my test for .1 seconds.

15:29:53.363 [debug] Testing Server

15:29:53.364 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

15:29:53.364 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [128, 250, 91, 28, 72, 194]]

15:29:53.364 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [172, 253, 206, 104, 164, 192], addr: {192, 168, 1, 112}, netmask: {255, 255, 255, 0}, broadaddr: {192, 168, 1, 255}, addr: {65152, 0, 0, 0, 58013, 57918, 36336, 23701}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

15:29:53.364 [debug] {192, 168, 1, 112}

15:29:53.375 [debug] Address: {192, 168, 1, 112}

15:29:53.375 [debug] Hostname: dg5DoV1zN1.local

15:29:53.481 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] {:ok, {:hostent, 'dg5DoV1zN1.local', [], :inet, 4, [{192, 168, 1, 112}]}}

15:29:53.482 [debug] Testing Client

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.482 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.483 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'dg5DoV1zN1.local', type: :a}]}

15:29:53.483 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: {192, 168, 1, 112}, domain: 'dg5DoV1zN1.local', func: false, tm: :undefined, ttl: 10, type: :a}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}

15:29:53.483 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: '_nerves._tcp.local', type: :ptr}]}

15:29:53.483 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: {192, 168, 1, 112}, domain: 'dg5DoV1zN1.local', func: false, tm: :undefined, ttl: 10, type: :a}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}

15:29:53.483 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: '_rosetta._tcp.local', domain: '_nerves._tcp.local', func: false, tm: :undefined, ttl: 10, type: :ptr}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}
.
15:29:53.483 [debug] Device: {:"_nerves._tcp.local", %Mdns.Client.Device{domain: nil, ip: {192, 168, 1, 112}, payload: %{}, services: ["_rosetta._tcp.local", "_nerves._tcp.local"]}}

@entone
Copy link
Contributor

entone commented Jul 20, 2017

Can you modify your test to not include the .local in the host name? and report back

@entone
Copy link
Contributor

entone commented Jul 20, 2017

There's also this bug for 16.10, https://askubuntu.com/questions/837982/how-to-configure-local-dns-lookup-in-ubuntu-16-10#838395

Oh mDNS, the ugly duckling of the network world, so much promise...

@aadavids
Copy link

Yes, I was wondering if it was an issue with my environment. Here's the modified test:

$ mix test
warning: variable "address" is unused
  test/mdns_test.exs:46

warning: variable "char_host" is unused
  test/mdns_test.exs:46

warning: variable "address" is unused
  test/mdns_test.exs:57


16:41:33.399 [debug] Testing Server

16:41:33.401 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

16:41:33.401 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [68, 28, 168, 228, 46, 205], addr: {10, 0, 1, 143}, netmask: {255, 255, 255, 0}, broadaddr: {10, 0, 1, 255}, addr: {65152, 0, 0, 0, 60569, 35875, 46618, 58212}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

16:41:33.401 [debug] {10, 0, 1, 143}

16:41:33.410 [debug] Address: {10, 0, 1, 143}

16:41:33.410 [debug] Hostname: SnzOJR7UJG

16:41:34.344 [debug] {:error, :nxdomain}


  1) test client and server (MdnsTest)
     test/mdns_test.exs:27
     match (=) failed
     code:  {:ok, {:hostent, char_host, [], :inet, 4, [address]}} = lookup
     right: {:error, :nxdomain}
     stacktrace:
       test/mdns_test.exs:46: (test)



Finished in 0.9 seconds
1 test, 1 failure

Randomized with seed 350735

@entone
Copy link
Contributor

entone commented Jul 20, 2017

Ok, same issue. Are you able to ping any other local machines on your network?

@aadavids
Copy link

Update: I tried reinstalling libnss-resolve (sudo apt purge libnss-resolve and sudo apt install libnss-resolve), and now the unmodified tests pass.

@aadavids
Copy link

$ mix test                                                                                                                                                      
warning: variable "address" is unused
  test/mdns_test.exs:46

warning: variable "char_host" is unused
  test/mdns_test.exs:46

warning: variable "address" is unused
  test/mdns_test.exs:57


16:47:58.365 [debug] Testing Server

16:47:58.366 [debug] [flags: [:up, :loopback, :running], hwaddr: [0, 0, 0, 0, 0, 0], addr: {127, 0, 0, 1}, netmask: {255, 0, 0, 0}, addr: {0, 0, 0, 0, 0, 0, 0, 1}, netmask: {65535, 65535, 65535, 65535, 65535, 65535, 65535, 65535}]

16:47:58.367 [debug] [flags: [:up, :broadcast, :running, :multicast], hwaddr: [68, 28, 168, 228, 46, 205], addr: {10, 0, 1, 143}, netmask: {255, 255, 255, 0}, broadaddr: {10, 0, 1, 255}, addr: {65152, 0, 0, 0, 60569, 35875, 46618, 58212}, netmask: {65535, 65535, 65535, 65535, 0, 0, 0, 0}]

16:47:58.367 [debug] {10, 0, 1, 143}

16:47:58.376 [debug] Address: {10, 0, 1, 143}

16:47:58.376 [debug] Hostname: QWecd3h2EW.local

16:47:58.497 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: 'QWecd3h2EW.local', type: :a}]}

16:47:58.505 [debug] {:ok, {:hostent, 'QWecd3h2EW.local', [], :inet, 4, [{10, 0, 1, 143}]}}

16:47:58.505 [debug] Testing Client

16:47:58.506 [debug] Got Query: %DNS.Record{anlist: [], arlist: [], header: %DNS.Header{aa: false, id: 0, opcode: :query, pr: false, qr: false, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: [%DNS.Query{class: :in, domain: '_nerves._tcp.local', type: :ptr}]}

16:47:58.507 [debug] Got Response: %DNS.Record{anlist: [%DNS.Resource{bm: [], class: :in, cnt: 0, data: '_rosetta._tcp.local', domain: '_nerves._tcp.local', func: false, tm: :undefined, ttl: 10, type: :ptr}], arlist: [], header: %DNS.Header{aa: true, id: 0, opcode: :query, pr: false, qr: true, ra: false, rcode: 0, rd: false, tc: false}, nslist: [], qdlist: []}
.
16:47:58.507 [debug] Device: {:"_nerves._tcp.local", %Mdns.Client.Device{domain: nil, ip: {10, 0, 1, 143}, payload: %{}, services: ["_rosetta._tcp.local", "_nerves._tcp.local"]}}


Finished in 0.1 seconds
1 test, 0 failures

Randomized with seed 314710

@entone
Copy link
Contributor

entone commented Jul 20, 2017

excellent, thank you for testing this. hopefully that resolves your network deployment tests as well.

I always run an LTS version of Ubuntu, for the record.

@aadavids
Copy link

Yeah, sticking with LTS is probably the smarter thing to do :).

Unfortunately still unable to contact the device.

@entone
Copy link
Contributor

entone commented Jul 20, 2017

can you run ping nerves.local or whatever the hostname is set to for the device.

are you sure the device is on the network?

Is this rpi0w?

@aadavids
Copy link

It's a BeagleBone black, running a port of "gadget mode" from the rpi0. So it's supposed to set up a connection over the USB interface.

I'm unable to ping the BBB from linux. Interestingly, I tried the same device with my mac at home, and it worked. The first ping packet always times out, but after that it's good. So the issue on linux might be unrelated.

@fhunleth
Copy link
Contributor Author

@aadavids Hmm... Could you try a couple things on Linux? First, could you check that the virtual Ethernet interface that shows up on Linux is getting assigned a link local IP address? For example, run ifconfig -a in a terminal and look for a 169.254.0.0/16 address . The next thing to try is to run wireshark and capture all of the traffic on the virtual Ethernet interface. Wireshark labels packets, so you should see some mDNS packets go through. Or, maybe this is another Ubuntu 16.10 bug?

@aadavids
Copy link

Sorry, I won't have that much time to test things until Monday, but here's the network config:

enp0s20f0u1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet6 fe80::3d1b:5fa7:7b82:5256  prefixlen 64  scopeid 0x20<link>
        ether 1a:52:97:d4:b5:03  txqueuelen 1000  (Ethernet)
        RX packets 2  bytes 80 (80.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 45  bytes 6338 (6.3 KB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

So it looks like it's getting an ipv6 address?

@entone
Copy link
Contributor

entone commented Jul 24, 2017

Hmm, if the virtual interface is only getting an ipv6 address, that could be an issue. I haven't tested mDNS with ipv6 and I'm pretty sure the rpi0 firmware only uses ipv4 as well.

@fhunleth
Copy link
Contributor Author

I had thought that IPv4 link local addresses would be assigned by default unless they were explicitly disabled. It may be that the system is waiting for DHCP to time out before assigning an address?

Also, regarding rpi0 firmware only using ipv4, that's sadly correct. For some reason I had thought that it just wasn't used. Not that it will change much since applications need to support it, but I'll start moving it into the systems: nerves-project/nerves_system_rpi0#19.

@aadavids
Copy link

I captured some of the traffic on the virtual Ethernet interface and I can see MDNS packets. I also tried to ping the device a couple times while wireshark was running. mdns.pcapng.tar.gz

So then I guess the issue seems to be that the virtual Ethernet interface is getting an ipv6 address?

@fhunleth
Copy link
Contributor Author

Update: Since we switched away from using IPv4 link-local addressing on USB gadget Ethernet with the official Nerves systems, mDNS has been work very well. If anyone else runs into this and is running Nerves, definitely verify that you're not using link-local addresses on the connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants