[clug] dns woes

Hal Ashburner hal.ashburner at gmail.com
Tue Apr 9 03:48:01 UTC 2019


I'm having a shocker with dns just now which on my home network is randomly
intermittently failing to resolve for some sites, sometimes. Seemingly with
no rhyme or reason. Then just as randomly righting itself.

I'm using an old x86-64 computer as a router with two ethernet rj45 ports
on which is installed vyos. (debian with a router-like ui). http://vyos.io

This router acts as a dhcp server. It hands out its own local ip address
over dhcp as the default-router and dns-server.

The router is set up as a dns repeater using pdns_recursor which is what
vyos.1.20 seems to use.

Much of the time it all works just fine. dns seems to always be resolved on
the router itself.

ie google.com fails to resolve on my laptop, ssh onto the router, host
google.com resolves just fine while still failing to resolve on the laptop,
2 minutes later it works again for no discernable reason.

In desperation I added 1.1.1.1 (cloudfare) as a second dns server to be
handed out by the dhcp server in addition to the router itself such that
cat /etc/resolv.conf on (say) a laptop getting its network conf from the
router reads:

nameserver 192.168.18.1
nameserver 1.1.1.1

noting that on the laptop resolv.conf is as follows.
$ ls -lh /etc/resolv.conf
lrwxrwxrwx 1 root root 32 Feb  6 00:26 /etc/resolv.conf ->
/run/systemd/resolve/resolv.conf



My laptop has 2 nameservers it can query and still fails. Manually removing
the 192.x - it immediately magically works suggesting that having 2
nameservers doesn't do what I think it does (ie first one fails, maybe try
the second now please? That would make sense to me otherwise why bother
allowing a second in config?)

My laptop here is just an example, dns randomly fails on anything attached
to the network. As in most of the time it's just fine, about 2 - 3 times a
day it fails totally for up to 5 minutes. Sometimes not for every url
either.

I really do want my router to control dns so it knows about the local
network and I can address things by name. This fallback to some public dns
server is out of sheer frustration and even that doesn't work! Configuring
a local machine to use 1.1.1.1 or 8.8.8.8 always works for names outside my
network.

I've tried making sure port 53 is accepted through the firewall for both
tcp and udp.
I've tried switching dnssec off as pdns seems to have some bugs there.
I've tried setting the cache size to 0 and to 128 seemingly with no effect.
There is nothing useful in the router logs. Aggressively logging dropped
packets doesn't show anything relating to dns.

I've tested that pdns_recursor is up using
$ sudo rec_control ping
and it's fine - even while I'm getting no dns resolution for something on
my lan and perfect resolution on the router itself.

The fact it fails after considerable time and then fixes itself is
troubling to me. Surely it should fail consistently if it is going to fail
due to config error?

vyos is used in ubiquiti gear which has decent reputation. The
configuration commands, rollback, save and other UI stuff seems like its
not terrible.

To show the struggle is real I take it out on google, which as ever when
searching for something specific like this google basically hates you and
will randomly drop words from your query so it can spam you with irrelevant
results. Even if you use allintext: or quote everything and it claims to
have found 6 results, searching the contents of the pages it found for you
with one of your keywords will frequently up empty "Oh you really want that
keyword, no google knows better". Between systemd and everything else
changing - everything most of what the internet says is now seemingly
irrelevant when it comes to dns anway and google isn't the help it could be
excluding all that information that is now completely wrong for the problem
you are trying to solve. But what's better than goog, right? They've got
the size to have the data so can be deliberately bad when it suits them,
which one would imagine is to make money, same as any other monopoly as
defined by a decreasing long run average total cost. ("You don't have to
buy water from us with the taps, buy it from the supermarket if you don't
like it.")

And I'll whine because whatever problem was being solved with all these dns
changes wasn't a problem I had on my home network for the past decade and a
half or so but now I've got a bunch of them and I feel like a moron in my
ability to solve them.

I'm presently doing a tcpdump of port 53 on the router lan interface and
waiting for it to fail again which it probably won't for hours. I'm not
sure what I would see there anyway. To be honest it seems a little
ridiculous. (And it also weirds me out how much kit now phones home for
some purpose seemingly not to serve its owner, was there really informed
consent for that?)

Any ideas welcome...
I guess deleting vyos, and installing debian with a dnsmasq is an option
then doing all the individual pieces of configuration, iptables, nat, is an
option but seems a shame.


Cheers,
Hal



pdns_recurusor config file as configured indirectly by the vyos command line

vyos at graham-vyos:~$ cat /etc/powerdns/recursor.conf

### Autogenerated by dns_forwarding.py ###

# Non-configurable defaults
daemon=yes
threads=1
allow-from=0.0.0.0/0, ::/0
log-common-errors=yes
non-local-bind=yes
query-local-address=0.0.0.0
query-local-address6=::

# cache-size
max-cache-entries=128

# negative TTL for NXDOMAIN
max-negative-ttl=3600

# ignore-hosts-file
export-etc-hosts=yes

# listen-on
local-address=192.168.18.1

# domain ... server ...

# dnssec
dnssec=off

# name-server
forward-zones-recurse=.=103.217.165.53;45.248.197.53


More information about the linux mailing list