June 4, 2025
From DNS Failures to Resilience: How NodeLocal DNSCache Saved the Day, Sanu Satyadarshi
I am Sanu Satyadarshi, part of the Platform Engineering division at Mercari, Inc. Platform Engineering provides a cost-effective, safe, and easy-to-use multi-cloud infrastructure service for all engineering teams to make and scale bets. This article discusses the DNS-related challenges encountered at Mercari on our Kubernetes clusters and the significant improvements achieved by implementing Node-Local DNS Cache. By optimizing DNS traffic and reducing errors, we enhanced system reliability and scalability, preventing production outages caused by DNS failures. Key Takeaways Reduced DNS calls to kube-dns by 10x, decreasing network overhead and inter-service communication costs. Lowered DNS query rates by 93% for services on the cluster. Achieved a 10x-100x reduction in DNS-level errors, improving system resilience. Eliminated the “failed to refresh DNS cache” errors, mitigating a frequent source of incidents. DNS on Kubernetes: The Elephant in the Room Domain Name System, more commonly known as DNS an extremely critical component in the internet infrastructure. This is the tech that allows your web browser to find the actual IP address of a website when you type example.com in your browser. DNS in itself is a highly complex topic, and understanding it requires a book(or two) on its own. Like any network infrastructure, Kubernetes depends on DNS to resolve service names like [service name].[namespace].svc.cluster.local and other names to IPs and allows communications among services and the external world. From the role of DNS in Kubernetes, you can imagine that any DNS failure or degradation can quickly escalate to increased latency, network congestion, and even complete outages. On Kubernetes, DNS is installed as a kube-dns deployment running on the kube-system namespace. Specifically at Mercari, it comes pre-installed with our managed GKE clusters for service discovery and name resolution across the clusters. kube-dns on Kubernetes allows multiple configurations using the configmap that can be used to change various parameters like ndots, etc. As kube-dns is responsible for resolving all… <a class="more-link" href="https://about.in.mercari.com/news/mercari-india/from-dns-failures-to-resilience-how-nodelocal-dnscache-saved-the-day-sanu-satyadarshi/">Continue reading <span class="screen-reader-text">From DNS Failures to Resilience: How NodeLocal DNSCache Saved the Day, Sanu Satyadarshi</span></a>