Here’s a comprehensive list of 100 common problems in Kubernetes LoadBalancer implementations, organized by categories (architecture, networking, configuration, DNS, health checks, cloud provider issues, etc.) with brief technical explanations:
🧩 1. Architectural and Design-Level Issues
- Misunderstanding Layer 4 vs. Layer 7 load balancing.
- Using
LoadBalancertype on bare-metal clusters without MetalLB or similar. - Multiple LoadBalancers per service causing excessive cloud resource usage.
- No external IP assigned due to pending load balancer provisioning.
- Failure to expose internal services (wrong external/internal annotation).
- Inconsistent behavior across cloud providers (AWS vs GCP vs Azure).
- Exceeding the limit of allowed load balancers per cloud project.
- Misaligned CIDR ranges between cluster and external network.
- Overlapping service CIDRs causing routing conflicts.
- Using external load balancers without proper NAT handling.
- Ignoring idle connection timeouts in cloud LB (common in AWS ELB).
- Lack of HA strategy for single load balancer dependency.
- Not accounting for failover between multiple zones.
- LoadBalancer fronting another LoadBalancer (double LB hop).
- Insufficient throughput capacity for expected workloads.
- Using NodePort underneath without firewall rules for nodes.
- Load balancer not resilient to node restarts or scaling.
- Using wrong protocol type (
TCPvsUDPvsHTTP). - Exposing control plane components accidentally.
- Traffic not routed through kube-proxy (bypassing service rules).
🌐 2. Networking and Connectivity Problems
- Misconfigured CNI plugin blocking external traffic.
- LoadBalancer not accessible due to missing external routes.
- NetworkPolicy blocking health check probes.
- Cloud firewall rules missing for NodePort ranges (30000–32767).
- Incorrect MTU leading to packet fragmentation/loss.
- Node IP not reachable from LB due to NAT misconfig.
- LoadBalancer health checks hitting wrong port or path.
- Source IP preserved incorrectly, breaking backend logic.
- Reverse path filtering causing dropped packets.
- Connection tracking issues (conntrack table overflow).
- Node local routing bypassing kube-proxy IPVS tables.
- Multiple NICs confusing the load balancer routing.
- BGP peering instability (in MetalLB setups).
- ARP/NDP conflicts between MetalLB speakers.
- VXLAN overlay interfering with external routes.
- Routing table overflow (too many routes).
- SNAT masking client IPs (breaking access logs).
- Kubernetes IPVS not syncing with kernel conntrack.
- Proxy ARP disabled on nodes (MetalLB issue).
- Incorrect egress IP or masquerade setup.
⚙️ 3. Configuration and Annotation Errors
- Missing cloud-specific annotations (e.g., AWS ALB ingress annotations).
- Wrong load balancer class (
loadBalancerClassfield not set). - Misconfigured health check path annotation.
- Backend protocol mismatch (
HTTPvsHTTPS). - Missing SSL certificate reference.
- Incorrect security group annotations.
- Service selector not matching any pods.
- Missing externalTrafficPolicy configuration.
- Misusing
sessionAffinitysettings. - Wrong
loadBalancerIPspecified (not in pool). - Missing
loadBalancerSourceRanges. - Disabled cross-zone load balancing by mistake.
- Using unsupported annotations in managed clusters.
- Forgetting to delete dangling LB when service is removed.
- Overly aggressive
externalTrafficPolicy=Localcausing node starvation. - Conflicting annotations between multiple ingress controllers.
- Cloud provider ignoring unrecognized annotation.
- Unintentionally setting
loadBalancerSourceRanges: 0.0.0.0/0. - Auto-assigned IP not in allowed subnet range.
- Health probe ports mismatched with container ports.
🧱 4. Ingress Controller Integration Problems
- Ingress controller using same ports as LoadBalancer.
- Duplicate ingress rules sending traffic to wrong backend.
- Path rewrite rules conflicting with app routes.
- TLS secret not found by ingress controller.
- Default backend misconfigured or missing.
- Ingress not picking up annotations from Service.
- Conflicts between Traefik and NGINX ingress controllers.
- Cert-manager not updating ingress TLS cert.
- Hostname mismatch causing SSL handshake failure.
- Ingress controller pod crashlooping due to invalid config.
- Load balancer health checks failing due to HTTP 301/302 redirects.
- Misconfigured ingress class (IngressClassName not set).
- Missing
X-Forwarded-Forheader propagation. - HTTP → HTTPS redirection loop.
- Wildcard hostnames not resolving properly.
- Static IP not associated with ingress LB.
- Overlapping host rules across namespaces.
- Backend timeout lower than LB idle timeout.
- Unsupported path type (
ExactvsPrefixmismatch). - Controller RBAC not allowing status updates.
☁️ 5. Cloud Provider and Infrastructure Problems
- Cloud provider API quota exhausted (cannot create LB).
- Service stuck in “pending” due to missing IAM permissions.
- Firewall rules not auto-created by cloud controller.
- Cloud controller not running in cluster.
- Using private subnet for LoadBalancer IPs unintentionally.
- Cloud LB not supporting IPv6 while cluster does.
- Static IP reservation expired or released.
- Using custom network tags that block LB provisioning.
- Cloud load balancer name too long for provider limit.
- Cloud provider API latency causing update delays.
- Regional vs. zonal LB mismatch.
- Load balancer nodes not detected due to tag mismatch.
- Cloud controller manager version incompatible with cluster.
- IAM policy missing
elasticloadbalancing:*permissions. - Cloud load balancer doesn’t support UDP (e.g., AWS Classic ELB).
- Load balancer node pool scaled down automatically.
- Backend instance registration failing silently.
- Security group dependency cycles (common in AWS).
- Subnet exhaustion—no available IPs for new LBs.
- Provider rate limits hit due to frequent service updates.
🔍 Reference Source Chains
Many of these issues can be traced through:
- Kubernetes source:
pkg/cloudprovider/providers/*pkg/proxy/ipvs/,pkg/proxy/iptables/,pkg/proxy/topology.go - Cloud Controller Manager logic:
kubernetes/cloud-provider
controller/service/service_controller.go - MetalLB internals:
metallb/metallb →
speaker/arp.go,bgp_controller.go - Ingress controllers: kubernetes/ingress-nginx, traefik/traefik