dnssecuritycost-optimization

Automated Domain Cleanup: Reclaiming Cost and Reducing Attack Surface

UUnknown

2026-02-18

10 min read

Automate discovery, validation, and retirement of stale domains and DNS records to cut costs and shrink attack surface.

Hook: Why your domain registry is quietly costing you money and expanding your attack surface

Teams that let micro apps, hack days, and ad-hoc tooling create domains and DNS records are building technical debt—and a growing attack surface—by accident. By 2026, the shift to "vibe-coding" and low/no-code micro apps has accelerated. Every ephemeral project can leave behind DNS records and registered domains that rack up renewals, misdirect traffic, and become reconnaissance targets for attackers. This guide gives a practical, automation-first strategy and ready-to-run scripts to find, validate, and retire stale domains and stale DNS records, reclaim cost, and reduce attack surface.

Why this matters now (2026 trends and urgency)

Late 2025 into early 2026 saw two reinforcing trends: an explosion of micro apps and a renewed focus on supply-chain and internet-facing attack surface reduction after several high-profile outages and incidents. Teams are adopting more cloud providers, registrars, and DNS zones—often without centralized governance. The result is fragmented inventories, unnoticed renewals, and stale DNS entries that are inexpensive to create but costly to maintain and secure.

Attackers routinely scan certificate transparency logs, DNS zones, and WHOIS/RDAP records for forgotten targets. Stale domains are a low-effort high-reward vector for typosquatting, phishing, and infrastructure takeover. Meanwhile, cumulative renewal fees and mounting complexity yield measurable cost waste.

Goal: Automated, repeatable domain cleanup that fits DevOps workflows

The cleanup approach below is designed for platform and infra teams supporting developers and citizen-builders. The goal is to deliver an automated, auditable pipeline that:

Creates a single inventory of registered domains and DNS records across providers
Applies deterministic staleness tests and ownership validation
Automates safe retirement steps (quarantine, redirect, expire) with human-in-the-loop approvals
Generates ongoing cost-saving reports and security risk scores

High-level workflow (inverted pyramid: most important first)

Inventory: Collect domains and DNS records via registrar APIs, cloud DNS APIs, RDAP, and certificate transparency logs.
Validation: Confirm ownership, last-use (traffic), TLS/cert presence, and record TTLs.
Scoring: Compute a risk-and-cost score (renewal cost, exposure, last-seen use).
Quarantine: Apply low-impact mitigations for likely-stale items (short TTL, HTTP 410, telemetry sink).
Retirement: Deregister or delete records via automated workflows with approvals and full audit logs.
Prevent: Enforce GitOps-backed DNS/IaC, and implement policy guardrails to stop future sprawl.

Step 1 — Inventory: Where to look and how to collect consistently

Collecting a single source of truth is the foundation. Use APIs and public sources together:

Registrar APIs (Cloudflare Registrar, GoDaddy, Namecheap, Google Domains) — most registrars now provide REST APIs or OAuth flows. Bulk-export via CSV where APIs are limited.
Cloud DNS APIs: AWS Route 53 (boto3), Cloudflare DNS (cf API), Azure DNS (az sdk), Google Cloud DNS.
RDAP/WHOIS: Use RDAP (replaces traditional WHOIS) to verify registrant contacts and expiry dates.
Certificate Transparency (crt.sh) and CT logs: find issued certificates for domains you control—or that attackers may use.
DNS query logs and metrics: Cloudflare Logpush, Route 53 Query Logging to S3, or your DNS resolver logs to measure last-seen traffic.

Practical inventory script (Python): Cloudflare + Route53 collector

Run on a scheduled runner (GitHub Actions, GitLab CI, internal cron). Save outputs to a central database (e.g., Postgres) or object store.

# inventory_collector.py
import os
import json
import boto3
import requests

# Cloudflare zones
CF_API = os.getenv('CF_API')
CF_EMAIL = os.getenv('CF_EMAIL')
cf_headers = {'Authorization': f'Bearer {CF_API}', 'Content-Type': 'application/json'}

zones = []
resp = requests.get('https://api.cloudflare.com/client/v4/zones', headers=cf_headers)
for z in resp.json().get('result', []):
    zones.append({'provider': 'cloudflare', 'zone': z['name'], 'zone_id': z['id']})

# Route53 hosted zones
r53 = boto3.client('route53')
zones_r53 = r53.list_hosted_zones()['HostedZones']
for z in zones_r53:
    zones.append({'provider': 'route53', 'zone': z['Name'].rstrip('.')})

print(json.dumps(zones, indent=2))

Step 2 — Validation: Determine whether a domain or DNS record is truly stale

“Stale” is a composite signal—not just age. Use multiple checks and weight them:

Last DNS query: If query logs show zero queries in 30–90 days, mark as likely stale.
TLS certificates: No active certs or expired certs in CT logs for 90+ days suggests non-use.
HTTP response: Probe endpoints for 404/410/403 versus productive content.
RDAP/WHOIS expiry: Upcoming expiries or lapsed contacts matter for action windows.
Owner confirmation: Send automated verification emails to contacts from RDAP or internal owner fields.

Script: Quick RDAP and HTTP validation (bash + curl)

# rdap_check.sh
#!/bin/bash
DOMAIN=$1
if [ -z "$DOMAIN" ]; then
  echo "usage: $0 example.com"
  exit 1
fi

# RDAP lookup
curl -s "https://rdap.org/domain/$DOMAIN" | jq '{domain: .ldhName, status: .status, expires: .events[]? | select(.eventAction=="expiration") | .eventDate}'

# HTTP probe
curl -I -s --max-time 10 "http://$DOMAIN" | head -n 5

Step 3 — Scoring: Simple risk and cost model

Compute a numeric score to sort cleanup candidates. Sample weighted model:

Last DNS query > 90 days: +50 points
No cert found in CT logs: +20 points
Registrant email unresolved: +15 points
Annual renewal cost > $10 and low usage: +10 points
Subdomain pointing to wildcard services (e.g., S3, Heroku) with no app: +20 points

Thresholds define actions: 80+ auto-quarantine candidate; 40–79 manual review; <40 monitored.

Step 4 — Quarantine: Safe mitigations with minimal blast radius

Before deleting, apply mitigations that reduce risk and give a final human review window:

Lower TTLs to 60–300s for rapid rollback.
Route traffic to an internal sink or HTTP 410 page that explains the record is deprecated.
For domains: set DNS to a parking CNAME or authoritative nameserver you control with minimal services.
Lock domain transfers (transfer lock) until retirement is confirmed.

Automation example (Cloudflare): quarantine A record and set TTL

# cf_quarantine.py
import requests, os
CF_API = os.getenv('CF_API')
ZONE_ID = 'your-zone-id'
RECORD_ID = 'record-id'
headers = {'Authorization': f'Bearer {CF_API}', 'Content-Type': 'application/json'}
payload = {'type':'A','name':'stale.example.com','content':'203.0.113.10','ttl':120,'proxied':False}
requests.put(f'https://api.cloudflare.com/client/v4/zones/{ZONE_ID}/dns_records/{RECORD_ID}', headers=headers, json=payload)
print('Quarantine applied')

Step 5 — Retirement: Safe, auditable removal

Retirement should be a controlled, auditable process with approvals and rollback plans. Suggested retirement workflow:

Open a ticket in your ticketing system (Jira, ServiceNow) via API with discovery data and risk score.
Notify owners and stakeholders; if no response in N days, auto-approve based on policy.
Execute retirement steps: remove DNS records, cancel domain auto-renew, archive TLS certs and logs.
Record changes in an append-only changelog and optionally send to compliance S3 bucket with signatures.

GitOps example: Terraform PR to remove zone records

Use a Git workflow where the automation generates a branch and PR that an approver must merge. Example (conceptual):

# automation generates a terraform change (pseudocode)
resource "aws_route53_record" "stale" {
  zone_id = "ZXYZ"
  name    = "old.micro.example.com"
  type    = "A"
  ttl     = 0
  # removed or commented out to signal deletion in PR
}

Step 6 — Prevention: Policies and guardrails to stop future sprawl

Automation is necessary but not sufficient. Apply policy at source:

Require DNS and domain provisioning through a central platform or self-service portal that implements TTL, owner, cost center, and expiry metadata.
Enforce GitOps for DNS zones and records so changes are auditable and easy to revert.
Label all records with metadata (owner, project, cost center, expiration policy).
Implement budget and renewal alerts tied to cost centers to avoid surprise bills from ephemeral domains.

Operationalizing: Scheduling, metrics, and alerts

Make cleanup recurring and measurable:

Schedule nightly inventory runs and weekly cleanup candidates exports.
Expose metrics: domains inventoried, domains retired, monthly renewal savings, and attack-surface reduction score.
Create automated alerts for suspicious changes (new domains added without owner, sudden inbound queries from new geos, new cert issuance for internal-sounding domains).

Advanced strategies and 2026 innovations worth using

Leverage newer capabilities that matured by 2026:

RDAP enrichment: RDAP now commonly includes structured contact URIs which enable automated owner verification—use it to avoid email bounce noise. For data retention and jurisdictional concerns, see a data sovereignty checklist.
CT log scanners as a feed: Set up push notifications from CT mirror services to detect new certs issued for your domains (immediate signal for inventory drift).
DNS over HTTPS/DoH analytics: Many resolvers publish DoH logs or metrics; use them to capture client-side lookups that bypass traditional authoritative logs.
Policy-as-Code: Use tools like Open Policy Agent (OPA) to enforce domain naming, TTLs, and owner fields in PRs before changes merge.
AI-assisted triage: Use LLM/AI tooling to summarize RDAP responses and produce suggested abandonment messages, but ensure a human decision stage for deletions.

Risk and compliance considerations

Retiring domains can have legal and brand implications. Include these checks:

Check regulatory requirements for retention (e.g., logs tied to specific data retention policies).
For domains used in customer-facing systems, include legal and comms in the approval path — brand and media implications are important; see Principal Media and Brand Architecture for domain-linked outcomes.
Document chain-of-custody for deletions and retain copies of content and certs for post-mortem if needed.

Pro tip: Never delete a domain immediately—quarantine first. Transfers and deletions are often final and costly to reverse.

Real-world example: How an infra team saved $12k and closed a phishing vector

In late 2025, a mid-sized SaaS platform used the workflow above and found 74 domains created by product experiments and hackdays. After validation and outreach, they retired 63 domains and removed 120 stale DNS records. The results:

Annual renewal savings: $12,400
Attack surface reduction: removed 18 hostnames that had certs issued and had appeared in CT logs
Process adoption: they moved to GitOps for all DNS requests with an approval SLA

This model scales: small initial automation investment with immediate cost and security ROI.

Example: GitHub Actions workflow to kick off cleanup weekly

# .github/workflows/domain_cleanup.yml (excerpt)
name: Domain Cleanup
on:
  schedule:
    - cron: '0 3 * * 1' # weekly
jobs:
  inventory:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Run inventory collector
        env:
          CF_API: ${{ secrets.CF_API }}
        run: |
          python3 inventory_collector.py > inventory.json
          # upload inventory.json to central storage or create a PR

Practical takeaways: What to set up this week

Day 1: Run a manual inventory (registrar console + Route53/Cloudflare exports). Find obvious stale domains.
Day 3: Deploy the inventory collector and schedule weekly runs. Store results centrally.
Week 2: Implement the scoring model and quarantine the top 10 candidates with owner notifications.
Month 1: Configure GitOps for DNS changes and add policy checks (TTL, owner, expiry).

Common pitfalls and how to avoid them

Relying only on WHOIS: use RDAP and CT logs for better signals.
Deleting before notifying owners: always include a human approval for high-impact records.
Ignoring infrastructure dependencies: use service maps to detect records linked to live services.
Not capturing audit logs: store every automated action with a signature and timestamp. If you need templates for post-incident work, our post-incident templates are a useful complement: Postmortem Templates and Incident Comms.

Final checklist before deletion

Inventory entry exists and is linked to owner and cost center
Traffic and cert analysis show no recent activity
Quarantine applied for a defined period (e.g., 7–30 days)
Approval recorded in ticketing system or Git PR merged
Deletion steps executed and archived

Conclusion and call-to-action

Stale domains and DNS records from micro apps and ad-hoc tooling are a predictable, fixable source of cost and risk. The pattern in 2026 is clear: teams that treat DNS and domain management as a first-class, automated, and auditable service save money and close simple attack vectors. Start with inventory automation, add validation and a quarantine stage, and finish with GitOps-backed retirement processes.

Actionable next step: Clone the example scripts in this article, run the collector against one account this week, and open a cleanup PR for the top 5 candidates. For a turnkey approach, reach out to your platform or security team to pilot the pipeline across one business unit.

Want a checklist PDF, Terraform examples, and an extended script bundle you can run in your CI? Click to get the repo and a starter playbook tailored for AWS Route53 and Cloudflare (GitHub link provided by your platform team).

Unknown

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Up Next

Building a Lightweight Governance Layer for Weekend Micro Apps Using IaC Policies

edge•11 min read

Edge vs Centralized Hosting for Warehouse Automation: A 2026 Playbook

ci-cd•10 min read

Integrating CI/CD with TMS: Automating Deployments for Logistics Integrations

benchmarks•11 min read

Benchmark: Latency and Cost of Running LLM Inference on Sovereign Cloud vs On-Device

Personalization•10 min read

How Apple’s AI Innovations Could Shape the Future of Cloud-Based Personalization

From Our Network

Trending stories across our publication group

Reducing Blast Radius from Social Media Platform Attacks: Domain Strategy, TLS, and Automated Revocation

letsencrypt.xyz

domain•9 min read

Reducing Blast Radius from Social Media Platform Attacks: Domain Strategy, TLS, and Automated Revocation

Checklist: What Every CTO Should Do After Major Social Platform Credential Breaches

registrer.cloud

executive•10 min read

Checklist: What Every CTO Should Do After Major Social Platform Credential Breaches

How to Run a Private Local AI Endpoint for Your Team Without Breaking Security

crazydomains.cloud

AI•10 min read

How to Run a Private Local AI Endpoint for Your Team Without Breaking Security

How to Build an Internal Marketplace for Micro App Domains and Developer Resources

availability.top

internal•9 min read

How to Build an Internal Marketplace for Micro App Domains and Developer Resources

Designing a Hybrid Inference Fleet: When to Use On-Device, Edge, and Cloud GPUs

webhosts.top

architecture•10 min read

Designing a Hybrid Inference Fleet: When to Use On-Device, Edge, and Cloud GPUs

How to Pick a Podcast Domain That Grows With Your Show (Before You Launch)

originally.online

podcasts•11 min read

How to Pick a Podcast Domain That Grows With Your Show (Before You Launch)

2026-02-22T00:37:10.672Z