robots.txt Validator

Validate a robots.txt file and test URL matching against user-agents

Ad placeholder (leaderboard)

A robots.txt file tells crawlers which parts of your site they may fetch. Small mistakes — a misplaced wildcard, an Allow that loses to a more specific Disallow, or a rule before any User-agent — can silently block important pages or expose ones you meant to hide. This robots.txt validator parses your file to the RFC 9309 standard and lets you test any URL against any user-agent in your browser.

How it works

The parser splits the file into groups. A group is one or more User-agent lines followed by Allow and Disallow rules; a blank line or a new User-agent after a rule starts a fresh group. Comments after # are stripped, and Sitemap lines are collected and checked for a fully-qualified URL.

When you test a URL, the tool first selects the group whose agent best matches your user-agent (longest substring wins, with * as the fallback). It then evaluates every rule in that group against the path. Patterns support two wildcards: * matches any sequence of characters, and a trailing $ anchors the match to the end of the path.

Most-specific match wins

When several rules match, the one with the longest pattern decides the outcome. If an Allow and a Disallow match with equal pattern length, the Allow wins. This is why Allow: /admin/public/ can override a broader Disallow: /admin/.

Notes

The validator models standard behaviour, but real crawlers occasionally differ on edge cases such as percent-encoding and case sensitivity. Use it to catch the common errors and to confirm your intent before publishing. Everything runs locally, so testing a staging or internal robots file is private.

Ad placeholder (rectangle)