Robots.txt is a text file at the root of your website that tells search engine crawlers which pages they can and cannot access. It uses directives like User-agent, Allow, Disallow, and Sitemap to control crawler behavior.

Where does robots.txt go on my server?

Place it at the root of your domain: https://example.com/robots.txt. It must be accessible at exactly this URL for crawlers to find it. Most CMS platforms (WordPress, Shopify) provide a way to edit it from their settings.

What does User-agent: * mean?

The asterisk (*) means the rules apply to all crawlers. You can target specific crawlers by name, such as User-agent: Googlebot for Google or User-agent: GPTBot for OpenAI's crawler.

Crawl-delay specifies the minimum number of seconds a crawler should wait between successive requests. For example, Crawl-delay: 10 means the crawler should wait at least 10 seconds. Note that Google does not officially support Crawl-delay, but Bing and Yandex do.

What are common robots.txt mistakes?

Common mistakes include: accidentally blocking important pages, putting the file in the wrong directory (it must be at the root), using robots.txt to hide sensitive pages (it is publicly visible — use proper authentication instead), and forgetting to add a Sitemap directive.

Is my data sent to a server?

No. All generation and validation happens entirely in your browser. Nothing is transmitted anywhere.

Robots.txt Generator

Generate and validate robots.txt files with a visual rule builder.

How it works

1
Add user-agent groups and rules

Use the visual builder to add User-agent groups, Allow/Disallow rules, and Sitemap URLs. Or start with a preset.
2
Review the live preview

Check the generated robots.txt in the live preview panel. Fix any validation warnings.
3
Copy or download

Copy the robots.txt to your clipboard or download it as a file ready to upload to your server.

Common use cases

Block all crawlers

Preset: Block All
Allow all with sitemap

Preset: Allow All + Sitemap

About This Tool

Build valid robots.txt files using a visual rule builder instead of manually writing directives. Add User-agent groups, Allow and Disallow rules, Sitemap URLs, and Crawl-delay values — the tool generates the correctly formatted robots.txt in real time.

**What is robots.txt?**

Robots.txt is a plain text file placed at the root of your website (e.g., https://example.com/robots.txt) that tells search engine crawlers which pages and directories they are allowed or not allowed to visit. It is the first file that crawlers like Googlebot, Bingbot, and Yandex check before indexing your site. While it does not force crawlers to obey (malicious bots ignore it), all legitimate search engines respect robots.txt directives.

The file uses a simple text format with User-agent groups, each containing Allow and Disallow directives. User-agent specifies which crawler the rules apply to (an asterisk * means all crawlers). Disallow tells the crawler not to visit matching paths. Allow explicitly permits access, overriding a broader Disallow. Sitemap points crawlers to your XML sitemap. Crawl-delay sets a minimum wait time (in seconds) between requests.

**How this tool works**

Instead of typing directives manually and risking syntax errors, you use the visual builder to add groups, rules, and sitemaps. The tool generates the robots.txt text in real time and shows a live preview on the right side. You can see exactly what your robots.txt will look like before copying or downloading it.

**Presets for common configurations**

Three one-click presets cover the most common scenarios. "Allow All" creates a minimal robots.txt that permits all crawlers to access everything. "Block All" disallows all crawlers from the entire site — useful for staging sites or private projects. "Block AI Bots" adds specific Disallow rules for known AI training crawlers including GPTBot (OpenAI), CCBot (Common Crawl), ChatGPT-User, Google-Extended, Bytespider (ByteDance), and anthropic-ai (Anthropic) — a growing use case for website owners who do not want their content used for AI model training.

**Validation**

The tool checks your robots.txt for common mistakes: empty User-agent values, groups with no rules, conflicting Allow and Disallow directives on the same path, and invalid Sitemap URLs. Warnings appear below the preview so you can fix issues before deploying.

**Real-world use cases**

Webmasters block crawlers from admin panels, private directories, and staging environments. SEO professionals fine-tune crawl budgets by disallowing low-value pages (pagination, filter parameters, internal search results). Content creators block AI training bots to protect their work. Developers generate robots.txt for new projects and test configurations before going live.

All processing runs entirely in your browser — nothing is sent to any server. Copy the result to your clipboard or download it as a robots.txt file ready to upload to your server.

More examples

Examples

Block all crawlers

Input

Preset: Block All

Output

User-agent: *
Disallow: /

Allow all with sitemap

Input

Preset: Allow All + Sitemap

Output

User-agent: *
Allow: /

Sitemap: https://example.com/sitemap.xml

Frequently Asked Questions

What is robots.txt?: Robots.txt is a text file at the root of your website that tells search engine crawlers which pages they can and cannot access. It uses directives like User-agent, Allow, Disallow, and Sitemap to control crawler behavior.
Where does robots.txt go on my server?: Place it at the root of your domain: https://example.com/robots.txt. It must be accessible at exactly this URL for crawlers to find it. Most CMS platforms (WordPress, Shopify) provide a way to edit it from their settings.
What does User-agent: * mean?: The asterisk (*) means the rules apply to all crawlers. You can target specific crawlers by name, such as User-agent: Googlebot for Google or User-agent: GPTBot for OpenAI's crawler.
How do I block AI training bots?: Use the "Block AI Bots" preset, which adds Disallow: / rules for known AI crawlers: GPTBot (OpenAI), CCBot (Common Crawl), ChatGPT-User, Google-Extended, Bytespider (ByteDance), and anthropic-ai (Anthropic). Note that only bots that respect robots.txt will honor these rules.
What is Crawl-delay?: Crawl-delay specifies the minimum number of seconds a crawler should wait between successive requests. For example, Crawl-delay: 10 means the crawler should wait at least 10 seconds. Note that Google does not officially support Crawl-delay, but Bing and Yandex do.
What are common robots.txt mistakes?: Common mistakes include: accidentally blocking important pages, putting the file in the wrong directory (it must be at the root), using robots.txt to hide sensitive pages (it is publicly visible — use proper authentication instead), and forgetting to add a Sitemap directive.
Is my data sent to a server?: No. All generation and validation happens entirely in your browser. Nothing is transmitted anywhere.

Discover More Tools

View all Developer Tools →

Schema Markup Generator

Generate JSON-LD structured data for Google rich results.

Open Graph Tag Generator

Generate Open Graph and Twitter Card meta tags with live social previews.

SEO Meta Tag Analyzer

Analyze and optimize your page title and meta description for search engines.

JSON Formatter & Validator

Format, validate, and prettify JSON instantly.