Smart Scan Directives Builder (Robots.txt & Sitemap)
Error-proof visual interface to create robots.txt. Manage crawl budget and avoid accidentally blocking your site from indexing.
Add to rules:
Search bot (User-agent)
# Add rules and optionally Sitemap URL aboveHow to create a correct robots.txt
Build a safe file to control search bots in just a few clicks.
-
Choose a ready preset
Add base rules in one click. Presets include: «WordPress (Optimal)», «E-commerce (Bitrix/Generic)», «Cleanup junk», and full site block for development — «No-index (Dev mode)».
-
Configure directives and Sitemap
Add custom rules: select a crawler (User-agent), set the directive («Allow» or «Disallow») and path. Enter your Sitemap URL.
-
Copy or download the result
Get the ready code. You can copy it or download it as a robots.txt file.
Why use a visual robots.txt generator?
-
Protection from fatal errors
One extra slash (/) can block your entire site from Google and Yandex. The visual interface is syntax-error proof so your pages stay in search.
-
Crawl budget savings
Search engines allocate limited time to crawl your site. By blocking technical pages, carts, filters and pagination you steer bots to important commercial content.
-
Anti-AI protection
Don't want OpenAI, Anthropic or Google bots to use your content for training? Use the «Anti-AI» preset in one click.
-
100% privacy
The tool runs locally in your browser. We do not collect your URL structure.
All about the robots.txt file and Sitemap
The robots.txt file is the first document search bots request when visiting your site. It tells them which sections may be crawled and which are forbidden. It is a critical tool for server load and indexing control.
Along with Allow/Disallow directives, this file typically includes the path to Sitemap.xml. The Sitemap works together with robots.txt: the first tells bots where not to go, the Sitemap shows which pages to index first.
Popular questions
- Where should I upload the robots.txt file?
- The file must be in the root directory of your site and accessible at your-domain.com/robots.txt
- What does User-agent: * (All bots) mean?
- The asterisk (*) in the User-agent rule means the directive applies to all crawlers (Googlebot, YandexBot, Bingbot, etc.) unless more specific rules are defined below for them.
- Will Disallow remove the page from search?
- Not necessarily. Disallow forbids crawling, but if there are external links to the page it may still be indexed. To remove a page from search reliably, use the <meta name="robots" content="noindex"> tag in the page HTML.
- How do I verify the file is correct?
- After uploading to the server use the «robots.txt analysis» tool in Yandex Webmaster and crawl reports in Google Search Console.
Building a site for your business?
Shift Box is a product IT studio for the B2B sector. Besides free utilities we build reliable industry solutions. Check out our products.
Learn more