Crawl Budget
Search Engine Crawl Budget
The number of pages a search engine will crawl on a site within a given time period.
Détail technique
Crawl Budget uses two mechanisms: robots.txt (file-level, prevents crawling but not indexing) and meta robots tags (page-level, controls indexing and link following). Common directives: 'noindex' (exclude from search), 'nofollow' (don't pass link equity), 'noarchive' (no cached copy). X-Robots-Tag HTTP headers provide the same controls for non-HTML resources (PDFs, images). A blocked page can still rank if other pages link to it — 'noindex' in meta tags is the only way to guarantee exclusion from search results.
Exemple
``` # robots.txt User-agent: * Allow: / Disallow: /admin/ Disallow: /api/internal/ Sitemap: https://peasytools.com/sitemap.xml ```