In 1994, Martijn Koster introduced robots.txt to help web crawlers navigate websites and avoid restricted areas. Originally called "RobotsNotWanted.txt", this file provided clear guidelines on which parts of a website should be off-limits, quickly becoming a voluntary standard adopted across the web.
With AI and automation becoming more prevalent, it seems likely that the web will soon primarily be used by bots and agents rather than humans. Today, AI agents navigate the web by manipulating the page, similar to how humans do. However, most websites don't provide specific guidance to help these agents, leading to reduced reliability. Introducing a standard like agents.txt can help improve this.
The agents.txt file would provide specific instructions for AI agents. This allows agents to access essential information directly. It minimizes unnecessary page parsing and reduces redundant requests. The file should offer clear guidelines, ensuring AI agents interact correctly with dynamic content, multi-step workflows and complex forms.
To address security concerns, agents.txt can also include rules beyond restricted areas. For example, it can specify how to handle authentication tokens, rate limits for requests, and guidelines for handling CAPTCHAs. This approach ensures that sensitive information remains protected while providing AI agents with the necessary instructions.
If the future is truly agentic, embracing the idea of agents.txt should improve how AI agents use websites, making interactions more efficient. This will likely lead to greater use of a website that contains this file compared to ones that actively block robots. Rewriting the web to be API-first may not be a near-term possibility, but this file might allow us to get one step closer to that reality.
Sample agents.txt file [please share feedback on this]
[GeneralInstructions]
# Provide any general instructions or comments for AI agents here.
# Example: Please follow the navigation and data extraction rules carefully.
GetAnswers = /agents/prompt
[PageStructure]
Header = #header
Footer = #footer
MainContent = #main-content
Sidebar = #sidebar
[Navigation]
LoginButton = #login-btn
NextPageButton = .next-page
PreviousPageButton = .prev-page
# Specify steps for multi-step forms if any
Step1 = #step1
Step2 = #step2
[DataExtraction]
ProductName = .product-name
Price = .price
Availability = .stock-status
Description = .product-description
# Example of nested elements
Reviews = .review-list .review-item
ReviewAuthor = .review-author
ReviewDate = .review-date
[FormSubmission]
SearchForm = #search-form
SearchInput = #search-input
SubmitButton = #submit-btn
# Include hidden fields or tokens if required
HiddenField = input[name='hidden_token']
[RateLimits]
MaxRequestsPerMinute = 30
# Specify the action if the rate limit is exceeded
OnRateLimitExceeded = "Back off and retry after 1 minute"
[Security]
AuthToken = .auth-token
CaptchaHandling = #captcha
# Detail on how to refresh authentication tokens
AuthTokenRefresh = /auth/refresh
[RestrictedAreas]
Disallow = /private/
Disallow = /admin/
Disallow = /user/