Robots.txt is a text file that contains few lines of simple code. This text file that tells web robots which pages on your site to crawl or not.
Basic Format :
User-agent: [user-agent name]
Disallow: [URL string not to be crawled]
Here User-agent is a web crawling software and Disallow or allow is the crawl instructions(how to behave to some User-agents).
The asterisk(*) after “user-agent” means that the robots.txt file applies to all web robots that visit
the site. The slash(/) after “Disallow” tells the robot to not visit any pages on the site. One of the major goals of SEO is to get search engines to crawl your site easily so they increase your ranking.
- User-agent: User-agent is a web crawling software
- Disallow: The command used to tell a user-agent not to crawl a particular URL.
- Allow: The command to tell Googlebot it can access a page or subfolder.
- Crawl-delay: How many seconds a crawler should wait before loading and crawling page content.
- Sitemap: XML sitemap(s) associated with this URL
- Block non-public pages
- Maximize Crawl Budget
- Prevent Indexing of Resources
How to Test Your Robots.txt File?
There are many robots.txt tester tools to test the Robots.txt file. One of these is Google Search Console.
Login Google Search Console account -> Go to the Old Version
In the Old Google Search Engine Interface -> Crawl -> Click on robots.txt Tester to crawl and find errors.
Related Blogs: SEO Things to Improve Ranking