Recent days WordPress Users as be researching tips on how to optimize the robots.txt file to improve SEO. The Robots.txt file tells search engines how to crawl a website, which makes it a very powerful SEO tool. This article show how to create a perfect robots.txt file for SEO.
What is a robots.txt file?
Robots.txt is a text file that site holders can make to tell search engine androids how to crawl and index pages on their websites. It is usually stored in the root directory, also known as the website’s home folder. The basic format of the robots.txt file is as follows:
User-agent: [user-agent name]
Disallow: [URL string not to be crawled]
User-agent: [user-agent name]
Allow: [URL string to be crawled]
Sitemap: [URL of site XML Sitemap]
Multiple lines of instructions can be to allow or block specific URLs and add multiple sitemaps. If URL is not blocked, search engine robots assume they can crawl it.
Here is how the robots.txt sample file looks:
User-Agent: *
Allow: / wp-content / uploads /
Disallow: / wp-content / plugins /
Disallow: / wp-admin /
Sitemap: https://example.com/sitemap_index.xml
In the above robots.txt example, search engines are allowed to crawl and index files in the WordPress upload folder. After that, search robots are banned from crawling plugins and index plugins, and WordPress administrator folders.
Finally, the URL of the XML site map is provided.
Does WordPress website need a Robots.txt file?
If a site does not have a robots.txt file, search engines will still crawl and index the site. However, it will be difficult to tell search engines which pages or folders they should not crawl.
When a blog is created for the first time and does not have a lot of content, this will not have much impact. However, as the website continues to grow and has a large amount of content, it is better to control how the website is crawled and indexed. Because each website has a crawl quota in each search robots. This means that they crawl a certain number of pages during the crawling session. If they have not finished crawling all the pages on the site, then they will go back and continue crawling in the next session. This may slow down website indexing speed.
Solve this problem by prohibiting search bots from trying to crawl unnecessary pages (such as WordPress management pages, plugin files, and theme folders). By disabling unnecessary pages, you can save crawl quotas. This helps search engines crawl more pages on your site and index them. Hiding content from public access is not the safest way, but it can help avert them from appearing in search results.
What should an ideal Robots.txt file look like?
Many popular blogs use very simple robots.txt files.
User-agent: *
Disallow:
Sitemap: http://www.example.com/post-sitemap.xml
Sitemap: http://www.example.com/page-sitemap.xml
This robots.txt file permits all robots to index all content and offer them with a link to the site ’s XML sitemap. For WordPress sites, it is best to make use of the following rules in the robots.txt file:
User-Agent: *
Allow: / wp-content / uploads /
Disallow: / wp-content / plugins /
Disallow: / wp-admin /
Disallow: /readme.html
Disallow: / refer /
Sitemap: http://www.example.com/post-sitemap.xml
Sitemap: http://www.example.com/page-sitemap.xml
This tells the search bot to index all WordPress images and files. It does not allow search robots to index
WordPress plugin files, WordPress management area, WordPress readme files, and affiliate links. By adding a site map to the robots.txt file, allows Google robots to easily let find all the pages on a website.