{"id":50,"date":"2024-03-06T12:10:01","date_gmt":"2024-03-06T12:10:01","guid":{"rendered":"https:\/\/gratisvps.net\/blog\/?p=50"},"modified":"2024-05-25T03:18:30","modified_gmt":"2024-05-25T03:18:30","slug":"how-to-block-ai-crawler-bots-using-robots-txt-file","status":"publish","type":"post","link":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/","title":{"rendered":"How to block AI Crawler Bots using robots.txt file"},"content":{"rendered":"<p>re you a content creator or a blog author who generates unique, high-quality content for a living? Have you noticed that generative AI platforms like OpenAI or CCBot use your content to train their algorithms without your consent? Don\u2019t worry! You can block these AI crawlers from accessing your website or blog by using the robots.txt file.<br \/>\n<span id=\"more-2121\"><\/span><br \/>\n<a href=\"https:\/\/www.cyberciti.biz\/media\/new\/cms\/2023\/09\/How-to-block-AI-Crawler-Bots-using-robots.txt-file.png\"><img fetchpriority=\"high\" decoding=\"async\" class=\"aligncenter size-full wp-image-2123 entered lazyloaded\" src=\"https:\/\/www.cyberciti.biz\/media\/new\/cms\/2023\/09\/How-to-block-AI-Crawler-Bots-using-robots.txt-file.png\" alt=\"How to block AI Crawler Bots using robots.txt file\" width=\"598\" height=\"425\" data-lazy-src=\"https:\/\/www.cyberciti.biz\/media\/new\/cms\/2023\/09\/How-to-block-AI-Crawler-Bots-using-robots.txt-file.png\" data-ll-status=\"loaded\" \/><\/a><\/p>\n<h2>What is a robots.txt file?<\/h2>\n<p>A robots.txt is nothing but a text file instructs robots, such as search engine robots, how to crawl and index pages on their website. You can block\/allow good or bad bots that follow your robots.txt file. The syntax is as follows to block a single bot using a user-agent:<\/p>\n<pre>user-agent: {BOT-NAME-HERE}\r\ndisallow: \/<\/pre>\n<p>Here is how to allow specific bots to crawl your website using a user-agent:<\/p>\n<pre>User-agent: {BOT-NAME-HERE}\r\nAllow: \/<\/pre>\n<h3>Where to place your robots.txt file?<\/h3>\n<p>Upload the file to your website\u2019s root folder. So that URL will look like:<\/p>\n<pre>https:\/\/example.com\/robots.txt\r\nhttps:\/\/blog.example.com\/robots.txt<\/pre>\n<p>See the following resources about robots.txt for more info:<\/p>\n<ol>\n<li>Introduction to robots.txt\u00a0<a href=\"https:\/\/developers.google.com\/search\/docs\/crawling-indexing\/robots\/intro\" target=\"_blank\" rel=\"noopener\">from Google<\/a>.<\/li>\n<li>What is robots.txt? | How a robots.txt file works\u00a0<a href=\"https:\/\/www.cloudflare.com\/learning\/bots\/what-is-robots-txt\/\" target=\"_blank\" rel=\"noopener\">from Cloudflare<\/a>.<\/li>\n<\/ol>\n<h2>How to block AI crawlers bots using the robots.txt file<\/h2>\n<p>The syntax is the same:<\/p>\n<pre>user-agent: {AI-Ccrawlers-Bot-Name-Here}\r\ndisallow: \/<\/pre>\n<h3>Blocking OpenAI using the robots.txt file<\/h3>\n<p>Add the following four lines to your robots.txt:<\/p>\n<pre>User-agent: GPTBot\r\nDisallow: \/\r\nUser-agent: ChatGPT-User\r\nDisallow: \/<\/pre>\n<p>Please note that OpenAI has two separate user agents for web crawling and browsing, each with its own CIDR and IP ranges. To configure the firewall rules listed below, you will need a strong understanding of networking concepts and root-level access to Linux. If you lack these skills, consider enlisting the services of a Linux sysadmin to prevent access from the constantly changing IP address ranges. This can become a game of cat and mouse.<\/p>\n<h4>#1: The\u00a0<tt>ChatGPT-User<\/tt>\u00a0is used by\u00a0<strong>plugins<\/strong>\u00a0in ChatGPT<\/h4>\n<p>Here\u2019s a list\u00a0of the user agents used by OpenAI crawlers and fetchers including CIDR or IP address ranges to block its plugin AI bot that you can use with your web server firewall. You can block the\u00a0<tt>23.98.142.176\/28<\/tt>\u00a0using the\u00a0ufw command\u00a0or\u00a0iptables command\u00a0on your web server. For example,\u00a0here is a firewall rule to block CIDR or IP range using UFW:<br \/>\n<code><span class=\"normaluserprompt\" title=\"The shell prompt usually ends in a $ sign and is not part of the command for the nonprivileged user.\">$\u00a0<\/span>sudo ufw deny proto tcp from 23.98.142.176\/28 to any port 80<br \/>\n<span class=\"normaluserprompt\" title=\"The shell prompt usually ends in a $ sign and is not part of the command for the nonprivileged user.\">$\u00a0<\/span>sudo ufw deny proto tcp from 23.98.142.176\/28 to any port 443<\/code><\/p>\n<h4>#2: The\u00a0<tt>GPTBot<\/tt>\u00a0is used by ChatGPT<\/h4>\n<p><a href=\"https:\/\/platform.openai.com\/docs\/gptbot\" target=\"_blank\" rel=\"noopener\">Here\u2019s a list<\/a>\u00a0of the user agents used by OpenAI crawlers and fetchers including\u00a0<a href=\"https:\/\/openai.com\/gptbot.json\" target=\"_blank\" rel=\"noopener\">CIDR<\/a>\u00a0or\u00a0<a href=\"https:\/\/openai.com\/gptbot-ranges.txt\" target=\"_blank\" rel=\"noopener\">IP address ranges<\/a>\u00a0to block its AI bot that you can use with your web server firewall. Again, you can block those ranges using the\u00a0ufw command\u00a0or\u00a0iptables command. Here is a shell script to block those CIDR ranges:<\/p>\n<div class=\"wp-geshi-highlight-wrap5\">\n<div class=\"wp-geshi-highlight-wrap4\">\n<div class=\"wp-geshi-highlight-wrap3\">\n<div class=\"wp-geshi-highlight-wrap2\">\n<div class=\"wp-geshi-highlight-wrap\">\n<div class=\"wp-geshi-highlight\">\n<div class=\"bash\">\n<pre class=\"de1\"><span class=\"co0\">#!\/bin\/bash<\/span>\r\n<span class=\"co0\"># Purpose: Block OpenAI ChatGPT bot CIDR <\/span>\r\n<span class=\"co0\"># Tested on: Debian and Ubuntu Linux<\/span>\r\n<span class=\"co0\"># Author: Vivek Gite {https:\/\/www.cyberciti.biz} under GPL v2.x+ <\/span>\r\n<span class=\"co0\"># ------------------------------------------------------------------<\/span>\r\n<span class=\"re2\">file<\/span>=<span class=\"st0\">\"\/tmp\/out.txt.$$\"<\/span>\r\n<span class=\"kw2\">wget<\/span> <span class=\"re5\">-q<\/span> <span class=\"re5\">-O<\/span> <span class=\"st0\">\"<span class=\"es2\">$file<\/span>\"<\/span> https:<span class=\"sy0\">\/\/<\/span>openai.com<span class=\"sy0\">\/<\/span>gptbot-ranges.txt <span class=\"nu0\">2<\/span><span class=\"sy0\">&gt;\/<\/span>dev<span class=\"sy0\">\/<\/span>null\r\n\u00a0\r\n<span class=\"kw1\">while<\/span> <span class=\"re2\">IFS<\/span>= <span class=\"kw3\">read<\/span> <span class=\"re5\">-r<\/span> cidr\r\n<span class=\"kw1\">do<\/span>\r\n    <span class=\"kw2\">sudo<\/span> ufw deny proto tcp from <span class=\"re1\">$cidr<\/span> to any port <span class=\"nu0\">80<\/span>\r\n    <span class=\"kw2\">sudo<\/span> ufw deny proto tcp from <span class=\"re1\">$cidr<\/span> to any port <span class=\"nu0\">443<\/span>\r\n<span class=\"kw1\">done<\/span> <span class=\"sy0\">&lt;<\/span> <span class=\"st0\">\"<span class=\"es2\">$file<\/span>\"<\/span>\r\n<span class=\"br0\">[<\/span> <span class=\"re5\">-f<\/span> <span class=\"st0\">\"<span class=\"es2\">$file<\/span>\"<\/span> <span class=\"br0\">]<\/span> <span class=\"sy0\">&amp;&amp;<\/span> <span class=\"kw2\">rm<\/span> <span class=\"re5\">-f<\/span> <span class=\"st0\">\"<span class=\"es2\">$file<\/span>\"<\/span><\/pre>\n<p>&nbsp;<\/p>\n<h3>Blocking Google AI (Bard and Vertex AI generative APIs)<\/h3>\n<p>Add the following two lines to your robots.txt:<\/p>\n<pre>User-agent: Google-Extended\r\nDisallow: \/<\/pre>\n<p>For more information,\u00a0<a href=\"https:\/\/developers.google.com\/search\/docs\/crawling-indexing\/overview-google-crawlers\" target=\"_blank\" rel=\"noopener\">here\u2019s a list<\/a>\u00a0of the user agents used by Google crawlers and fetchers. However, Google does not provide CIDR, IP address ranges, or autonomous system information (ASN) to block its AI bot that you can use with your web server firewall.<\/p>\n<h3>Blocking commoncrawl (CCBot) using the robots.txt file<\/h3>\n<p>Add the following two lines to your robots.txt:<\/p>\n<pre>User-agent: CCBot\r\nDisallow: \/<\/pre>\n<p>Although Common Crawl is a\u00a0<a href=\"https:\/\/commoncrawl.org\/ccbot\" target=\"_blank\" rel=\"noopener\">non-profit foundation<\/a>, everyone uses data to train their AI by its bot called CCbot. It is essential to block them, too. However, just like Google, they do not provide CIDR, IP address ranges, or autonomous system information (ASN) to block its AI bot that you can use with your web server firewall.<\/p>\n<h3>Blocking Perplexity AI using the robots.txt file<\/h3>\n<p>Another service that takes all your content and rewrite it using generative AI. You can block it as follows:<\/p>\n<pre>User-agent: PerplexityBot\r\nDisallow: \/<\/pre>\n<p>They also published\u00a0<a href=\"https:\/\/www.perplexity.ai\/perplexitybot.json\" target=\"_blank\" rel=\"noopener\">IP address rages<\/a>\u00a0that you can block using your WAF or web server firewall.<\/p>\n<h2>Can AI bots ignore my robots.txt file?<\/h2>\n<p>Well-established companies such as Google and OpenAI typically adhere to robots.txt protocols. But some poorly designed AI bots will ignore your robots.txt.<\/p>\n<h2>Is it possible to block AI bots using AWS or Cloudflare WAF technology?<\/h2>\n<p><a href=\"https:\/\/blog.cloudflare.com\/ai-bots\/\" target=\"_blank\" rel=\"noopener\">Cloudflare recently announced<\/a>\u00a0that they have introduced a new firewall rule that can block AI bots. However, search engines and other bots can still use your website\/blog via its WAF rules. It is crucial to remember that WAF products require a thorough understanding of how bots function and must be implemented carefully. Otherwise, it could result in the blocking of other users as well. Here is how to block AI bots using Cloudflare WAF:<br \/>\n<img decoding=\"async\" class=\"alignnone wp-image-83 size-full\" src=\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/Is-it-possible-to-block-AI-bots-using-Cloudflare-WAF-technology-599x465-1.webp\" alt=\"Is it possible to block ai bots using cloudflare waf technology\" width=\"599\" height=\"465\" srcset=\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/Is-it-possible-to-block-AI-bots-using-Cloudflare-WAF-technology-599x465-1.webp 599w, https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/Is-it-possible-to-block-AI-bots-using-Cloudflare-WAF-technology-599x465-1-300x233.webp 300w\" sizes=\"(max-width: 599px) 100vw, 599px\" \/><\/p>\n<div id=\"attachment_2122\" class=\"wp-caption aligncenter\">\n<p>&nbsp;<\/p>\n<p id=\"caption-attachment-2122\" class=\"wp-caption-text\">Click to enlarge<\/p>\n<\/div>\n<p>Please note that I\u2019m evaluating the Cloudflare solution, but my primary testing shows it blocked at least 3.31% of users. The 3.31% is the CSR (Challenge Solve Rate) rate, i.e., humans who solved the captcha provided by Cloudflare. That is a high CSR rate. I need to do more testing. I will update this blog post when I start using Cloudflare.<\/p>\n<p>&nbsp;<\/p>\n<h2>Can I block access to my code and documents hosted on GitHub and other cloud-hosting sites?<\/h2>\n<p>No. I don\u2019t know if that is possible.<\/p>\n<p>I am concerned about using GitHub, a Microsoft product, and the largest investor in OpenAI. They may use your data to train AI through their ToS updates and other loopholes. It would be best if your company or you hosted the git server independently to prevent your data and code from being used for training. Big companies like\u00a0<a href=\"https:\/\/www.theregister.com\/2023\/05\/19\/apple_chatgpt\" target=\"_blank\" rel=\"noopener\">Apple<\/a>\u00a0and others prohibit the internal use of ChatGPT and similar products because they fear it may lead to code and sensitive data leakage.<\/p>\n<h2>Is it ethical to block AI bots for training data when AI is being used for the betterment of humanity?<\/h2>\n<p>I have doubts about using OpenAI, Google Bard, Microsoft Bing, or any other AI for the benefit of humanity. It seems like a mere money-making scheme, while generative AI replaces white-collar jobs. However, if you have any information about how my data can be utilized to cure cancer (or similar stuff), please feel free to share it in the comments section.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>re you a content creator or a blog author who generates unique, high-quality content for a living? Have you noticed that generative AI platforms like OpenAI or CCBot use your content to train their algorithms without your consent? Don\u2019t worry! You can block these AI crawlers from accessing your website&hellip;<\/p>\n","protected":false},"author":1,"featured_media":51,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[5,2],"tags":[27,28,7],"class_list":["post-50","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-how-to","category-linux","tag-ai-crawler","tag-bots","tag-how-to"],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v23.6 (Yoast SEO v23.6) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to block AI Crawler Bots using robots.txt file - Gratisvps.net | Blog Daily Tech Info<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to block AI Crawler Bots using robots.txt file\" \/>\n<meta property=\"og:description\" content=\"re you a content creator or a blog author who generates unique, high-quality content for a living? Have you noticed that generative AI platforms like OpenAI or CCBot use your content to train their algorithms without your consent? Don\u2019t worry! You can block these AI crawlers from accessing your website&hellip;\" \/>\n<meta property=\"og:url\" content=\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/\" \/>\n<meta property=\"og:site_name\" content=\"Gratisvps.net | Blog Daily Tech Info\" \/>\n<meta property=\"article:published_time\" content=\"2024-03-06T12:10:01+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2024-05-25T03:18:30+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"627\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"ariete\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"ariete\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/\"},\"author\":{\"name\":\"ariete\",\"@id\":\"https:\/\/gratisvps.net\/blog\/#\/schema\/person\/cddcf8cb5192d0713c19b79425c77fc4\"},\"headline\":\"How to block AI Crawler Bots using robots.txt file\",\"datePublished\":\"2024-03-06T12:10:01+00:00\",\"dateModified\":\"2024-05-25T03:18:30+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/\"},\"wordCount\":990,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png\",\"keywords\":[\"AI Crawler\",\"Bots\",\"how to\"],\"articleSection\":[\"How To\",\"Linux\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/\",\"url\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/\",\"name\":\"How to block AI Crawler Bots using robots.txt file - Gratisvps.net | Blog Daily Tech Info\",\"isPartOf\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png\",\"datePublished\":\"2024-03-06T12:10:01+00:00\",\"dateModified\":\"2024-05-25T03:18:30+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage\",\"url\":\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png\",\"contentUrl\":\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png\",\"width\":1200,\"height\":627},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/gratisvps.net\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to block AI Crawler Bots using robots.txt file\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/gratisvps.net\/blog\/#website\",\"url\":\"https:\/\/gratisvps.net\/blog\/\",\"name\":\"Gratisvps.net | Blog Daily Tech Info\",\"description\":\"Discover reliable VPS server solutions\",\"publisher\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/gratisvps.net\/blog\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/gratisvps.net\/blog\/#organization\",\"name\":\"Gratisvps.net | Blog Daily Tech Info\",\"url\":\"https:\/\/gratisvps.net\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/gratisvps.net\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/10\/logo.png\",\"contentUrl\":\"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/10\/logo.png\",\"width\":250,\"height\":67,\"caption\":\"Gratisvps.net | Blog Daily Tech Info\"},\"image\":{\"@id\":\"https:\/\/gratisvps.net\/blog\/#\/schema\/logo\/image\/\"}},{\"@type\":\"Person\",\"@id\":\"https:\/\/gratisvps.net\/blog\/#\/schema\/person\/cddcf8cb5192d0713c19b79425c77fc4\",\"name\":\"ariete\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/gratisvps.net\/blog\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/ca385b636b0c0fe0e98479594ff50902?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/ca385b636b0c0fe0e98479594ff50902?s=96&d=mm&r=g\",\"caption\":\"ariete\"},\"sameAs\":[\"https:\/\/gratisvps.net\/blog\"],\"url\":\"https:\/\/gratisvps.net\/blog\/author\/ariete\/\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How to block AI Crawler Bots using robots.txt file - Gratisvps.net | Blog Daily Tech Info","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/","og_locale":"en_US","og_type":"article","og_title":"How to block AI Crawler Bots using robots.txt file","og_description":"re you a content creator or a blog author who generates unique, high-quality content for a living? Have you noticed that generative AI platforms like OpenAI or CCBot use your content to train their algorithms without your consent? Don\u2019t worry! You can block these AI crawlers from accessing your website&hellip;","og_url":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/","og_site_name":"Gratisvps.net | Blog Daily Tech Info","article_published_time":"2024-03-06T12:10:01+00:00","article_modified_time":"2024-05-25T03:18:30+00:00","og_image":[{"width":1200,"height":627,"url":"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png","type":"image\/png"}],"author":"ariete","twitter_card":"summary_large_image","twitter_misc":{"Written by":"ariete","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#article","isPartOf":{"@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/"},"author":{"name":"ariete","@id":"https:\/\/gratisvps.net\/blog\/#\/schema\/person\/cddcf8cb5192d0713c19b79425c77fc4"},"headline":"How to block AI Crawler Bots using robots.txt file","datePublished":"2024-03-06T12:10:01+00:00","dateModified":"2024-05-25T03:18:30+00:00","mainEntityOfPage":{"@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/"},"wordCount":990,"commentCount":0,"publisher":{"@id":"https:\/\/gratisvps.net\/blog\/#organization"},"image":{"@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage"},"thumbnailUrl":"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png","keywords":["AI Crawler","Bots","how to"],"articleSection":["How To","Linux"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/","url":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/","name":"How to block AI Crawler Bots using robots.txt file - Gratisvps.net | Blog Daily Tech Info","isPartOf":{"@id":"https:\/\/gratisvps.net\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage"},"image":{"@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage"},"thumbnailUrl":"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png","datePublished":"2024-03-06T12:10:01+00:00","dateModified":"2024-05-25T03:18:30+00:00","breadcrumb":{"@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#primaryimage","url":"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png","contentUrl":"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/03\/How-to-block-web-crawlers-1.png","width":1200,"height":627},{"@type":"BreadcrumbList","@id":"https:\/\/gratisvps.net\/blog\/how-to-block-ai-crawler-bots-using-robots-txt-file\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/gratisvps.net\/blog\/"},{"@type":"ListItem","position":2,"name":"How to block AI Crawler Bots using robots.txt file"}]},{"@type":"WebSite","@id":"https:\/\/gratisvps.net\/blog\/#website","url":"https:\/\/gratisvps.net\/blog\/","name":"Gratisvps.net | Blog Daily Tech Info","description":"Discover reliable VPS server solutions","publisher":{"@id":"https:\/\/gratisvps.net\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/gratisvps.net\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/gratisvps.net\/blog\/#organization","name":"Gratisvps.net | Blog Daily Tech Info","url":"https:\/\/gratisvps.net\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/gratisvps.net\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/10\/logo.png","contentUrl":"https:\/\/gratisvps.net\/blog\/wp-content\/uploads\/2024\/10\/logo.png","width":250,"height":67,"caption":"Gratisvps.net | Blog Daily Tech Info"},"image":{"@id":"https:\/\/gratisvps.net\/blog\/#\/schema\/logo\/image\/"}},{"@type":"Person","@id":"https:\/\/gratisvps.net\/blog\/#\/schema\/person\/cddcf8cb5192d0713c19b79425c77fc4","name":"ariete","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/gratisvps.net\/blog\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/ca385b636b0c0fe0e98479594ff50902?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/ca385b636b0c0fe0e98479594ff50902?s=96&d=mm&r=g","caption":"ariete"},"sameAs":["https:\/\/gratisvps.net\/blog"],"url":"https:\/\/gratisvps.net\/blog\/author\/ariete\/"}]}},"_links":{"self":[{"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/posts\/50"}],"collection":[{"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/comments?post=50"}],"version-history":[{"count":4,"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/posts\/50\/revisions"}],"predecessor-version":[{"id":85,"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/posts\/50\/revisions\/85"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/media\/51"}],"wp:attachment":[{"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/media?parent=50"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/categories?post=50"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gratisvps.net\/blog\/wp-json\/wp\/v2\/tags?post=50"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}