# As a condition of accessing this website, you agree to abide by the following # content signals: # (a) If a Content-Signal = yes, you may collect content for the corresponding # use. # (b) If a Content-Signal = no, you may not collect content for the # corresponding use. # (c) If the website operator does not include a Content-Signal for a # corresponding use, the website operator neither grants nor restricts # permission via Content-Signal with respect to the corresponding use. # The content signals and their meanings are: # search: building a search index and providing search results (e.g., returning # hyperlinks and short excerpts from your website's contents). Search does not # include providing AI-generated search summaries. # ai-input: inputting content into one or more AI models (e.g., retrieval # augmented generation, grounding, or other real-time taking of content for # generative AI search answers). # ai-train: training or fine-tuning AI models. # ANY RESTRICTIONS EXPRESSED VIA CONTENT SIGNALS ARE EXPRESS RESERVATIONS OF # RIGHTS UNDER ARTICLE 4 OF THE EUROPEAN UNION DIRECTIVE 2019/790 ON COPYRIGHT # AND RELATED RIGHTS IN THE DIGITAL SINGLE MARKET. # BEGIN Cloudflare Managed content User-agent: * Content-Signal: search=yes,ai-train=no Allow: / User-agent: Amazonbot Disallow: / User-agent: Applebot-Extended Disallow: / User-agent: Bytespider Disallow: / User-agent: CCBot Disallow: / User-agent: ClaudeBot Disallow: / User-agent: CloudflareBrowserRenderingCrawler Disallow: / User-agent: Google-Extended Disallow: / User-agent: GPTBot Disallow: / User-agent: meta-externalagent Disallow: / # END Cloudflare Managed Content # Comprehensive SEO-Friendly Robots.txt for Ledger Live Website # Website: https://ledge-live.org/ # Last Updated: January 2025 # Purpose: Maximize search engine visibility while blocking harmful bots # ============================================================================= # MAJOR SEARCH ENGINES - FULL ACCESS ALLOWED # ============================================================================= # Google Search Engine - Full Access User-agent: Googlebot Allow: / Crawl-delay: 1 # Google Image Search - Full Access User-agent: Googlebot-Image Allow: / Crawl-delay: 1 # Google Mobile Search - Full Access User-agent: Googlebot-Mobile Allow: / Crawl-delay: 1 # Google News - Full Access User-agent: Googlebot-News Allow: / Crawl-delay: 1 # Bing Search Engine - Full Access User-agent: Bingbot Allow: / Crawl-delay: 1 # Yahoo Search Engine - Full Access User-agent: Slurp Allow: / Crawl-delay: 2 # Yandex Search Engine - Full Access User-agent: YandexBot Allow: / Crawl-delay: 2 # Baidu Search Engine - Full Access User-agent: Baiduspider Allow: / Crawl-delay: 3 # DuckDuckGo Search Engine - Full Access User-agent: DuckDuckBot Allow: / Crawl-delay: 2 # ============================================================================= # LEGITIMATE CRAWLERS AND TOOLS - CONTROLLED ACCESS # ============================================================================= # Facebook Social Media Crawler User-agent: facebookexternalhit Allow: / Crawl-delay: 2 # Twitter Social Media Crawler User-agent: Twitterbot Allow: / Crawl-delay: 2 # LinkedIn Social Media Crawler User-agent: LinkedInBot Allow: / Crawl-delay: 3 # WhatsApp Link Preview User-agent: WhatsApp Allow: / Crawl-delay: 2 # Telegram Link Preview User-agent: TelegramBot Allow: / Crawl-delay: 2 # Apple Search (Siri, Spotlight) User-agent: Applebot Allow: / Crawl-delay: 2 # Microsoft Search (Cortana) User-agent: MicrosoftPreview Allow: / Crawl-delay: 2 # ============================================================================= # HARMFUL AI BOTS AND SCRAPERS - BLOCKED # ============================================================================= # OpenAI GPT Crawlers - Block to prevent unauthorized training User-agent: GPTBot Disallow: / User-agent: ChatGPT-User Disallow: / User-agent: CCBot Disallow: / User-agent: anthropic-ai Disallow: / User-agent: Claude-Web Disallow: / # Google AI Training Bots - Block unauthorized AI training User-agent: Google-Extended Disallow: / User-agent: GoogleOther Disallow: / # Common Content Scrapers - Block aggressive scrapers User-agent: SemrushBot Disallow: / User-agent: AhrefsBot Disallow: / User-agent: MJ12bot Disallow: / User-agent: DotBot Disallow: / User-agent: BLEXBot Disallow: / User-agent: DataForSeoBot Disallow: / # Malicious and Spam Bots - Block harmful traffic User-agent: SiteBot Disallow: / User-agent: spbot Disallow: / User-agent: TwengaBot Disallow: / User-agent: dotbot Disallow: / User-agent: AhrefsBot Disallow: / User-agent: MegaIndex Disallow: / User-agent: BUbiNG Disallow: / User-agent: Cliqzbot Disallow: / User-agent: MojeekBot Disallow: / User-agent: PiplBot Disallow: / User-agent: woobot Disallow: / User-agent: ZoominfoBot Disallow: / # Email Harvesters and Spam Bots User-agent: EmailCollector Disallow: / User-agent: EmailSiphon Disallow: / User-agent: WebBandit Disallow: / User-agent: EmailWolf Disallow: / User-agent: ExtractorPro Disallow: / User-agent: CopyRightCheck Disallow: / User-agent: Crescent Disallow: / User-agent: SiteSnagger Disallow: / User-agent: ProWebWalker Disallow: / User-agent: CheeseBot Disallow: / User-agent: LNSpiderguy Disallow: / User-agent: ia_archiver Disallow: / User-agent: ia_archiver/1.6 Disallow: / User-agent: Wayback Disallow: / User-agent: Wget Disallow: / User-agent: WebStripper Disallow: / User-agent: WebCopier Disallow: / User-agent: WebReaper Disallow: / User-agent: WebSauger Disallow: / User-agent: Website\ eXtractor Disallow: / User-agent: WebsiteQuester Disallow: / User-agent: WebZIP Disallow: / User-agent: Teleport Disallow: / User-agent: TeleportPro Disallow: / User-agent: Microsoft\ URL\ Control Disallow: / User-agent: MIIxpc Disallow: / User-agent: Telesoft Disallow: / User-agent: Website\ Quester Disallow: / User-agent: moget/2.1 Disallow: / User-agent: WebZip/4.0 Disallow: / User-agent: WebStripper/2.03 Disallow: / User-agent: WebSauger/5.0 Disallow: / User-agent: WebCopier/v.2.2 Disallow: / User-agent: NetAnts Disallow: / User-agent: Mister\ PiX Disallow: / User-agent: WebAuto/4.0 Disallow: / User-agent: TheNomad/2.7 Disallow: / User-agent: WWW-Collector-E/1.8 Disallow: / User-agent: RMA Disallow: / User-agent: libWeb/clsHTTP Disallow: / User-agent: asterias Disallow: / User-agent: httplib Disallow: / User-agent: turingos Disallow: / User-agent: spanner Disallow: / User-agent: InfoNaviRobot Disallow: / User-agent: Harvest/1.5 Disallow: / User-agent: Bullseye/1.0 Disallow: / User-agent: Mozilla/4.0\ (compatible;\ BullsEye;\ Windows\ 95) Disallow: / User-agent: Crescent\ Internet\ ToolPak\ HTTP\ OLE\ Control\ v.1.0 Disallow: / User-agent: CherryPickerSE/1.0 Disallow: / User-agent: CherryPickerElite/1.0 Disallow: / User-agent: WebBandit/3.50 Disallow: / User-agent: NICErsPRO Disallow: / User-agent: Microsoft\ URL\ Control\ -\ 5.01.4511 Disallow: / User-agent: DittoSpyder Disallow: / User-agent: Foobot Disallow: / User-agent: WebmasterWorldForumBot Disallow: / User-agent: SpankBot Disallow: / User-agent: BotALot Disallow: / User-agent: lwp-trivial/1.34 Disallow: / User-agent: lwp-trivial Disallow: / User-agent: BunnySlippers Disallow: / User-agent: Microsoft\ URL\ Control\ -\ 6.00.8169 Disallow: / User-agent: URLy\ Warning Disallow: / User-agent: Wget/1.6 Disallow: / User-agent: Wget/1.5.3 Disallow: / User-agent: Wget Disallow: / User-agent: LinkWalker/2.0 Disallow: / User-agent: cosmos Disallow: / User-agent: moget Disallow: / User-agent: hloader Disallow: / User-agent: humanlinks Disallow: / User-agent: LinkextractorPro Disallow: / User-agent: Offline\ Explorer Disallow: / User-agent: Mata\ Hari Disallow: / User-agent: LexiBot Disallow: / User-agent: Web\ Image\ Collector Disallow: / User-agent: The\ Intraformant Disallow: / User-agent: True_Robot/1.0 Disallow: / User-agent: True_Robot Disallow: / User-agent: BlowFish/1.0 Disallow: / User-agent: JennyBot Disallow: / User-agent: MIIxpc/4.2 Disallow: / User-agent: BuiltBotTough Disallow: / User-agent: ProPowerBot/2.14 Disallow: / User-agent: BackDoorBot/1.0 Disallow: / User-agent: toCrawl/UrlDispatcher Disallow: / User-agent: WebEnhancer Disallow: / User-agent: suzuran Disallow: / User-agent: VoidEYE Disallow: / User-agent: Cyclone Disallow: / User-agent: UtilMind\ HTTPGet Disallow: / User-agent: IRLbot/2.0 Disallow: / User-agent: IRLbot/3.0 Disallow: / User-agent: Blackwidow Disallow: / User-agent: Bot\ mailto:craftbot@yahoo.com Disallow: / User-agent: ChinaClaw Disallow: / User-agent: Custo Disallow: / User-agent: DISCo Disallow: / User-agent: Download\ Demon Disallow: / User-agent: eCatch Disallow: / User-agent: EirGrabber Disallow: / User-agent: EmailSiphon Disallow: / User-agent: EmailWolf Disallow: / User-agent: Express\ WebPictures Disallow: / User-agent: ExtractorPro Disallow: / User-agent: EyeNetIE Disallow: / User-agent: FlashGet Disallow: / User-agent: GetRight Disallow: / User-agent: GetWeb! Disallow: / User-agent: Go!Zilla Disallow: / User-agent: Go-Ahead-Got-It Disallow: / User-agent: GrabNet Disallow: / User-agent: Grafula Disallow: / User-agent: HMView Disallow: / User-agent: HTTrack Disallow: / User-agent: Image\ Stripper Disallow: / User-agent: Image\ Sucker Disallow: / User-agent: Indy\ Library Disallow: / User-agent: InterGET Disallow: / User-agent: Internet\ Ninja Disallow: / User-agent: JetCar Disallow: / User-agent: JOC\ Web\ Spider Disallow: / User-agent: larbin Disallow: / User-agent: LeechFTP Disallow: / User-agent: Mass\ Downloader Disallow: / User-agent: MIDown\ tool Disallow: / User-agent: Mister\ PiX Disallow: / User-agent: Navroad Disallow: / User-agent: NearSite Disallow: / User-agent: NetAnts Disallow: / User-agent: NetSpider Disallow: / User-agent: Net\ Vampire Disallow: / User-agent: NetZIP Disallow: / User-agent: Octopus Disallow: / User-agent: Offline\ Explorer Disallow: / User-agent: Offline\ Navigator Disallow: / User-agent: PageGrabber Disallow: / User-agent: Papa\ Foto Disallow: / User-agent: pavuk Disallow: / User-agent: pcBrowser Disallow: / User-agent: RealDownload Disallow: / User-agent: ReGet Disallow: / User-agent: SiteSnagger Disallow: / User-agent: SmartDownload Disallow: / User-agent: SuperBot Disallow: / User-agent: SuperHTTP Disallow: / User-agent: Surfbot Disallow: / User-agent: tAkeOut Disallow: / User-agent: Teleport\ Pro Disallow: / User-agent: VoidEYE Disallow: / User-agent: Web\ Image\ Collector Disallow: / User-agent: Web\ Sucker Disallow: / User-agent: WebAuto Disallow: / User-agent: WebCopier Disallow: / User-agent: WebFetch Disallow: / User-agent: WebGo\ IS Disallow: / User-agent: WebLeacher Disallow: / User-agent: WebReaper Disallow: / User-agent: WebSauger Disallow: / User-agent: Website\ eXtractor Disallow: / User-agent: Website\ Quester Disallow: / User-agent: WebStripper Disallow: / User-agent: WebWhacker Disallow: / User-agent: WebZIP Disallow: / User-agent: Widow Disallow: / User-agent: WWWOFFLE Disallow: / User-agent: Xaldon\ WebSpider Disallow: / User-agent: Zeus Disallow: / # ============================================================================= # DIRECTORY AND FILE RESTRICTIONS # ============================================================================= # Default rules for all other bots not specifically mentioned User-agent: * # Allow access to important SEO and public files Allow: /robots.txt Allow: /sitemap.xml Allow: /sitemap*.xml Allow: /.well-known/ Allow: /favicon.ico Allow: /apple-touch-icon*.png Allow: /browserconfig.xml Allow: /manifest.json # Block access to sensitive directories and files Disallow: /admin/ Disallow: /administrator/ Disallow: /wp-admin/ Disallow: /wp-includes/ Disallow: /wp-content/plugins/ Disallow: /wp-content/themes/ Disallow: /cgi-bin/ Disallow: /tmp/ Disallow: /temp/ Disallow: /cache/ Disallow: /logs/ Disallow: /log/ Disallow: /backup/ Disallow: /backups/ Disallow: /config/ Disallow: /configuration/ Disallow: /includes/ Disallow: /inc/ Disallow: /lib/ Disallow: /libraries/ Disallow: /vendor/ Disallow: /node_modules/ Disallow: /private/ Disallow: /secure/ Disallow: /hidden/ Disallow: /internal/ Disallow: /test/ Disallow: /tests/ Disallow: /testing/ Disallow: /dev/ Disallow: /development/ Disallow: /staging/ Disallow: /beta/ Disallow: /alpha/ Disallow: /demo/ Disallow: /sample/ Disallow: /examples/ # Block access to sensitive file types Disallow: /*.sql$ Disallow: /*.db$ Disallow: /*.log$ Disallow: /*.bak$ Disallow: /*.old$ Disallow: /*.tmp$ Disallow: /*.conf$ Disallow: /*.config$ Disallow: /*.ini$ Disallow: /*.env$ Disallow: /*.key$ Disallow: /*.pem$ Disallow: /*.crt$ Disallow: /*.p12$ Disallow: /*.pfx$ Disallow: /*~$ Disallow: /*.swp$ Disallow: /*.DS_Store$ Disallow: /Thumbs.db$ # Block access to common CMS and framework files Disallow: /wp-config.php Disallow: /wp-config-sample.php Disallow: /wp-settings.php Disallow: /wp-load.php Disallow: /wp-blog-header.php Disallow: /xmlrpc.php Disallow: /readme.html Disallow: /readme.txt Disallow: /license.txt Disallow: /changelog.txt Disallow: /install.php Disallow: /upgrade.php Disallow: /setup.php Disallow: /config.php Disallow: /configuration.php Disallow: /settings.php Disallow: /.htaccess Disallow: /.htpasswd Disallow: /web.config Disallow: /composer.json Disallow: /composer.lock Disallow: /package.json Disallow: /package-lock.json Disallow: /yarn.lock Disallow: /Gruntfile.js Disallow: /gulpfile.js Disallow: /webpack.config.js # Block access to version control and development files Disallow: /.git/ Disallow: /.svn/ Disallow: /.hg/ Disallow: /.bzr/ Disallow: /CVS/ Disallow: /.gitignore Disallow: /.gitattributes Disallow: /.editorconfig Disallow: /.eslintrc Disallow: /.jshintrc Disallow: /.sass-cache/ Disallow: /.vscode/ Disallow: /.idea/ Disallow: /.sublime-project Disallow: /.sublime-workspace # Set crawl delay for general bots to prevent server overload Crawl-delay: 5 # ============================================================================= # SITEMAP DECLARATIONS # ============================================================================= # Main sitemap location for search engines Sitemap: https://ledge-live.org/sitemap.xml # Additional sitemaps (if applicable) Sitemap: https://ledge-live.org/sitemap-pages.xml Sitemap: https://ledge-live.org/sitemap-blog.xml Sitemap: https://ledge-live.org/sitemap-images.xml # ============================================================================= # ADDITIONAL NOTES # ============================================================================= # This robots.txt file is designed to: # 1. Allow all major search engines full access for optimal SEO # 2. Block harmful AI bots and content scrapers # 3. Protect sensitive directories and files # 4. Prevent server overload with appropriate crawl delays # 5. Provide clear sitemap locations for better indexing # # Last updated: January 2025 # Website: https://ledge-live.org/ # Contact: webmaster@ledge-live.org (for legitimate crawling requests)