The Robots Exclusion Protocol or robots.txt protocol, is a convention to prevent cooperating web spiders and other web robots from accessing all or part of a website which is otherwise publicly viewable.
This project provides an easy-to-use class, implemented in C#, to work with robots.txt files.
- Loading Robots.txt files by providing Url or file content
- Easy-to-use and
- Fluent interface
- Supports multiple User-Agents
- Supports different types of entries:
- Disallow entries
- Allow entries
- Sitemap entries
- Crawl-delay entries
- Supports comments
- Supports wild cards (both * and $)