Below is a categorization of possible useful Wikipedia bots. None of these possibilities are necessarily recommended or implemented.
Examples of actual bots used on Wikipedia can be found at history of Wikipedia bots.
Editing bots
- Automatic importer-by-request: This bot imports entries from a public/GFDL database one manual request at a time. There is not an implementation of a user interface for seeing possible entries, etc.
- Automatic importer: This bot imports batches of entries from a public/GFDL database. If used, it is expected that it imports Wikipedia entries that are as well-formed as possible. See the history of Wikipedia bots for examples.
- Other automatic tools and scripts: This includes spell checkers, wikifiers, etc. The possibilities are endless.
- Anti-vandalism: Finds pages that have been blanked, or nearly blanked and if weighted set dictionary of <new version> is far less significant than <old version> notes it on some maintenance page and reverts. Newer versions of anti-vandalism bots are more extensive and have been able to automatically revert a large portion of the vandalism that occurs.
- Ban enforcement: Finds and reverts changes by suspicious new users/shared IPs/hosting IPs (open ports) to pages targeted by sockpuppets as a possible recurrence of a banned user using alternate IP addresses. Older users can restore any such edits that don't appear to be by the banned user in question.
WikiProject tagging and auto-assessment bots
WikiProject tagging bots add WikiProject banner templates to talk pages on behalf of WikiProjects based on a set of instructions (for example, a list of categories).
Automatic assessment bots also tag specific project banners with the appropriate class where possible (i.e. stub, FA, FL, etc.). When visiting the talk page of an article that has been auto-assessed, you should remove the "|auto=???
" parameter from the template and, if necessary, re-assess the article based on the WikiProject assessment scale.
Non-editing bots
- Data miner
- A tool which attempts to use information extraction techniques to extract structured information from Wikipedia. If you want to do this, it is preferable to download a database dump and run the bot on your own server. You will get vastly better performance, and will not interfere with other Wikipedia users, or cause unneeded network traffic. The only disadvantage of this is that your copy of Wikipedia will not incorporate the most recent changes; but this should not be too big an issue for most information extraction applications. (You can always redownload and rerun the application at a later date.)
- Vandalism identifier
- Uses heuristics to search for possible uncorrected vandalism (preserved changes by known vandals, curse words list, similar edits by other vandals etc.).
- Copyright violation identifier
- Similar to vandalism identifier, compares chunks of text on new pages to what already exists on the internet; reports possible infringements to a page where human editors can review. One example of this type of bot would be CorenSearchBot.
See also
- Wikipedia:Bots/Status (this page will need updating), with explanation of what each bot does.
- Wikipedia:List of bots by number of edits
- Help:MakeBot, allows bureaucrats to grant and revoke bot status from user accounts.
- meta:bot
- m:Bot policy
- meta:Countervandalism Network/Bots
- m:Using_the_python_wikipediabot
- Wikipedia:Semi-bots (guideline proposal started 15 April 2006)