Erwin v1.1.0 -- A module based IRC-bot in perl

Alex Brasetvik


Table of Contents
1. What is Erwin?
1.1. Just another IRC bot?
1.2. Features
2. User guide
2.1. Installation
2.1.1. Requirements
2.1.2. Installing from source
2.2. Configuration
2.2.1. Database setup
2.2.2. Configuring Erwin
2.2.3. Stop words
2.2.4. Bundled modules
2.2.5. Running Erwin
2.3. Using Erwin
2.3.1. Core
2.3.2. Users
2.3.3. Fact database
2.3.4. Paste bin
2.3.5. Kernel
2.3.6. URLAnnotate
2.4. Advanced configuration
2.4.1. Writing regexes

Chapter 1. What is Erwin?

1.1. Just another IRC bot?

Erwin is an IRC-bot in Perl, using POE, written with modularity and expandability in mind. Installing and configuring modules is very easy, compared to lots of other Perl bots where you have to read the source and add your own code in the various event loops.

Adding additional functionality is as easy as writing a subroutine that returns something and configuring a regex to use that subroutine.

Further expandability can be achieved using "plug-ins". Plug-ins subclass "Erwin::Plugin", and can recieve and take control over various events. For example does the pastebin-plugin need the ability to take control over every line sent privately by some user, and e.g. a quiz module would need to get every single message sent to a channel while a quiz is going on.


1.2. Features

Note: Additional modules may exist in GIT, that are not included in the distribution yet, as they are not considered stable.

  • Very easy to add regexes and write modules.

  • Advanced fact database.

  • List kernel releases on kernel.org..

  • Paste bin with support for multiple channels -- very useful for help channels.


Chapter 2. User guide

2.1. Installation

2.1.1. Requirements

Erwin is tested and developed in Linux using Debian and Fedora. A database server is needed. Erwin supports both PostgreSQL and MySQL, and probably also MariaDB, but requires one of them if you are going to use the fact database. Erwin uses the following Perl modules, all of which are available on CPAN and most likely in Debian:

  • POE

  • POE::Component::IRC

  • DBI (Required for the fact module)

  • DBD::Pg or DBD::mysql (Required for the fact module)

  • Getopt::Long

  • GDBM_File

  • MLDBM::Sync

  • Digest::MD5

  • Locale::gettext

  • Term::ReadKey

  • Log::LogLite


2.1.2. Installing from source

2.1.2.1. Install Perl modules

You have to install every module listed in the requirements, except from those required by modules you will not use.

The preferable way is to install them using your distributions package system, such as rpm, urpmi, Portage, Ports or whatever.

If not, you can always use CPAN. Run perl -MCPAN -e 'install Foo::Bar' as root, where Foo::Bar is the name of the module you want to install.


2.1.2.2. Install Erwin

With the perl modules installed, you are ready to run out of the source directory.

However, if you want to install it globally so all users on the system can use Erwin, run make install in the top-level directory. (Note: there is no need to run make.)


2.2. Configuration

2.2.1. Database setup

The database schemes are located in the src/sql subdirectory, or in /usr/share/erwin.


2.2.1.1. Using PostgreSQL

PostgreSQL is the prefered database to use with Erwin.

Create a user and database and allow the user to use the database by editing pg_hba.conf. For detailed information, consult PostgreSQL's documentation.

Set up the table by running something like psql database username > erwin.pgsql.


2.2.1.2. Using MySQL

Create a user and database and allow the user to use the database with GRANT. For detailed information, consult MySQL's documentation.

Set up the table by running something like mysql -u username -p database > erwin.mysql.


2.2.2. Configuring Erwin

There are two configuration files included with Erwin. You should not touch config.defaults, as it may and will be changed in future releases, but stick to conf.pl, which is a sample configuration file.

The configuration file is thoroughly commented, and should be pretty straight forward to set up. Just make sure you do not make any syntax errors :-)


2.2.3. Stop words

Stop words are common words, such as "the", "in", etc. that you don't want your bot to store any keywords on.

Databases for English and Norwegian stop words are included in the source distribution.


2.2.4. Bundled modules

This chapter explains the configuration of the modules included.

Most configuration variables can vary based on where the event that triggers it occur. An example:


Core => {
	#Antiflood: Time in seconds to not write the same text the same place.
       	Antiflood => 600, #Global
        privmsg => {
        	Antiflood => 60, #Local
        },
        '#channel' => {
        	Antiflood => 300, #Local
	},
}
		

Here, "Antiflood" will be 600 everywhere but in private messages and in "#channel", where it will be 60 and 300 respectively.


2.2.4.1. Core

Core is a proxy module that sends various events to the right places, and sends events back in a sane manner.

Configuration variables for "Core":

Antiflood

Time in seconds to not write the same text the same place

cron_interval

Time in seconds between every run of the cron-procedure -- which cleans up variables, etc. Reserved for future use.


2.2.4.2. Users

Users takes care of registering and authenticating users, logging out, changing passwords, adding users and granting them privileges.

Configuration variables for "Users":

Public

Wether or not registration is open to the public. If not, users can only be added through erwin --adduser or via this module's adduser command.


2.2.4.3. Facts

Facts takes care of adding, editing and deleting facts and pointers.

Note that you can define different data sources for different channels! :-)

Configuration variables for "Facts":

keyword_length

Maximum length of a keyword. Remember to update the database schemes.

fact_length

Maximum length of a fact. Remember to update the database schemes.

driver

Database driver. Either "Pg" for PostgreSQL or "mysql" for MySQL. (Mind the case.)

database

Name of the database to connect to.

username

Username to use when connecting.

password

Password to use when connecting.

table

Table the facts are stored in.


2.2.4.4. Paste

Paste is a module to make it easy to paste lots of command output or whatever without messing up the channel.

Note: Every channel needs to be configured explicitly:


{
	url_prefix => 'http://somedomain.com/channel/paste',
	path => '/var/www/somedomain.com/channel/paste',
}
			

Note: Erwin does not bother deleting old pastes, so you might want to add something like the following to your crontab:


0 0 * * * find ~/public_html/somechannel/ -iname \*.html -mtime +5 -exec rm {} \; #Delete pastes older than 5 days.
			

Configuration variables for "Paste"

template

The template to use.

expire

Maximum number of seconds between starting and closing a paste.

url_prefix

The URL prefix. Do not add a trailing slash. ("/").

path

The directory in which to store the pastes.


2.2.4.5. Kernel

Kernel fetches the latest kernel versions and displays them as a comma separated list.

Configuration variables for "Kernel":

URL

URL to fetch kernel versions from.


2.2.4.6. URLAnnotate

URLAnnotate fetches the title of URLs pasted into a channel.

Configuration variables for "URLAnnotate":

enabled

Boolean value. True enables URL annotating.

timeout

Timeout after these many seconds. Slow pages will paralyze the bot until timed out.

max_doc_size

Maximum size for document.


2.2.4.7. Nickspam

Nickspam warns people that are "nick spamming" and tells them how many times they've recieved the warning.

Later releases will include an easy way to see how many times a person has "spammed".

Configuration variables for "Nickspam":

expire

Time in seconds to store warning count.

#channel

If defined, the person will be informed that it's against #channel's rules.

#channel2

Ditto.

Note that the "expire"-value is global and may not be overridden in channels.


2.2.4.8. NoRepeat

NoRepeat warns people that are repeating the exact sentence over again.

Configuration variables for "NoRepeat":

time

Time in seconds the user aren't allowed to repeat the same sentence.

length

The length of the sentence must be longer than or equal to this value.


2.2.4.9. NoPaste

NoPaste warns people that seems to be pasting and tells them to paste in a paste bin.

Configuration variables for "NoPaste":

length

The length of the messages must be longer than or equal to this value to be counted.

count

The amount of messages needed for it to be considered pasting.

interval

Time interval in seconds.


2.2.5. Running Erwin

Before running, you probably want to add yourself a user. Run erwin -c /path/to/configuration/file --adduser and follow the instructions. Give your user the "a"-flag.

Before running Erwin, and at least before reloading it, you should do a test run. Do this by running erwin -t. You may ignore invalid regexes for modules you don't intend using.

Running Erwin once it's configured is done by typing erwin -c /path/to/configuration/file

Sample init files are included in the Debian packages.


2.3. Using Erwin

This chapter explains the use of the various modules, with the default regexes.


2.3.1. Core

2.3.1.1. Join and Part

These commands requires the administrator flag

/msg yourbot !join #channel tells your bot to join "#channel".

/msg yourbot !part #channel tells your bot to leave "#channel".


2.3.1.2. Rehash

This command requires the administrator flag.

/msg yourbot !rehash tells your bot to rehash configuration and modules.

Note: Editing regexes and removing configuration variables don't currently work.


2.3.2. Users

2.3.2.1. Log in

To use any functionality requiring necessary permissions, you have to log in.

Log in by typing /msg yourbot !auth username password. !auth is the command. "username" and "password" are arguments. You have to log in every time you change your hostmask.


2.3.2.2. Register

If registering is publicly available, users may type /msg yourbot !register username password to get a user. This user may then be granted permissions.


2.3.2.3. Grant permissions

Administrator users may grant users permissions by typing /msg yourbot !grant username flags, where "flags" is e.g. "at" for both administrator and teacher access. You may not edit your own user.


2.3.2.4. Change password

Type /msg yourbot !passwd old new to change your password.


2.3.2.5. Delete users

Administrator users may delete other users, including other administrators. Type /msg yourbot !delete username t.

You may not delete yourself.


2.3.2.6. Log out

To make Erwin forget your hostmask, type /msg yourbot !logout.


2.3.3. Fact database

2.3.3.1. Adding facts

These commands require the "teacher"-flag.

yourbot: keyword is/are explanation

Saves the fact which may be accessed using "keyword". The keyword is everything between the bot's nick and the verb "is" or "are". The bot's nick will not be part of the fact.

keyword *is/*are explanation

Saves the fact which may be accessed using "keyword". The keyword is everything till the verb "is" or "are". The fact will be saved without the asterix (*).

This explanation has the /* keyword */ in the middle of the sentence

Saves the fact which may be accessed using "keyword". The fact will be saved without the "/* */"-s.

no, yourbot, keyword is something else

Overwrites the fact with that keyword.


2.3.3.2. Deleting facts

This command require the "teacher"-flag.

yourbot: forget keyword

Deletes the fact.


2.3.3.3. Modifying facts

This command require the "teacher"-flag.

keyword =~ s/old/new/

keyword =~ s/old/new/i

keyword =~ s/old/new/g

keyword =~ s/old/new/gi

Replaces "old" with "new".

Parameters:

  • i case-insensitive

  • g globally

  • gi globally, case-insensitive


2.3.3.4. Pointers

This command require the "teacher"-flag.

pointer => fact

Creates a pointer. If a pointer is looked up or modified, the original is looked up and returned or modified instead.

Deleting a pointer does not delete the original, but deleting a fact delets all pointers to it.

If a fact with the same keyword as a pointer is saved, the new fact overwrites the existing pointer.


2.3.3.5. Looking up keywords

  • literal keyword looks up the keyword "literally". Special variables aren't substituted, and pointers aren't dereferences. Antiflood is overridden. This command requires the "teacher"-flag.

  • what is keyword? looks up the keyword and returns it if it exists.

  • yourbot: keyword(?) looks up the keyword and returns it if it exists.

  • keyword? looks up the keyword and returns it if it exists.


2.3.3.6. Redirecting keywords

If you want to tell others about some keyword, there's two ways to do it -- within the channel, or in a private message. If multiple data sources are used, the keyword for the data source where the command is sent, is used.

  • tell nick about keyword sends a private message telling "nick" about "keyword".

  • yourbot: keyword, nick explains "keyword" to "nick" where it is typed. If "keyword" contains a "$who"-variable, that one is substituted. If not, the fact is prefixedd with "nick: ".


2.3.3.7. Special variables

The following is a list of special variables which may be used:

$WHO

Whomever getting talked to.

$X or $KEY

The keyword used to access the fact.

$MDAY, $MONTH, $MNAME

Month day, month, and month name respectively.

$YEAR, $HOUR, $MIN, $SEC

Year, hour, minute, and second respectively.

$RAND(X)

A random integer between 0 and X.

<reply>

Everything after <reply> is sent. (Nothing before).

<action>

Everything after <action> sent as an action.

|||

Splits the string and picks one of them randomly.


2.3.4. Paste bin

All this has to be done in private message.


2.3.4.1. Start a pasting session

Type /msg yourbot !paste some title that's not a question. The bot will send instructions to follow. It might ask which channel the URL should be announced in if you are on multiple channels with paste bins.


2.3.4.2. Paste

Read the instructions carefully and paste when the bot tells you to.

Be careful as to not flood yourself off the server.


2.3.4.3. Announce the URL

When you are done pasting, type "!pasted", and the bot will generate an HTML-document and send a URL to the specified channel.


2.3.5. Kernel

!kernel fetches the latest kernel versions.


2.3.6. URLAnnotate

Type e.g. http://www.kernel.org, and your bot will output "[The Linux Kernel Archives]".


2.4. Advanced configuration

2.4.1. Writing regexes

A regular expression, or regex for short, is what makes Erwin do something when it recieves text.

If you don't know how to write regexes, read the short tutorial at www.perldoc.com.

Erwin's list of regexes is an array of hashes. Each hash is a regex with its configuration. When a regex is triggered, one of the following happens:

  • A subroutine in a module is run, with backrefs as arguments.

  • Some plain text is written. The text may include the same special variables mentioned above.

  • A keyword is looked up

The users which may trigger a regex can be configured by using flags.

Special configuration variables to subroutines may be specified.


2.4.1.1. Special variables

The following variables may be used in regexes:

$NICK

The current nick of the bot.

$MYNICK

The current nick of the bot, as well as common delimiters, such as ",", ":", " ", etc.

$SOMENICK

Anything assumed to be a nick -- which is any valid nick with common delimiters.


2.4.1.2. Options

regex

The actual regex.

content

The content will be sent where the regex is triggered. $who is who triggered it.

The same special keywords that may occur in a fact are valid.

pointer

Pointer to a fact.

sub

The name of a subroutine.

flags

Flags needed to trigger the regex.

nohalt

Boolean. Default: false. If true, it won't stop on this regex if hit.

args

Arguments to a sub routine.

privmsg

Boolean. Default: false. The regex is only valid in private messages, if true.

public

Boolean. Default: false. The regex is only valid in public messages, if true.


2.4.1.3. Examples


{
	regex => 'i\'?m hungry',
	content => '<action>throws a cookie on $who.'
}

The bot will trow a cookie at anybody typing "im hungry" or "i'm hungry". (The "?"-character has a special meaning.)


{
	regex => '(msg|message) me', 
	pointer => 'msg me', #Explains $who about how lame "msg me" is.
}

The bot will look up the keyword specified ("msg me") and send it to whoever writes anything including "msg me" or "message me".


{
	regex => '$MYNICK: (.*?) (=) (.*)',
	sub => 'Facts::Learn',
	flags => 't',
}

Another way to learn. (Notice the flags.).