An IRC Markov chain bot https://dagbot.tech
  • Python 97.9%
  • Shell 2.1%
Find a file
2022-07-09 09:11:30 -07:00
commands Last calendar update for now 2021-04-16 14:03:50 -07:00
markov Move sqlite stuff to a "cores" concept 2020-04-29 02:22:08 -05:00
utilities Port to Python 3 2019-10-01 19:11:50 -05:00
.gitignore Move sqlite stuff to a "cores" concept 2020-04-29 02:22:08 -05:00
config_default.json Add ability to disable command plugins & do not require triggers + deprecated_triggers 2018-02-13 19:39:41 -06:00
config_schema.json Add ability to censor words/phrases from generated sentences 2018-05-13 20:54:48 -05:00
markovbrain.py Move sqlite stuff to a "cores" concept 2020-04-29 02:22:08 -05:00
README.md Add python 2 version information to README 2019-10-01 19:18:59 -05:00
requirements.txt Use https for pattern pip install 2022-07-09 09:11:30 -07:00
sadface.py Wait to join channels to give a chance for nickserv identify to go through 2022-07-09 08:51:57 -07:00
sed-cleaning.sh Add assorted small improvements 2012-06-18 18:11:49 -04:00

dagbot

An IRC Markov Chain chatbot with a simple pluggable command system using Python 3.6. Legacy Python 2 version remains on it's own branch but is and will be unmaintained.

Intro

Markov bots work on based on the simple idea of markov chains. They usually require a substantially large corpus (flat text file with known good phrases / sentences). One could call this the "brain" of dagbot. Without a large corpus, markov bots will usually generate gibberish phrases that look nothing like what a human would say.

Details

Commands have a very simple interface which tell the bot what keywords are triggers & what class should handle said keywords.

All configuration data for the bot is defined & validated via jsonschema. The config_schema.json file defines the main configuration data that dagbot uses (irc channels, response rate etc). It also serves as documentation of configuration format & what each setting means.

Configuration data for each command is also configurable but by default goes in commands/config directory. Command config is totally separate from main bot configuration and is fully customizable. At runtime, all commands are pulled in automatically from the commands directory.

Logic for markov responses is fairly simple. First the pattern library is used to try to find interesting words in a phrase like the subject. If none can be found, we fall back to just picking the first n (markov chain length) words of the phrase that was sent to the bot.

The brain file is just a flat text file of sentences that have been seen before (said by a human). Dagbot records all messages said in the IRC channels it joins (unless a channel is configured not to be recorded). This means the brain file you use will be modified when the bot is running. At runtime, the text file is parsed into actual markov chains which are stored in a sqlite3 backed dictionary. The database was used to save memory since using python's built in dictionary consumes massive amounts of memory.

Dagbot depends on the following libraries:

Running the following should install all the dependencies:

pip install -r requirements.txt

More info on pip is here if something breaks.

After you've tweaked the configuration file to your liking, you can start the bot with:

python sadface.py /path/to/config.json

There is no issue with using pypy with dagbot instead of CPython. In fact I fully recommend using pypy!

Dagbot is obviously originally derived from sadface but has at this point diverged & grown substantially from the original code.

You can find dagbot chatting away on a number of channels on snoonet if you're interested in seeing how well it performs. The #f1 channel is quite familiar with dagbot especially due to the countdown command.

Credits