Infomer - Waikato Linux Users Group

Note: You are viewing an old revision of this page. View the current version.

Infomer is Isomer?'s IRC bot. Infomer's claim to fame is being a better infobot than infobot. It does this by exploiting more knowledge about the english language and using it to learn. Infomer is written in LanguagePython?.

Infomer is currently in pieces as I try to rewrite it to be more modular and more easily extendable.

A lot of controversy arose from Infomer on #wlug, due to it being used for abuse, specifically:

The fact that when the bot does say something, it can go on for pages and pages at a time
The fact that the bot often interjects with random crud from time to time
People teaching it silly facts and then getting delight as it recited them.

In answer:

This is mostly for debugging and testing, it's very hard to see what a bot is learning if it will only repeat other facts. This has been removed from the new version of the bot and new interfaces are being examined.
The bot interjecting random crud I'm hoping to fix by making the bot smarter with what it replies and smarter at figuring out when it should say something. For instance when someone asks a question wait a few seconds, if noone else has spoken, only then reply.
Well, this obviously shows that people are enjoying playing with the bot, not much you can code around this with.

As an experiment I'm placing some of infomer's code in the wiki for people to update. This code isn't executed directly, I have to cut and paste it into infomer every so often, but I'm curious if/how people will update it.

Parse_line

This function takes a line and splits it up into sentances, doing some munging along the way. It also should figure out if this is a directed message, and if so who it is directed to. For example: <Isomer> Infomer: you suck
should figure out the sentance "you suck" is directed at Infomer. It doesn't have to figure out what "you" is, parse_Sentance does that. It should however figure out split up sentances and convert them into multiple sentances. eg: <Isomer> Isomer is cool, and is funky

should be split up into "Isomer is cool" "Isomer is funky" (with who being "Isomer" and the target being None). Punctuation in general should be stripped.

def parse_line(who,text)

tm = x = re.match(r"(?P<string>\w+)\: ([\w\'?+)", text)

ntext = "" first = 1

if not tm

target = None line = text

else

target = tm.!group(1)? line = tm.!group(2)?

Now with the target out of the way, begin stripping prefixes and conjunctions etc.

delims = ["?", ".", "and", ","? pronouns = ["your"?

for d in delims

line = string.replace(line, d, ".")

sentences = string.split(line, ".")

for s in sentences

words = string.split(s, " ")

for p in prefixes

if p in words

words.remove(p)

if first == 1

first = 0

ntext = string.join(words, " ")

if target != None

for p in pronouns

string.replace(ntext, p, target + "'s")

parse_sentence(who, target, ntext)

Parse_sentance

This function's job is to take a sentance and clean it up, replacing "you" with "Isomer is", removing little words, and rearranging any sentances to make more sense to the bot. For example this function should be able to take: (who=Isomer,target=Infomer) "You are a very stupid bot"

and turn it into

(who=Isomer,target=Infomer) "Infomer is very stupid" (who=Isomer,target=Infomer) "Infomer is a bot"

def parse_sentence(speaker, target, sentence)

target is who a sentence was aimed at.

replacements is a list of tuples. Each tuple is (matchlist), (replacement)

It's ok to have a replacement that is further expanded by another rule.

use lower-case since we map user text to lower-case for the comparison :)

replacements = [

abbreviations - case sensitivity?

(["you're"?, ["you", "are"?), (["I'm"?, ["I", "am"?), (["It's"?, ["It", "is"?), (["it's"?, ["it", "is"?), (["I", "am"?, [speaker, "is"?), (["my"?, [speaker + "'s"?), ]

if (target != None)

replacements.extend(

([["you", "are"?, [target, "is"?), (["are", "you"?, ["is", target?), (["your"?, [target + "'s"?),

bad idea? (["you"?, [target?), # catch-all

]

)

unparsed_tokens = string.split(sentence) parsed_tokens = [?

while (len(unparsed_tokens) > 0): # assume len() is evaluated each time

for pair in replacements

made_expansion = 0

term_len = len(pair[0?) if (len(unparsed_tokens) >= term_len and

map(string.lower,unparsed_tokensterm_len?) == pair[0?)

replace match with replacement

unparsed_tokens = pair[1? + unparsed_tokens? made_expansion = 1 break

if (made_expansion == 0)

we couldn't make any expansions at this point...

parsed_tokens.append( unparsed_tokens.pop(0) )

parse_phrase(speaker, parsed_tokens)

parse_phrase

parse_phrase takes a sentance and calls the database primatives on it.

def parse_phrase(who,text)

for i in questions

if i==text[0?.lower(): obj=Object(text?) return get_fact(obj,text[1?)

first=len(text) first_verb=None

for i in verbs

if i in text and text.index(i)<first

first=text.index(i) first_verb=i

Not a recognised statement

if first_verb==None

return ""

split into two halves and a verb, eg:

Perry's hair is very cool -> (Perry's Hair,is,Very Cool)

lhs = textfirst? verb = text[first? rhs = text?

if " ".join(lhs).lower() in fake_lhs

return ""

obj=Object(lhs) add_fact(obj,verb,parse_definition(verb,rhs))

return ""

Misc functions

This function removes a prefix from a sentance

def remove_prefix(text)

prefixes = [[

["and"?, ["also"?, ["ah"?, [ahh"?, ["anyway"?, ["apparently"?, ["although"?, ["but"?, ["bah"?, ["besides"?, ["no"?, ["yes"?, ["yeah"? ]

flag=1

while flag==1

flag=0

for i in prefixes

if map(lambda x:string.lower(x),text!len(i)?)==i

text=text?

flag=1

return text

An object to hold information about an uh, object.

An class to hold an object's information

class Object

def init(self,tokens)

self.tokens=[? token=""

Join up words into tokens

eg: Isomer's Left Foot's sole -> (Isomer,Left Foot,sole)

for i in tokens

token=token+" "+i

if len(token)>2 and token?=="'s"

token=token-2? self.tokens.append(token.strip()) token=""

This intentionally adds empty token when it is ""

self.tokens.append(token.strip())

def repr(self)

return `self.tokens`

Figures out if this is a special type of information

def parse_definition(verb,text)

if text[0?.lower() in ["a","an","the"?: return (ISA,text?)
if " ".join(text2?).lower()=="known as": return (AKA,text?)
if " ".join(text3?).lower()=="also known as": return (AKA,text?)

return (NORMAL,text)

3 pages link to Infomer:

Version 1, saved on Thursday, December 19, 2002 11:42:43 pm by PerryLorier

Edit PageHistory Diff Info LikePages