Infomer is Isomer's IRC bot. Infomer's claim to fame is being a better infobot than infobot. It does this by exploiting more knowledge about the english language and using it to learn. Infomer is written in Python.
Infomer is currently in pieces as I try to rewrite it to be more modular and more easily extendable.
A lot of controversy arose from Infomer on #wlug, due to it being used for abuse, specifically:
In answer:
As an experiment I'm placing some of infomer's code in the wiki for people to update. This code isn't executed directly, I have to cut and paste it into infomer every so often, but I'm curious if/how people will update it.
This function takes a line and splits it up into sentances, doing some munging along the way. It also should figure out if this is a directed message, and if so who it is directed to. For example:
<Isomer> Infomer: you suck
should figure out the sentance "you suck" is directed at Infomer. It doesn't have to figure out what "you" is, parse_Sentance does that. It should however figure out split up sentances and convert them into multiple sentances. eg:
<Isomer> Isomer is cool, and is funky
should be split up into "Isomer is cool" "Isomer is funky" (with who being "Isomer" and the target being None). Punctuation in general should be stripped.
def parse_line(who,text): tm = x = re.match(r"(?P<string>\w+)\: ([\w\' ]+)", text) ntext = "" first = 1 if not tm: target = None line = text else: target = tm.group(1) line = tm.group(2) # Now with the target out of the way, begin stripping prefixes and conjunctions etc. delims = ["?", ".", "and", ","] for d in delims: line = line.replace(d, ".") sentences = line.split(".") for s in sentences: words = s.split(" ") for p in prefixes: if p in words: words.remove(p) if first == 1: first = 0 ntext = " ".join(words) parse_sentence(who, target, ntext)
This function's job is to take a sentance and clean it up, replacing "you" with "Isomer is", removing little words, and rearranging any sentances to make more sense to the bot. For example this function should be able to take:
(who=Isomer,target=Infomer) "You are a very stupid bot"
and turn it into
(who=Isomer,target=Infomer) "Infomer is very stupid" (who=Isomer,target=Infomer) "Infomer is a bot" def parse_sentence(speaker, target, sentence): # target is who a sentence was aimed at. # replacements is a list of tuples. Each tuple is (matchlist), (replacement) # It's ok to have a replacement that is further expanded by another rule. # use lower-case since we map user text to lower-case for the comparison :) replacements = [ # abbreviations - case sensitivity? (["you're"], ["you", "are"]), (["I'm"], ["I", "am"]), (["It's"], ["It", "is"]), (["it's"], ["it", "is"]), (["I", "am"], [speaker, "is"]), (["my"], [speaker + "'s"]), ] if target != None: replacements.extend( [ (["you", "are"], [target, "is"]), (["are", "you"], ["is", target]), (["your"], [target + "'s"]), ### bad idea? (["you"], [target]), # catch-all ] ) unparsed_tokens = sentence.split() parsed_tokens = [] while len(unparsed_tokens) > 0: # assume len() is evaluated each time for pair in replacements: made_expansion = 0 term_len = len(pair[0]) if len(unparsed_tokens) >= term_len and \ map(str.lower,unparsed_tokens[:term_len]) == pair[0]: # replace match with replacement unparsed_tokens = pair[1] + unparsed_tokens[term_len:] made_expansion = 1 break if made_expansion == 0: # we couldn't make any expansions at this point... parsed_tokens.append( unparsed_tokens.pop(0) ) parse_phrase(speaker, parsed_tokens)
parse_phrase takes a sentance and calls the database primatives on it.
def parse_phrase(who,text): for i in questions: if i==text[0].lower(): obj=Object(text[2:]) return get_fact(obj,text[1]) first=len(text) first_verb=None for i in verbs: if i in text and text.index(i)<first: first=text.index(i) first_verb=i # Not a recognised statement if first_verb==None: return "" # split into two halves and a verb, eg: # Perry's hair is very cool -> (Perry's Hair,is,Very Cool) lhs = text[:first] verb = text[first] rhs = text[first+1:] if " ".join(lhs).lower() in fake_lhs: return "" obj=Object(lhs) add_fact(obj,verb,parse_definition(verb,rhs)) return ""
This function removes a prefix from a sentence
def remove_prefix(text): prefixes = [ ["and"], ["also"], ["ah"], [ahh"], ["anyway"], ["apparently"], ["although"], ["but"], ["bah"], ["besides"], ["no"], ["yes"], ["yeah"] ] flag=1 while flag==1: flag=0 for i in prefixes: if map(str.lower,text[:!len(i)])==i: text=text[len(i)+1:] flag=1 return text
An object to hold information about an uh, object.
# An class to hold an object's information class Object: def __init__(self,tokens): self.tokens=[] token="" # Join up words into tokens # eg: Isomer's Left Foot's sole -> (Isomer,Left Foot,sole) for i in tokens: token=token+" "+i if len(token)>2 and token[-2:]=="'s": token=token[:-2] self.tokens.append(token.strip()) token="" # This intentionally adds empty token when it is "" self.tokens.append(token.strip()) def __repr__(self): return `self.tokens`
Figures out if this is a special type of information
def parse_definition(verb,text): if text[0].lower() in ["a","an","the"]: return (ISA,text[1:]) if " ".join(text[:2]).lower()=="known as": return (AKA,text[2:]) if " ".join(text[:3]).lower()=="also known as": return (AKA,text[3:]) return (NORMAL,text)
Parse:
Infomer is a very stupid bot
as
Infomer is very stupid Infomer is a bot
requires knowledge of adverbs 'n stuff. adverbs can mostly be detected by checking for "ly" on the end of words.
Add:
tell nick that message
when nick next says something say who wanted to me to tell you that message
3 pages link to Infomer: