Infomer is Isomer?'s IRC bot. Infomer's claim to fame is being a better infobot than infobot. It does this by exploiting more knowledge about the english language and using it to learn. Infomer is written in LanguagePython?.
Infomer is currently in pieces as I try to rewrite it to be more modular and more easily extendable.
A lot of controversy arose from Infomer on #wlug, due to it being used for abuse, specifically:
In answer:
As an experiment I'm placing some of infomer's code in the wiki for people to update. This code isn't executed directly, I have to cut and paste it into infomer every so often, but I'm curious if/how people will update it.
should be split up into "Isomer is cool" "Isomer is funky" (with who being "Isomer" and the target being None). Punctuation in general should be stripped.
tm = x = re.match(r"(?P<string>\w+)\: ([\w\'?+)", text)
ntext = "" first = 1
- Now with the target out of the way, begin stripping prefixes and conjunctions etc.
delims = ["?", ".", "and", ","? pronouns = ["your"?
- for d in delims
line = string.replace(line, d, ".")
sentences = string.split(line, ".")
- for s in sentences
words = string.split(s, " ")
- for p in prefixes
- if p in words
words.remove(p)
- if first == 1
first = 0
ntext = string.join(words, " ")
- if target != None
- for p in pronouns
string.replace(ntext, p, target + "'s")
parse_sentence(who, target, ntext)
and turn it into
(who=Isomer,target=Infomer) "Infomer is very stupid" (who=Isomer,target=Infomer) "Infomer is a bot"
- def parse_sentence(speaker, target, sentence)
- target is who a sentence was aimed at.
- replacements is a list of tuples. Each tuple is (matchlist), (replacement)
- It's ok to have a replacement that is further expanded by another rule.
use lower-case since we map user text to lower-case for the comparison :)
replacements = [
- abbreviations - case sensitivity?
(["you're"?, ["you", "are"?), (["I'm"?, ["I", "am"?), (["It's"?, ["It", "is"?), (["it's"?, ["it", "is"?), (["I", "am"?, [speaker, "is"?), (["my"?, [speaker + "'s"?), ]
bad idea? (["you"?, [target?), # catch-all
]
)
unparsed_tokens = string.split(sentence) parsed_tokens = [?
while (len(unparsed_tokens) > 0): # assume len() is evaluated each time
- for pair in replacements
made_expansion = 0
term_len = len(pair[0?) if (len(unparsed_tokens) >= term_len and
- if (made_expansion == 0)
- we couldn't make any expansions at this point...
parsed_tokens.append( unparsed_tokens.pop(0) )
parse_phrase(speaker, parsed_tokens)
parse_phrase takes a sentance and calls the database primatives on it.
first=len(text) first_verb=None
- for i in verbs
- if i in text and text.index(i)<first
first=text.index(i) first_verb=i
- Not a recognised statement
- if first_verb==None
return ""
- split into two halves and a verb, eg:
- Perry's hair is very cool -> (Perry's Hair,is,Very Cool)
lhs = textfirst? verb = text[first? rhs = text?
- if " ".join(lhs).lower() in fake_lhs
return ""
obj=Object(lhs) add_fact(obj,verb,parse_definition(verb,rhs))
return ""
This function removes a prefix from a sentance
prefixes = [[
["and"?, ["also"?, ["ah"?, [ahh"?, ["anyway"?, ["apparently"?, ["although"?, ["but"?, ["bah"?, ["besides"?, ["no"?, ["yes"?, ["yeah"? ]
flag=1
- while flag==1
flag=0
return text
An object to hold information about an uh, object.
self.tokens=[? token=""
- Join up words into tokens
- eg: Isomer's Left Foot's sole -> (Isomer,Left Foot,sole)
- for i in tokens
token=token+" "+i
- This intentionally adds empty token when it is ""
self.tokens.append(token.strip())
return `self.tokens`
Figures out if this is a special type of information
3 pages link to Infomer:
lib/main.php:944: Notice: PageInfo: Cannot find action page
lib/main.php:839: Notice: PageInfo: Unknown action