Penguin
Annotated edit history of Infomer version 9 showing authors affecting page license. View with all changes included.
Rev Author # Line
4 PerryLorier 1 Infomer is [Isomer|PerryLorier]'s [IRC] bot. Infomer's claim to fame is being a better infobot than infobot. It does this by exploiting more knowledge about the english language and using it to learn. Infomer is written in [Python].
2
3 Infomer is currently in pieces as I try to rewrite it to be more modular and more easily extendable.
4
5 A lot of controversy arose from Infomer on #wlug, due to it being used for abuse, specifically:
6 # The fact that when the bot does say something, it can go on for pages and pages at a time
7 # The fact that the bot often interjects with random crud from time to time
8 # People teaching it silly facts and then getting delight as it recited them.
9
10 In answer:
11 # This is mostly for debugging and testing, it's very hard to see what a bot is learning if it will only repeat other facts. This has been removed from the new version of the bot and new interfaces are being examined.
12 # The bot interjecting random crud I'm hoping to fix by making the bot smarter with what it replies and smarter at figuring out when it should say something. For instance when someone asks a question wait a few seconds, if noone else has spoken, only then reply.
13 # Well, this obviously shows that people are enjoying playing with the bot, not much you can code around this with.
14
15 As an experiment I'm placing some of infomer's code in the wiki for people to update. This code isn't executed directly, I have to cut and paste it into infomer every so often, but I'm curious if/how people will update it.
16
17 !!Parse_line
18
19 This function takes a line and splits it up into sentances, doing some munging along the way. It also should figure out if this is a directed message, and if so who it is directed to. For example:
8 LawrenceDoliveiro 20 <pre>
4 PerryLorier 21 <Isomer> Infomer: you suck
8 LawrenceDoliveiro 22 </pre>
4 PerryLorier 23 should figure out the sentance "you suck" is directed at Infomer. It doesn't have to figure out what "you" is, parse_Sentance does that. It should however figure out split up sentances and convert them into multiple sentances. eg:
8 LawrenceDoliveiro 24 <pre>
4 PerryLorier 25 <Isomer> Isomer is cool, and is funky
8 LawrenceDoliveiro 26 </pre>
4 PerryLorier 27 should be split up into "Isomer is cool" "Isomer is funky" (with who being "Isomer" and the target being None). Punctuation in general should be stripped.
28
8 LawrenceDoliveiro 29 <verbatim>
4 PerryLorier 30 def parse_line(who,text):
31 tm = x = re.match(r"(?P<string>\w+)\: ([\w\' ]+)", text)
32
33 ntext = ""
34 first = 1
35
36 if not tm:
37 target = None
38 line = text
39
40 else:
41 target = tm.group(1)
42 line = tm.group(2)
43
44 # Now with the target out of the way, begin stripping prefixes and conjunctions etc.
45
46 delims = ["?", ".", "and", ","]
47
48 for d in delims:
49 line = line.replace(d, ".")
50
51 sentences = line.split(".")
52
53 for s in sentences:
54
55 words = s.split(" ")
56
57 for p in prefixes:
58 if p in words:
59 words.remove(p)
60
61 if first == 1:
62 first = 0
63
8 LawrenceDoliveiro 64 ntext = " ".join(words)
4 PerryLorier 65
8 LawrenceDoliveiro 66 parse_sentence(who, target, ntext)
67 </verbatim>
4 PerryLorier 68
8 LawrenceDoliveiro 69 !!Parse_sentence
4 PerryLorier 70 This function's job is to take a sentance and clean it up, replacing "you" with "Isomer is", removing little words, and rearranging any sentances to make more sense to the bot. For example this function should be able to take:
8 LawrenceDoliveiro 71 <pre>
4 PerryLorier 72 (who=Isomer,target=Infomer) "You are a very stupid bot"
8 LawrenceDoliveiro 73 </pre>
4 PerryLorier 74 and turn it into
8 LawrenceDoliveiro 75 <verbatim>
4 PerryLorier 76 (who=Isomer,target=Infomer) "Infomer is very stupid"
77 (who=Isomer,target=Infomer) "Infomer is a bot"
78
79 def parse_sentence(speaker, target, sentence):
80
81 # target is who a sentence was aimed at.
82
83 # replacements is a list of tuples. Each tuple is (matchlist), (replacement)
84 # It's ok to have a replacement that is further expanded by another rule.
85 # use lower-case since we map user text to lower-case for the comparison :)
86
87 replacements = [
88 # abbreviations - case sensitivity?
89 (["you're"], ["you", "are"]),
90 (["I'm"], ["I", "am"]),
91 (["It's"], ["It", "is"]),
92 (["it's"], ["it", "is"]),
93 (["I", "am"], [speaker, "is"]),
94 (["my"], [speaker + "'s"]),
95 ]
96
97 if target != None:
98 replacements.extend(
99 [
100 (["you", "are"], [target, "is"]),
101 (["are", "you"], ["is", target]),
102 (["your"], [target + "'s"]),
103 ### bad idea? (["you"], [target]), # catch-all
104 ]
105 )
106 unparsed_tokens = sentence.split()
107 parsed_tokens = []
108
109 while len(unparsed_tokens) > 0: # assume len() is evaluated each time
110 for pair in replacements:
111 made_expansion = 0
112
113 term_len = len(pair[0])
114 if len(unparsed_tokens) >= term_len and \
115 map(str.lower,unparsed_tokens[:term_len]) == pair[0]:
116 # replace match with replacement
117 unparsed_tokens = pair[1] + unparsed_tokens[term_len:]
118 made_expansion = 1
119 break
120
121 if made_expansion == 0:
122 # we couldn't make any expansions at this point...
123 parsed_tokens.append( unparsed_tokens.pop(0) )
124
125
126 parse_phrase(speaker, parsed_tokens)
8 LawrenceDoliveiro 127 </verbatim>
4 PerryLorier 128
129 !!parse_phrase
130 parse_phrase takes a sentance and calls the database primatives on it.
131
8 LawrenceDoliveiro 132 <verbatim>
4 PerryLorier 133 def parse_phrase(who,text):
134 for i in questions:
135 if i==text[0].lower():
136 obj=Object(text[2:])
137 return get_fact(obj,text[1])
138
139 first=len(text)
140 first_verb=None
141 for i in verbs:
142 if i in text and text.index(i)<first:
143 first=text.index(i)
144 first_verb=i
145
146 # Not a recognised statement
147 if first_verb==None:
148 return ""
149
150 # split into two halves and a verb, eg:
151 # Perry's hair is very cool -> (Perry's Hair,is,Very Cool)
152 lhs = text[:first]
153 verb = text[first]
154 rhs = text[first+1:]
155 if " ".join(lhs).lower() in fake_lhs:
156 return ""
157
158 obj=Object(lhs)
159 add_fact(obj,verb,parse_definition(verb,rhs))
160
8 LawrenceDoliveiro 161 return ""
162 </verbatim>
4 PerryLorier 163
164 !!Misc functions
8 LawrenceDoliveiro 165 This function removes a prefix from a sentence
166 <verbatim>
4 PerryLorier 167 def remove_prefix(text):
8 LawrenceDoliveiro 168 prefixes = [
169 ["and"], ["also"], ["ah"], [ahh"], ["anyway"], ["apparently"],
170 ["although"], ["but"], ["bah"], ["besides"], ["no"], ["yes"],
171 ["yeah"] ]
172 flag=1
173 while flag==1:
174 flag=0
175 for i in prefixes:
176 if map(str.lower,text[:!len(i)])==i:
177 text=text[len(i)+1:]
178 flag=1
179 return text
180 </verbatim>
4 PerryLorier 181
182 An object to hold information about an uh, object.
183
8 LawrenceDoliveiro 184 <verbatim>
4 PerryLorier 185 # An class to hold an object's information
186 class Object:
187 def __init__(self,tokens):
188 self.tokens=[]
189 token=""
190 # Join up words into tokens
191 # eg: Isomer's Left Foot's sole -> (Isomer,Left Foot,sole)
192 for i in tokens:
193 token=token+" "+i
194 if len(token)>2 and token[-2:]=="'s":
195 token=token[:-2]
196 self.tokens.append(token.strip())
197 token=""
198 # This intentionally adds empty token when it is ""
199 self.tokens.append(token.strip())
200
8 LawrenceDoliveiro 201 def __repr__(self):
202 return `self.tokens`
203 </verbatim>
4 PerryLorier 204
205 Figures out if this is a special type of information
8 LawrenceDoliveiro 206 <verbatim>
4 PerryLorier 207 def parse_definition(verb,text):
8 LawrenceDoliveiro 208 if text[0].lower() in ["a","an","the"]:
209 return (ISA,text[1:])
210 if " ".join(text[:2]).lower()=="known as":
211 return (AKA,text[2:])
212 if " ".join(text[:3]).lower()=="also known as":
213 return (AKA,text[3:])
214 return (NORMAL,text)
215 </verbatim>
4 PerryLorier 216
217 !!TODO
218 Parse:
8 LawrenceDoliveiro 219 <pre>
4 PerryLorier 220 Infomer is a very stupid bot
8 LawrenceDoliveiro 221 </pre>
4 PerryLorier 222 as
8 LawrenceDoliveiro 223 <pre>
4 PerryLorier 224 Infomer is very stupid
225 Infomer is a bot
8 LawrenceDoliveiro 226 </pre>
4 PerryLorier 227 requires knowledge of adverbs 'n stuff. adverbs can mostly be detected by checking for "ly" on the end of words.
228
229 Add:
8 LawrenceDoliveiro 230 <pre>
4 PerryLorier 231 tell ''nick'' that ''message''
8 LawrenceDoliveiro 232 </pre>
4 PerryLorier 233 when ''nick'' next says something say ''who'' wanted to me to tell you that ''message''

PHP Warning

lib/blame.php:177: Warning: Invalid argument supplied for foreach() (...repeated 2 times)