Merge lp:~bac/launchpad/bug-341935-captcha into lp:launchpad
- bug-341935-captcha
- Merge into devel
Status: | Merged |
---|---|
Approved by: | Brad Crittenden |
Approved revision: | no longer in the source branch. |
Merged at revision: | not available |
Proposed branch: | lp:~bac/launchpad/bug-341935-captcha |
Merge into: | lp:launchpad |
Diff against target: |
319 lines 7 files modified
lib/canonical/launchpad/browser/tests/registration.py (+2/-0) lib/canonical/launchpad/pagetests/standalone/xx-new-account-redirection-url.txt (+2/-0) lib/canonical/launchpad/templates/launchpad-login.pt (+23/-5) lib/canonical/launchpad/webapp/login.py (+47/-0) lib/lp/registry/stories/foaf/xx-createaccount.txt (+17/-4) lib/lp/registry/stories/foaf/xx-reg-with-existing-email.txt (+5/-1) lib/lp/testing/registration.py (+32/-0) |
To merge this branch: | bzr merge lp:~bac/launchpad/bug-341935-captcha |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Martin Albisetti (community) | ui | Approve | |
Michael Nelson (community) | ui* | Approve | |
Edwin Grubbs (community) | code ui* | Approve | |
Review via email: mp+13022@code.launchpad.net |
Commit message
Add a simple math-based captcha to the registration pages to thwart dumb bots.
Description of the change
Brad Crittenden (bac) wrote : | # |
Edwin Grubbs (edwin-grubbs) wrote : | # |
Hi Brad,
This looks good. I just have one minor comment.
merge-approved
-Edwin
>=== modified file 'lib/lp/
>--- lib/lp/
>+++ lib/lp/
>@@ -21,12 +21,34 @@
> The email address you provided isn't valid. Please verify it and try
> again.
>
>-Jane tries again, providing a valid email address and asking to create a new
>-account.
>+Next she enters a valid email address but enters a wrong answer for
>+the incredibly trivial match captcha. She gets an error message.
Did you intend "math captcha" instead of "match captcha"?
> >>> browser.
> ... '<email address hidden>')
>- >>> browser.
>+ >>> browser.
>+ >>> browser.
>+ >>> print_feedback_
>+ The answer to the simple math question was incorrect or missing.
>+ Please try again.
>+
>+Jane tries again, providing a the correct captcha answer. Her valid
>+email address from before has been retained in the form.
>+
>+ >>> import re
>+ >>> def get_captcha_
>+ ... expr = re.compile("What is (\d+ .{1} \d+)?")
>+ ... match = expr.search(
>+ ... if match:
>+ ... question = match.group(1)
>+ ... answer = eval(question)
>+ ... return str(answer)
>+ ... return ''
>+
>+ >>> browser.
>+ ... get_captcha_
>+ >>> browser.
>+ >>> print_feedback_
> >>> print extract_
> Registration mail sent
> Instructions on completing your registration have been sent to
>@@ -36,7 +58,7 @@
> Launchpad sends Jane an email message containing a token she must use to
> complete the registration process.
>
>- >>> import email, re
>+ >>> import email
> >>> from lp.services.mail import stub
> >>> from canonical.
> ... get_token_
>@@ -91,6 +113,8 @@
> >>> browser.open('http://
> >>> browser.
> ... '<email address hidden>')
>+ >>> browser.
>+ ... get_captcha_
> >>> browser.
>
> >>> from_addr, to_addrs, raw_msg = stub.test_
>
Michael Nelson (michael.nelson) wrote : | # |
Hi bac,
Thanks for getting this done! Looks very straight-forward.
I've noticed that many sites have at least a casual explanation for why they're asking people to answer a maths question. Something like: "We like to make sure that you're not a bot." as a description for the form field, but I'm assuming you've made the decision that there's no benefit?
Martin Albisetti (beuno) wrote : | # |
08:11 < beuno> noodles775, good morning
08:11 < noodles775> hi beuno
08:11 < beuno> noodles775, looking at the captcha review
08:12 < beuno> it's a great point about telling people why we ask that
08:12 < beuno> I wonder if we could have a (?) icon, that is clickable, and mrevell's help popup comes up
08:12 < beuno> so we don't have so much text on the page
08:12 < noodles775> Yeah, that'd be a good way to keep it minimal :)
08:13 < beuno> the other thing that came to mind was, maybe the field shouldn't be that length
08:13 < beuno> to make it distincctivly different than password
08:13 < beuno> if you don't read labels, it's easy to be fooled
08:16 < noodles775> yeah, true - I hadn't thought of that - mistaking it for a password field simply because it's following the email field.
08:16 < noodles775> I had a quick look at a few other sites (gmail signup and ubuntu forums) and both define a label like "Random question" or "Word
08:17 < beuno> right, you may also be able to get away with a very short description
08:17 < beuno> I think the key here is that it look visually different enough
08:17 < beuno> this is where "don't read the merge proposal" comes in useful :)
08:18 < noodles775> So, laying it out like the gmail signup:
08:19 < noodles775> hmm... not sure what we'd use for a label, but having the actual simple math question in the right section (ie. aligned with the inputs)
08:20 < beuno> yeap, that would work
08:21 < noodles775> perhaps the label could simply be 'Random question'.
08:21 < beuno> yeah, or "spam test"
08:21 < beuno> something that gets the message across
08:21 < noodles775> I'll play with it and do a diff.
08:21 < beuno> it's a great point that the label is not a good place to put the question
Michael Nelson (michael.nelson) wrote : | # |
> Hi bac,
>
> Thanks for getting this done! Looks very straight-forward.
>
> I've noticed that many sites have at least a casual explanation for why
> they're asking people to answer a maths question. Something like: "We like to
> make sure that you're not a bot." as a description for the form field, but I'm
> assuming you've made the decision that there's no benefit?
I was just chatting with Martin, and a few things came out of it:
1. It would be good to ensure that the captcha field is visually different (it's possible that some people won't read the label but will simply type their password in there...being so used to entering their email followed by a password to create accounts etc.)
2. It might be good to have some indication of why we're asking for a maths question (as above) - but it doesn't need to be prominent.
So along those lines, here's two thoughts - see what you think:
1. http://
This deals with (1) by giving the field a normal label and moving the question into the right-hand column (similar to what gmail does with their 'Word verification' when you try to register an account). That said, I don't particularly like the smaller input all on its own... so,
2. http://
shows another idea - modifying the question to simply 8 + 8 = and having the input inline there. This looks much simpler to me and is very hard to mistake as a password field ;). I'll see what you guys think.
Note, in the second screenshot, you'll see that I also added the explanation of the random question to the steps above the fields - this seems like a good place to put it (as it is a required step) and it keeps the form clean. Another thought Martin had was to have a small '?' help link next to the question that would pop-up an explanation.
Anyway, let me know what you think.
Brad Crittenden (bac) wrote : | # |
Thanks for the review and suggestions Michael and Martin.
Michael I like your second screenshot and have implemented it. A new screenshot is at http://
Martin Albisetti (beuno) : | # |
Preview Diff
1 | === modified file 'lib/canonical/launchpad/browser/tests/registration.py' |
2 | --- lib/canonical/launchpad/browser/tests/registration.py 2009-08-05 18:52:52 +0000 |
3 | +++ lib/canonical/launchpad/browser/tests/registration.py 2009-10-08 20:25:21 +0000 |
4 | @@ -8,6 +8,7 @@ |
5 | from canonical.launchpad.ftests import logout |
6 | from canonical.launchpad.testing.pages import setupBrowser |
7 | from canonical.launchpad.webapp import canonical_url |
8 | +from lp.testing.registration import set_captcha_answer |
9 | |
10 | |
11 | def start_registration_through_the_web(email): |
12 | @@ -20,6 +21,7 @@ |
13 | browser = setupBrowser() |
14 | browser.open('http://launchpad.dev/+login') |
15 | browser.getControl(name='loginpage_email', index=1).value = email |
16 | + set_captcha_answer(browser) |
17 | browser.getControl('Register').click() |
18 | return browser |
19 | |
20 | |
21 | === modified file 'lib/canonical/launchpad/pagetests/standalone/xx-new-account-redirection-url.txt' |
22 | --- lib/canonical/launchpad/pagetests/standalone/xx-new-account-redirection-url.txt 2009-05-12 01:39:29 +0000 |
23 | +++ lib/canonical/launchpad/pagetests/standalone/xx-new-account-redirection-url.txt 2009-10-08 20:25:21 +0000 |
24 | @@ -39,8 +39,10 @@ |
25 | If she now creates a new account, we'll certainly redirect her back to |
26 | whatever we had stored on the redirection_url hidden field. |
27 | |
28 | + >>> from lp.testing.registration import set_captcha_answer |
29 | >>> anon_browser.getControl('E-mail address:', index=1).value = ( |
30 | ... 'granny@canonical.com') |
31 | + >>> set_captcha_answer(anon_browser) |
32 | >>> anon_browser.getControl('Register').click() |
33 | >>> print anon_browser.contents |
34 | <... |
35 | |
36 | === modified file 'lib/canonical/launchpad/templates/launchpad-login.pt' |
37 | --- lib/canonical/launchpad/templates/launchpad-login.pt 2009-07-17 17:59:07 +0000 |
38 | +++ lib/canonical/launchpad/templates/launchpad-login.pt 2009-10-08 20:25:21 +0000 |
39 | @@ -154,10 +154,11 @@ |
40 | <!-- Preserve extra query parameters as hidden fields. --> |
41 | <tal:replacement tal:replace="structure view/preserve_query" /> |
42 | </form> |
43 | - <p><strong>Note:</strong> You must enable cookies in your browser to log into or register for Launchpad.</p> |
44 | + <p><strong>Note:</strong> You must enable cookies in your browser |
45 | + to log into or register for Launchpad.</p> |
46 | <!-- 5a. didn't try to register last time: --> |
47 | <h1 tal:condition="not: view/registration_error"> |
48 | - Not registered yet? |
49 | + Not registered yet? |
50 | </h1> |
51 | <!-- 5b. just tried to register and failed: --> |
52 | <tal:block condition="view/registration_error"> |
53 | @@ -168,19 +169,24 @@ |
54 | <!-- 5. might want to register: --> |
55 | <p> |
56 | Creating your Launchpad account is easy. Here's what to do:</p> |
57 | - |
58 | + |
59 | <ol class="subordinate"> |
60 | <li>Make sure cookies are enabled in your browser.</li> |
61 | - <li>Enter your e-mail address.</li> |
62 | + <li>Enter your e-mail address and answer our random question |
63 | + so we know that you're human. |
64 | + </li> |
65 | <li>Follow the URL in the confirmation e-mail that Launchpad sends and you're done!</li> |
66 | </ol> |
67 | - |
68 | + |
69 | |
70 | |
71 | <form name="join" method="POST"> |
72 | <input type="hidden" name="redirection_url" |
73 | tal:condition="not: request/form/redirection_url|nothing" |
74 | tal:attributes="value view/getRedirectionURL" /> |
75 | + <input type="hidden" |
76 | + tal:attributes="name view/captcha_hash; |
77 | + value view/get_captcha_hash" /> |
78 | <table> |
79 | <tr> |
80 | <th> |
81 | @@ -194,6 +200,18 @@ |
82 | </td> |
83 | </tr> |
84 | <tr> |
85 | + <th> |
86 | + <label>Random question:</label> |
87 | + </th> |
88 | + <td> |
89 | + <span id="problem" tal:content="view/captcha_problem" /> |
90 | + <input type="text" size="5" |
91 | + id="captcha_submission" value="" |
92 | + tal:attributes="name view/captcha_submission" |
93 | + /> |
94 | + </td> |
95 | + </tr> |
96 | + <tr> |
97 | <th></th> |
98 | <td> |
99 | <input |
100 | |
101 | === modified file 'lib/canonical/launchpad/webapp/login.py' |
102 | --- lib/canonical/launchpad/webapp/login.py 2009-07-31 19:56:48 +0000 |
103 | +++ lib/canonical/launchpad/webapp/login.py 2009-10-08 20:25:21 +0000 |
104 | @@ -8,6 +8,8 @@ |
105 | import cgi |
106 | import urllib |
107 | from datetime import datetime, timedelta |
108 | +import md5 |
109 | +import random |
110 | |
111 | from BeautifulSoup import UnicodeDammit |
112 | |
113 | @@ -18,6 +20,7 @@ |
114 | |
115 | from z3c.ptcompat import ViewPageTemplateFile |
116 | |
117 | +from canonical.cachedproperty import cachedproperty |
118 | from canonical.config import config |
119 | from canonical.launchpad import _ |
120 | from canonical.launchpad.interfaces.account import AccountStatus |
121 | @@ -167,6 +170,8 @@ |
122 | submit_registration = form_prefix + 'submit_registration' |
123 | input_email = form_prefix + 'email' |
124 | input_password = form_prefix + 'password' |
125 | + captcha_submission = form_prefix + 'captcha_submission' |
126 | + captcha_hash = form_prefix + 'captcha_hash' |
127 | |
128 | # Instance variables that represent the state of the form. |
129 | login_error = None |
130 | @@ -292,12 +297,20 @@ |
131 | redirection_url = redirection_url_list[0] |
132 | |
133 | self.email = request.form.get(self.input_email).strip() |
134 | + |
135 | if not valid_email(self.email): |
136 | self.registration_error = ( |
137 | "The email address you provided isn't valid. " |
138 | "Please verify it and try again.") |
139 | return |
140 | |
141 | + # Validate the user is human, more or less. |
142 | + if not self.validateCaptcha(): |
143 | + self.registration_error = ( |
144 | + "The answer to the simple math question was incorrect " |
145 | + "or missing. Please try again.") |
146 | + return |
147 | + |
148 | registered_email = getUtility(IEmailAddressSet).getByEmail(self.email) |
149 | if registered_email is not None: |
150 | person = registered_email.person |
151 | @@ -386,6 +399,40 @@ |
152 | L.append(html % (name, cgi.escape(value, quote=True))) |
153 | return '\n'.join(L) |
154 | |
155 | + def validateCaptcha(self): |
156 | + """Validate the submitted captcha value matches what we expect.""" |
157 | + expected = self.request.form.get(self.captcha_hash) |
158 | + submitted = self.request.form.get(self.captcha_submission) |
159 | + if expected is not None and submitted is not None: |
160 | + return md5.new(submitted).hexdigest() == expected |
161 | + return False |
162 | + |
163 | + @cachedproperty |
164 | + def captcha_answer(self): |
165 | + """Get the answer for the current captcha challenge. |
166 | + |
167 | + With each failed attempt a new challenge will be given. Our answer |
168 | + space is acknowledged to be ridiculously small but is chosen in the |
169 | + interest of ease-of-use. We're not trying to create an iron-clad |
170 | + challenge but only a minimal obstacle to dumb bots. |
171 | + """ |
172 | + return random.randint(10, 20) |
173 | + |
174 | + @property |
175 | + def get_captcha_hash(self): |
176 | + """Get the captcha hash. |
177 | + |
178 | + The hash is the value we put in the form for later comparison. |
179 | + """ |
180 | + return md5.new(str(self.captcha_answer)).hexdigest() |
181 | + |
182 | + @property |
183 | + def captcha_problem(self): |
184 | + """Create the captcha challenge.""" |
185 | + op1 = random.randint(1, self.captcha_answer) |
186 | + op2 = self.captcha_answer - op1 |
187 | + return '%d + %d =' % (op1, op2) |
188 | + |
189 | |
190 | def logInPrincipal(request, principal, email): |
191 | """Log the principal in. Password validation must be done in callsites.""" |
192 | |
193 | === modified file 'lib/lp/registry/stories/foaf/xx-createaccount.txt' |
194 | --- lib/lp/registry/stories/foaf/xx-createaccount.txt 2009-09-11 18:45:31 +0000 |
195 | +++ lib/lp/registry/stories/foaf/xx-createaccount.txt 2009-10-08 20:25:21 +0000 |
196 | @@ -21,12 +21,24 @@ |
197 | The email address you provided isn't valid. Please verify it and try |
198 | again. |
199 | |
200 | -Jane tries again, providing a valid email address and asking to create a new |
201 | -account. |
202 | +Next she enters a valid email address but enters a wrong answer for |
203 | +the incredibly trivial math captcha. She gets an error message. |
204 | |
205 | >>> browser.getControl(name='loginpage_email', index=1).value = ( |
206 | ... 'jane@example.com') |
207 | - >>> browser.getControl('Register').click() |
208 | + >>> browser.getControl(name='loginpage_captcha_submission').value = '-1' |
209 | + >>> browser.getControl('Register').click() |
210 | + >>> print_feedback_messages(browser.contents) |
211 | + The answer to the simple math question was incorrect or missing. |
212 | + Please try again. |
213 | + |
214 | +Jane tries again, providing a the correct captcha answer. Her valid |
215 | +email address from before has been retained in the form. |
216 | + |
217 | + >>> from lp.testing.registration import set_captcha_answer |
218 | + >>> set_captcha_answer(browser) |
219 | + >>> browser.getControl('Register').click() |
220 | + >>> print_feedback_messages(browser.contents) |
221 | >>> print extract_text(find_main_content(browser.contents)) |
222 | Registration mail sent |
223 | Instructions on completing your registration have been sent to |
224 | @@ -36,7 +48,7 @@ |
225 | Launchpad sends Jane an email message containing a token she must use to |
226 | complete the registration process. |
227 | |
228 | - >>> import email, re |
229 | + >>> import email |
230 | >>> from lp.services.mail import stub |
231 | >>> from canonical.launchpad.ftests.logintoken import ( |
232 | ... get_token_url_from_email) |
233 | @@ -91,6 +103,7 @@ |
234 | >>> browser.open('http://launchpad.dev/+login') |
235 | >>> browser.getControl(name='loginpage_email', index=1).value = ( |
236 | ... 'jperson@example.com') |
237 | + >>> set_captcha_answer(browser) |
238 | >>> browser.getControl('Register').click() |
239 | |
240 | >>> from_addr, to_addrs, raw_msg = stub.test_emails.pop() |
241 | |
242 | === modified file 'lib/lp/registry/stories/foaf/xx-reg-with-existing-email.txt' |
243 | --- lib/lp/registry/stories/foaf/xx-reg-with-existing-email.txt 2009-05-12 01:39:29 +0000 |
244 | +++ lib/lp/registry/stories/foaf/xx-reg-with-existing-email.txt 2009-10-08 20:25:21 +0000 |
245 | @@ -15,6 +15,8 @@ |
246 | |
247 | >>> browser.getControl('E-mail address:', index=1).value = ( |
248 | ... 'test@canonical.com') |
249 | + >>> from lp.testing.registration import set_captcha_answer |
250 | + >>> set_captcha_answer(browser) |
251 | >>> browser.getControl('Register').click() |
252 | |
253 | >>> for message in get_feedback_messages(browser.contents): |
254 | @@ -31,6 +33,7 @@ |
255 | on with the registration process. |
256 | |
257 | >>> browser.getControl('E-mail address:', index=1).value = 'mpo@iki.fi' |
258 | + >>> set_captcha_answer(browser) |
259 | >>> browser.getControl('Register').click() |
260 | |
261 | >>> print extract_text(find_tag_by_id(browser.contents, 'address')) |
262 | @@ -82,6 +85,7 @@ |
263 | |
264 | >>> anon_browser.getControl('E-mail address:', index=1).value = ( |
265 | ... 'christian.reis@ubuntulinux.com') |
266 | + >>> set_captcha_answer(anon_browser) |
267 | >>> anon_browser.getControl('Register').click() |
268 | |
269 | >>> from_addr, to_addr, msg = stub.test_emails.pop() |
270 | @@ -143,6 +147,7 @@ |
271 | |
272 | >>> browser.getControl('E-mail address:', index=1).value = ( |
273 | ... 'bad-user@canonical.com') |
274 | + >>> set_captcha_answer(browser) |
275 | >>> browser.getControl('Register').click() |
276 | |
277 | >>> print extract_text(find_main_content(browser.contents).p) |
278 | @@ -179,4 +184,3 @@ |
279 | <div ...>This profile cannot be claimed because the account is suspended. |
280 | Contact a <a href="mailto:feedback@launchpad.net?subject=SU...">Launchpad |
281 | admin</a> about this issue.</div> |
282 | - |
283 | |
284 | === added file 'lib/lp/testing/registration.py' |
285 | --- lib/lp/testing/registration.py 1970-01-01 00:00:00 +0000 |
286 | +++ lib/lp/testing/registration.py 2009-10-08 20:25:21 +0000 |
287 | @@ -0,0 +1,32 @@ |
288 | +# Copyright 2009 Canonical Ltd. This software is licensed under the |
289 | +# GNU Affero General Public License version 3 (see the file LICENSE). |
290 | + |
291 | +"""Helper functions dealing with registration in tests. |
292 | +""" |
293 | +__metaclass__ = type |
294 | + |
295 | +__all__ = [ |
296 | + 'get_captcha_answer', |
297 | + 'set_captcha_answer', |
298 | + ] |
299 | + |
300 | +import re |
301 | + |
302 | + |
303 | +def get_captcha_answer(contents): |
304 | + """Search the browser contents and get the captcha answer.""" |
305 | + expr = re.compile("(\d+ .{1} \d+) =") |
306 | + match = expr.search(contents) |
307 | + if match: |
308 | + question = match.group(1) |
309 | + answer = eval(question) |
310 | + return str(answer) |
311 | + return '' |
312 | + |
313 | + |
314 | +def set_captcha_answer(browser, answer=None): |
315 | + """Given a browser, set the login captcha with the correct answer.""" |
316 | + if answer is None: |
317 | + answer = get_captcha_answer(browser.contents) |
318 | + browser.getControl(name='loginpage_captcha_submission').value = ( |
319 | + answer) |
= Summary =
Bug 341935 addresses the need to do *something* to disrupt the ability for bots to
trick Launchpad into sending registration email to unsuspecting people. It was
agreed to do a simple, text-based math problem captcha as a first step to try to
defeat the bots. While it would be trivial for bots to defeat this captcha (see the
pagetest!) it is our belief that we are not a real target so no one would invest in a
custom solution for use against Launchpad.
== Proposed fix ==
Add a simple captcha into the form.
== Pre-implementation notes ==
Discussions with Barry and Curtis.
== Implementation details ==
The view for the login page is a mess. It's not a LaunchpadFormView so the additions
were done 'by hand'.
In order to make it a low bar the math question is simple addition with the answer
being in the range [10, 20].
Screenshot at http:// people. canonical. com/~bac/ captcha. png
== Tests ==
bin/test xx-createaccoun t.txt
== Demo and Q/A ==
https:/ /launchpad. dev/+login
= Launchpad lint =
Checking for conflicts. and issues in doctests and templates.
Running jslint, xmllint, pyflakes, and pylint.
Using normal rules.
Linting changed files: /launchpad/ templates/ launchpad- login.pt registry/ stories/ foaf/xx- createaccount. txt /launchpad/ webapp/ login.py
lib/canonical
lib/lp/
lib/canonical