Merge lp:~jamesh/storm/bug-242813 into lp:storm

Proposed by James Henstridge
Status: Merged
Merged at revision: not available
Proposed branch: lp:~jamesh/storm/bug-242813
Merge into: lp:storm
Diff against target: 132 lines
2 files modified
storm/expr.py (+10/-0)
tests/expr.py (+87/-0)
To merge this branch: bzr merge lp:~jamesh/storm/bug-242813
Reviewer Review Type Date Requested Status
Jamu Kakar (community) Approve
Review via email: mp+10508@code.launchpad.net
To post a comment you must log in.
Revision history for this message
James Henstridge (jamesh) wrote :

Flatten nested set expressions as they are built so that we are less likely to hit Python's recursion limit when compiling the expression to SQL.

This relies on the left associativity of all the set expressions. (i.e. that Union(Union(a, b), c) is equivalent to Union(a, b, c)).

Revision history for this message
Jamu Kakar (jkakar) wrote :

Nice and simple, +1.

review: Approve
Revision history for this message
GabrielGrant (gabrielgrant) wrote :

The tests pass, and this fixes a problem I've run into, so I'm eager to see this pushed into mainline.

A few questions:

Should there be a test to explicitly check that heterogeneous set expressions are not flattened? (ie exercising the negative branch of the second if statement)

Is it necessary to skip expressions containing an order_by? If the order_by is the same on each expr, shouldn't the result of the nested and un-nested expressions be equivalent? (I may very well be totally wrong about this, but I can't seem to cook up a counter-example at the moment)

It appears that in addition to addressing https://bugs.edge.launchpad.net/storm/+bug/242813 this patch also should fix some of https://bugs.edge.launchpad.net/storm/+bug/309276 (which is a problem I'm having), however I don't think it addresses it fully. The proposed solution doesn't tackle heterogeneous nested set queries, which will still fail on MySQL. http://bugs.mysql.com/bug.php?id=3347 http://bugs.mysql.com/bug.php?id=2435 and http://bugs.mysql.com/bug.php?id=7360 seem to imply that there is a problem with MySQL that basically makes some such nested queries impossible to express, however simpler ones like (SELECT ...) UNION (SELECT ...) INTERSECT (SELECT ...) are supported in MySQL, but (I believe) can't be achieved through Storm itself.

Could/should there be a more graceful failure for these cases than a MySQL parse error? (Sorry if this is too far off track. Perhaps this discussion would be better continued on the appropriate bug? I've subscribed myself)

Cheers!

Revision history for this message
James Henstridge (jamesh) wrote :

> The tests pass, and this fixes a problem I've run into, so I'm eager to see
> this pushed into mainline.
>
> A few questions:
>
> Should there be a test to explicitly check that heterogeneous set expressions
> are not flattened? (ie exercising the negative branch of the second if
> statement)

Yep. There should be a test for that.

> Is it necessary to skip expressions containing an order_by? If the order_by is
> the same on each expr, shouldn't the result of the nested and un-nested
> expressions be equivalent? (I may very well be totally wrong about this, but I
> can't seem to cook up a counter-example at the moment)

I guess I wrote the code that way to be on the safe side: except in the case of MySQL's broken query parser, this branch is mainly designed to improve performance.

That said, the test can probably go. If the outer set expression has an order, then the inner set expression's ordering doesn't matter. If the outer expression has no order, then there is no guarantee on the ordering so we should be able to ignore it in that case too.

I'll remove the check.

> It appears that in addition to addressing
> https://bugs.edge.launchpad.net/storm/+bug/242813 this patch also should fix
> some of https://bugs.edge.launchpad.net/storm/+bug/309276 (which is a problem
> I'm having), however I don't think it addresses it fully. The proposed
> solution doesn't tackle heterogeneous nested set queries, which will still
> fail on MySQL. http://bugs.mysql.com/bug.php?id=3347
> http://bugs.mysql.com/bug.php?id=2435 and
> http://bugs.mysql.com/bug.php?id=7360 seem to imply that there is a problem
> with MySQL that basically makes some such nested queries impossible to
> express, however simpler ones like (SELECT ...) UNION (SELECT ...) INTERSECT
> (SELECT ...) are supported in MySQL, but (I believe) can't be achieved through
> Storm itself.

It might be possible to fix up that sort of chaining by playing with the operator precedences in the expression compiler.

> Could/should there be a more graceful failure for these cases than a MySQL
> parse error? (Sorry if this is too far off track. Perhaps this discussion
> would be better continued on the appropriate bug? I've subscribed myself)

I don't think it is worth the trouble. Writing code to detect expressions that will cause the MySQL query parser trouble is probably as much work as adjusting our expression generators to make expressions that don't trigger bugs in the MySQL query parser.

lp:~jamesh/storm/bug-242813 updated
331. By James Henstridge

Collapse set expressions with order_by set, and add some extra tests.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'storm/expr.py'
--- storm/expr.py 2009-07-31 01:53:08 +0000
+++ storm/expr.py 2009-10-21 09:48:13 +0000
@@ -1099,6 +1099,16 @@
1099 self.order_by = kwargs.get("order_by", Undef)1099 self.order_by = kwargs.get("order_by", Undef)
1100 self.limit = kwargs.get("limit", Undef)1100 self.limit = kwargs.get("limit", Undef)
1101 self.offset = kwargs.get("offset", Undef)1101 self.offset = kwargs.get("offset", Undef)
1102 # If the first expression is of a compatible type, directly
1103 # include its sub expressions.
1104 if len(self.exprs) > 0:
1105 first = self.exprs[0]
1106 if (isinstance(first, self.__class__) and
1107 first.all == self.all and
1108 first.limit is Undef and
1109 first.offset is Undef):
1110 self.exprs = first.exprs + self.exprs[1:]
1111
11021112
1103@compile.when(SetExpr)1113@compile.when(SetExpr)
1104def compile_set_expr(compile, expr, state):1114def compile_set_expr(compile, expr, state):
11051115
=== modified file 'tests/expr.py'
--- tests/expr.py 2009-07-23 22:47:10 +0000
+++ tests/expr.py 2009-10-21 09:48:13 +0000
@@ -275,6 +275,35 @@
275 self.assertEquals(expr.limit, 1)275 self.assertEquals(expr.limit, 1)
276 self.assertEquals(expr.offset, 2)276 self.assertEquals(expr.offset, 2)
277277
278 def test_union_collapse(self):
279 expr = Union(Union(elem1, elem2), elem3)
280 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
281
282 # Only first expression is collapsed.
283 expr = Union(elem1, Union(elem2, elem3))
284 self.assertEquals(expr.exprs[0], elem1)
285 self.assertTrue(isinstance(expr.exprs[1], Union))
286
287 # Don't collapse if all is different.
288 expr = Union(Union(elem1, elem2, all=True), elem3)
289 self.assertTrue(isinstance(expr.exprs[0], Union))
290 expr = Union(Union(elem1, elem2), elem3, all=True)
291 self.assertTrue(isinstance(expr.exprs[0], Union))
292 expr = Union(Union(elem1, elem2, all=True), elem3, all=True)
293 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
294
295 # Don't collapse if limit or offset are set.
296 expr = Union(Union(elem1, elem2, limit=1), elem3)
297 self.assertTrue(isinstance(expr.exprs[0], Union))
298 expr = Union(Union(elem1, elem2, offset=3), elem3)
299 self.assertTrue(isinstance(expr.exprs[0], Union))
300
301 # Don't collapse other set expressions.
302 expr = Union(Except(elem1, elem2), elem3)
303 self.assertTrue(isinstance(expr.exprs[0], Except))
304 expr = Union(Intersect(elem1, elem2), elem3)
305 self.assertTrue(isinstance(expr.exprs[0], Intersect))
306
278 def test_except(self):307 def test_except(self):
279 expr = Except(elem1, elem2, elem3)308 expr = Except(elem1, elem2, elem3)
280 self.assertEquals(expr.exprs, (elem1, elem2, elem3))309 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
@@ -287,6 +316,35 @@
287 self.assertEquals(expr.limit, 1)316 self.assertEquals(expr.limit, 1)
288 self.assertEquals(expr.offset, 2)317 self.assertEquals(expr.offset, 2)
289318
319 def test_except_collapse(self):
320 expr = Except(Except(elem1, elem2), elem3)
321 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
322
323 # Only first expression is collapsed.
324 expr = Except(elem1, Except(elem2, elem3))
325 self.assertEquals(expr.exprs[0], elem1)
326 self.assertTrue(isinstance(expr.exprs[1], Except))
327
328 # Don't collapse if all is different.
329 expr = Except(Except(elem1, elem2, all=True), elem3)
330 self.assertTrue(isinstance(expr.exprs[0], Except))
331 expr = Except(Except(elem1, elem2), elem3, all=True)
332 self.assertTrue(isinstance(expr.exprs[0], Except))
333 expr = Except(Except(elem1, elem2, all=True), elem3, all=True)
334 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
335
336 # Don't collapse if limit or offset are set.
337 expr = Except(Except(elem1, elem2, limit=1), elem3)
338 self.assertTrue(isinstance(expr.exprs[0], Except))
339 expr = Except(Except(elem1, elem2, offset=3), elem3)
340 self.assertTrue(isinstance(expr.exprs[0], Except))
341
342 # Don't collapse other set expressions.
343 expr = Except(Union(elem1, elem2), elem3)
344 self.assertTrue(isinstance(expr.exprs[0], Union))
345 expr = Except(Intersect(elem1, elem2), elem3)
346 self.assertTrue(isinstance(expr.exprs[0], Intersect))
347
290 def test_intersect(self):348 def test_intersect(self):
291 expr = Intersect(elem1, elem2, elem3)349 expr = Intersect(elem1, elem2, elem3)
292 self.assertEquals(expr.exprs, (elem1, elem2, elem3))350 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
@@ -300,6 +358,35 @@
300 self.assertEquals(expr.limit, 1)358 self.assertEquals(expr.limit, 1)
301 self.assertEquals(expr.offset, 2)359 self.assertEquals(expr.offset, 2)
302360
361 def test_intersect_collapse(self):
362 expr = Intersect(Intersect(elem1, elem2), elem3)
363 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
364
365 # Only first expression is collapsed.
366 expr = Intersect(elem1, Intersect(elem2, elem3))
367 self.assertEquals(expr.exprs[0], elem1)
368 self.assertTrue(isinstance(expr.exprs[1], Intersect))
369
370 # Don't collapse if all is different.
371 expr = Intersect(Intersect(elem1, elem2, all=True), elem3)
372 self.assertTrue(isinstance(expr.exprs[0], Intersect))
373 expr = Intersect(Intersect(elem1, elem2), elem3, all=True)
374 self.assertTrue(isinstance(expr.exprs[0], Intersect))
375 expr = Intersect(Intersect(elem1, elem2, all=True), elem3, all=True)
376 self.assertEquals(expr.exprs, (elem1, elem2, elem3))
377
378 # Don't collapse if limit or offset are set.
379 expr = Intersect(Intersect(elem1, elem2, limit=1), elem3)
380 self.assertTrue(isinstance(expr.exprs[0], Intersect))
381 expr = Intersect(Intersect(elem1, elem2, offset=3), elem3)
382 self.assertTrue(isinstance(expr.exprs[0], Intersect))
383
384 # Don't collapse other set expressions.
385 expr = Intersect(Union(elem1, elem2), elem3)
386 self.assertTrue(isinstance(expr.exprs[0], Union))
387 expr = Intersect(Except(elem1, elem2), elem3)
388 self.assertTrue(isinstance(expr.exprs[0], Except))
389
303 def test_auto_tables(self):390 def test_auto_tables(self):
304 expr = AutoTables(elem1, [elem2])391 expr = AutoTables(elem1, [elem2])
305 self.assertEquals(expr.expr, elem1)392 self.assertEquals(expr.expr, elem1)

Subscribers

People subscribed via source and target branches

to status/vote changes: