Merge lp:~james-w/lazr.restfulclient/improve-collection-fetch-optimisation into lp:lazr.restfulclient

Proposed by James Westby
Status: Work in progress
Proposed branch: lp:~james-w/lazr.restfulclient/improve-collection-fetch-optimisation
Merge into: lp:lazr.restfulclient
Diff against target: 39 lines (+12/-10)
1 file modified
src/lazr/restfulclient/resource.py (+12/-10)
To merge this branch: bzr merge lp:~james-w/lazr.restfulclient/improve-collection-fetch-optimisation
Reviewer Review Type Date Requested Status
Graham Binns (community) code Needs Information
Review via email: mp+16841@code.launchpad.net
To post a comment you must log in.
Revision history for this message
James Westby (james-w) wrote :

Hi,

This change helps with fetching large collections. There is
on optimisation that sets the desired page size if it looks
like a full page will be unneeded, but the way it was coded
meant that the first fetch was always full.

The change here means that the optimisation is more widely applied,
but critically for me you can actually control the size of the fetches
my looping the slices. Some API calls are liable to timeout, so
chunking them smaller allows you to use them, and that's currently
not possible. With this change you can iterate slices to control
the page size.

Thanks,

James

Revision history for this message
James Westby (james-w) wrote :

Actually this doesn't allow you to limit the page size
all the time. Once you get beyond the first page it
makes a request with no size specified.

In order to do this properly we would either need a
max-size request variable, would that be feasible for
LP?

Thanks,

James

Revision history for this message
Graham Binns (gmb) wrote :

Hi James,

Thanks for this branch. A couple of things:

 1. I don't see any tests for this change. Are there tests that already cover this functionality? If so, how will this change affect them? If not, can you add some?
 2. Formatting multi-line conditionals is always a bit weird. We usually wrap after the first 'and' or 'or'. I've written a patch that fixes this for this branch: http://pastebin.ubuntu.com/558064/

> Actually this doesn't allow you to limit the page size
> all the time. Once you get beyond the first page it
> makes a request with no size specified.
>
> In order to do this properly we would either need a
> max-size request variable, would that be feasible for
> LP?
>
 3. I don't really understand what you mean here (I'm still jetlagged, which doesn't help). Can you explain it further (possibly using words of no more than one syllable)?

review: Needs Information (code)

Unmerged revisions

87. By James Westby

Account for the case where there isn't a first page properly.

86. By James Westby

Make the optimisation in Collection._get_slice apply to more cases.

Collection._get_slice had an optimisation to fetch smaller pages if it
thought that a full page was unneeded. However, it only applied this on
the second and subsequent fetch. This change makes it apply to the first
fetch as well, as it applies there equally.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'src/lazr/restfulclient/resource.py'
--- src/lazr/restfulclient/resource.py 2009-12-17 15:03:38 +0000
+++ src/lazr/restfulclient/resource.py 2010-01-05 11:41:18 +0000
@@ -753,6 +753,17 @@
753753
754 # Iterate over pages until we have the correct number of entries.754 # Iterate over pages until we have the correct number of entries.
755 while more_needed > 0 and page_url is not None:755 while more_needed > 0 and page_url is not None:
756 if (first_page_size is not None
757 and more_needed > 0 and more_needed < first_page_size):
758 # An optimization: it's likely that we need less than
759 # a full page of entries, because the number we need
760 # is less than the size of the first page we got.
761 # Instead of requesting a full-sized page, we'll
762 # request only the number of entries we think we'll
763 # need. If we're wrong, there's no problem; we'll just
764 # keep looping.
765 page_url = self._with_url_query_variable_set(
766 page_url, 'ws.size', more_needed)
756 representation = simplejson.loads(767 representation = simplejson.loads(
757 unicode(self._root._browser.get(page_url)))768 unicode(self._root._browser.get(page_url)))
758 current_page_entries = representation['entries']769 current_page_entries = representation['entries']
@@ -766,16 +777,7 @@
766 break777 break
767 if first_page_size is None:778 if first_page_size is None:
768 first_page_size = len(current_page_entries)779 first_page_size = len(current_page_entries)
769 if more_needed > 0 and more_needed < first_page_size:780
770 # An optimization: it's likely that we need less than
771 # a full page of entries, because the number we need
772 # is less than the size of the first page we got.
773 # Instead of requesting a full-sized page, we'll
774 # request only the number of entries we think we'll
775 # need. If we're wrong, there's no problem; we'll just
776 # keep looping.
777 page_url = self._with_url_query_variable_set(
778 page_url, 'ws.size', more_needed)
779781
780 if slice.step is not None:782 if slice.step is not None:
781 entry_dicts = entry_dicts[::slice.step]783 entry_dicts = entry_dicts[::slice.step]

Subscribers

People subscribed via source and target branches