Merge lp:~jelmer/bzr/repo-size into lp:bzr

Proposed by Jelmer Vernooij
Status: Rejected
Rejected by: Jelmer Vernooij
Proposed branch: lp:~jelmer/bzr/repo-size
Merge into: lp:bzr
Diff against target: 136 lines (+55/-1)
7 files modified
NEWS (+3/-0)
bzrlib/remote.py (+10/-0)
bzrlib/repository.py (+6/-1)
bzrlib/smart/repository.py (+13/-0)
bzrlib/smart/request.py (+3/-0)
bzrlib/tests/per_repository/test_statistics.py (+10/-0)
bzrlib/tests/test_smart.py (+10/-0)
To merge this branch: bzr merge lp:~jelmer/bzr/repo-size
Reviewer Review Type Date Requested Status
Robert Collins (community) Needs Fixing
Review via email: mp+16582@code.launchpad.net
To post a comment you must log in.
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

This patch adds a custom implementation of Repository.__len__ that
returns the number of revisions in the repository.

This is mainly convenient for foreign branch plugins, which do not
provide Repository.revisions. Currently Repository.gather_stats() relies
on len(Repository.revisions), so foreign branch plugins have to override
it.

Cheers,

Jelmer

Revision history for this message
Robert Collins (lifeless) wrote :

On Fri, 2009-12-25 at 15:36 +0000, Jelmer Vernooij wrote:
> Jelmer Vernooij has proposed merging lp:~jelmer/bzr/repo-size into lp:bzr.
>
> Requested reviews:
> bzr-core (bzr-core)
> Related bugs:
> #123653 "bzr info" depends on private Repository._revision_store
> https://bugs.launchpad.net/bugs/123653
>
>
> This patch adds a custom implementation of Repository.__len__ that
> returns the number of revisions in the repository.
>
> This is mainly convenient for foreign branch plugins, which do not
> provide Repository.revisions. Currently Repository.gather_stats() relies
> on len(Repository.revisions), so foreign branch plugins have to override
> it.

Generally we avoid giving objects that are not dicts or sequences a
__len__, and I think its awfully prone to confusion here too, so we
should not do that. I thought that this had been written down somewhere
- you may wish to briefly look for it in the dev docs and if not found
add a note to this effect: only make things look like containers if they
feel like them too. (that is that you won't be surprised as a user).

Rather, I suggest you put the revision count into the output of
gather_stats: its not guaranteed to be cheap for all repositories, so
there is every reason to make it optional.

 review needs-fixing

review: Needs Fixing
Revision history for this message
Jelmer Vernooij (jelmer) wrote :

On Fri, 2009-12-25 at 19:21 +0000, Robert Collins wrote:
> Review: Needs Fixing
> On Fri, 2009-12-25 at 15:36 +0000, Jelmer Vernooij wrote:
> > Jelmer Vernooij has proposed merging lp:~jelmer/bzr/repo-size into lp:bzr.
> >
> > Requested reviews:
> > bzr-core (bzr-core)
> > Related bugs:
> > #123653 "bzr info" depends on private Repository._revision_store
> > https://bugs.launchpad.net/bugs/123653
> >
> >
> > This patch adds a custom implementation of Repository.__len__ that
> > returns the number of revisions in the repository.
> >
> > This is mainly convenient for foreign branch plugins, which do not
> > provide Repository.revisions. Currently Repository.gather_stats() relies
> > on len(Repository.revisions), so foreign branch plugins have to override
> > it.
>
> Generally we avoid giving objects that are not dicts or sequences a
> __len__, and I think its awfully prone to confusion here too, so we
> should not do that. I thought that this had been written down somewhere
> - you may wish to briefly look for it in the dev docs and if not found
> add a note to this effect: only make things look like containers if they
> feel like them too. (that is that you won't be surprised as a user).
>
> Rather, I suggest you put the revision count into the output of
> gather_stats: its not guaranteed to be cheap for all repositories, so
> there is every reason to make it optional.
As I mentioned in my merge request, it's already part of gather_stats()
but I'd like to have it in a separate function so I can override it for
subclasses. Would it be acceptable if I just renamed __len__ to
_get_revision_count() ?

Cheers,

Jelmer

Revision history for this message
Robert Collins (lifeless) wrote :

On Sat, 2009-12-26 at 00:51 +0000, Jelmer Vernooij wrote:

> > Generally we avoid giving objects that are not dicts or sequences a
> > __len__, and I think its awfully prone to confusion here too, so we
> > should not do that. I thought that this had been written down somewhere
> > - you may wish to briefly look for it in the dev docs and if not found
> > add a note to this effect: only make things look like containers if they
> > feel like them too. (that is that you won't be surprised as a user).
> >
> > Rather, I suggest you put the revision count into the output of
> > gather_stats: its not guaranteed to be cheap for all repositories, so
> > there is every reason to make it optional.
> As I mentioned in my merge request, it's already part of gather_stats()
> but I'd like to have it in a separate function so I can override it for
> subclasses. Would it be acceptable if I just renamed __len__ to
> _get_revision_count() ?

Or you could just delete printing the number of revisions. I don't think
a method on repository is a particularly good idea; if anything a method
on repository.revisions. However, gather_stats is *intended* to be
overridden, trying to avoid overriding it is odd to me :).

Another alternative would be to make the check for whether to attempt to
calculate the revision count more sophisticated.

-Rob

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Le samedi 26 décembre 2009 à 19:51 +0000, Robert Collins a écrit :
> On Sat, 2009-12-26 at 00:51 +0000, Jelmer Vernooij wrote:
> > > Generally we avoid giving objects that are not dicts or sequences a
> > > __len__, and I think its awfully prone to confusion here too, so we
> > > should not do that. I thought that this had been written down somewhere
> > > - you may wish to briefly look for it in the dev docs and if not found
> > > add a note to this effect: only make things look like containers if they
> > > feel like them too. (that is that you won't be surprised as a user).
> > >
> > > Rather, I suggest you put the revision count into the output of
> > > gather_stats: its not guaranteed to be cheap for all repositories, so
> > > there is every reason to make it optional.
> > As I mentioned in my merge request, it's already part of gather_stats()
> > but I'd like to have it in a separate function so I can override it for
> > subclasses. Would it be acceptable if I just renamed __len__ to
> > _get_revision_count() ?
> Or you could just delete printing the number of revisions. I don't think
> a method on repository is a particularly good idea; if anything a method
> on repository.revisions.
Except the foreign repository implementations don't necessarily
have .revisions set to something sane.

> However, gather_stats is *intended* to be
> overridden, trying to avoid overriding it is odd to me :).
>
> Another alternative would be to make the check for whether to attempt to
> calculate the revision count more sophisticated.
So, the reason I am trying to avoid overriding it is because it is quite
a big function and I would like to avoid duplicating all of that code in
all of the foreign branch plugins.

I think printing the number of revisions in "bzr info" is actually quite
useful, I've used it on several occasions.

Cheers,

Jelmer

Revision history for this message
Martin Pool (mbp) wrote :

2009/12/26 Robert Collins <email address hidden>:
> Generally we avoid giving objects that are not dicts or sequences a
> __len__, and I think its awfully prone to confusion here too, so we
> should not do that. I thought that this had been written down somewhere
> - you may wish to briefly look for it in the dev docs and if not found
> add a note to this effect: only make things look like containers if they
> feel like them too. (that is that you won't be surprised as a user).

+1

--
Martin <http://launchpad.net/~mbp/>

Revision history for this message
Robert Collins (lifeless) wrote :

> > However, gather_stats is *intended* to be
> > overridden, trying to avoid overriding it is odd to me :).
> >
> > Another alternative would be to make the check for whether to attempt to
> > calculate the revision count more sophisticated.
> So, the reason I am trying to avoid overriding it is because it is quite
> a big function and I would like to avoid duplicating all of that code in
> all of the foreign branch plugins.
>
> I think printing the number of revisions in "bzr info" is actually quite
> useful, I've used it on several occasions.

So, its expensive for you because there isn't a size/len/area style function on .revisions? How about we add one there?

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

Well, we don't have .revisions in the case of e.g. subversion, Repository.revisions is just set to None.

Revision history for this message
Robert Collins (lifeless) wrote :

I'm basically very hesitant to be adding more methods to Repository as a whole. Its overly big (still).

Revision history for this message
Jelmer Vernooij (jelmer) wrote :

> I'm basically very hesitant to be adding more methods to Repository as a
> whole. Its overly big (still).
Darn, that's a good point. I'll look at alternatives.

Revision history for this message
Vincent Ladeuil (vila) wrote :

>>>>> "Jelmer" == Jelmer Vernooij <email address hidden> writes:

    Jelmer> Well, we don't have .revisions in the case of e.g. subversion, Repository.revisions is just set to None.

Set it to [] instead ;-p

    Vincent

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'NEWS'
2--- NEWS 2009-12-25 13:33:03 +0000
3+++ NEWS 2009-12-25 15:36:19 +0000
4@@ -63,6 +63,9 @@
5 CamelCase. For the features that were more likely to be used, we added a
6 deprecation thunk, but not all. (John Arbash Meinel)
7
8+* ``Repository.__len__`` will now return the number of revisions in
9+ the repository. (#123653, Jelmer Vernooij)
10+
11 * ``WorkingTree.update`` implementations must now accept a ``revision``
12 parameter.
13
14
15=== modified file 'bzrlib/remote.py'
16--- bzrlib/remote.py 2009-12-15 20:32:34 +0000
17+++ bzrlib/remote.py 2009-12-25 15:36:19 +0000
18@@ -886,6 +886,16 @@
19 parents_provider = self._make_parents_provider(other_repository)
20 return graph.Graph(parents_provider)
21
22+ def __len__(self):
23+ """See Repository.__len__."""
24+ path = self.bzrdir._path_for_remote_call(self._client)
25+ response_tuple, response_handler = self._call_expecting_body(
26+ 'Repository.__len__', path)
27+ if response_tuple[0] != 'ok':
28+ raise errors.UnexpectedSmartServerResponse(response_tuple)
29+ body = response_handler.read_body_bytes()
30+ return int(body)
31+
32 def gather_stats(self, revid=None, committers=None):
33 """See Repository.gather_stats()."""
34 path = self.bzrdir._path_for_remote_call(self._client)
35
36=== modified file 'bzrlib/repository.py'
37--- bzrlib/repository.py 2009-12-17 10:01:25 +0000
38+++ bzrlib/repository.py 2009-12-25 15:36:19 +0000
39@@ -926,6 +926,11 @@
40 r'.* revision="(?P<revision_id>[^"]+)"'
41 )
42
43+ @needs_read_lock
44+ def __len__(self):
45+ """Determine the number of revisions in the repository."""
46+ return len(self.revisions.keys())
47+
48 def abort_write_group(self, suppress_errors=False):
49 """Commit the contents accrued within the current write group.
50
51@@ -1471,7 +1476,7 @@
52 # XXX: do we want to __define len__() ?
53 # Maybe the versionedfiles object should provide a different
54 # method to get the number of keys.
55- result['revisions'] = len(self.revisions.keys())
56+ result['revisions'] = len(self)
57 # result['size'] = t
58 return result
59
60
61=== modified file 'bzrlib/smart/repository.py'
62--- bzrlib/smart/repository.py 2009-09-03 00:33:35 +0000
63+++ bzrlib/smart/repository.py 2009-12-25 15:36:19 +0000
64@@ -328,6 +328,19 @@
65 return SuccessfulSmartServerResponse(('no', ))
66
67
68+class SmartServerRepositorySize(SmartServerRepositoryRequest):
69+
70+ def do_repository_request(self, repository):
71+ """Return the number of revisions in a repository.
72+
73+ :param repository: The repository to query in.
74+
75+ :return: A SmartServerResponse ('ok',), a encoded body containing
76+ the number of revisions.
77+ """
78+ return SuccessfulSmartServerResponse(('ok', ), "%d" % len(repository))
79+
80+
81 class SmartServerRepositoryGatherStats(SmartServerRepositoryRequest):
82
83 def do_repository_request(self, repository, revid, committers):
84
85=== modified file 'bzrlib/smart/request.py'
86--- bzrlib/smart/request.py 2009-12-21 17:00:29 +0000
87+++ bzrlib/smart/request.py 2009-12-25 15:36:19 +0000
88@@ -592,6 +592,9 @@
89 request_handlers.register_lazy('Repository.gather_stats',
90 'bzrlib.smart.repository',
91 'SmartServerRepositoryGatherStats')
92+request_handlers.register_lazy('Repository.__len__',
93+ 'bzrlib.smart.repository',
94+ 'SmartServerRepositorySize')
95 request_handlers.register_lazy('Repository.get_parent_map',
96 'bzrlib.smart.repository',
97 'SmartServerRepositoryGetParentMap')
98
99=== modified file 'bzrlib/tests/per_repository/test_statistics.py'
100--- bzrlib/tests/per_repository/test_statistics.py 2009-03-23 14:59:43 +0000
101+++ bzrlib/tests/per_repository/test_statistics.py 2009-12-25 15:36:19 +0000
102@@ -62,3 +62,13 @@
103 'revisions': 0
104 },
105 stats)
106+
107+ def test_len(self):
108+ tree = self.make_branch_and_memory_tree('.')
109+ self.assertEquals(0, len(tree.branch.repository))
110+ tree.lock_write()
111+ tree.add('')
112+ rev1 = tree.commit('first post', committer='person 1',
113+ timestamp=1170491381, timezone=0)
114+ tree.unlock()
115+ self.assertEquals(1, len(tree.branch.repository))
116
117=== modified file 'bzrlib/tests/test_smart.py'
118--- bzrlib/tests/test_smart.py 2009-10-29 00:16:50 +0000
119+++ bzrlib/tests/test_smart.py 2009-12-25 15:36:19 +0000
120@@ -1472,6 +1472,16 @@
121 rev_id_utf8, 'yes'))
122
123
124+class TestSmartServerRepositorySize(tests.TestCaseWithMemoryTransport):
125+
126+ def test_empty(self):
127+ backing = self.get_transport()
128+ request = smart.repository.SmartServerRepositorySize(backing)
129+ repository = self.make_repository('.')
130+ self.assertEqual(SmartServerResponse(('ok', ), "0"),
131+ request.execute(''))
132+
133+
134 class TestSmartServerRepositoryIsShared(tests.TestCaseWithMemoryTransport):
135
136 def test_is_shared(self):