Merge lp:~jameinel/bzr/2.1-categorize-requests-819604 into lp:bzr/2.1
Status: | Work in progress |
---|---|
Proposed branch: | lp:~jameinel/bzr/2.1-categorize-requests-819604 |
Merge into: | lp:bzr/2.1 |
Prerequisite: | lp:~jameinel/bzr/2.1-client-reconnect-819604 |
Diff against target: |
315 lines (+148/-95) 2 files modified
bzrlib/smart/request.py (+138/-95) bzrlib/tests/test_smart_request.py (+10/-0) |
To merge this branch: | bzr merge lp:~jameinel/bzr/2.1-categorize-requests-819604 |
Related bugs: |
Reviewer | Review Type | Date Requested | Status |
---|---|---|---|
Jelmer Vernooij (community) | code | Approve | |
Martin Pool | Approve | ||
Martin Packman (community) | Approve | ||
Review via email: mp+78708@code.launchpad.net |
Commit message
Bug #819604, categorize all smart Requests as to their safety for retrying them automatically.
Description of the change
1) I analyzed all of the requests that are currently registered as to whether they are safe or not to retry. I ended up with 4 categories:
# safe The request is strictly idempotent, calling it twice results in
# the same result as calling it one time. This includes all read-only
# requests, and write requests like 'put' where the end result is
# identical content.
# unsafe A request which is unsafe means that state is updated in a way that
# replaying that request results in a different state. For example
# 'append' writes more bytes to a given file. If append succeeds, it
# moves the file pointer.
# semi This is a request that isn't strictly idempotent, but doesn't
# result in corruption if it is retried. This is for things like
# 'lock' and 'unlock'. If you call lock, it updates the disk
# structure. If you fail to read the response, you won't be able to
# use the lock, because you don't have the lock token. Calling lock
# again will fail, because the lock is already taken. However, we
# can't tell if the server received our request or not. If it didn't,
# then retrying the request is fine, as it will actually do what we
# want. If it did, we will interrupt the current operation, but we
# are no worse off than interrupting the current operation because of
# a ConnectionReset.
# stream This is a request that takes a stream that cannot be restarted if
# consumed. This request is 'safe' in that if we determine the
# connection is closed before we consume the stream, we can try
# again.
2) The retry code will ultimately only care if something is 'unsafe' or not, so we could simplify this list if we wanted. Or we could extend it to split 'safe' into differentiating between 'read-only' and 'idempotent'.
One interesting potential from this classification, is that we could actually test it. Specifically, for each request, we set up a RemoteRepository in some sort of state. We could then create a test that:
a) for 'read-only', after the request is sent, there is no change on disk.
b) for 'idempotent' there is a change, however, sending the request a second time, there is no change.
c) for 'semi', there is some change on disk, sending the request a second time causes a client-side failure (exception is raised)
d) for 'stream', I'm not really sure
e) for 'unsafe' we could confirm that sending the request once results in a change, and sending it again, results in yet-another change.
However, the test suite setup is non-trivial, since you need specific arguments for every function. Which is an awful lot of scenarios to describe.
3) I decided to add this to registry, rather that putting it on the classes themselves. Either would be fine, as we only need to lookup one request at runtime that actually got ConnectionReset. The main reason I went this route is that I want to make sure a human looks at every request and determines its safety. Adding it as a class attribute tends to leave it as the default from inheritance, which makes it a little too easy to mark something safe when it isn't. And if you get it wrong, this is something that only causes corruption when you get a ConnectionReset at exactly the right time. So really hard to debug and diagnose.
4) This is one of those patches that will definitely need to be updated to merge it to newer versions of bzr, since it touches so many lines of code, and we are very likely to have added new RPCs since bzr 2.1.
I like this, it's good documentation.
Miscellaneous gripe, the word 'safe' is overused in naming things. It requires context to know what it really means, and that tends to be lost when if the terminology spreads to multiple places.