Bazaar

Code review comment for lp:~mbp/bzr/497274-http-405

497274-http-405
Merge into bzr.dev

Revision history for this message

Vincent Ladeuil (vila) wrote on 2009-12-21:

>>>>> "martin" == Martin Pool <email address hidden> writes:

Disclaimer: I wrote my patch in urgency without trying to
integrate it with yours, as such I tried to address the missing
parts. The were no tests and no urllib implementation. I stopped
there and didn't explain it as well as I should have, see below
for more comments.

<snip/>

martin> I'll merge that into mine and see how it looks.

That was the idea, I'm fine with whatever end result you come up
with.

<snip/>

>> I doubt users are really interested in the general case,
>> muttering may be enough though.

    martin> They might not be, but only giving the status code is
    martin> not very helpful if there is something mysterious
    martin> going on, like the server being overloaded. We tend
    martin> to get bug reports about this - though just getting
    martin> it into the trace should help us with that.

Yup. I was a bit uncomfortable with unhtml() but not that
much. My main concern is that its a bit brute force and in
presence of errors I like, as a dev, to have access to the raw
data. Users have others expectancies and I don't a clear answer
about what should be presented.

<snip/>

    martin> There are two types of possible duplication: across
    martin> different call paths (get vs post vs others) and
    martin> across different http client implementations. I can
    martin> see how different callers might want different
    martin> interpretations for some codes, but I don't see why
    martin> we would want different interpretations across urllib
    martin> and pycurl, unless it really is something where the
    martin> client library interferes with the result, as it may
    martin> for redirects.

So, thanks for bringing a fresh eye here. Historically I've never
attempt to unify the two implementations very hard based on the
assumption that pycurl will be deprecated (shudder).

But as you point above, both goals are a bit contradictory, you
have to care to leave errors escape the lower layer handling so
that the higher (common) one can work.

<snip/>

martin> So your code has, in handle_response:

    martin> elif code == 405:
    martin> # The server refused to handle the request, the data presumably
    martin> # contains details about the error and not useful data, but we leave
    martin> # that to the higher levels.
    martin> return data

    martin> which strongly looks like you're going to treat the
    martin> html text of the error response as a valid response
    martin> body, which is definitely not right. OK, apparently
    martin> this doesn't necessarily happen because the base
    martin> class send_http_smart_request translates it, but
    martin> there's still some risk of that.

Yup, that's why I said the implementation is incomplete. This was
the less intrusive fix but I was not happy with the risk.

    martin> If you pass around (http_status, body) too much then
    martin> multiple layers need to know how to interpret the
    martin> status code. Without understanding it, they can't
    martin> understand how to use the body data. For insta2nce,
    martin> send_http_smart_request assumes that only 200 means
    martin> success, but that's not quite absolutely true.

    martin> So I'd like to have one reasonably well defined place
    martin> that turns error-like responses into exceptions.
    martin> Then if some code wants to handle them specially, it
    martin> can always catch the exception. I think it's a bit
    martin> bad that send_http_smart_request needs to both check
    martin> the status and also catch exceptions.

Yes.

    martin> I guess I don't properly understand whether you think
    martin> my patch is actually wrong or just needs tests. I
    martin> will try to use your tests with mine and see where we
    martin> get to.

I hope I clarified that: except for 404, I'm fine with a common
error handling, may be I just haven't seen yet a good way to
implement that but I didn't spend much time on it either and I
trust you to find it ;)

And reading the other comments: I didn't test it myself against
google code, I just started with the tests (which I recognize may
look a bit like black magic).

>>>>> "martin" == Martin Pool <mbp@sourcefrog.net> writes:

<snip/>

martin> I'll merge that into mine and see how it looks.

That was the idea, I'm fine with whatever end result you come up
with.

<snip/>

>> I doubt users are really interested in the general case,
    >> muttering may be enough though.

martin> They might not be, but only giving the status code is
    martin> not very helpful if there is something mysterious
    martin> going on, like the server being overloaded.  We tend
    martin> to get bug reports about this - though just getting
    martin> it into the trace should help us with that.

<snip/>

martin> There are two types of possible duplication: across
    martin> different call paths (get vs post vs others) and
    martin> across different http client implementations.  I can
    martin> see how different callers might want different
    martin> interpretations for some codes, but I don't see why
    martin> we would want different interpretations across urllib
    martin> and pycurl, unless it really is something where the
    martin> client library interferes with the result, as it may
    martin> for redirects.

So, thanks for bringing a fresh eye here. Historically I've never
attempt to unify the two implementations very hard based on the
assumption that pycurl will be deprecated (shudder).

But as you point above, both goals are a bit contradictory, you
have to care to leave errors escape the lower layer handling so
that the higher (common) one can work.

<snip/>

martin> So your code has, in handle_response:

martin>     elif code == 405:
    martin>         # The server refused to handle the request, the data presumably
    martin>         # contains details about the error and not useful data, but we leave
    martin>         # that to the higher levels.
    martin>         return data

martin> which strongly looks like you're going to treat the
    martin> html text of the error response as a valid response
    martin> body, which is definitely not right.  OK, apparently
    martin> this doesn't necessarily happen because the base
    martin> class send_http_smart_request translates it, but
    martin> there's still some risk of that.

Yup, that's why I said the implementation is incomplete. This was
the less intrusive fix but I was not happy with the risk.

martin> If you pass around (http_status, body) too much then
    martin> multiple layers need to know how to interpret the
    martin> status code.  Without understanding it, they can't
    martin> understand how to use the body data.  For insta2nce,
    martin> send_http_smart_request assumes that only 200 means
    martin> success, but that's not quite absolutely true.

martin> So I'd like to have one reasonably well defined place
    martin> that turns error-like responses into exceptions.
    martin> Then if some code wants to handle them specially, it
    martin> can always catch the exception.  I think it's a bit
    martin> bad that send_http_smart_request needs to both check
    martin> the status and also catch exceptions.

Yes.

martin> I guess I don't properly understand whether you think
    martin> my patch is actually wrong or just needs tests.  I
    martin> will try to use your tests with mine and see where we
    martin> get to.

And reading the other comments: I didn't test it myself against
google code, I just started with the tests (which I recognize may
look a bit like black magic).

« Back to merge proposal