Bazaar

Code review comment for lp:~gz/bzr/transport_post_connect_hook

transport_post_connect_hook
Merge into bzr.dev

Revision history for this message

Vincent Ladeuil (vila) wrote on 2010-10-16:

>>>>> Martin [gz] <email address hidden> writes:

    > Just having a hook turned out to be insufficient, and results in
    > worse hangs than the old get_transport hack as can be seen from
    > the nearly four hour babune runtime last night:
    > <http://babune.ladeuil.net:24842/job/selftest-subset-windows/11/>

    > The problem is that RemoteTransport classes don't use
    > _set_connection and generally behave rather differently. Throwing
    > in a hook point in __init__ when a new medium is built gets us to
    > a leak-free 40 minute runtime:

> <http://babune.ladeuil.net:24842/job/selftest-subset-windows/13/>

Ok, so this validates the approach of calling disconnect for all
transports that have been able to connect to their server and ensures
all code paths are covered. As such, I'm tempted to accept the patch as
is waiting for a better solution in a followon.

From a design point of view, I think this outlines the divergence
between the smart transport and the others and came from the time where
we implemented connection sharing.

The smart transport has a medium object that implements
_ensure_connection() and disconnect() whereas the other transports use a
_SharedConnection object which is used a data container by the
transport, but the transport object implements connect_xxx() and
disconnect() while calling _set_connection() when the connection is
established (including reconnections).

So while the post_connect hook can be implemented for the transports
objects, it can't be properly implemented for smart transports where the
transport object is not available at the medium level when the
connection occur (in theory we could pass it done but I'm worried about
ref cycles there).

This hints as connection hook instead of a transport hook but this
doesn't play well with the transport objects who consider the connection
as an opaque attribute so far.

    > However the semantics are not quite correct there. Even if we
    > don't have to worry about remote reconnections (do we?) the
    > post_connect hook is happening before the real connection, and
    > potentially twice in the RemoteHTTPTransport case. Combined with
    > the existing confusion over how many times disconnect gets called,

I think the confusion comes the fact that disconnect() should be
implemented at the transport level since it is defined as closing the
connection even if there are other transports sharing this connection
(as opposed to closing the connection when the *last* transport using
the connection requires it).

In the test suite, all transports sharing a connection calls
disconnect() to ensure we don't get leaks but that's the expected
behaviour.

> it suggests this hook probably isn't sane enough to be generally
> useful.

Indeed, it lies for the smart transports as it's called *before* the
connection occurs.

> We do want the leak problem fixed asap though, so if anyone has
> any clever ideas...

One alternative would be to add a 'created' hook for transports which,
for the tests, will also call disconnect(). From your hack_transport
branch we know that calling disconnect() for all created transports is
enough to fix the leaks even if a bit agressive. This will replace the
'post_connect' hook for the smart transport.

The test can then call disconnect for both this 'created' hook for the
smart transports and the 'post_connect' hook for the other transports.
Calling the 'post_connect' hook for the other transports is still the
right thing to do since it needs to be called (and is in this proposal)
for the reconnections. I don't think smart transports handle
reconnections which we may want to fix too (but is out of scope for this
proposal).

This will address the test needs, but still leave a bug in the
'post_connect' hook for the smart transports.

I'd like to have spiv advice on how this could be adressed for the
smart medium/transport relationship and whether there is a way to
better unify the connection handling for *all* transports though.

There is also the client/medium disctinction in this area which is still
unclear to me but that's another subject.

>>>>> Martin [gz] <gzlist@googlemail.com> writes:

> <http://babune.ladeuil.net:24842/job/selftest-subset-windows/13/>

From a design point of view, I think this outlines the divergence
between the smart transport and the others and came from the time where
we implemented connection sharing.

This hints as connection hook instead of a transport hook but this
doesn't play well with the transport objects who consider the connection
as an opaque attribute so far.

In the test suite, all transports sharing a connection calls
disconnect() to ensure we don't get leaks but that's the expected
behaviour.

> it suggests this hook probably isn't sane enough to be generally
    > useful.

Indeed, it lies for the smart transports as it's called *before* the
connection occurs.

> We do want the leak problem fixed asap though, so if anyone has
    > any clever ideas...

This will address the test needs, but still leave a bug in the
'post_connect' hook for the smart transports.

I'd like to have spiv advice on how this could be adressed for the
smart medium/transport  relationship and whether there is a way to
better unify the connection handling for *all* transports though.

There is also the client/medium disctinction in this area which is still
unclear to me but that's another subject.

« Back to merge proposal