Merge lp:~vila/bzr/405745-http-hangs into lp:bzr

Proposed by Vincent Ladeuil
Status: Merged
Merged at revision: 5396
Proposed branch: lp:~vila/bzr/405745-http-hangs
Merge into: lp:bzr
Diff against target: 254 lines (+99/-68)
3 files modified
NEWS (+3/-0)
bzrlib/tests/http_server.py (+93/-66)
bzrlib/tests/test_http.py (+3/-2)
To merge this branch: bzr merge lp:~vila/bzr/405745-http-hangs
Reviewer Review Type Date Requested Status
Andrew Bennetts Needs Fixing
Review via email: mp+13050@code.launchpad.net

This proposal supersedes a proposal from 2009-10-07.

To post a comment you must log in.
Revision history for this message
Vincent Ladeuil (vila) wrote : Posted in a previous version of this proposal

Thanks to Cris Boylan that keep nagging me on the problem, this patch
addresses the http tests hanging on AIX.

The crux of the issue is that shutdown() can be implemented differently on various systems
and we were trying to handle the socket from two different threads, which is frowned upon on AIX.

This patch stops the server more "manually" but more importantly more cleanly by
connecting to the server (allowing it to wake up from the listen() call and terminate
cleanly).

The server thread can now be joined without trouble reducing the leaking tests from ~300 to ~200
(when running bzr selftest -s bt.test_http).

Note that I didn't add any test here.
I tried to modify test_http.TestHTTPServer.test_server_start_and_stop to check
that no threads were leaked, but that just pointed out that the threading functions
active_count() and enumerate() are very happy to disagree about the number of active threads
even when called one after the other (I can't demonstrate it but I experimented it in ~5-10%
of my test runs).

Other submissions will follow shortly to address more leaks (first http/1.1, then ftp).

Revision history for this message
Andrew Bennetts (spiv) wrote : Posted in a previous version of this proposal

Looks good to me. Nice to see large scary comments get removed!

review: Approve
Revision history for this message
Vincent Ladeuil (vila) wrote :
Download full text (8.1 KiB)

I had to merge some refactoring done for bug #392127 so that the 2.4 band-aids stay smaller.

I changed the server.tearDown() into server.shutdown() as that better matches the SocketServer.py methods names.

I'm not particularly happy with having to override server_bind() twice, but that will be solved
with #392127 where the inheritance scheme is a bit different.

I may need to make connect_socket() a bit more reusable but I also plan to make more
test_server bits reusable so that could wait too.

Here is the differential diff (but it's ~200 lines while the full one is ~250):
=== modified file 'bzrlib/tests/http_server.py'
--- bzrlib/tests/http_server.py 2009-10-07 10:24:04 +0000
+++ bzrlib/tests/http_server.py 2009-10-08 08:25:53 +0000
@@ -317,27 +317,70 @@
         # the tests cases.
         self.test_case_server = test_case_server
         self._home_dir = test_case_server._home_dir
-
- def tearDown(self):
- """Called to clean-up the server.
-
- Since the server may be (surely is, even) in a blocking listen, we
- shutdown its socket before closing it.
- """
- # The server is listening for a last connection, let's give it:
- try:
- fake_conn = socket.create_connection(self.server_address)
- fake_conn.close()
- except socket.error, e:
- # But ignore connection errors as the point is to unblock the
- # server thread, it may happen that it's not blocked or even not
- # started (when something went wrong during test case setup
- # leading to self.setUp() *not* being called but self.tearDown()
- # still being called)
- pass
-
-
-class TestingHTTPServer(SocketServer.TCPServer, TestingHTTPServerMixin):
+ self.serving = False
+ self.is_shut_down = threading.Event()
+
+ def serve(self):
+ self.serving = True
+ self.is_shut_down.clear()
+ while self.serving:
+ try:
+ # Really a connection but the python framework is generic and
+ # call them requests
+ self.handle_request()
+ except socket.timeout:
+ pass
+ except (socket.error, select.error), e:
+ if e[0] == errno.EBADF:
+ # Starting with python-2.6, handle_request may raise socket
+ # or select exceptions when the server is shut down as we
+ # do.
+ pass
+ else:
+ raise
+ # Let's close the listening socket
+ self.server_close()
+ self.is_shut_down.set()
+
+ def connect_socket(self):
+ msg = "getaddrinfo returns an empty list"
+ for res in socket.getaddrinfo(*self.server_address):
+ af, socktype, proto, canonname, sa = res
+ sock = None
+ try:
+ sock = socket.socket(af, socktype, proto)
+ sock.connect(sa)
+ return sock
+
+ except socket.error, msg:
+ if sock is not None:
+ sock.close()
+ raise socket.error, msg
+
+ def shutdown(self):...

Read more...

Revision history for this message
Andrew Bennetts (spiv) wrote :

+ def connect_socket(self):
+ msg = "getaddrinfo returns an empty list"
+ for res in socket.getaddrinfo(*self.server_address):
[...]
+ except socket.error, msg:
+ if sock is not None:
+ sock.close()
+ raise socket.error, msg
+

"except socket.error, msg" is bogus, especially when "msg" was a string earlier. It's also a bit arbitrary to that if no addresses work that this will raise an error from the last address, rather than the first, or some sort of aggregation.

I'd probably write this as:
    err = socket.error("getaddrinfo returned an empty list")
    for ...
        except socket.error, err:
            # 'err' is now the most recent error
            if sock is not None:
                sock.close()
    raise err

Otherwise this seems ok. bb:tweak.

review: Needs Fixing

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
=== modified file 'NEWS'
--- NEWS 2009-11-30 04:49:31 +0000
+++ NEWS 2009-11-30 09:20:31 +0000
@@ -552,6 +552,9 @@
552Testing552Testing
553*******553*******
554554
555* HTTP test servers will leak less threads (and sockets) and will not hang on
556 AIX anymore. (Vincent Ladeuil, #405745)
557
555* Passing ``--lsprof-tests -v`` to bzr selftest will cause lsprof output to558* Passing ``--lsprof-tests -v`` to bzr selftest will cause lsprof output to
556 be output for every test. Note that this is very verbose! (Robert Collins)559 be output for every test. Note that this is very verbose! (Robert Collins)
557560
558561
=== modified file 'bzrlib/tests/http_server.py'
--- bzrlib/tests/http_server.py 2009-11-11 07:23:59 +0000
+++ bzrlib/tests/http_server.py 2009-11-30 09:20:31 +0000
@@ -317,44 +317,71 @@
317 # the tests cases.317 # the tests cases.
318 self.test_case_server = test_case_server318 self.test_case_server = test_case_server
319 self._home_dir = test_case_server._home_dir319 self._home_dir = test_case_server._home_dir
320320 self.serving = False
321 def tearDown(self):321 self.is_shut_down = threading.Event()
322 """Called to clean-up the server.322
323323 def serve(self):
324 Since the server may be (surely is, even) in a blocking listen, we324 self.serving = True
325 shutdown its socket before closing it.325 self.is_shut_down.clear()
326 """326 while self.serving:
327 # Note that is this executed as part of the implicit tear down in the327 try:
328 # main thread while the server runs in its own thread. The clean way328 # Really a connection but the python framework is generic and
329 # to tear down the server is to instruct him to stop accepting329 # call them requests
330 # connections and wait for the current connection(s) to end330 self.handle_request()
331 # naturally. To end the connection naturally, the http transports331 except socket.timeout:
332 # should close their socket when they do not need to talk to the332 pass
333 # server anymore. This happens naturally during the garbage collection333 except (socket.error, select.error), e:
334 # phase of the test transport objetcs (the server clients), so we334 if e[0] == errno.EBADF:
335 # don't have to worry about them. So, for the server, we must tear335 # Starting with python-2.6, handle_request may raise socket
336 # down here, from the main thread, when the test have ended. Note336 # or select exceptions when the server is shut down as we
337 # that since the server is in a blocking operation and since python337 # do.
338 # use select internally, shutting down the socket is reliable and338 pass
339 # relatively clean.339 else:
340 try:340 raise
341 self.socket.shutdown(socket.SHUT_RDWR)341 # Let's close the listening socket
342 except socket.error, e:342 self.server_close()
343 # WSAENOTCONN (10057) 'Socket is not connected' is harmless on343 self.is_shut_down.set()
344 # windows (occurs before the first connection attempt344
345 # vila--20071230)345 def connect_socket(self):
346346 err = socket.error('getaddrinfo returns an empty list')
347 # 'Socket is not connected' can also occur on OSX, with a347 for res in socket.getaddrinfo(*self.server_address):
348 # "regular" ENOTCONN (when something went wrong during test case348 af, socktype, proto, canonname, sa = res
349 # setup leading to self.setUp() *not* being called but349 sock = None
350 # self.tearDown() still being called -- vila20081106350 try:
351 if not len(e.args) or e.args[0] not in (errno.ENOTCONN, 10057):351 sock = socket.socket(af, socktype, proto)
352 raise352 sock.connect(sa)
353 # Let the server properly close the socket353 return sock
354 self.server_close()354
355355 except socket.error, err:
356356 # 'err' is now the most recent error
357class TestingHTTPServer(SocketServer.TCPServer, TestingHTTPServerMixin):357 if sock is not None:
358 sock.close()
359 raise err
360
361 def shutdown(self):
362 """Stops the serve() loop.
363
364 Blocks until the loop has finished. This must be called while serve()
365 is running in another thread, or it will deadlock.
366 """
367 if not self.serving:
368 return
369 self.serving = False
370 # The server is listening for a last connection, let's give it:
371 try:
372 fake_conn = self.connect_socket()
373 fake_conn.close()
374 except socket.error, e:
375 # But ignore connection errors as the point is to unblock the
376 # server thread, it may happen that it's not blocked or even not
377 # started (when something went wrong during test case setup
378 # leading to self.setUp() *not* being called but self.tearDown()
379 # still being called)
380 pass
381 self.is_shut_down.wait()
382
383
384class TestingHTTPServer(TestingHTTPServerMixin, SocketServer.TCPServer):
358385
359 def __init__(self, server_address, request_handler_class,386 def __init__(self, server_address, request_handler_class,
360 test_case_server):387 test_case_server):
@@ -362,9 +389,15 @@
362 SocketServer.TCPServer.__init__(self, server_address,389 SocketServer.TCPServer.__init__(self, server_address,
363 request_handler_class)390 request_handler_class)
364391
365392 def server_bind(self):
366class TestingThreadingHTTPServer(SocketServer.ThreadingTCPServer,393 SocketServer.TCPServer.server_bind(self)
367 TestingHTTPServerMixin):394 if sys.version < (2, 5):
395 self.server_address = self.socket.getsockname()
396
397
398class TestingThreadingHTTPServer(TestingHTTPServerMixin,
399 SocketServer.ThreadingTCPServer,
400 ):
368 """A threading HTTP test server for HTTP 1.1.401 """A threading HTTP test server for HTTP 1.1.
369402
370 Since tests can initiate several concurrent connections to the same http403 Since tests can initiate several concurrent connections to the same http
@@ -399,6 +432,11 @@
399 else:432 else:
400 raise433 raise
401434
435 def server_bind(self):
436 SocketServer.ThreadingTCPServer.server_bind(self)
437 if sys.version < (2, 5):
438 self.server_address = self.socket.getsockname()
439
402440
403class HttpServer(transport.Server):441class HttpServer(transport.Server):
404 """A test server for http transports.442 """A test server for http transports.
@@ -462,18 +500,22 @@
462 raise httplib.UnknownProtocol(proto_vers)500 raise httplib.UnknownProtocol(proto_vers)
463 else:501 else:
464 self._httpd = self.create_httpd(serv_cls, rhandler)502 self._httpd = self.create_httpd(serv_cls, rhandler)
503<<<<<<< TREE
465 self.host, self.port = self._httpd.socket.getsockname()504 self.host, self.port = self._httpd.socket.getsockname()
505=======
506 # Ensure we get the right port
507 host, self.port = self._httpd.server_address
508>>>>>>> MERGE-SOURCE
466 return self._httpd509 return self._httpd
467510
468 def _http_start(self):511 def _http_start(self):
469 """Server thread main entry point. """512 """Server thread main entry point. """
470 self._http_running = False513 server = None
471 try:514 try:
472 try:515 try:
473 httpd = self._get_httpd()516 server = self._get_httpd()
474 self._http_base_url = '%s://%s:%s/' % (self._url_protocol,517 self._http_base_url = '%s://%s:%s/' % (self._url_protocol,
475 self.host, self.port)518 self.host, self.port)
476 self._http_running = True
477 except:519 except:
478 # Whatever goes wrong, we save the exception for the main520 # Whatever goes wrong, we save the exception for the main
479 # thread. Note that since we are running in a thread, no signal521 # thread. Note that since we are running in a thread, no signal
@@ -486,21 +528,8 @@
486528
487 # From now on, exceptions are taken care of by the529 # From now on, exceptions are taken care of by the
488 # SocketServer.BaseServer or the request handler.530 # SocketServer.BaseServer or the request handler.
489 while self._http_running:531 if server is not None:
490 try:532 server.serve()
491 # Really an HTTP connection but the python framework is generic
492 # and call them requests
493 httpd.handle_request()
494 except socket.timeout:
495 pass
496 except (socket.error, select.error), e:
497 if e[0] == errno.EBADF:
498 # Starting with python-2.6, handle_request may raise socket
499 # or select exceptions when the server is shut down (as we
500 # do).
501 pass
502 else:
503 raise
504533
505 def _get_remote_url(self, path):534 def _get_remote_url(self, path):
506 path_parts = path.split(os.path.sep)535 path_parts = path.split(os.path.sep)
@@ -527,10 +556,10 @@
527 """556 """
528 # XXX: TODO: make the server back onto vfs_server rather than local557 # XXX: TODO: make the server back onto vfs_server rather than local
529 # disk.558 # disk.
530 if not (backing_transport_server is None or \559 if not (backing_transport_server is None
531 isinstance(backing_transport_server, local.LocalURLServer)):560 or isinstance(backing_transport_server, local.LocalURLServer)):
532 raise AssertionError(561 raise AssertionError(
533 "HTTPServer currently assumes local transport, got %s" % \562 "HTTPServer currently assumes local transport, got %s" %
534 backing_transport_server)563 backing_transport_server)
535 self._home_dir = os.getcwdu()564 self._home_dir = os.getcwdu()
536 self._local_path_parts = self._home_dir.split(os.path.sep)565 self._local_path_parts = self._home_dir.split(os.path.sep)
@@ -556,11 +585,9 @@
556585
557 def tearDown(self):586 def tearDown(self):
558 """See bzrlib.transport.Server.tearDown."""587 """See bzrlib.transport.Server.tearDown."""
559 self._httpd.tearDown()
560 self._http_running = False588 self._http_running = False
561 # We don't need to 'self._http_thread.join()' here since the thread is589 self._httpd.shutdown()
562 # a daemonic one and will be garbage collected anyway. Joining just590 self._http_thread.join()
563 # slows us down for no added benefit.
564591
565 def get_url(self):592 def get_url(self):
566 """See bzrlib.transport.Server.get_url."""593 """See bzrlib.transport.Server.get_url."""
567594
=== modified file 'bzrlib/tests/test_http.py'
--- bzrlib/tests/test_http.py 2009-11-18 16:02:53 +0000
+++ bzrlib/tests/test_http.py 2009-11-30 09:20:31 +0000
@@ -325,10 +325,11 @@
325 server = http_server.HttpServer()325 server = http_server.HttpServer()
326 server.setUp()326 server.setUp()
327 try:327 try:
328 self.assertTrue(server._http_running)328 self.assertTrue(server._httpd is not None)
329 self.assertTrue(server._httpd.serving)
329 finally:330 finally:
330 server.tearDown()331 server.tearDown()
331 self.assertFalse(server._http_running)332 self.assertFalse(server._httpd.serving)
332333
333 def test_create_http_server_one_zero(self):334 def test_create_http_server_one_zero(self):
334 class RequestHandlerOneZero(http_server.TestingHTTPRequestHandler):335 class RequestHandlerOneZero(http_server.TestingHTTPRequestHandler):