Merge lp:~mandel/desktopcouch/batch_update into lp:desktopcouch

Proposed by Manuel de la Peña
Status: Merged
Approved by: Chad Miller
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~mandel/desktopcouch/batch_update
Merge into: lp:desktopcouch
Diff against target: 67 lines (+46/-0)
2 files modified
desktopcouch/records/server_base.py (+26/-0)
desktopcouch/records/tests/test_server.py (+20/-0)
To merge this branch: bzr merge lp:~mandel/desktopcouch/batch_update
Reviewer Review Type Date Requested Status
John O'Brien (community) Approve
Stuart Langridge (community) Approve
Eric Casteleijn (community) Approve
Review via email: mp+17482@code.launchpad.net

Commit message

Add put_records_batch(), a way to send several records at once.

To post a comment you must log in.
Revision history for this message
Manuel de la Peña (mandel) wrote :

The patch adds a put_records_batch method that puts a batch of docs in the db and a test that ensures that works as expected. The docs are added using a single request while the attachments are added one by one.

The result is a tuple of the form ``(success, docid, rev_or_exc)``, where ``success`` is a boolean indicating whether the update succeeded, ``docid`` is the ID of the document, and ``rev_or_exc`` is either the new document revision, or an exception instance (e.g. `ResourceConflict`) if the update failed.

Use case: Add a group of contacts to a category/tag

Concerns: If an attachment fails the doc are not rolled back. This same behavior if found in the single put_record method.

Revision history for this message
Eric Casteleijn (thisfred) wrote :

Looks good to me, and tests pass. Of course raising exceptions would be preferable to returning them, but in this case couchdb itself makes that hard to impossible.

review: Approve
Revision history for this message
Stuart Langridge (sil) wrote :
Download full text (19.5 KiB)

Man, I don't like returning an exception instance. After a long discussion on #python, though, I'm not sure there's a better way. Entire discussion pasted below (sorry, it's quite long).

<aquarius> If I've got a function which does multiple things, some of which may fail, but failure of some things doesn't mean that the others didn't succeed, what's the best way of signalling that failure? If I raise a ItPartiallyWorked exception, then I'll still need to pass back the successful results (from the successful actions) in that exception too, which seems a bit wrong
<lvh> aquarius, a warning sounds in order
<lvh> aquarius, just return a tuple that might have None in some bits
<lvh> also, the "warning" module
<aquarius> lvh, well, that's what we're doing at the moment: we return a list of results, where each result is either a successful result or an exception. But returning a list with exception instances in it seems weird. I mean, we could throw a warning *too* to alert people that it happened, of course.
<lvh> aquarius, you probably want to split the separate bits up into smaller functions that *do* do the right thing and raise an exception, but aren't taxed with the extra problem of still delivering partial results, like the composing function
<Alberth> aquarius: output a warning line to stderr ?
<aquarius> lvh, ah, I can't split it up into little bits because I'm calling an external non-Python thing which actually does all the actions :)
<lvh> aquarius, yeah no, don't use exception objects as sentinel values
<lvh> aquarius, use None instead, your users will hate you less
<aquarius> lvh, so I pass all the actions in one go to the external thing (CouchDB, in this case), and it gives me back a list of successes or failures.
<lvh> aquarius, does the absolute order matter? IE is the fact that something is at position 2 meaningful?
<aquarius> if I use None, though, there's no indication of *why* that action failed.
<lvh> aquarius, or is it just the relative position (2 comes before 3 and after 1)
<aquarius> lvh, absolute order does not matter (the list that comes back is *actually* a list of tuples (id_of_action, success_or_failure)
<Alberth> aquarius, lvh: you can also use a dict {'stepA': 'ok', 'stepB': 'foo failed'}
<lvh> if you raise an exception inside a generator and you only catch it outside of it, is the state lost?
<lvh> the logical thing to assume would be yes, let me give it a shot
<aquarius> Alberth, yes. that's what we do. What should "foo failed" be, though? I mean, it could just be a string, but then what I've done is reinvented string Exceptions :)
<lvh> aquarius, sentinel value
<lvh> aquarius, typically yourmodule.FAILURE or None
<lvh> of course, FAILURE = None # at module scope works fine
<aquarius> lvh, a sentinel value won't tell me what type of failure it was, though, it'll just tell me that it did fail. I'd like to know why
<Alberth> nice idea, a set of constant values could work
<Alberth> NO_FOO=1 ; NO_BAR =2
<aquarius> but that's just exceptions reinvented!
<lvh> aquarius, no, return values do not travel up the stack
<Alberth> exceptions are used to stop execution imho
<lvh> well, they do: but only once
<Alberth> that is n...

review: Needs Information
Revision history for this message
Manuel de la Peña (mandel) wrote :
Download full text (21.6 KiB)

I WOW that is indeed a long conversation (an instructive too), but there is a concern I have, if we were to fix this issue regarding the returning of exceptions we should first patch the update method of python-couchdb since it is the one really performing the job, the put_records_batch simple uses it and later adds the required attachments. From my point of view it is not worth to look at a solution at desktpcouch level but on python-couchdb.

I'm sure that the guy at python-couchdb will understand that returning instances of exceptions is not the best way, but will they want to depend on twisted? I personally dont think so, nevertheless we will have to talk with them and fix the real code that has the problem, if exceptions are created by python-couchdb we are simply making things worse.

I hope I make sense,

Manuel

> Man, I don't like returning an exception instance. After a long discussion on
> #python, though, I'm not sure there's a better way. Entire discussion pasted
> below (sorry, it's quite long).
>
> <aquarius> If I've got a function which does multiple things, some of
> which may fail, but failure of some things doesn't mean that the others didn't
> succeed, what's the best way of signalling that failure? If I raise a
> ItPartiallyWorked exception, then I'll still need to pass back the successful
> results (from the successful actions) in that exception too, which seems a bit
> wrong
> <lvh> aquarius, a warning sounds in order
> <lvh> aquarius, just return a tuple that might have None in some bits
> <lvh> also, the "warning" module
> <aquarius> lvh, well, that's what we're doing at the moment: we return a
> list of results, where each result is either a successful result or an
> exception. But returning a list with exception instances in it seems weird. I
> mean, we could throw a warning *too* to alert people that it happened, of
> course.
> <lvh> aquarius, you probably want to split the separate bits up into smaller
> functions that *do* do the right thing and raise an exception, but aren't
> taxed with the extra problem of still delivering partial results, like the
> composing function
> <Alberth> aquarius: output a warning line to stderr ?
> <aquarius> lvh, ah, I can't split it up into little bits because I'm
> calling an external non-Python thing which actually does all the actions :)
> <lvh> aquarius, yeah no, don't use exception objects as sentinel values
> <lvh> aquarius, use None instead, your users will hate you less
> <aquarius> lvh, so I pass all the actions in one go to the external thing
> (CouchDB, in this case), and it gives me back a list of successes or failures.
> <lvh> aquarius, does the absolute order matter? IE is the fact that
> something is at position 2 meaningful?
> <aquarius> if I use None, though, there's no indication of *why* that
> action failed.
> <lvh> aquarius, or is it just the relative position (2 comes before 3 and
> after 1)
> <aquarius> lvh, absolute order does not matter (the list that comes back
> is *actually* a list of tuples (id_of_action, success_or_failure)
> <Alberth> aquarius, lvh: you can also use a dict {'stepA': 'ok',
> '...

Revision history for this message
Manuel de la Peña (mandel) wrote :

I have been doing my little inquiries in python-couchdb, please read the following conversations regarding the matter:

Original discussion on the update interface: http://groups.google.com/group/couchdb-python/browse_thread/thread/df58036112c90040/e49d948a152c839b?lnk=gst&q=deferredlist#e49d948a152c839bfor

My conversation with python-couchdb group: http://groups.google.com/group/couchdb-python/browse_thread/thread/bf51ec2cd24ad154

I don't see any practical implementation of the method that does not return instances of different exceptions.

Revision history for this message
Stuart Langridge (sil) wrote :

OK. There seems to be no very good way of handling this, so we'll go with what you currently have.

review: Approve
Revision history for this message
John O'Brien (jdobrien) wrote :

looks good.

review: Approve
Revision history for this message
Chad Miller (cmiller) wrote :

Attempt to merge lp:~mandel/desktopcouch/batch_update into lp:desktopcouch failed due to merge conflicts:

text conflict in desktopcouch/records/server_base.py

Revision history for this message
Manuel de la Peña (mandel) wrote :

Should be ok now.

> Attempt to merge lp:~mandel/desktopcouch/batch_update into lp:desktopcouch
> failed due to merge conflicts:
>
> text conflict in desktopcouch/records/server_base.py

lp:~mandel/desktopcouch/batch_update updated
120. By Manuel de la Peña

Merged with current trunk

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== modified file 'desktopcouch/records/server_base.py'
2--- desktopcouch/records/server_base.py 2010-02-25 19:59:13 +0000
3+++ desktopcouch/records/server_base.py 2010-02-25 21:18:17 +0000
4@@ -215,6 +215,32 @@
5
6 return record.record_id
7
8+ def put_records_batch(self, batch):
9+ """Put a batch of records in back end storage."""
10+ # used to access fast the record
11+ records_hash = {}
12+ for record in batch:
13+ if not record.record_id:
14+ from uuid import uuid4 # Do not rely on couchdb to create an ID for us.
15+ record.record_id = uuid4().hex
16+ records_hash[record.record_id] = record
17+ # although with a single record we need to test for the revisison, with
18+ # a batch we do not, but we have to make sure that we did not get an error
19+ batch_put_result = self.db.update(batch)
20+ for current_tuple in batch_put_result:
21+ success, docid, rev_or_exc = current_tuple
22+ if success:
23+ record = records_hash[docid]
24+ # set the new rev
25+ record._data["_rev"] = rev_or_exc
26+ for attachment_name in record.list_attachments():
27+ data, content_type = record.attachment_data(attachment_name)
28+ self.db.put_attachment(
29+ {"_id":record.record_id, "_rev":record["_rev"]},
30+ data, attachment_name, content_type)
31+ # all success record have the blobs added we return result of update
32+ return batch_put_result
33+
34 def update_fields(self, record_id, fields, cached_record=None):
35 """Safely update a number of fields. 'fields' being a
36 dictionary with path_tuple: new_value for only the fields we
37
38=== modified file 'desktopcouch/records/tests/test_server.py'
39--- desktopcouch/records/tests/test_server.py 2010-02-13 22:06:45 +0000
40+++ desktopcouch/records/tests/test_server.py 2010-02-25 21:18:17 +0000
41@@ -95,6 +95,26 @@
42 self.assertEqual(
43 record['record_number'], retrieved_record['record_number'])
44
45+ def test_put_records_batch(self):
46+ """Test putting a batch of records."""
47+ # create records to be put in a batch
48+ batch = []
49+ for current in (1, 2, 3, 4, 5):
50+ record = Record({'record_number': current}, record_type="http://example.com/")
51+ if current % 2:
52+ # set bad data to get an exception
53+ record._data["_rev"] = "1-32323"
54+ batch.append(record)
55+ # put the batch and ensure that the records have been added
56+ batch_result = self.database.put_records_batch(batch)
57+ for current_tuple in batch_result:
58+ success, docid, rev_or_exc = current_tuple
59+ if success:
60+ self.assertTrue(self.database._server[self.dbname][docid])
61+ else:
62+ # make sure we do not have the record in the db
63+ self.assertFalse(self.database._server[self.dbname][docid])
64+
65 def test_delete_record(self):
66 """Test deletion of records."""
67 record = Record({'record_number': 0}, record_type="http://example.com/")

Subscribers

People subscribed via source and target branches