Webservice requests never use a slave database because last-write time is unknown

Bug #297052 reported by Stuart Bishop
4
Affects Status Importance Assigned to Milestone
Launchpad itself
Triaged
High
Unassigned

Bug Description

Webservice requests are stateless, and are forced to always use the master database to avoid problems caused by replication lag.

This is not scalable, especially when we consider that the webservice will be used for operations deemed too expensive to be implemented in the webapp. Webservice requests should use a slave database whenever possible, just like our web application does.

I can think of a few options to do this:

- We could require Launchpad API clients to support cookies, and use our existing session management.

- We could allow Launchpad API clients to generate and send their own session identifier, which we could hook into our existing session management.

- We could allow Launchpad API clients to send the 'last write time' with their requests. Launchpad can then issue queries on a slave when it knows all previous writes from the client have been replicated.

If the session identifier is deemed optional, we need to carefully consider the default behavior. If we continue the current policy of always using the master database, then we risk overloading this single point of failure.

Revision history for this message
Stuart Bishop (stub) wrote :

Another option would be to use the IP address of the client as a session token if none was passed explicitly. This won't be optimal though due to masking from NAT and web proxies.

Revision history for this message
Stuart Bishop (stub) wrote :

Setting to high, as I think we need to at least agree on an approach before the Launchpad API beta period finishes.

Changed in launchpad-foundations:
importance: Undecided → High
status: New → Triaged
assignee: nobody → flacoste
Revision history for this message
Francis J. Lacoste (flacoste) wrote :

The volume on API is very low, let's revisit has usage ramps up.

Changed in launchpad-foundations:
assignee: flacoste → nobody
importance: High → Medium
Revision history for this message
Leonard Richardson (leonardr) wrote :

I think using the IP addess as a session token would work. It's okay if everyone behind a firewall uses a particular slave database for a couple seconds, unless there's a *huge* number of people behind that firewall.

Here's a way of implementing the 'last write time' without requiring a bunch of logic on the client. When the client makes a write request, the server sends the last write time as a cookie. The client just needs to know how to handle cookies. (Not trivial, but a very general task.) If an incoming last-write-time is far enough in the past that all the slaves have caught up, the server can clear the cookie when it sends the response. The client can purge the cookie and continue on as before.

If a client doesn't support cookies, either they run the risk of having out-of-date data, or they always hit the master database. It depends on what we choose as the default.

I think this is better than a generic session ID cookie because the information is stored on the client. We don't have to keep a big map of session IDs and worry about what happens if the map gets screwed up. Also, custom clients can hack the last write time if they need to for whatever reason.

Revision history for this message
Stuart Bishop (stub) wrote : Re: [Bug 297052] Re: Webservice requests should use a slave database when possible

On Wed, Nov 12, 2008 at 11:21 PM, Francis J. Lacoste
<email address hidden> wrote:

> The volume on API is very low, let's revisit has usage ramps up.

Once usage ramps up, it is too late as we then have a high load from
clients that don't provide enough information for us to load balance
their queries.

--
Stuart Bishop <email address hidden>
http://www.stuartbishop.net/

Revision history for this message
Robert Collins (lifeless) wrote : Re: Webservice requests should use a slave database when possible

the API is > 50% of our render requests a day, so its use has clearly ramped up :) - 4M renders a day at the time of writing.

We have the Nonce which a single client uses and updates, we could attach last-write data to that.

Changed in launchpad:
importance: Medium → High
summary: - Webservice requests should use a slave database when possible
+ Webservice requests never use a slave database because last-write time
+ is unknown
Revision history for this message
Francis J. Lacoste (flacoste) wrote :

Anonymous requests could always be sent to a SlaveStore also, not sure how much of the API traffic it represents though

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.