Code review comment for lp:~edwin-grubbs/launchpad/bug-615654-jobqueue-cron-script

Revision history for this message
Gavin Panella (allenap) wrote :

> Thanks for the review. Unfortunately, abentley had already started reviewing
> it, but it was all done on IRC up to this point. I should have just listed him
> as the requested reviewer. Sorry for the duplicate effort. I'll try to answer
> your questions below.

No worries. It's good that Aaron also reviewed this. He spotted a lot
of stuff that I skimmed over; I didn't take as long on this review as
I ought to have done.

>
> > I know you're adding ProcessJobSourceGroups in here as well as the
> > membership stuff, but it still seems like a lot of work to do
> > something asynchronously. Moving something out of the app server takes
> > a lot of plumbing work, and it's not trivial plumbing work either.
> >
> > I'd really like a way to create, in the app server, something like a
> > persistent closure that can be run later, there or somewhere
> > else. When run it would have the same user and database access as in
> > the app server thread that spawned it. Creating a new job now seems to
> > involve creating tables, database users, job sources, and so on, and
> > it's getting in the way of the actual work that needs doing.
> >
> > I'm rambling on, and it's not a fault in this branch, but after doing
> > this recently, do you have any thoughts on that?
> >
> > Gavin.
>
>
> It seems that the reason we don't support a closure method for processing the
> jobs is that there are a lot of security restrictions placed on us. For
> example, each job class is required to have its own dbuser, so that the losas
> can figure out which job class is causing problems on the db. Of course, the
> job system seems much better suited for really slow processes, and it might be
> worthwhile to have a secondary solution that just moves the processing after
> the request response.

Yes, that sounds good. As suggested by Aaron, I'll add it to the
wishlist.

>
> > [1]
> >
> > + for job_source in selected_job_sources:
> > + child = subprocess.Popen(child_args + [job_source])
> > + children.append(child)
> > + for child in children:
> > + child.wait()
> >
> > Is selected_job_sources ever likely to be large? If not now, but could
> > be one day, consider filing a bug to limit the number of concurrency
> > processes.
>
>
> Currently, each job source class is given its own process. If we run into a
> problem with too many concurrent jobs being processed, the job source classes
> can be reorganized into different crontab_groups, and then different crontab
> entries can process the jobs at different times. We are not really interested
> in a long term solution for this, since the job system will most likely be
> moved to some kind of message queue, so the cron scripts for job processing
> will disappear.

Okay, fair enough.

>
> > [2]
> >
> > + for child in children:
> > + child.wait()
> >
> > Is there any value in checking and doing something with the exit code
> > from the child?
>
>
> No, waiting for the children to finish is really only helpful for testing, so
> that all the output from the child processes can be gathered. The cron scripts
> log to files, and I am actually making the default behavior not to wait on the
> child processes, so that one really long running child process does not cause
> the other job sources not to run the next time the crontab entry calls it,
> because the parent process still has its lock file.

Cool.

>
> > [3]
> >
> > + packages=('_pythonpath', 'twisted', 'ampoule'), env=env)
> >
> > I just looked again at _pythonpath.py and now I can't see.
>
>
> Huh?

I just meant that the ugliness in _pythonpath.py made me temporarily
blind ;) Sorry for the confusion.

« Back to merge proposal