-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Am 11.01.2010 12:29, Graham Binns schrieb:
> This branch does the initial infrastructure work for enabling us to calculate Bug heat.

Thank you, Graham, for getting this feature started!

> 
> It adds several classes which will be used by a cronscript to calculate the hotness values for all the bugs in Launchpad. It also exposes the hotness field of the Bug table via the IBug interface and sets up permissions for altering it appropriately.
> 

The branch looks really good and straightforward, so I forgive you the
short cover letter ... ;-) I do have a few comments, though, two of them
very strong. Therefore:

  review needs-fixing

1. Using "heat" consistently as the term and not use "hotness" will
avoid confusion. We already agreed on that on IRC. Could you please also
file a bug about renaming the db column?

2. Using constants for heat calculations. Whenever I see constant values
used in code, some alarm goes off inside of me. With a collection of
values like this, multiple alarms where ringing ... ;-) Please see my
suggestion below.

Apart from that, just a few minor things.

Again, thanks for your work.

Henning


> === modified file 'lib/lp/bugs/configure.zcml'
> +                    hotness"/>
> +                    setHotness"/>

See above.


> === added file 'lib/lp/bugs/doc/bug-heat.txt'
> +
> +A new bug will have a hotness of zero.
> +
> +    >>> from lp.testing.factory import LaunchpadObjectFactory
> +
> +    >>> factory = LaunchpadObjectFactory()

You can delete those two lines, "factory" is already defined in doc tests.

> +    >>> bug_owner = factory.makePerson()
> +    >>> bug = factory.makeBug(owner=bug_owner)
> +    >>> bug.hotness
> +    0
> +

[...]

> +The heat generated by duplicates is n*6, where n is the number of
> +duplicates that the bug has. With one duplicate,
> +_getHeatFromDuplicates() will return 6.

OK, this weighting value should definitely go into a constant of some
kind. I think a BugHeatConstants class in the interface file would be
appropriate.

class BugHeatConstants:
    """Constants needed to calculate bug heat."""

    DUPLICATES = 6

etc. See lib/lp/translations/interfaces/translations.py for an example.

Now, the question is what to do about the test. If you change the
weighting later on, do you want to update your test, too? In the
documentation and in the test code? You can decide on that but I think I
would do something like this:


Weighting parameters are found in BugHeatConstants.
The heat generated by duplicates is n*DUPLICATES, where n is the number of
duplicates that the bug has.

> +
> +    >>> dupe = factory.makeBug()
> +    >>> dupe.duplicateof = bug
> +
> +    >>> bug_heat_calculator._getHeatFromDuplicates()
> +    6

>>> heat = bug_heat_calculator._getHeatFromDuplicates()
>>> print heat == BugHeatConstants.DUPLICATES
True

> +
> +With 5 duplicates, _getHeatFromDuplicates() will return 30.

With 5 duplicates, the value will be 5 times as high.

> +
> +    >>> for i in range(4):
> +    ...     dupe = factory.makeBug()
> +    ...     dupe.duplicateof = bug
> +    >>> transaction.commit()
> +
> +    >>> bug_heat_calculator._getHeatFromDuplicates()
> +    30

>>> heat = bug_heat_calculator._getHeatFromDuplicates()
>>> print heat == 5 * BugHeatConstants.DUPLICATES
True


But as I said, I leave that up to you to decide.

This goes for all heat calculations, obviously, so I am not repeating
that here.

[...]

> +BugHeatCalculator.getBugHeat()
> +------------------------------
> +
> +BugHeatCalculator.getBugHeat() returns the total heat for a given bug as
> +the sum of the results of all _getHeatFrom*() methods.
> +
> +For our bug, the total heat will be 470:
> +
> + * From duplicates 30
> + * From subscribers 20
> + * From privacy 150
> + * From security 250
> + * From affected users 20

This might not work so clearly with my suggestion for the test. But
you'll think of something, I am sure. ;-P

[...]

> 
> === modified file 'lib/lp/bugs/interfaces/bug.py'

BugHeatConstants should go in here.

> --- lib/lp/bugs/interfaces/bug.py	2009-12-09 20:48:01 +0000
> +++ lib/lp/bugs/interfaces/bug.py	2010-01-11 11:29:16 +0000
> @@ -296,6 +296,10 @@
>              value_type=Reference(schema=IPerson),
>              readonly=True))
>  
> +    hotness = Int(
> +        title=_("The 'hotness' of the bug"),
> +        required=False, readonly=True)
> +

heat = ...

>      # Adding related BugMessages provides a hook for getting at
>      # BugMessage.visible when building bug comments.
>      bug_messages = Attribute('The bug messages related to this object.')
> @@ -732,6 +736,9 @@
>          if the user is the owner or an admin.
>          """
>  
> +    def setBugHotness(hotness):
> +        """Set the hotness for the bug."""

setBugHeat

> +
>  class InvalidDuplicateValue(Exception):
>      """A bug cannot be set as the duplicate of another."""
>      webservice_error(417)
> @@ -970,6 +977,19 @@
>          Otherwise, return False.
>          """
>  
> +    def dangerousGetAllBugs():
> +        """DO NOT CALL THIS METHOD.
> +
> +        This method exists solely to allow the bug heat script to grab
> +        all the bugs in the database - including private ones - and
> +        iterate over them. DO NOT USE IT UNLESS YOU KNOW WHAT YOU'RE
> +        DOING. AND IF YOU KNOW WHAT YOU'RE DOING YOU KNOW BETTER THAN TO
> +        USE THIS ANYWAY.
> +        """
> +        # XXX 2010-01-08 gmb bug=505850:
> +        #     Note, this method should go away when we have a proper
> +        #     permissions system for scripts.
> +

Yeah, it's really bad there is no better solution for this. Hope this
gets fixed soon.

>  
>  class InvalidBugTargetType(Exception):
>      """Bug target's type is not valid."""
> 
> === modified file 'lib/lp/bugs/model/bug.py'
> --- lib/lp/bugs/model/bug.py	2009-12-14 15:25:55 +0000
> +++ lib/lp/bugs/model/bug.py	2010-01-11 11:29:16 +0000
> @@ -254,6 +254,7 @@
>      message_count = IntCol(notNull=True, default=0)
>      users_affected_count = IntCol(notNull=True, default=0)
>      users_unaffected_count = IntCol(notNull=True, default=0)
> +    hotness = IntCol(notNull=True, default=0)

heat = IntCol(notNull=True, default=0, name="hotness") AFAIK

>  
>      @property
>      def comment_count(self):
> @@ -1425,6 +1426,10 @@
>  
>          return not subscriptions_from_dupes.is_empty()
>  
> +    def setHotness(self, hotness):
> +        """See `IBug`."""
> +        self.hotness = hotness
> +

self.heat = heat

>  
>  class BugSet:
>      """See BugSet."""
> @@ -1665,13 +1670,19 @@
>          return bugs
>  
>      def getByNumbers(self, bug_numbers):
> -        """see `IBugSet`."""
> +        """See `IBugSet`."""
>          if bug_numbers is None or len(bug_numbers) < 1:
>              return EmptyResultSet()
>          store = IStore(Bug)
>          result_set = store.find(Bug, In(Bug.id, bug_numbers))
>          return result_set.order_by('id')
>  
> +    def dangerousGetAllBugs(self):
> +        """See `IBugSet`."""
> +        store = IStore(Bug)
> +        result_set = store.find(Bug)
> +        return result_set.order_by('id')
> +
>  
>  class BugAffectsPerson(SQLBase):
>      """A bug is marked as affecting a user."""
> 
> === added file 'lib/lp/bugs/scripts/bugheat.py'
> --- lib/lp/bugs/scripts/bugheat.py	1970-01-01 00:00:00 +0000
> +++ lib/lp/bugs/scripts/bugheat.py	2010-01-11 11:29:16 +0000
> @@ -0,0 +1,153 @@
> +# Copyright 2010 Canonical Ltd.  This software is licensed under the
> +# GNU Affero General Public License version 3 (see the file LICENSE).
> +
> +"""The innards of the Bug Heat cronscript."""
> +
> +__metaclass__ = type
> +__all__ = []
> +
> +
> +from zope.component import getUtility
> +from zope.interface import implements
> +
> +from canonical.launchpad.interfaces.looptuner import ITunableLoop
> +from canonical.launchpad.utilities.looptuner import LoopTuner
> +
> +from lp.bugs.interfaces.bug import IBugSet
> +
> +class BugHeatCalculator:

I am surprised this goes into this file. True, it is used in the script
only atm, but it is not really specific to being used in scripts, is it?
But that is not a major point for me.

> +    """A class to calculate the heat for a bug."""
> +
> +    def __init__(self, bug):
> +        self.bug = bug
> +
> +    def _getHeatFromPrivacy(self):
> +        """Return the heat generated by the bug's `private` attribute."""
> +        if self.bug.private:
> +            return 150

return BugHeatConstants.PRIVATE

You get what I mean, I am sure... ;)

> +        else:
> +            return 0

return BugHeatConstants.COLD  :-) Maybe not....

[...]

> +
> +    def getBugHeat(self):
> +        """Return the total heat for the current bug."""
> +        heat_counts = [
> +            self._getHeatFromAffectedUsers(),
> +            self._getHeatFromDuplicates(),
> +            self._getHeatFromPrivacy(),
> +            self._getHeatFromSecurity(),
> +            self._getHeatFromSubscribers(),
> +            ]
> +
> +        total_heat = 0
> +        for count in heat_counts:
> +            total_heat += count
> +
> +        return total_heat

Err, what's wrong with "sum" here?
return sum(heat_counts)

Or even just one line:
        return sum([
            self._getHeatFromAffectedUsers(),
            self._getHeatFromDuplicates(),
            self._getHeatFromPrivacy(),
            self._getHeatFromSecurity(),
            self._getHeatFromSubscribers(),
            ])


> +
> +
> +class BugHeatTunableLoop:
> +    """An `ITunableLoop` implementation for bug heat calculations."""
> +
> +    implements(ITunableLoop)
> +
> +    total_updated = 0
> +
> +    def __init__(self, transaction, logger, offset=0):
> +        self.transaction = transaction
> +        self.logger = logger
> +        self.offset = offset
> +        self.total_updated = 0
> +
> +    def isDone(self):
> +        """See `ITunableLoop`."""
> +        # When the main loop has no more Bugs to process it sets
> +        # offset to None. Until then, it always has a numerical
> +        # value.
> +        return self.offset is None
> +
> +    def __call__(self, chunk_size):
> +        """Retrieve a batch of Bugs and update their heat.
> +
> +        See `ITunableLoop`.
> +        """
> +        # XXX 2010-01-08 gmb bug=198767:
> +        #     We cast chunk_size to an integer to ensure that we're not
> +        #     trying to slice using floats or anything similarly
> +        #     foolish. We shouldn't have to do this.
> +        chunk_size = int(chunk_size)
> +
> +        start = self.offset
> +        end = self.offset + chunk_size
> +
> +        self.transaction.begin()
> +        # XXX 2010-01-08 gmb bug=505850:
> +        #     This method call should be taken out and shot as soon as
> +        #     we have a proper permissions system for scripts.
> +        bugs = list(getUtility(IBugSet).dangerousGetAllBugs()[start:end])

We talked about this on IRC and you wanted to remove list().

> +
> +        self.offset = None
> +        if bugs:

Either use bugs.any() or test the length for zero.

number_of_bugs = len(bugs)
if number_of_bugs > 0:

> +            starting_id = bugs[0].id
> +            self.logger.info("Updating %i Bugs (starting id: %i)" %
> +                (len(bugs), starting_id))
> +
> +        for bug in bugs:
> +            # We set the starting point of the next batch to the Bug
> +            # id after the one we're looking at now. If there aren't any
> +            # bugs this loop will run for 0 iterations and start_id

What's start_id? ;-)

> +            # will remain set to None.
> +            start += 1
> +            self.offset = start
> +            self.logger.debug("Updating heat for bug %s" % bug.id)
> +            bug_heat_calculator = BugHeatCalculator(bug)
> +            heat = bug_heat_calculator.getBugHeat()
> +            bug.setHotness(heat)
> +            self.total_updated += 1

Seriously, I don't see the point of all those assignments in the loop
(start, self.offset, self.total_updated). start is not even used.
Probably copy-and-paste error.

Why not let the loop run and update the values at the end?

	if number_of_bugs == 0:
            # Terminate loop
            self.total_updated = self.offset
            self.offset = None
            return
        # Insert log output here ...
        for bug in bugs:
            self.logger.debug("Updating heat for bug %s" % bug.id)
            bug_heat_calculator = BugHeatCalculator(bug)
            heat = bug_heat_calculator.getBugHeat()
            bug.setHotness(heat)
        self.offset += number_of_bugs


> === modified file 'lib/lp/bugs/tests/test_doc.py'

Oh, good! I never knew about this file. Something learned for me, here! :-)

> --- lib/lp/bugs/tests/test_doc.py	2009-10-02 11:29:22 +0000
> +++ lib/lp/bugs/tests/test_doc.py	2010-01-11 11:29:16 +0000
> @@ -89,6 +89,12 @@
>          '../doc/cve-update.txt',
>          setUp=cveSetUp, tearDown=tearDown, layer=LaunchpadZopelessLayer
>          ),
> +    'bug-heat.txt': LayeredDocFileSuite(
> +        '../doc/bug-heat.txt',
> +        setUp=setUp,
> +        tearDown=tearDown,
> +        layer=LaunchpadZopelessLayer
> +        ),
>      'bugnotificationrecipients.txt-uploader': LayeredDocFileSuite(
>          '../doc/bugnotificationrecipients.txt',
>          setUp=uploaderBugsSetUp,
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktLIf0ACgkQBT3oW1L17ijCJQCghcdPyrLjhFwYniS66wC1OLr4
TrUAoK30+Ia/fkpZ5hiMRbF/6rxUxn3N
=IFW0
-----END PGP SIGNATURE-----