Hi Henning, Thanks for tackling this. You've done a part of another card with this job, so good job there :) I feel it still needs a few improvements. See below. review needs-fixing > A large part of the diff is made up of the new property > "FileImporter.share_with_other_side" and its tests. The property is a > boolean that reflects the sharing policy for translations from the > file that is being imported. The policy depends on where the > translations are coming from and what they are being imported into and > on the permissions of the uploader. This was isolated to be able to > change this policy later. (And I admit it could have warranted an > extra branch.) Interesting approach. We'll need to extract this even further, but we've already got a separate card for that. You've just made that job simpler (though, it's not going to be identical, since we'll have to support the specifics of the web UI as well). > === modified file 'lib/lp/translations/doc/poimport-script.txt' > --- lib/lp/translations/doc/poimport-script.txt 2010-09-21 16:19:47 > +0000 > +++ lib/lp/translations/doc/poimport-script.txt 2010-09-30 08:53:45 +0000 > @@ -282,9 +282,9 @@ > > >>> from lp.services.worlddata.interfaces.language import ILanguageSet > >>> esperanto = getUtility(ILanguageSet).getLanguageByCode('eo') > - >>> foos = potemplate['Foo %s'].getLocalTranslationMessages( > + >>> suggestions = potemplate['Foo % > s'].getLocalTranslationMessages( > ... potemplate, esperanto) > - >>> sorted([foo.msgstr0.translation for foo in foos]) > + >>> sorted([sugg.msgstr0.translation for sugg in suggestions]) > [u'Bar', u'Bars'] I don't really like "sugg" much better than "foo" (though, "suggestions" instead of "foos" IS MUCH better :). How about "suggestion"? > === modified file 'lib/lp/translations/utilities/sanitize.py' > --- lib/lp/translations/utilities/sanitize.py 2010-08-16 09:01:35 +0000 > +++ lib/lp/translations/utilities/sanitize.py 2010-09-30 08:53:45 +0000 > @@ -181,6 +181,8 @@ > > # Expected plural forms should all exist and empty translations should > # be normalized to None. > + if pluralforms is None: > + pluralforms = 2 > for pluralform in range(pluralforms): > if pluralform not in sanitized_translations: > sanitized_translations[pluralform] = None What's the reasoning for this? Why not default to 1 instead (i.e. assume it's a message with no plurals at all)? > === modified file 'lib/lp/translations/utilities/tests/test_sanitize.py' > --- lib/lp/translations/utilities/tests/test_sanitize.py 2010-08-23 08:41:03 +0000 > +++ lib/lp/translations/utilities/tests/test_sanitize.py 2010-09-30 08:53:45 +0000 > @@ -257,6 +257,20 @@ > expected_sanitized, > sanitize_translations_from_webui(self.english, translations, 3)) > > + def test_sanitize_translations_pluralforms_none(self): > + # Some languages don't provide a plural form, so 2 is assumed. > + translations = { > + 0: u'Plural form 0 ', > + } > + expected_sanitized = { > + 0: u'Plural form 0', > + 1: None, > + } > + self.assertEqual( > + expected_sanitized, > + sanitize_translations_from_webui( > + self.english, translations, None)) > + > > def test_suite(): > return unittest.TestLoader().loadTestsFromName(__name__) > I am happy to see the above simple change tested as well: good job! > === modified file 'lib/lp/translations/utilities/tests/test_file_importer.py' ... > +class FileImporterSharingTest(TestCaseWithFactory): > + """Class test for the sharing operation of the FileImporter base class.""" > + layer = LaunchpadZopelessLayer > + > + UPSTREAM = 0 > + UBUNTU = 1 > + LANGCODE = 'eo' I know this is how we usually used to do stuff, but today we should be able to also make languages and not depend on existing ones in the sampledata. For instance, if you really want to keep using 'eo', I'd suggest you do factory.makeLanguage('eo@test'). > + POFILE = dedent("""\ > + msgid "" > + msgstr "" > + "PO-Revision-Date: 2005-05-03 20:41+0100\\n" > + "Last-Translator: FULL NAME \\n" > + "Content-Type: text/plain; charset=UTF-8\\n" > + "X-Launchpad-Export-Date: 2009-05-14 08:54+0000\\n" > + > + msgid "Thank You" > + msgstr "Dankon" > + """) > + > + def setUp(self): > + super(FileImporterSharingTest, self).setUp() > + # Create the upstream series and template with a translator. > + self.translator = self.factory.makeTranslator(self.LANGCODE) > + self.upstream_productseries = self.factory.makeProductSeries() > + self.upstream_productseries.product.translationgroup = ( > + self.translator.translationgroup) > + self.upstream_productseries.product.translationpermission = ( > + TranslationPermission.RESTRICTED) > + self.upstream_template = self.factory.makePOTemplate( > + productseries=self.upstream_productseries) > + > + def _makeEntry(self, side, from_upstream=False, uploader=None): How about "_makeImportEntry" just so it's slightly clearer? > + if side == self.UPSTREAM: > + potemplate = self.upstream_template > + else: > + # Create a template in a source package and link the source > + # package to the upstream series to enable sharing. > + ubuntu = getUtility(ILaunchpadCelebrities).ubuntu > + distroseries = self.factory.makeDistroSeries(distribution=ubuntu) > + ubuntu.translation_focus = distroseries > + sourcepackagename = self.factory.makeSourcePackageName() > + potemplate = self.factory.makePOTemplate( > + distroseries=distroseries, > + sourcepackagename=sourcepackagename, > + name=self.upstream_template.name) > + self.factory.makeSourcePackagePublishingHistory( > + sourcepackagename=sourcepackagename, > + distroseries=distroseries) Why is this necessary? Is it not possible to reach a sourcepackage without it? > + sourcepackage = distroseries.getSourcePackage(sourcepackagename) > + sourcepackage.setPackaging( > + self.upstream_productseries, self.factory.makePerson()) > + pofile = self.factory.makePOFile( > + self.LANGCODE, potemplate=potemplate, create_sharing=True) > + entry = self.factory.makeTranslationImportQueueEntry( > + potemplate=potemplate, from_upstream=from_upstream, > + uploader=uploader, content=self.POFILE) > + entry.potemplate = potemplate > + entry.pofile = pofile > + transaction.commit() It'd be nice to add a comment about why is a commit necessary. > + return entry > + > + def test_translator_persmissions(self): s/perSmissions/permissions/ > + # Sanity check that the translator has the right permissions but > + # others don't. > + pofile = self.factory.makePOFile( > + self.LANGCODE, potemplate=self.upstream_template) > + self.assertFalse( > + pofile.canEditTranslations(self.factory.makePerson())) > + self.assertTrue( > + pofile.canEditTranslations(self.translator.translator)) > + Do note that Jeroen has started on cleaning up these checks as well. It'd be good to point him at this code as well since you are probably going to land this first :) > + def test_templates_are_sharing(self): > + # Sharing between upstream and Ubuntu was set up correctly. > + entry = self._makeEntry(self.UBUNTU) > + subset = getUtility(IPOTemplateSet).getSharingSubset( > + distribution=entry.distroseries.distribution, > + sourcepackagename=entry.sourcepackagename) > + self.assertContentEqual( > + [entry.potemplate, self.upstream_template], > + list(subset.getSharingPOTemplates(entry.potemplate.name))) This looks a lot like a test for the code that's not really here, doesn't it? :) Or, is it basically a test for _makeEntry? Perhaps it should be marked as such (i.e. test_makeEntry_templates_are_sharing for the name). It is good that you want to have at least a basic sanity check for your helper method. > + def test_share_with_other_side_upstream(self): > + # An upstream queue entry will be shared with ubuntu. > + entry = self._makeEntry(self.UPSTREAM) > + importer = POFileImporter( > + entry, importers[TranslationFileFormat.PO], None) > + self.assertTrue( > + importer.share_with_other_side, > + "Upstream import should share with Ubuntu.") That's simple enough :) > + def test_share_with_other_side_ubuntu(self): > + # An ubuntu queue entry will not be shared with upstream. > + entry = self._makeEntry(self.UBUNTU) > + importer = POFileImporter( > + entry, importers[TranslationFileFormat.PO], None) > + self.assertFalse( > + importer.share_with_other_side, > + "Ubuntu import should not share with upstream.") As is this and the following tests: > + def test_share_with_other_side_ubuntu_from_package(self): ... > + def test_share_with_other_side_ubuntu_uploader_upstream_translator(self): ... Cool stuff. It's amazing how tests can be so simple for something this intricate. :) > === modified file 'lib/lp/translations/utilities/translation_import.py' > ... > > importers = { > @@ -445,7 +460,86 @@ > context=message.context)) > return potmsgset > > - def storeTranslationsInDatabase(self, message, potmsgset): > + @cachedproperty > + def share_with_other_side(self): > + """Returns True if translations should be shared with the other side. > + """ > + traits = getUtility( > + ITranslationSideTraitsSet).getForTemplate(self.potemplate) > + if traits.side == TranslationSide.UPSTREAM: > + return True > + # Check from_upstream. > + if self.translation_import_queue_entry.from_upstream: > + return True > + # Find the sharing POFile and check permissions. > + productseries = self.potemplate.distroseries.getSourcePackage( > + self.potemplate.sourcepackagename).productseries > + if productseries is None: > + return False This doesn't seem to be tested (i.e. when there's no link between a sourcepackage and productseries, we should return True if it's for UPSTREAM import, and return False for UBUNTU import: I know it doesn't make much sense to test for all of this with the code as-is, but as soon as we change the code, it will make more sense). > + upstream_template = getUtility(IPOTemplateSet).getSubset( > + productseries=productseries).getPOTemplateByName( > + self.potemplate.name) > + upstream_pofile = upstream_template.getPOFileByLang( > + self.pofile.language.code) When we move this code for use on views as well, we'll have to take more care with query count (I know it's a cached property, but at least these two could be a single query on the POTemplateSet). Not for this branch, though. > + @cachedproperty > + def translations_are_msgids(self): > + """Are symbolic msgids being used and these are the real ones?""" This sentence is a bit confusing. If I had no idea about what this is about, I'd have even less idea about it. :) > + def _storeCredits(self, potmsgset, credits): > + """Store credits but only from upstream.""" > + if not self.translation_import_queue_entry.from_upstream: > + return None > + return potmsgset.setCurrentTranslation( > + self.pofile, self.last_translator, credits, > + RosettaTranslationOrigin.SCM, self.share_with_other_side) This is actually very interesting, and I am happy to see that you extracted it out. It actually points out that the naming for "from_upstream" is not really perfect: it should probably be "from_maintainer" or "from_trusted" or something. I thought setCurrentTranslation didn't check for identical message first, but now I see that it does (I was going to warn about using it blindly). > + def _validateMessage(self, potmsgset, message, > + translations, message_data): > + """Validate the message and report success or failure.""" > + try: > + validate_translation( > + potmsgset.singular_text, potmsgset.plural_text, > + translations, potmsgset.flags) > + except GettextValidationError, e: > + self._addUpdateError(message_data, potmsgset, unicode(e)) > + message.validation_status = ( > + TranslationValidationStatus.UNKNOWNERROR) > + return False > + > + message.validation_status = TranslationValidationStatus.OK > + return True This looks very similar to what potmsgset._validate_translations does. It'd be nice if we actually shared this code as well. > + def _approveMessage(self, potmsgset, message, message_data): > + """Try to approve the message, return None on TranslationConflict.""" > + try: > + message.approve( > + self.pofile, self.last_translator, > + self.share_with_other_side, self.lock_timestamp) > + except TranslationConflict: > + self._addConflictError(message_data, potmsgset) > + if self.logger is not None: > + self.logger.info( > + "Conflicting updates on message %d." % potmsgset.id) > + # The message remains a suggestion. > + return None > + > + if self.translations_are_msgids: > + potmsgset.clearCachedSingularText() Should this not be in the "finally:" block? > + return message > + > + def storeTranslationsInDatabase(self, message_data, potmsgset): > """Try to store translations in the database. > > Perform check if a PO file is available and if the message has any > @@ -453,7 +547,8 @@ > is added to the list in self.errors but the translations are stored > anyway, marked as having an error. > > - :param message: The message for which translations will be stored. > + :param message_data: The message data for which translations will be > + stored. > :param potmsgset: The POTMsgSet that this message belongs to. > > :return: The updated translation_message entry or None, if no storing > @@ -464,54 +559,40 @@ > # store English strings in an IPOFile. > return None > > - if (not message.translations or > - set(message.translations) == set([u'']) or > - set(message.translations) == set([None])): > + no_translations = ( > + message_data.translations == [] or > + message_data.translations is None or > + not any(message_data.translations)) Oh, I just read up on any() :) It seems you can drop the comparison to the empty list as well. > + if no_translations: > # We don't have anything to import. > return None > > - try: > - # Do the actual import. > - translation_message = potmsgset.updateTranslation( > - self.pofile, self.last_translator, message.translations, > - self.translation_import_queue_entry.from_upstream, > - self.lock_timestamp, force_edition_rights=self.is_editor) > - except TranslationConflict: > - self._addConflictError(message, potmsgset) > - if self.logger is not None: > - self.logger.info( > - "Conflicting updates on message %d." % potmsgset.id) > - return None > - except GettextValidationError, e: > - # We got an error, so we submit the translation again but > - # this time asking to store it as a translation with > - # errors. > - > - # Add the pomsgset to the list of pomsgsets with errors. > - self._addUpdateError(message, potmsgset, unicode(e)) > - > - try: > - translation_message = potmsgset.updateTranslation( > - self.pofile, self.last_translator, message.translations, > - self.translation_import_queue_entry.from_upstream, > - self.lock_timestamp, ignore_errors=True, > - force_edition_rights=self.is_editor) > - except TranslationConflict: > - # A conflict on top of a validation error? Give up. > - # This message is cursed. > - if self.logger is not None: > - self.logger.info( > - "Conflicting updates; ignoring invalid message %d." % > - potmsgset.id) > - return None > - > - just_replaced_msgid = ( > - self.importer.uses_source_string_msgids and > - self.pofile.language.code == 'en') > - if just_replaced_msgid: > - potmsgset.clearCachedSingularText() > - > - return translation_message > + sanitized_translations = sanitize_translations_from_import( > + potmsgset.singular_text, message_data.translations, > + self.pofile.language.pluralforms) > + > + if potmsgset.is_translation_credit: > + # Translation credits cannot be added as suggestions. > + return self._storeCredits(potmsgset, sanitized_translations) Note that this might actually introduce a bug where we ignore bugs in software. For example, a translation credits message might have a c-format flag on, and a translation might have a percentage sign in, and we wouldn't bork out. Still, exported PO files would end up containing such a message with c-format flag and msgfmt couldn't compile them. With the new model, there will be no way to know if a translation credits message didn't validate, but is the actual one from upstream. Something to think about. >+ > + # The message is first stored as a suggestion and only made > + # current if it validates. > + new_message = potmsgset.submitSuggestion( > + self.pofile, self.last_translator, sanitized_translations) > + validation_ok = self._validateMessage( > + potmsgset, new_message, sanitized_translations, message_data) > + if validation_ok and self.is_editor: > + return self._approveMessage(potmsgset, new_message, message_data) > + > + if self.translation_import_queue_entry.from_upstream: > + # XXX: henninge 2010-09-21: Mixed models! > + # This is mimicking the old behavior to still mark these messages > + # as "imported". Will have to be removed when > + # getPOTMsgSetsWithErrors is updated to the new model. > + new_message.makeCurrentUpstream(True) getPOTMsgSetsWithErrors is not really part of our POFile:+filters, now that I think about it. So, I've added another card for that. Cheers, Danilo