Merge lp:~jtv/launchpad/validate-translations-file into lp:launchpad

Proposed by Jeroen T. Vermeulen
Status: Merged
Approved by: Brad Crittenden
Approved revision: not available
Merged at revision: not available
Proposed branch: lp:~jtv/launchpad/validate-translations-file
Merge into: lp:launchpad
Diff against target: 461 lines (+317/-10)
7 files modified
lib/lp/translations/scripts/tests/test-data/minimal.pot (+8/-0)
lib/lp/translations/scripts/tests/test_validate_translations_file.py (+131/-0)
lib/lp/translations/scripts/validate_translations_file.py (+130/-0)
lib/lp/translations/utilities/mozilla_xpi_importer.py (+2/-0)
lib/lp/translations/utilities/tests/test_xpi_manifest.py (+21/-10)
lib/lp/translations/utilities/xpi_manifest.py (+7/-0)
scripts/rosetta/validate-translations-file.py (+18/-0)
To merge this branch: bzr merge lp:~jtv/launchpad/validate-translations-file
Reviewer Review Type Date Requested Status
Brad Crittenden (community) code Approve
Review via email: mp+16866@code.launchpad.net

Commit message

Validator script for translations files. Also, a leading newline in an XPI manifest is now recognized as an error.

To post a comment you must log in.
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

= Bug 503382 =

For Firefox in particular, in order to detect problems with broken translation files early and stay tightly coupled to upstream, the Ubuntu folks need to test translation files for syntax errors and such that would prevent them from importing into Launchpad.

In the case of Firefox (which uses XPI archives for translation, not gettext), a complication is that the upstream files are available as raw files that, on build, would go into an XPI archive. The main type of file to check is DTD files. We need to check these without building full XPI files, but the organization of files within XPI archives can be different from the directory hierarchies in the revision-controlled source tree.

This branch adds a script that parses translations files in a variety of formats. Only DTD and manifest files were requested, but adding a few formats was easy enough and may come in handy.

Note that this is not a full LaunchpadScript, although I did borrow a few snippets of setup code from there. It does not need locking, activity monitoring, etc. but standard options like -v can be useful.

One of the tests re-uses the XPI helpers that currently live in lp.translations.utilities.tests. Maybe those should be moved, though I'm not sure where.

Test with:
{{{
./bin/test -vv -t validate_translations_file
}}}

No lint. To Q/A, run the new script against a variety of files in these formats.

Jeroen

Revision history for this message
Brad Crittenden (bac) wrote :
Download full text (6.3 KiB)

Hi Jeroen,

Thanks for this branch and new tool. It looks good except for a few minor things. r=bac, merge-conditional

> === added file 'lib/lp/translations/scripts/tests/test_validate_translations_file.py'
> --- lib/lp/translations/scripts/tests/test_validate_translations_file.py 1970-01-01 00:00:00 +0000
> +++ lib/lp/translations/scripts/tests/test_validate_translations_file.py 2010-01-05 17:14:16 +0000
> @@ -0,0 +1,133 @@
> +#! /usr/bin/python2.5
> +#
> +# Copyright 2010 Canonical Ltd. This software is licensed under the
> +# GNU Affero General Public License version 3 (see the file LICENSE).

You win brownie points for getting the year right!

> +"""Test the validate-translations-file script."""
> +
> +import logging
> +import os.path
> +from textwrap import dedent
> +from unittest import TestCase, TestLoader
> +
> +from canonical.launchpad.ftests.script import run_script
> +
> +import lp.translations
> +
> +from lp.translations.scripts.validate_translations_file import (
> + UnknownFileType, ValidateTranslationsFile)
> +from lp.translations.utilities.tests.xpi_helpers import (
> + get_en_US_xpi_file_to_import)
> +
> +
> +class TestValidateTranslationsFile(TestCase):
> +
> + def _makeValidator(self, test_args=None):
> + """Produce a ValidateTranslationsFile."""
> + if test_args is None:
> + test_args = []
> + validator = ValidateTranslationsFile(test_args=test_args)
> + validator.logger.setLevel(logging.CRITICAL)
> + return validator
> +
> + def _strip(self, file_contents):
> + """Remove leading newlines & indentation from file_contents."""
> + return dedent(file_contents.strip())
> +
> + def _findTestData(self):
> + """Return base path to this test's test data."""
> + return os.path.join(
> + os.path.dirname(lp.translations.__file__),
> + 'scripts/tests/test-data')
> +
> + def test_validate_unknown(self):
> + # Unknown filename extensions result in UnknownFileType.
> + validator = self._makeValidator(['foo.bar'])

The argument list here is confusing. Does it serve any purpose? If
not please remove it.

> === added file 'lib/lp/translations/scripts/validate_translations_file.py'
> --- lib/lp/translations/scripts/validate_translations_file.py 1970-01-01 00:00:00 +0000
> +++ lib/lp/translations/scripts/validate_translations_file.py 2010-01-05 17:14:16 +0000
> @@ -0,0 +1,131 @@
> +# Copyright 2010 Canonical Ltd. This software is licensed under the
> +# GNU Affero General Public License version 3 (see the file LICENSE).
> +
> +__metaclass__ = type
> +
> +__all__ = [
> + 'UnknownFileType',
> + 'ValidateTranslationsFile',
> + ]
> +
> +from cStringIO import StringIO
> +import logging
> +from optparse import OptionParser
> +import os.path
> +
> +from canonical.launchpad import scripts
> +from lp.translations.utilities.gettext_po_parser import POParser
> +from lp.translations.utilities.mozilla_xpi_importer import (
> + DtdFile, MozillaZipImportParser)
> +from lp.translations.utilities.xpi_manifest import XpiManifest
> +
> +
> +class UnknownFileType(Exception):
> + """File's type is not recognized."""...

Read more...

review: Approve (code)
Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

Thank you. You were absolutely right about the confusing test_args; they were mostly meant to be illustrative here but I thought I'd need them at some point. Turns out I don't, and I just removed the entire mechanism. Simple is beautiful.

Preview Diff

[H/L] Next/Prev Comment, [J/K] Next/Prev File, [N/P] Next/Prev Hunk
1=== added directory 'lib/lp/translations/scripts/tests/test-data'
2=== added file 'lib/lp/translations/scripts/tests/test-data/minimal.pot'
3--- lib/lp/translations/scripts/tests/test-data/minimal.pot 1970-01-01 00:00:00 +0000
4+++ lib/lp/translations/scripts/tests/test-data/minimal.pot 2010-01-06 06:21:21 +0000
5@@ -0,0 +1,8 @@
6+msgid ""
7+msgstr ""
8+"MIME-Version: 1.0\n"
9+"Content-Type: text/plan; charset=UTF-8\n"
10+"Content-Transfer-Encoding: 8bit\n"
11+
12+msgid "A translatable string."
13+msgstr ""
14
15=== added file 'lib/lp/translations/scripts/tests/test_validate_translations_file.py'
16--- lib/lp/translations/scripts/tests/test_validate_translations_file.py 1970-01-01 00:00:00 +0000
17+++ lib/lp/translations/scripts/tests/test_validate_translations_file.py 2010-01-06 06:21:21 +0000
18@@ -0,0 +1,131 @@
19+#! /usr/bin/python2.5
20+#
21+# Copyright 2010 Canonical Ltd. This software is licensed under the
22+# GNU Affero General Public License version 3 (see the file LICENSE).
23+
24+"""Test the validate-translations-file script."""
25+
26+import logging
27+import os.path
28+from textwrap import dedent
29+from unittest import TestCase, TestLoader
30+
31+from canonical.launchpad.ftests.script import run_script
32+
33+import lp.translations
34+
35+from lp.translations.scripts.validate_translations_file import (
36+ UnknownFileType, ValidateTranslationsFile)
37+from lp.translations.utilities.tests.xpi_helpers import (
38+ get_en_US_xpi_file_to_import)
39+
40+
41+class TestValidateTranslationsFile(TestCase):
42+
43+ def _makeValidator(self):
44+ """Produce a ValidateTranslationsFile."""
45+ validator = ValidateTranslationsFile(test_args=[])
46+ validator.logger.setLevel(logging.CRITICAL)
47+ return validator
48+
49+ def _strip(self, file_contents):
50+ """Remove leading newlines & indentation from file_contents."""
51+ return dedent(file_contents.strip())
52+
53+ def _findTestData(self):
54+ """Return base path to this test's test data."""
55+ return os.path.join(
56+ os.path.dirname(lp.translations.__file__),
57+ 'scripts/tests/test-data')
58+
59+ def test_validate_unknown(self):
60+ # Unknown filename extensions result in UnknownFileType.
61+ validator = self._makeValidator()
62+ self.assertRaises(
63+ UnknownFileType, validator._validateContent, 'foo.bar', 'content')
64+
65+ def test_validate_dtd_good(self):
66+ validator = self._makeValidator()
67+ result = validator._validateContent(
68+ 'test.dtd', '<!ENTITY a.translatable.string "A string">\n')
69+ self.assertTrue(result)
70+
71+ def test_validate_dtd_bad(self):
72+ validator = self._makeValidator()
73+ result = validator._validateContent(
74+ 'test.dtd', '<!ENTIT etc.')
75+ self.assertFalse(result)
76+
77+ def test_validate_xpi_manifest_good(self):
78+ validator = self._makeValidator()
79+ result = validator._validateContent(
80+ 'chrome.manifest', 'locale foo nl jar:chrome/nl.jar!/foo/')
81+ self.assertTrue(result)
82+
83+ def test_validate_xpi_manifest_bad(self):
84+ # XPI manifests must not begin with newline.
85+ validator = self._makeValidator()
86+ result = validator._validateContent('chrome.manifest', '\nlocale')
87+ self.assertFalse(result)
88+
89+ def test_validate_po_good(self):
90+ validator = self._makeValidator()
91+ result = validator._validateContent('nl.po', self._strip(r"""
92+ msgid ""
93+ msgstr ""
94+ "MIME-Version: 1.0\n"
95+ "Content-Type: text/plan; charset=UTF-8\n"
96+ "Content-Transfer-Encoding: 8bit\n"
97+
98+ msgid "foo"
99+ msgstr "bar"
100+ """))
101+ self.assertTrue(result)
102+
103+ def test_validate_po_bad(self):
104+ validator = self._makeValidator()
105+ result = validator._validateContent('nl.po', self._strip("""
106+ msgid "no header here"
107+ msgstr "hier geen kopje"
108+ """))
109+ self.assertFalse(result)
110+
111+ def test_validate_pot_good(self):
112+ validator = self._makeValidator()
113+ result = validator._validateContent('test.pot', self._strip(r"""
114+ msgid ""
115+ msgstr ""
116+ "MIME-Version: 1.0\n"
117+ "Content-Type: text/plan; charset=UTF-8\n"
118+ "Content-Transfer-Encoding: 8bit\n"
119+
120+ msgid "foo"
121+ msgstr ""
122+ """))
123+ self.assertTrue(result)
124+
125+ def test_validate_pot_bad(self):
126+ validator = self._makeValidator()
127+ result = validator._validateContent('test.pot', 'garble')
128+ self.assertFalse(result)
129+
130+ def test_validate_xpi_good(self):
131+ validator = self._makeValidator()
132+ xpi_content = get_en_US_xpi_file_to_import('en-US').read()
133+ result = validator._validateContent('pl.xpi', xpi_content)
134+ self.assertTrue(result)
135+
136+ def test_validate_xpi_bad(self):
137+ validator = self._makeValidator()
138+ result = validator._validateContent('de.xpi', 'garble')
139+ self.assertFalse(result)
140+
141+ def test_script(self):
142+ test_input = os.path.join(self._findTestData(), 'minimal.pot')
143+ script = 'scripts/rosetta/validate-translations-file.py'
144+ result, out, err = run_script(script, [test_input])
145+ self.assertEqual(0, result)
146+
147+
148+def test_suite():
149+ return TestLoader().loadTestsFromName(__name__)
150
151=== added file 'lib/lp/translations/scripts/validate_translations_file.py'
152--- lib/lp/translations/scripts/validate_translations_file.py 1970-01-01 00:00:00 +0000
153+++ lib/lp/translations/scripts/validate_translations_file.py 2010-01-06 06:21:21 +0000
154@@ -0,0 +1,130 @@
155+# Copyright 2010 Canonical Ltd. This software is licensed under the
156+# GNU Affero General Public License version 3 (see the file LICENSE).
157+
158+__metaclass__ = type
159+
160+__all__ = [
161+ 'UnknownFileType',
162+ 'ValidateTranslationsFile',
163+ ]
164+
165+from cStringIO import StringIO
166+import logging
167+from optparse import OptionParser
168+import os.path
169+
170+from canonical.launchpad import scripts
171+from lp.translations.utilities.gettext_po_parser import POParser
172+from lp.translations.utilities.mozilla_xpi_importer import (
173+ DtdFile, MozillaZipImportParser)
174+from lp.translations.utilities.xpi_manifest import XpiManifest
175+
176+
177+class UnknownFileType(Exception):
178+ """File's type is not recognized."""
179+
180+
181+def validate_unknown_file_type(filename, content):
182+ """Fail validation: unknown file type."""
183+ raise UnknownFileType("Unrecognized file type for '%s'." % filename)
184+
185+
186+def validate_dtd(filename, content):
187+ """Validate XPI DTD file."""
188+ DtdFile(filename, filename, content)
189+
190+
191+def validate_po(filename, content):
192+ """Validate a gettext PO or POT file."""
193+ POParser().parse(content)
194+
195+
196+def validate_xpi(filename, content):
197+ """Validate an XPI archive."""
198+ MozillaZipImportParser(filename, StringIO(content))
199+
200+
201+def validate_xpi_manifest(filename, content):
202+ """Validate XPI manifest."""
203+ XpiManifest(content)
204+
205+
206+class ValidateTranslationsFile:
207+ """Parse translations files to see if they are well-formed."""
208+
209+ name = 'validate-translations-file'
210+
211+ validators = {
212+ 'dtd': validate_dtd,
213+ 'manifest': validate_xpi_manifest,
214+ 'po': validate_po,
215+ 'pot': validate_po,
216+ 'xpi': validate_xpi,
217+ }
218+
219+ def __init__(self, test_args=None):
220+ """Set up basic facilities, similar to `LaunchpadScript`."""
221+ self.parser = OptionParser()
222+ scripts.logger_options(self.parser, default=logging.INFO)
223+ self.options, self.args = self.parser.parse_args(args=test_args)
224+ self.logger = scripts.logger(self.options, self.name)
225+
226+ def main(self):
227+ """Validate file(s)."""
228+ failures = 0
229+ files = len(self.args)
230+ self.logger.info("Validating %d file(s)." % files)
231+
232+ for filename in self.args:
233+ if not self._readAndValidate(filename):
234+ failures += 1
235+
236+ if failures == 0:
237+ self.logger.info("OK.")
238+ elif failures > 1:
239+ self.logger.error("%d failures in %d files." % (failures, files))
240+ elif files > 1:
241+ self.logger.error("1 failure in %d files." % files)
242+ else:
243+ self.logger.error("Validation failed.")
244+
245+ if failures == 0:
246+ return 0
247+ else:
248+ return 1
249+
250+ def _pickValidator(self, filename):
251+ """Select the appropriate validator for a file."""
252+ base, ext = os.path.splitext(filename)
253+ if ext is not None and ext.startswith('.'):
254+ ext = ext[1:]
255+ return self.validators.get(ext, validate_unknown_file_type)
256+
257+ def _validateContent(self, filename, content):
258+ """Validate in-memory file contents.
259+
260+ :param filename: Name of this file.
261+ :param content: Contents of this file, as raw bytes.
262+ :return: Whether the file was parsed successfully.
263+ """
264+ validator = self._pickValidator(filename)
265+ try:
266+ validator(filename, content)
267+ except (SystemError, AssertionError):
268+ raise
269+ except UnknownFileType:
270+ raise
271+ except Exception, e:
272+ self.logger.warn("Failure in '%s': %s" % (filename, e))
273+ return False
274+
275+ return True
276+
277+ def _readAndValidate(self, filename):
278+ """Read given file and validate it.
279+
280+ :param filename: Name of a file to read.
281+ :return: Whether the file was parsed successfully.
282+ """
283+ content = file(filename).read()
284+ return self._validateContent(filename, content)
285
286=== modified file 'lib/lp/translations/utilities/mozilla_xpi_importer.py'
287--- lib/lp/translations/utilities/mozilla_xpi_importer.py 2009-10-14 18:43:26 +0000
288+++ lib/lp/translations/utilities/mozilla_xpi_importer.py 2010-01-06 06:21:21 +0000
289@@ -4,7 +4,9 @@
290 __metaclass__ = type
291
292 __all__ = [
293+ 'DtdFile',
294 'MozillaXpiImporter',
295+ 'MozillaZipImportParser',
296 ]
297
298 from cStringIO import StringIO
299
300=== modified file 'lib/lp/translations/utilities/tests/test_xpi_manifest.py'
301--- lib/lp/translations/utilities/tests/test_xpi_manifest.py 2009-07-17 00:26:05 +0000
302+++ lib/lp/translations/utilities/tests/test_xpi_manifest.py 2010-01-06 06:21:21 +0000
303@@ -9,6 +9,9 @@
304
305 from lp.translations.utilities.xpi_manifest import XpiManifest
306
307+from lp.translations.interfaces.translationimporter import (
308+ TranslationFormatSyntaxError)
309+
310
311 class XpiManifestTestCase(unittest.TestCase):
312 """Test `XpiManifest`."""
313@@ -38,7 +41,7 @@
314 There are no usable
315 locale lines
316 in this file.
317- """)
318+ """.lstrip())
319 self.assertEqual(len(manifest._locales), 0)
320 chrome_path, locale = manifest.getChromePathAndLocale('lines')
321 self.failIf(chrome_path is not None, "Empty manifest matched a path.")
322@@ -61,7 +64,7 @@
323 locale bar en-US bardir/
324 locale ixx en-US ixxdir/
325 locale gna en-US gnadir/
326- """)
327+ """.lstrip())
328 self.assertEqual(len(manifest._locales), 4)
329 self._checkSortOrder(manifest)
330 for dir in ['gna', 'bar', 'ixx', 'foo']:
331@@ -107,7 +110,7 @@
332 locale okay fr foodir/
333 locale overlong fr foordir/ etc. etc. etc.
334 locale incomplete fr
335- """)
336+ """.lstrip())
337 self.assertEqual(len(manifest._locales), 1)
338 chrome_path, locale = manifest.getChromePathAndLocale('foodir/x')
339 self.failIf(chrome_path is None, "Garbage lines messed up match.")
340@@ -119,7 +122,7 @@
341 manifest = XpiManifest("""
342 locale dup fy boppe
343 locale dup fy boppe
344- """)
345+ """.lstrip())
346 self.assertEqual(len(manifest._locales), 1)
347
348 def _checkLookup(self, manifest, path, chrome_path, locale):
349@@ -162,7 +165,7 @@
350 manifest = XpiManifest("""
351 locale short el /ploink/squit
352 locale long he /ploink/squittle
353- """)
354+ """.lstrip())
355 self._checkSortOrder(manifest)
356 self._checkLookup(manifest, 'ploink/squit/x', 'short/x', 'el')
357 self._checkLookup(manifest, '/ploink/squittle/x', 'long/x', 'he')
358@@ -175,7 +178,7 @@
359 locale foo2 ca a/b/
360 locale foo3 ca a/b/c/x1
361 locale foo4 ca a/b/c/x2
362- """)
363+ """.lstrip())
364 self._checkSortOrder(manifest)
365 self._checkLookup(manifest, 'a/bb', 'foo1/bb', 'ca')
366 self._checkLookup(manifest, 'a/bb/c', 'foo1/bb/c', 'ca')
367@@ -190,7 +193,7 @@
368 manifest = XpiManifest("""
369 locale foo en_GB jar:foo.jar!/dir/
370 locale bar id jar:bar.jar!/
371- """)
372+ """.lstrip())
373 self._checkSortOrder(manifest)
374 self._checkLookup(
375 manifest, 'jar:foo.jar!/dir/file', 'foo/file', 'en_GB')
376@@ -227,7 +230,7 @@
377 locale croatian hr jar:translations.jar!/hr/
378 locale docs sr jar:docs.jar!/sr/
379 locale docs hr jar:docs.jar!/hr/
380- """)
381+ """.lstrip())
382 self._checkSortOrder(manifest)
383 self._checkLookup(
384 manifest, 'jar:translations.jar!/sr/x', 'serbian/x', 'sr')
385@@ -242,7 +245,7 @@
386 locale x it jar:dir/x.jar!/subdir/y.jar!/
387 locale y it jar:dir/x.jar!/subdir/y.jar!/deep/
388 locale z it jar:dir/x.jar!/subdir/z.jar!/
389- """)
390+ """.lstrip())
391 self._checkSortOrder(manifest)
392 self._checkLookup(
393 manifest, 'jar:dir/x.jar!/subdir/y.jar!/foo', 'x/foo', 'it')
394@@ -291,10 +294,18 @@
395 locale browser en-US jar:locales/
396 locale browser en-US jar:locales/en-US.jar!/chrome/
397 locale browser en-US jar:locales/en-US.jar!/
398- """)
399+ """.lstrip())
400 path = manifest.findMatchingXpiPath('browser/gui/print.dtd', 'en-US')
401 self.assertEqual(path, "jar:locales/en-US.jar!/chrome/gui/print.dtd")
402
403+ def test_blank_line(self):
404+ # Manifests must not begin with newline.
405+ self.assertRaises(
406+ TranslationFormatSyntaxError,
407+ XpiManifest, """
408+ locale browser en-US jar:locales
409+ """)
410+
411
412 def test_suite():
413 return unittest.defaultTestLoader.loadTestsFromName(__name__)
414
415=== modified file 'lib/lp/translations/utilities/xpi_manifest.py'
416--- lib/lp/translations/utilities/xpi_manifest.py 2009-07-17 00:26:05 +0000
417+++ lib/lp/translations/utilities/xpi_manifest.py 2010-01-06 06:21:21 +0000
418@@ -9,6 +9,9 @@
419 import logging
420 import re
421
422+from lp.translations.interfaces.translationimporter import (
423+ TranslationFormatSyntaxError)
424+
425
426 def normalize_path(path):
427 """Normalize filesystem path within XPI file."""
428@@ -127,6 +130,10 @@
429
430 def __init__(self, content):
431 """Initialize: parse `content` as a manifest file."""
432+ if content.startswith('\n'):
433+ raise TranslationFormatSyntaxError(
434+ message="Manifest begins with newline.")
435+
436 locales = []
437 for line in content.splitlines():
438 words = line.split()
439
440=== added file 'scripts/rosetta/validate-translations-file.py'
441--- scripts/rosetta/validate-translations-file.py 1970-01-01 00:00:00 +0000
442+++ scripts/rosetta/validate-translations-file.py 2010-01-06 06:21:21 +0000
443@@ -0,0 +1,18 @@
444+#! /usr/bin/python2.5
445+# Copyright 2010 Canonical Ltd. This software is licensed under the
446+# GNU Affero General Public License version 3 (see the file LICENSE).
447+
448+# pylint: disable-msg=W0403
449+
450+__metaclass__ = type
451+
452+import _pythonpath
453+
454+import sys
455+
456+from lp.translations.scripts.validate_translations_file import (
457+ ValidateTranslationsFile)
458+
459+
460+if __name__ == "__main__":
461+ sys.exit(ValidateTranslationsFile().main())