Input: 2013 retrospective

It was a big year for Input. In 2012, we spent the last half rewriting Input. In 2013, it went through sec-review, had a bunch of things fixed and then we migrated to the new system.

Since then, we've been fixing bugs, reimplementing features that were lost and writing the scaffolding for the new set of User Advocacy dashboards and tools.

Let's look at some Bugzilla and git stats for the year:

Twas the year: 2013
===================


Bugzilla
========

Bugs created: 150

                willkg : 100
            cwwmozilla : 5
                fbraun : 4
               mgrimes : 4
               tdowner : 3
        stephen.donner : 3
           me+bugzilla : 2
        gasell+mozilla : 2
               mcooper : 2
                 glind : 2
             mozaakash : 1
        kdurant35rules : 1
            hitmanarky : 1
              kbrosnan : 1
        bob.silverberg : 1
              splewako : 1
              rrosario : 1
             mattbasta : 1
              educmale : 1
                feer56 : 1
                326374 : 1
               anthony : 1
        shopov.bogomil : 1
               peterbe : 1
                  l10n : 1
    chrismore.bugzilla : 1
                landis : 1
          dron.rathore : 1
                    rq : 1
             MattN+bmo : 1
          joshua-smith : 1
                cturra : 1
        swagat.kanungo : 1

Bugs resolved: 268

                willkg : 157
                       :    WONTFIX 50
                       :      FIXED 89
                       : WORKSFORME 8
                       :  DUPLICATE 9
                       :    INVALID 1
            cwwmozilla : 57
                       :      FIXED 1
                       :    WONTFIX 7
                       : WORKSFORME 29
                       :  DUPLICATE 1
                       :    INVALID 19
               mgrimes : 10
                       :      FIXED 1
                       :  DUPLICATE 1
                       : WORKSFORME 5
                       :    INVALID 3
        shopov.bogomil : 7
                       :    WONTFIX 1
                       : WORKSFORME 2
                       :    INVALID 1
                       :      FIXED 2
                       :  DUPLICATE 1
               mcooper : 6
                       :  DUPLICATE 1
                       :      FIXED 5
               mozilla : 5
                       :      FIXED 5
           me+bugzilla : 4
                       :    WONTFIX 1
                       :      FIXED 1
                       :  DUPLICATE 1
                       :    INVALID 1
             mozaakash : 2
                       : WORKSFORME 1
                       :    INVALID 1
        trifandreialin : 2
                       : WORKSFORME 2
              rrosario : 2
                       :      FIXED 2
          joshua-smith : 2
                       :      FIXED 1
                       :    INVALID 1
           aaron.train : 2
                       :    WONTFIX 1
                       :  DUPLICATE 1
        stephen.donner : 1
                       : INCOMPLETE 1
               emorley : 1
                       :      FIXED 1
               curtisk : 1
                       :    INVALID 1
               unghost : 1
                       : WORKSFORME 1
          rajul.iitkgp : 1
                       :      FIXED 1
             jruderman : 1
                       : INCOMPLETE 1
          chris.lonnen : 1
                       :      FIXED 1
             nigelbabu : 1
                       :      FIXED 1
              tofumatt : 1
                       :      FIXED 1
                cturra : 1
                       :      FIXED 1
               fwenzel : 1
                       :      FIXED 1
               mbrandt : 1
                       :      FIXED 1

            INCOMPLETE : 2
             DUPLICATE : 15
               INVALID : 28
            WORKSFORME : 48
               WONTFIX : 60
                 FIXED : 115


git
===

Total commits: 277

      Will Kahn-Greene :   251  (+51614, -16878, files 1132)
           Mike Cooper :    12  (+38545, -249, files 219)
        Brandon Burton :     5  (+21, -178, files 6)
         Ricky Rosario :     4  (+36, -19, files 6)
        Bob Silverberg :     2  (+11, -6, files 2)
                 Rajul :     1  (+3, -0, files 1)
          Joshua Smith :     1  (+10, -5, files 1)
               bogomil :     1  (+1, -1, files 1)


Total lines added:   90241
Total lines deleted: 17336
Total files changed: 1368

I want to highlight some interesting bits:

  1. We resolved more bugs than we created. That's partially due to us going through and closing out old bugs for the old Input that aren't relevant anymore.

  2. According to the Bugzilla and git data, there were 47 contributors to Input this year: 326374, Bob Silverberg, Brandon Burton, Joshua Smith, MattN+bmo, Mike Cooper, Rajul, Ricky Rosario, Will Kahn-Greene, aaron.train, anthony, bogomil, chris.lonnen, chrismore.bugzilla, cturra, curtisk, cwwmozilla, dron.rathore, educmale, emorley, fbraun, feer56, fwenzel, gasell+mozilla, glind, hitmanarky, jruderman, kbrosnan, kdurant35rules, l10n, landis, mattbasta, mbrandt, me+bugzilla, mgrimes, mozaakash, nigelbabu, peterbe, rajul.iitkgp, rq, splewako, stephen.donner, swagat.kanungo, tdowner, tofumatt, trifandreialin, and unghost.

    That doesn't include localizers who do a ton of work translating the strings in the Input ui.

    That includes some of the folks who work on the input-tests repository, but possibly misses some.

  3. Most of the 47 contributors are not "core developers". That's cool, but I could be doing a better job here making it easier for non-core developers.

    We maintain a Get Involved page and we hang out on #input on irc.mozilla.org. We have a input-dev mailing list. If you want to work on Input, this is where it's at!

Those are the stats.

At a high-level, we accomplished the following:

  1. stood up a new Input code base

  2. the beginnings of spam identification and removal

  3. Input API for feedback submission

  4. Firefox OS feedback form

  5. infrastructure for an Analysts group with special privileges

  6. the beginnings of an Occurrence Comparison report dashboard

One thing I discovered in 2013q4 was that it's really hard to be the mostly-solo dev on a project like this. I'm lucky that I'm part of a larger team, so peer reviews for work I've done is possible and timely. However, I find I'm switching contexts between the technical details of what I'm working on now and the high-level details of a bunch of possible future tasks/projects. That's really hard to do day-to-day and still maintain development momentum. I have some thoughts on how to serialize my work so that I'm doing less context switching and I can focus on individual things more deeply which should produce better outcomes.

My goals for Input for 2014 are these:

  1. clean up the code base: there's still a bunch of weird stuff in there from the rapid development work we did in 2012

  2. reduce barriers to entry for new contributors: better documentation, fewer steps to get up and running, more bugs marked for mentoring, more outreach, ...

  3. build infrastructure that we can use for better User Advocacy tools: watched alerts, email notifications, dashboards, ...

  4. flesh out tests: we're really light on smoke tests and regression-catching tests

  5. work with Matt and Cheng to figure out where Input fits into the grand scheme of things; how can we make it a general-purpose feedback system? how can we handle non Firefox products and initiatives?

Yay for 2013!

Update 7:08pm

My script only showed top tens which misses tons of people who did work. I redid the data and that increases the number of contributors from 16 to 47. Oops!

Update April 21st, 2015

LGuruprasad found a bug in the script that caused commits-by-author information to be wrong. Fixed the script and updated the stats!

Dennis Retrospective (2013)

Project

time:

3 months

impact:
  • fixed l10n-related HTTP 500 errors for SUMO and Input

  • paved road for MDN, AMO and other Mozilla sites to use the same strategy

Problem statement

When we deploy support.mozilla.org (SUMO) and Input [1], it fetches the most recent localized strings .po files from SVN, compiles them to .mo files, and ships them with the site. Because SUMO supports many languages and the deploy pulls down the most recent changes, there's no way to effectively test the entire site for all languages before deploying to production. Because of that, users experience HTTP 500 server errors on pages that have bad strings.

When there are server errors, we get notified, write up a bug, and then have to go fix it and push the fix out as soon as possible. Fixing issues is difficult since we don't know most of the languages the site is translated in and our l10n community spans many timezones, so getting help can take many hours.

That was covered roughly in [bug 841412] for SUMO and [bug 875313] for Input.

On top of that, SUMO sends emails to SUMO users and if the localized strings are bad, then emails don't get sent. That's covered in [bug 850215].

Why are there problems in localizations? Judging by the strings we were seeing, we think we had a few issues:

  1. The localizer changes the formatting token.

    We're using gettext and Python and several different token formats. For example, %s, %(name)s, and {name}. If the localizer changes the structure of the token, that will cause a server error.

  2. The localizer translates the token.

    For example, the token is {name}, but the translated token is {nombre}.

  3. The localizer copies a string from somewhere else with different tokens.

    For example, the string has {url}, but the translated string has {helpurl}.

Thus we're faced with:

  1. a series of problems with strings that happen semi-frequently

  2. server errors preventing users from seeing certain parts of SUMO

  3. we can't suss out these errors by testing the whole site for every locale before every prod deploy

  4. each server error is a priority 1 interrupt

SUMO and Input weren't the only sites that has this problem---it's a problem for MDN, AMO, and other sites, too.

Solution

I wondered if we could effectively lint .po files during deploy and only compile the .po files that didn't have problems. If we did that, then the problem would stop.

I wrote a localelinter.py script in SUMO [2] that let me experiment with writing a linter and tying it to the deploy process.

That went well and I wanted to use the same system for two different projects plus I suspected others would want to use it, too. Further, we had a poxx.py script that let us debug layout problems related to translated strings and I wanted to merge both of these into a new library.

I created Dennis that let you lint and fake-translate strings.

On July 29th, I released Dennis v0.3.3 which I felt was good enough for us and other people to use. We switched SUMO and Input to use it.

Our server errors from localization issues mostly ended. Periodically, we'd encounter a new kind of issue and would improve Dennis to catch it.

We made the deploy logs viewable so linting errors could be seen, bugs could be written, and errors could get fixed.

I talked about how we use it on Input and SUMO along with the bash script we used that compiled .po files that linted successfully to .mo files. MDN, AMO, and other sites switched to using Dennis thus eliminating the errors for them, too.

Impact

This eliminated a frequent cause of HTTP 500 errors which caused downtime for the site, prevented users from getting Firefox support, and created frequent interruptions for site engineers.

Building it as a library allowed other Mozilla sites to use it eliminating the problem for them, too.

Alternatives

potools

The potools project that had a linter, but it:

  1. didn't return a non-zero exit code when it encountered linting errors

  2. performed a quality-of-string kind of linting and didn't really know about variables

We couldn't go with this as is and it looked too difficult to improve to meet our needs.

pyvideo status: November 24th, 2013

What is pyvideo.org

pyvideo.org is an index of Python-related conference and user-group videos on the Internet. Saw a session you liked and want to share it? It's likely you can find it, watch it, and share it with pyvideo.org.

Status

Lot of stuff has happened since the last status report, but there are four things of note:

  1. Sheila is now a co-admin of pyvideo.org. She has been for a couple of months. I need to update the site to reflect this.

    I'm really psyched about this. It's a ton of work and I'm just not managing it well. Splitting the work should make it more manageable.

  2. Back in July, Sheila poked me about a tweet Jesse wrote suggesting Rackspace was interested in sponsoring Open Source projects. She contacted Jesse and set everything up.

    I'm psyched that Rackspace agreed to sponsor pyvideo.org by providing free hosting. Several months later, I moved pyvideo.org from where it was before to a vm at Rackspace.

    I'm really excited about this! It makes a bunch of problems that I was trying to figure out what to do about go away.

    Thank you, Rackspace!

    I need to update the site to reflect this.

  3. Sheila discovered that blip.tv was expiring a bunch of accounts that held conference videos and that those videos would go away. She and I scrambled to download all the files from blip and move them to Rackspace cloudfiles. It's about 600 videos and around 250gb of data.

    In the process of doing that, we saved videos for DjangoCon EU 2010, DjangoCon EU 2011 and PyGotham 2012. I added these to pyvideo.org today. These videos have pages that are stubs with no metadata. I've got that in my queue of things to fix.

    Also, the thumbnails for all the videos on blip.tv are on my laptop which isn't very helpful. I need to move those and update the videos in pyvideo.org.

    As a side note, if we didn't have hosting from Rackspace, we'd have been totally screwed. Thank you, Jesse Noller and Rackspace!

  4. I've been working on the richard codebase fixing architectural problems, reducing the complexities and trying to clean it up so it's in a better state. That work is almost done. When it is, I'll update pyvideo.org with the new site. At this rate, I think I can finish the work this year, but that assumes there aren't any more emergencies.

  5. I've been thinking about how to build a better communication channel for pyvideo.org so people can more easily follow what's going on so they can act on things they're interested in.

    pyvideo.org has a "site news" section. It's a pain in the ass to use and it's not syndicated anywhere and it's likely no one sees it.

    Blogging status reports like this on my blog is better, but I don't think my blog is very widely read. Making my blog more widely-read seems like a lot of work and I'm not sure I can do it effectively anyhow.

    So I've decided to ditch the "site news" section of pyvideo.org and switch to Twitter. I started a @PyvideoOrg account.

    I'll tweet site updates, calls for help and newly posted conferences. I'm tossing around tweeting new videos when they get posted, but videos tend to get posted in huge batches and getting > 40 tweets all at once is a total drag. I'll have to think about that some more.

    Follow @PyvideoOrg if you're interested! Also, feel free to tweet at that account.

    I need to update the site to reflect this.

Also, in my life things are pretty nuts. I have a new kid and juggling everything was impossible for a while. I think that should easy up now and I can spend more time on pyvideo.org going forward.

That's the state of things!

Also, thank you thank you thank you thank you Rackspace!

Dennis v0.3.10 released! Fixes, status subcommand and Zombie!

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a linter for finding problems in strings in .po files

  • a statuser for seeing the high-level status of your .po files

  • a translator for strings .po files

v0.3.10 released!

v0.3.8 fixed mismatched errors in plural strings. Thanks Mike!

v0.3.9 fixed two false positives in error detection.

v0.3.10 adds the status subcommand and the Zombie transform which, like the dubstep transform, is silly but fun.

/images/sumo_zombie1.thumbnail.png

SUMO ... in Zombie!

45 out of 47 Djangonauts use the Zombie transform to make their site accessible to those who have departed. This could open up your app to millions of new users. Truth.

Dennis v0.3.7 released! Dubstep and Django!

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a linter for finding problems in strings in .po files

  • a translator for strings .po files

v0.3.7 released!

v0.3.6 fixed a goof where the linter was skipping errors. Oops.

v0.3.7 adds a dubstep translator (which is just plain silly, but awesome).

/images/sumo_dubstep1.thumbnail.png

SUMO ... in Dubstep!

Truth: 9 out of 10 experts agree SUMO is extra helpful in dubstep.

v0.3.7 also adds Django command shims to make it easier to use Dennis in your Django project.

Use these instructions to set up Dennis so you can use its commands with ./manage.py.

If you aren't using Dennis, yet, it's worth taking a look at. l10n tools are the best!

Dennis v0.3.5 released!

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a translator for strings .po files

  • a linter for finding problems in strings in .po files

v0.3.5 released!

0.3.4 fixed an issue with the linter so it skips fuzzy strings.

0.3.5 fixes the rules default for the linter so that it includes the malformed lint rules. It also adds detection of formatting tokens like {0] where it doesn't end in a curly brace. This kicks up a ValueError in Python:

>>> '{0]'.format(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unmatched '{' in format
>>>

If you're using Dennis---especially to detect errors in .po files before you push them to production---you should upgrade.

Talk like a pirate day!

Tomorrow, September 19th, is Talk like a Pirate Day. Dennis can help you celebrate with its built-in Pirate translator which works on .po files, but also works on any input from command line arguments or stdin.

Translate your HTML pages:

(cat < "$1" | dennis-cmd translate --pipeline=html,pirate -) > "pirate_$1"

Translate all your git commit messages with this hooks/commit-msg:

#!/bin/bash

# Pipe the contents of the commit message file through dennis to
# a temp file, then copy it back.
(cat < $1 | dennis-cmd translate - > $1.tmp) && mv $1.tmp $1

# We always exit 0 even if the dennis-cmd fails. If the dennis-cmd
# fails, you get your original commit message. No one likes it when
# shenanigans break your stuff for realz.
exit 0;

If you forget about this blog post, these two recipes are in the recipes section of the documentation. If you have other recipes, I'd love to hear about them!

Also, the Pirate! translator can always be improved. If there are improvements you want to make, please submit a pull request!

ElasticUtils v0.8 and v0.8.1 released!

What is it?

ElasticUtils is a Python library for building and executing Elasticsearch searches.

v0.8 and v0.8.1 released!

I missed the announcement for v0.8, so I'll cover both v0.8 and v0.8.1 here.

Roughly:

  • ElasticUtils now requires at least pyelasticsearch 0.6

  • adds range query and filter

  • adds S.filter_raw

  • changes the Indexable.index arguments dropping force_insert and picking up overwrite_existing

For the complete list of what's new, What's new in Version 0.8.1

Many thanks to everyone who helped out: Jannis Leidel, Rob Hudson and Grégoire Vigneron.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Switching to South

tl;dr

We just landed the bits that switch us from Schematic---the migration system we were using---to South. This is my account of that journey in case it helps others.

Context

Kitsune is the Django project that runs Mozilla Support. The project was started many years ago. For as long as I've worked on Kitsune, we used a migration system called Schematic.

Schematic has the nicety of being very very raw. You can do anything: raw SQL, raw shell, raw Python, raw fish---whatevs. This was nice because we could write whatever we wanted.

Schematic is a total pain in the ass because it hasn't been touched in 2 years, doesn't work well with recent versions of MySQL or MariaDB and makes it really difficult to write migrations that continue to work over time. Further, there's no way to do backwards migrations in Schematic even if you wanted to. We were constantly getting bit by these issues.

The switch

Switching from Schematic to South turned out to be pretty easy. I did it Monday afternoon. I essentially followed these steps:

  1. Added South as a dependency for our Django project.

  2. Initialized South migrations for all the apps we use in Kitsune which was a whole bunch of:

    $ ./manage.py schemamigration <appname> --initial
  3. Wrote a last Schematic migration that adds all the South bookkeeping which entailed dumping the output of this to a file:

    $ mysqldump <database> south_migrationhistory

    and then editing that file by hand.

    That creates the south_migrationhistory table and populates it with the bookkeeping for the initial commits for the apps initialized in step 2.

  4. Added ./manage.py migrate to our deploy script.

  5. Do a happy dance!

The relevant commits for Kitsune are here:

That's pretty much it.

Update: September 11, 2013

This was with Django 1.4.7 and South 0.8.2. If you're using different versions, you may experience different things.

Update: September 13, 2013

In Schematic, one thing we would do after creating new models is add a content type and the permissions.

This walks through setting that up automatically with South and post_migrate:

http://devwithpassion.com/felipe/south-django-permissions/

Dennis v0.3.3 released!

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a translator for strings .po files

  • a linter for finding problems in strings in .po files

v0.3.3 released!

This is the first blog-post-announced release. I think Dennis is good enough for wider use. I've been using it for development work on both kitsune (which drives Support) and fjord (which drives Input with great success.

Why Dennis?

It fills two basic needs I had:

  1. translate .po files so I can find problems during development related to localized strings, layout issues, unicode support, etc

  2. lint translated .po files so that errors in translated strings don't make it to production where they cause fires, make users angry and make me very sad and tired

There's another project called Translate Toolkit that you could use for item 1, but it doesn't have Pirate! and I like my pipeline architecture since it's more "pluggable" (whatever that means). Plus it didn't have a linter that covered my specific issues nor does it return a non-zero exit status so I can't use it for selective compiling.

Therefore I decided to write my own tool to meet my needs.

The ultra-basics

Install

$ pip install dennis
$ pip install blessings  # Optional for prettier output

Linting

Lint a single .po file for problems including mismatched/malformed Python variables in translated strings:

$ dennis-cmd lint locale/fr/LC_MESSAGES/messages.po

Produces output like this:

(dennis) saturn ~/mozilla/fjord> dennis-cmd lint locale/fr/LC_MES
SAGES/messages.po
dennis-cmd version 0.3.4.dev
>>> Working on: /home/willkg/mozilla/fjord/locale/fr/LC_MESSAGES/
messages.po
Error: mismatched: invalid variables: {count}
msgid: Most Recent Message
msgstr[0]: Les {count} derniers messages

Error: mismatched: invalid variables: {count}
msgid: Most Recent Message
msgid_plural: Last %(count)s Messages
msgstr[1]: Les {count} derniers messages

Warning: mismatched: missing variables: %(count)s
msgid: Most Recent Message
msgid_plural: Last %(count)s Messages
msgstr[1]: Les {count} derniers messages

Error: mismatched: invalid variables: {count}
msgid: {0} similar messages
msgstr: Les {count} derniers messages

Warning: mismatched: missing variables: {0}
msgid: {0} similar messages
msgstr: Les {count} derniers messages

Totals
  Warnings:     2
  Errors:       3

If you have blessings installed, it'll colorize that output.

You can also lint a directory structure of .po files:

$ dennis-cmd lint --errorsonly locale/

We use this to compile only the error-free .po files to .mo files and tell us which .po files have problems so we can fix them. This prevents HTTP 500 errors and inaccessible pages due to maltranslated strings on our sites.

Translating

You can translate a .po file in place into Pirate! to help find l10n issues in your code:

$ dennis-cmd translate --pipeline=html,pirate \
    locale/xx/LC_MESSAGES/messages.po

This takes into account that the strings have HTML in them that should be ignored when translating. It uses a pipeline architecture where the output of one transform is fed as input to the next, so you can string them along and get shouty extra-pirate with anglequotes:

$ dennis-cmd translate --pipeline=html,pirate,pirate,shouty,anglequote \
    locale/xx/LC_MESSAGES/messages.po

Summary

That's the gist of it. In the Dennis documentation is a list of Dennis recipes covering linting, translating, etc.

Yay for Dennis!

Switched Geeksphone to nightly channel

Got my Geeksphone Peak yesterday in the mail, but hadn't had a chance to open the package until today (chaotic life stuff).

I'm running Debian Testing and had done some Gaia/FirefoxOS work in late 2012. I had my development environment still set up including adb and fastboot.

I wanted to switch the phone to the nightly channel which is what I was using (or thought I was using) with my Unagi. Mostly I followed Updating and Tweaking your Firefox OS Developer Preview phone/Geeksphone.

After doing that, I moved my SIM card over as well as the Micro SD card I had in my Unagi phone. The Micro SD card has all my media on it---not my suer data. I thought about moving my user data over, but I've had the Unagi so long that I figured my user data is probably in a funky state anyhow and better to start over.

Couple of thoughts:

  1. The nightly build is version 2.0.0-prerelease.

  2. The phone is a lot bigger than the Unagi---it'll take a bit of getting used to.

  3. Most things worked fine. I even got a marketing message from T-Mobile minutes after I inserted my SIM card and got on the network.

There are a few issues (the camera is not working at all). I need to spend some time with Bugzilla to file those with useful data. Regardless, the phone is working pretty well and I'm probably going to switch to it as my primary phone soon. That might give me some incentive to finish rewriting magic10ball.