Dev (old posts, page 23)

ElasticUtils v0.9 released!

, | Tweet this

What is it?

ElasticUtils is a Python library for building and executing Elasticsearch searches.

See the Quickstart for more details.

v0.9 released!

This is a big release, but there are some compromises in it that I'm not wildly excited about. Things like Elasticsearch 1.0 support didn't make the cut. I'm really sorry about that---we're working on it.

This release has a lot of changes in it. Roughly:

  • dropped pyelasticsearch for elasticsearch-py (Thank you Honza!)
  • fixed S.all() so it does what Django does which should let you use an S in the place of a QuerySet in some cases
  • new FacetResult class (Thank you James!)
  • S.facet() can take a size keyword
  • cleaned up ESTestCase
  • SearchResults now has facet data in the facets property
  • etc.

For the complete list of what's new, What's new in Version 0.9.

Many thanks to everyone who helped out: Alexey Kotlyarov, David Lundgren, Honza Král, James Reynolds, Jannis Leidel, Juan Ignacio Catalano, Kevin Stone, Mathieu Pillard, Mihnea Dobrescu-Balaur, nearlyfreeapps, Ricky Cook, Rob Hudson, William Tisäter and Will Kahn-Greene.

We're going to be sprinting on ElasticUtils 0.10 at PyCon US in Montreal mid April. If you're interested, come find me!

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Dennis v0.3.10 released! Fixes, status subcommand and Zombie!

, | Tweet this

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a linter for finding problems in strings in .po files
  • a statuser for seeing the high-level status of your .po files
  • a translator for strings .po files

v0.3.10 released!

v0.3.8 fixed mismatched errors in plural strings. Thanks Mike!

v0.3.9 fixed two false positives in error detection.

v0.3.10 adds the status subcommand and the Zombie transform which, like the dubstep transform, is silly but fun.

/images/sumo_zombie1.thumbnail.png

SUMO ... in Zombie!

45 out of 47 Djangonauts use the Zombie transform to make their site accessible to those who have departed. This could open up your app to millions of new users. Truth.

Dennis v0.3.7 released! Dubstep and Django!

, | Tweet this

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a linter for finding problems in strings in .po files
  • a translator for strings .po files

v0.3.7 released!

v0.3.6 fixed a goof where the linter was skipping errors. Oops.

v0.3.7 adds a dubstep translator (which is just plain silly, but awesome).

/images/sumo_dubstep1.thumbnail.png

SUMO ... in Dubstep!

Truth: 9 out of 10 experts agree SUMO is extra helpful in dubstep.

v0.3.7 also adds Django command shims to make it easier to use Dennis in your Django project.

Use these instructions to set up Dennis so you can use its commands with ./manage.py.

If you aren't using Dennis, yet, it's worth taking a look at. l10n tools are the best!

Dennis v0.3.5 released!

, | Tweet this

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a translator for strings .po files
  • a linter for finding problems in strings in .po files

v0.3.5 released!

0.3.4 fixed an issue with the linter so it skips fuzzy strings.

0.3.5 fixes the rules default for the linter so that it includes the malformed lint rules. It also adds detection of formatting tokens like {0] where it doesn't end in a curly brace. This kicks up a ValueError in Python:

>>> '{0]'.format(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: unmatched '{' in format
>>>

If you're using Dennis---especially to detect errors in .po files before you push them to production---you should upgrade.

Talk like a pirate day!

Tomorrow, September 19th, is Talk like a Pirate Day. Dennis can help you celebrate with its built-in Pirate translator which works on .po files, but also works on any input from command line arguments or stdin.

Translate your HTML pages:

(cat < "$1" | dennis-cmd translate --pipeline=html,pirate -) > "pirate_$1"

Translate all your git commit messages with this hooks/commit-msg:

#!/bin/bash

# Pipe the contents of the commit message file through dennis to
# a temp file, then copy it back.
(cat < $1 | dennis-cmd translate - > $1.tmp) && mv $1.tmp $1

# We always exit 0 even if the dennis-cmd fails. If the dennis-cmd
# fails, you get your original commit message. No one likes it when
# shenanigans break your stuff for realz.
exit 0;

If you forget about this blog post, these two recipes are in the recipes section of the documentation. If you have other recipes, I'd love to hear about them!

Also, the Pirate! translator can always be improved. If there are improvements you want to make, please submit a pull request!

ElasticUtils v0.8 and v0.8.1 released!

, | Tweet this

What is it?

ElasticUtils is a Python library for building and executing Elasticsearch searches.

v0.8 and v0.8.1 released!

I missed the announcement for v0.8, so I'll cover both v0.8 and v0.8.1 here.

Roughly:

  • ElasticUtils now requires at least pyelasticsearch 0.6
  • adds range query and filter
  • adds S.filter_raw
  • changes the Indexable.index arguments dropping force_insert and picking up overwrite_existing

For the complete list of what's new, What's new in Version 0.8.1

Many thanks to everyone who helped out: Jannis Leidel, Rob Hudson and Grégoire Vigneron.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Switching to South

, | Tweet this

tl;dr

We just landed the bits that switch us from Schematic---the migration system we were using---to South. This is my account of that journey in case it helps others.

Context

Kitsune is the Django project that runs Mozilla Support. The project was started many years ago. For as long as I've worked on Kitsune, we used a migration system called Schematic.

Schematic has the nicety of being very very raw. You can do anything: raw SQL, raw shell, raw Python, raw fish---whatevs. This was nice because we could write whatever we wanted.

Schematic is a total pain in the ass because it hasn't been touched in 2 years, doesn't work well with recent versions of MySQL or MariaDB and makes it really difficult to write migrations that continue to work over time. Further, there's no way to do backwards migrations in Schematic even if you wanted to. We were constantly getting bit by these issues.

The switch

Switching from Schematic to South turned out to be pretty easy. I did it Monday afternoon. I essentially followed these steps:

  1. Added South as a dependency for our Django project.

  2. Initialized South migrations for all the apps we use in Kitsune which was a whole bunch of:

    $ ./manage.py schemamigration <appname> --initial
    
  3. Wrote a last Schematic migration that adds all the South bookkeeping which entailed dumping the output of this to a file:

    $ mysqldump <database> south_migrationhistory
    

    and then editing that file by hand.

    That creates the south_migrationhistory table and populates it with the bookkeeping for the initial commits for the apps initialized in step 2.

  4. Added ./manage.py migrate to our deploy script.

  5. Do a happy dance!

The relevant commits for Kitsune are here:

That's pretty much it.

Update: September 11, 2013
This was with Django 1.4.7 and South 0.8.2. If you're using different versions, you may experience different things.
Update: September 13, 2013

In Schematic, one thing we would do after creating new models is add a content type and the permissions.

This walks through setting that up automatically with South and post_migrate:

http://devwithpassion.com/felipe/south-django-permissions/

Dennis v0.3.3 released!

, | Tweet this

What is it?

Dennis is a Python command line utility (and library) for working with localization. It includes:

  • a translator for strings .po files
  • a linter for finding problems in strings in .po files

v0.3.3 released!

This is the first blog-post-announced release. I think Dennis is good enough for wider use. I've been using it for development work on both kitsune (which drives Support) and fjord (which drives Input with great success.

Why Dennis?

It fills two basic needs I had:

  1. translate .po files so I can find problems during development related to localized strings, layout issues, unicode support, etc
  2. lint translated .po files so that errors in translated strings don't make it to production where they cause fires, make users angry and make me very sad and tired

There's another project called Translate Toolkit that you could use for item 1, but it doesn't have Pirate! and I like my pipeline architecture since it's more "pluggable" (whatever that means). Plus it didn't have a linter that covered my specific issues nor does it return a non-zero exit status so I can't use it for selective compiling.

Therefore I decided to write my own tool to meet my needs.

The ultra-basics

Install

$ pip install dennis
$ pip install blessings  # Optional for prettier output

Linting

Lint a single .po file for problems including mismatched/malformed Python variables in translated strings:

$ dennis-cmd lint locale/fr/LC_MESSAGES/messages.po

Produces output like this:

(dennis) saturn ~/mozilla/fjord> dennis-cmd lint locale/fr/LC_MES
SAGES/messages.po
dennis-cmd version 0.3.4.dev
>>> Working on: /home/willkg/mozilla/fjord/locale/fr/LC_MESSAGES/
messages.po
Error: mismatched: invalid variables: {count}
msgid: Most Recent Message
msgstr[0]: Les {count} derniers messages

Error: mismatched: invalid variables: {count}
msgid: Most Recent Message
msgid_plural: Last %(count)s Messages
msgstr[1]: Les {count} derniers messages

Warning: mismatched: missing variables: %(count)s
msgid: Most Recent Message
msgid_plural: Last %(count)s Messages
msgstr[1]: Les {count} derniers messages

Error: mismatched: invalid variables: {count}
msgid: {0} similar messages
msgstr: Les {count} derniers messages

Warning: mismatched: missing variables: {0}
msgid: {0} similar messages
msgstr: Les {count} derniers messages

Totals
  Warnings:     2
  Errors:       3

If you have blessings installed, it'll colorize that output.

You can also lint a directory structure of .po files:

$ dennis-cmd lint --errorsonly locale/

I use this to compile only the error-free .po files to .mo files and tell us which .po files have problems so we can fix them.

Translating

You can translate a .po file in place into Pirate! to help find l10n issues in your code:

$ dennis-cmd translate --pipeline=html,pirate \
    locale/xx/LC_MESSAGES/messages.po

This takes into account that the strings have HTML in them that should be ignored when translating. It uses a pipeline architecture where the output of one transform is fed as input to the next, so you can string them along and get shouty extra-pirate with anglequotes:

$ dennis-cmd translate --pipeline=html,pirate,pirate,shouty,anglequote \
    locale/xx/LC_MESSAGES/messages.po

Summary

That's the gist of it. In the Dennis documentation is a list of Dennis recipes covering linting, translating, etc.

Yay for Dennis!

Switched Geeksphone to nightly channel

, | Tweet this

Got my Geeksphone Peak yesterday in the mail, but hadn't had a chance to open the package until today (crazy life stuff).

I'm running Debian Testing and had done some Gaia/FirefoxOS work in late 2012. I had my development environment still set up including adb and fastboot.

I wanted to switch the phone to the nightly channel which is what I was using (or thought I was using) with my Unagi. Mostly I followed Updating and Tweaking your Firefox OS Developer Preview phone/Geeksphone.

After doing that, I moved my SIM card over as well as the Micro SD card I had in my Unagi phone. The Micro SD card has all my media on it---not my suer data. I thought about moving my user data over, but I've had the Unagi so long that I figured my user data is probably in a funky state anyhow and better to start over.

Couple of thoughts:

  1. The nightly build is version 2.0.0-prerelease.
  2. The phone is a lot bigger than the Unagi---it'll take a bit of getting used to.
  3. Most things worked fine. I even got a marketing message from T-Mobile minutes after I inserted my SIM card and got on the network.

There are a few issues (the camera is not working at all). I need to spend some time with Bugzilla to file those with useful data. Regardless, the phone is working pretty well and I'm probably going to switch to it as my primary phone soon. That might give me some incentive to finish rewriting magic10ball.

ElasticUtils v0.7 released!

, | Tweet this

What is it?

ElasticUtils is a Python library for building and executing Elasticsearch searches.

v0.7 released!

Turns out I haven't announced an ElasticUtils release since August 2012. Why? Partially because up until now, I always had deep-seated problems with ElasticUtils and wasn't excited about announcing yet another version with things I disliked in it.

I feel really good about v0.7 for a variety of reasons. Let me tell you some of them:

  1. We switched from pyes to pyelasticsearch. I'm really happy with this.

  2. There was a monumental effort to fix sharp edges in the API, generalize bits that needed generalizing, improve the quality of the software, improve the test suite, improve the docs, ...

    Doing a git diff --stat tells me:

    65 files changed, 6164 insertions(+), 2716 deletions(-)
    

    That's a lot of change for a small project like this.

If you're using ElasticUtils, I highly encourage you to update to v0.7. We're using it on Input and Support already.

For the complete list of what's new, What's new in Version 0.7.

Many thanks to everyone who helped out: Erik Rose, Jannis Leidel, Rob Hudson, Steve Ivy, Will Kahn-Greene (oh, that's me!), Chris McDonald, Ricky Rosario, James Socol, Giorgos Logiotatidis, Mike Cooper, Grégoire Vigneron, Chris Sinchok and Brandon Adams.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

My thoughts on Elasticsearch: Part 1: indexing

, | Tweet this

Summary

I just finished up an overhaul of ElasticUtils and then an overhaul of the search infrastructure for support.mozilla.org. During that period of time, I thought about extending the ElasticUtils documentation to include things I discovered while working on these projects. Then I decided that this information is temporal---it's probably good now, but might not be in a year. Maintaining it in the ElasticUtils docs seemed like more work than it was worth.

Thus I decided to write a series of blog posts.

This one covers indexing. Later ones will cover mappings, searching and other things.

It's also long, rambling and contains code. The rest is after the break.

Read more…