ElasticUtils v0.7 released!

What is it?

ElasticUtils is a Python library for building and executing Elasticsearch searches.

v0.7 released!

Turns out I haven't announced an ElasticUtils release since August 2012. Why? Partially because up until now, I always had deep-seated problems with ElasticUtils and wasn't excited about announcing yet another version with things I disliked in it.

I feel really good about v0.7 for a variety of reasons. Let me tell you some of them:

  1. We switched from pyes to pyelasticsearch. I'm really happy with this.

  2. There was a monumental effort to fix sharp edges in the API, generalize bits that needed generalizing, improve the quality of the software, improve the test suite, improve the docs, ...

    Doing a git diff --stat tells me:

    65 files changed, 6164 insertions(+), 2716 deletions(-)

    That's a lot of change for a small project like this.

If you're using ElasticUtils, I highly encourage you to update to v0.7. We're using it on Input and Support already.

For the complete list of what's new, What's new in Version 0.7.

Many thanks to everyone who helped out: Erik Rose, Jannis Leidel, Rob Hudson, Steve Ivy, Will Kahn-Greene (oh, that's me!), Chris McDonald, Ricky Rosario, James Socol, Giorgos Logiotatidis, Mike Cooper, Grégoire Vigneron, Chris Sinchok and Brandon Adams.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Proposal: LDAP password resets as a unit of measure

Backstory

Every 3 months, we at Mozilla have to reset our LDAP passwords. The system helpfully sends the first reminder 2 weeks before your password expires, then the second reminder 1 week before your password expires and the last reminder 2 days before your password expires.

Sometimes time passes by faster than you know and you end up with a Locked out of LDAP account.

The 3 month LDAP password reset is such a large part of our lives that I propose it become a standard unit of measure for elapsed time.

Usage

Used in casual conversation:

Pat: Hi!

Jordan: Hi!

Pat: I haven't seen you before. How long have you been at
Mozilla?

Jordan: I've been here for 6 LDAP password resets.

Pat: Oh, weird. I've been here for 7. Good to meet you!
Would you like a banana?

Jordan: Would I ever!

Used in casual conversation on IRC:

<patbot> anyone use less?
<corycory> i only use sass. it's the best.
* riledupriley has quit (Quit: riledupriley)
<patbot> :(
<hugbot> (patbot)
* r1cky has joined #casualconversationexample
<r1cky> morning!
<nigelb> r1cky: hai!
<nigelb> Ah, it's nearly mfbt.
<mtjordan> sure. been using it for 3 ldap password resets.
<mtjordan> patbot: why do you ask?

Used in Bugzilla comments:

Jordan [:jordan]  1 day ago       Comment 0 [reply] [-]

Readonly mode causes the site to ISE.
Pat [:pat]  1 day ago             Comment 1 [reply] [-]

I looked into it. Turns out we haven't used readonly
mode in at least 4 LDAP password resets.

I think we just need to add a fake authentication
module. Easy peasy.

Used when joining a new group:

From: Pat
To: some-group@mozilla.org
Subject: Welcome Jordan to some-group!

Hi all!

I'd like to welcome Jordan to some-group! Jordan brings
expertise that is invaluable. I'm excited! Yay!

Jordan: Tell us about yourself!

Pat
From: Jordan
To: some-group@mozilla.org
Subject: Re: Welcome Jordan to some-group!

Hi!

I'm excited to join some-group! Hopefully I bring something
useful to the table.

I've been at Mozilla for 7 LDAP password resets, I like
top-posting and I make a mean cold brew coffee.

Looking forward to my first meeting!

Jordan


On Blah blah blah at blah blah blah, Pat wrote:
> Hi all!
>
> I'd like to welcome Jordan to some-group! Jordan brings
> expertise that is invaluable. I'm excited! Yay!
>
> Jordan: Tell us about yourself!
>
> Pat

Used in an email to everyone@ about departing:

Dear everyone!

It is with sadness that I tell you I'm leaving as of next
Friday. As you know, I've been with Mozilla for 32 LDAP
password resets and frankly, I'm totally out of usable
Sherlock Holmes story titles, so I'm off to new challenges.

I will miss you all.

Update: Potch suggested just using "LDAPs". Used in a sentence: "I've been here for 6 LDAPs." I like that.

My thoughts on Elasticsearch: Part 1: indexing

Summary

I just finished up an overhaul of ElasticUtils and then an overhaul of the search infrastructure for support.mozilla.org. During that period of time, I thought about extending the ElasticUtils documentation to include things I discovered while working on these projects. Then I decided that this information is temporal---it's probably good now, but might not be in a year. Maintaining it in the ElasticUtils docs seemed like more work than it was worth.

Thus I decided to write a series of blog posts.

This one covers indexing. Later ones will cover mappings, searching and other things.

It's also long, rambling and contains code. The rest is after the break.

Read more…

pyvideo status: April 3rd, 2013

What is pyvideo.org

pyvideo.org is an index of Python-related conference and user-group videos on the Internet. Saw a session you liked and want to share it? It's likely you can find it, watch it, and share it with pyvideo.org.

Status

  • Videos for PyCon US 2013 are still going up. There are 115 posted and live now. There are around 30 that are waiting for presenters to look at the metadata and tell Carl whether the metadata is good or not. More on that later.

  • Several new people submitted patches to richard! Several of the patches were fixes to broken things they saw on pyvideo.org. I've applied the fixes to the site directly, but have been waiting on making any non-critical updates to the site until after things have cooled off. I think I'll do a site update in the next week or so.

  • PyData 2013 was recorded. When videos are posted, they'll be in the PyData category. I don't know what the posting schedule is.

  • I was contacted a couple of times by the inimitable Montréal Python to post their videos. They're going to test out steve which is the tool I've been writing for the last 6 months to make it possible for other folks to generate the video metadata needed by pyvideo.org.

    I eagerly look forward to their progress and to their videos getting on the site.

    If it works out well, I'll blog more about steve and look for volunteers to use steve to generate the video metadata for the ever increasing backlog.

  • Several people are gittip'ing me. It's not a lot of money, but that and the many emails I've gotten over the last few weeks about the site have been really great. I work on pyvideo.org in my free time of which I don't have a lot. It's nice to know that prioritizing pyvideo.org work over other things helps you.

That's the gist of things!

Most of the PyCon US 2013 videos that aren't live are waiting for presenters to tell Carl at NextDayVideo (carl at nextdayvideo dot com) whether the metadata is good.

  • If you see your name on this list and you've told Carl the metadata is fine already, please send him a friendly reminder.

  • If you see your name on this list and you haven't told Carl anything, please send him a "yes, this is great!" or the list of things you need corrected.

  • If you see a friend on this list, tell your friend to do one of the above.

I'll update this list as I'm aware of changes. However, I don't work for NextDayVideo, so it's entirely possible my list is not current and/or there are errors. If so, please let me know.

Here's the list (last updated 2013-04-12 7:13am -0400):

  • Digital signal processing through speech, hearing, and Python -- Mel Chua

  • Faster Python Programs through Optimization -- Mike Müller

  • Python beyond the CPU -- Andy Terrel, Travis Oliphant, Mark Florisson

  • Code to Cloud in under 45 minutes -- John Wetherill

  • A Gentle Introduction to Computer Vision -- Katherine Scott, Anthony Oliver

  • Documenting Your Project in Sphinx -- Brandon Rhodes

  • Contribute with me! Getting started with open source development -- Jessica McKellar

  • Intermediate Twisted: Test-Driven Networking Software -- Itamar Turner-Trauring

  • Gittip: Inspiring Generosity -- Chad Whitacre

  • The Magic of Metaprogramming -- Jeff Rush

  • You can be a speaker at PyCon! -- Anna Ravenscroft

  • sys._current_frames(): Take real-time x-rays of your software for fun and performance -- Leonardo Rochael

  • Planning and Tending the Garden: The Future of Early Childhood Python Education -- Kurt Grandis

  • powerful pyramid features -- Carlos de la Guardia

  • Python for Robotics and Hardware Control -- Jonathan Foote

  • Copyright and You -- Frank Siler

  • Chef: Automating web application infrastructure -- Kate Heddleston

  • Numba: A Dynamic Python compiler for Science -- Travis Oliphant, Siu Kwan Lam, Mark Florisson

  • Integrating Jython with Java -- Jim Baker, Shashank Bharadwaj

  • Iteration & Generators: the Python Way -- Luciano Ramalho

  • ApplePy: An Apple ][ emulator in Python -- James Tauber

  • Distributed Coordination with Python -- Ben Bangert

  • Become a logging expert in 30 minutes -- Gavin M. Roy

  • PyNES: Python programming for Nintendo 8 bits -- Guto Maia

  • Purely Python Imaging with Pymaging -- Jonas Obrist

  • Namespaces in Python -- Eric Snow

These are all set now:

  • IPython in-depth: high-productivity interactive and parallel python -- Fernando Perez, Brian Granger, Min RK

  • Pyramid for Humans -- Paul Everitt

  • Learn Python Through Public Data Hacking -- David Beazley

  • Rethinking Errors: Learning from Scala and Go -- Bruce Eckel

ElasticUtils sprint at PyCon US 2013

What is it?

ElasticUtils is a Python library for building and executing ElasticSearch searches.

PyCon US 2013 sprint

I was only at the sprints for a single day. Rob and I spent some time working on elasticutils. Several good things came out of that:

  1. Rob wrote up an elasticutils Django middleware which throws a 501 or 503 page if an unhandled pyelasticsearch or requests exception is raised

  2. I fixed the Django tasks, added a test, and updated the documentation

  3. I cleaned up the Django ElasticSearchTestCase class

  4. I spent a bunch of time thinking about queries, syntax and functionality

Someone on IRC asked whe the next version of elasticutils will go out. I have no schedule right now, but I think it's important to let the code get used by projects that don't mind being bleeding edge and bake for a bit. The code in master tip right now is 0.7.dev and the big change since 0.6 is that we switched from pyes to pyelasticsearch. That's a big change---the more baking it does, the better.

Having said that, a release depends mostly on how much free time I have in the near future. I'm about to lose all free time for a bit, so my guess is that we won't see a 0.7 release until this summer unless there's a compelling reason to push one out.

In the meantime, I'm actively maintaining the v0.5 and v0.6 branches. I'd like to stop maintaining the v0.5 branch, but need to get Mozillians and AMO off of it first.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Adding Persona authentication to richard

tl;dr

This is a post covering my first time experience with integrating Persona authentication into my Django project named richard. I briefly cover why I did it, what I used, and list the commits I did the work in as an example of how it can be done. I hope this helps others implement it on their sites..

why

A month ago, I added Persona authentication support to richard. This allowed me to use Persona authentication for pyvideo.org. I did this for several reasons:

  1. I wanted to try it out and see how well it worked on a small Django site (tl;dr works great---I'll use this on all my sites)

  2. I wanted people to authenticate with an email-based identity rather than a social network based identity

  3. I wanted to allow people to create accounts on pyvideo.org, but didn't want to deal with the responsibility of protecting things like passwords

So that's where I'm coming from.

how

I used django-browserid which gives you some JavaScript and a few template tags that make it easy to incorporate Persona authentication into a Django app.

It took about 15 minutes to get it working. I've made some minor edits to the code since then and updated to v0.8 of django-browserid. All told, I think I've spent a couple of hours on Persona implementation.

In the process of doing that work, I hit a few minor issues, created some pull requests, helped with other pull requests and became one of the maintainers. Yay!

Here are the commits I did the work in. I figured the diffs might help you implement similar things on your sites:

That last commit updates to django-browserid master tip to pick up a fix to login failures if BROWSERID_CREATE_USER is False. That fix will be released in v0.8.1 soon.

further reading

The Mozilla Persona site helps understand why it exists and has a Developer FAQ.

The django-browserid docs are pretty good and walk through setting it up, advanced usage, and troubleshooting. I encourage you to read through them in full---it'll give you a better understanding of the pieces.

Dan Callahan did a talk at PyCon US 2013 on Persona. That's worth watching. It covers why Mozilla built it, how it works, and why it's important that it works that way. He also demos integrating it into sites and talks about using Persona authentication alongside other authentication methods.

If you're interested in adding Persona authentication to your Django site and need help, let me know.

Django Eadred v0.2 released! Django app for generating sample data.

Django Eadred gives you some scaffolding for generating sample data to make it easier for new contributors to get up and running quickly, bootstrapping required database data, and generating large amounts of random data for testing graphs and things like that.

For v0.2, I added some helper methods for generating names, email addresses, sentences and paragraphs. It's definitely the case that these helpers won't handle all use cases, but I think they'll help specific ones.

There are no backwards-compatability problems with v0.1.

To update, do:

pip install -U eadred

SUMO: Now ... in pirate!

A while back, I wrote a post about poxx.py which talked about a script I based on Ned Batchelder's poxx.py script and overhauled to provide a faux "Swedish Chef" translation of Miro strings allowing me to test localizations of the application.

The transform from English to "Swedish Chef" had the following four impotant properties:

  1. the output is vaguely readable

  2. the output is longer than the input which helps us find ui issues

  3. the output is clearly distinguished from English which helps us find strings that aren't getting translated

  4. the output is mildly amusing which is sometimes important in dark times

Back in August, I made some changes and pulled it into Fjord. This helped us suss out localization issues on a new site. However, I wasn't really happy with it. Amongst other things, it always felt like "Swedish Chef" was culturally insensitive.

A couple of weeks ago, I overhauled poxx.py again. This time, PIRATE! It continues to have the four properties I think are important for a test locale.

We're using it now for SUMO development. It's the grog to your Jolly Roger:

/images/sumo-pirate.thumbnail.png

SUMO -- In Pirate!

We're using this script on both SUMO and Fjord now. You can use it for your site, too! The code is at https://github.com/mozilla/kitsune/tree/master/scripts/.

If you see any problems with it, toss me a message in a bottle.

pyvideo status: February 3rd, 2013

What is pyvideo.org

pyvideo.org is an index of Python-related conference and user-group videos on the Internet. Saw a session you liked and want to share it? It's likely you can find it, watch it, and share it with pyvideo.org.

Status

  • Videos for PyCon AU 2012 are posted.

    That's probably the last conference I'm going to do on my own. More about that later.

  • I've made some big changes to richard. For one, formatted fields use Markdown instead of HTML now (yay!). I've improved the API. I've made a lot of layout tweaks and user interface improvements.

  • I pushed out steve v0.1 and then promptly made a bunch of fixes, tweaks and changes. So I need to do a new release soon. steve is the utility people can use to generate conference data for pyvideo.org. See the commandline chapter for details.

I've been working on getting steve and richard to the point where I'm neither doing all the work nor am I the bottleneck for work being done.

I still need to write up a blog post on how to use steve to generate JSON files for pyvideo.org. That will make it possible for anyone to add conference video.

I'm working on changing richard to allow for other people to edit video metadata. It'll continue to be curated, but this will make it possible for other people to help because there are like 1600 videos and the repository continues to grow and I'm just one man. I have some of this worked out on paper, but it needs to be implemented.

That's the current push. I'm hoping to have a lot of this done for PyCon 2013.

Mozilla: 1 year review

This was my first full year at Mozilla and it was intense. I essentially worked on four projects: SUMO, Input, ElasticUtils and Gaia. This blog post talks about the first two which are worked on by the James' Rifles SUMINPUT Megalosaur team.

We accomplished a lot on SUMO this year. I spent a couple of hours last week throwing together a rough "year in review" script that looked at Bugzilla and git and crunched some numbers:

Twas the year: 2012
===================


Bugzilla
========

Bugs created: 938

              rrosario : 201
               a.topal : 188
                willkg : 108
           scoobidiver : 51
               igarcia : 41
                mverdi : 36
      swarnavasengupta : 30
                 james : 29
                  bram : 19
            tobbi.bugs : 17

Bugs resolved: 1025

              rrosario : 335
                       : WORKSFORME 18
                       :    INVALID 16
                       :  DUPLICATE 23
                       :    WONTFIX 7
                       :      FIXED 263
                       : INCOMPLETE 8
               a.topal : 182
                       : WORKSFORME 36
                       :    INVALID 41
                       :  DUPLICATE 11
                       :    WONTFIX 70
                       :      FIXED 21
                       : INCOMPLETE 3
                willkg : 131
                       :  DUPLICATE 6
                       :      FIXED 110
                       : WORKSFORME 2
                       :    WONTFIX 11
                       :    INVALID 2
                rdalal : 84
                       :      FIXED 84
                 james : 51
                       : WORKSFORME 6
                       :    INVALID 5
                       :  DUPLICATE 3
                       :    WONTFIX 15
                       :      FIXED 14
                       : INCOMPLETE 8
               mcooper : 37
                       :      FIXED 36
                       :    INVALID 1
            tobbi.bugs : 29
                       :      FIXED 29
             tgavankar : 28
                       :    WONTFIX 1
                       : WORKSFORME 1
                       :      FIXED 26
           scoobidiver : 28
                       :      FIXED 4
                       :  DUPLICATE 4
                       : WORKSFORME 11
                       :    WONTFIX 3
                       :    INVALID 6
               bmo2010 : 13
                       :      FIXED 1
                       :  DUPLICATE 3
                       : WORKSFORME 3
                       :    INVALID 6

            INCOMPLETE : 21
             DUPLICATE : 61
            WORKSFORME : 82
               INVALID : 91
               WONTFIX : 117
                 FIXED : 653

git
===

Total commits: 916

         Ricky Rosario : 430
      Will Kahn-Greene : 192
           Rehan Dalal : 98
           Mike Cooper : 44
             Erik Rose : 34
                 Tobbi : 29
        Tanay Gavankar : 23
           Kadir Topal : 11
             Tim Watts : 10
         Berker Peksag : 9
           James Socol : 7
            Victor Neo : 6
      Cesar Carruitero : 5
           David Lilly : 4
                  Ibai : 3
        Isac Lagerblad : 2
                 icaaq : 1
           TylerDowner : 1
              browning : 1
         ricky rosario : 1
    Anatoli Papirovski : 1
     Clauber Stipkovic : 1
          Jason Thomas : 1
                atopal : 1
      Florin Strugariu : 1

There are some interesting bits in there:

  1. Ricky does a lot of work! Holy cow!

  2. There were 23 people who contributed code to Kitsune (the SUMO codebase) this year. Of those, about half are volunteer contributors.

    Compare with 2011, we had 19 people who contributed to the code base and less than half were volunteer contributors.

  3. We resolved more bugs than we created in 2012. We did that in 2011 as well, so that's two years in a row. I've never seen that happen before on a project I work on.

The codebase is pretty different now than it was at the beginning of the year. I helped with the following semi-massive overhauls:

  1. The push for more metrics and dashboards to view the numbers.

  2. The switch from Sphinx to ElasticSearch.

  3. The new Information Architecture which affected browsing and searching across the site.

  4. The site redesign which covered both the desktop and mobile versions of the site.

  5. The upgrade to Django 1.4.

  6. The switch from arecibo to sentry.

  7. The push to switch from fixtures to model makers for all our tests.

  8. The switch from weekly deployments on Tuesdays to deploying whenever we want. Continuous deployment is fantastic.

  9. Started switching the whole site from Webtrends to Google Analytics. I saw Ricky write up a bunch of bugs to finish up that work, so I'll say it's in progress.

  10. During the redesign, Rehan redid all the CSS and switched us to use LESS.

  11. I spun off some code I wrote for richard, then ported to Fjord, then improved into a project called django-eadred. That makes it a lot easier to generate sample data for a variety of purposes like new contributors, bootstrapping, and large random data sets.

On top of that, we did a lot of work on the documentation and making it easier to get to a working Kitsune development environment. We switched to a sprint-based work flow using Scrumbugz. We also nixed our daily checkin conference call for an IRC-based checkin system that we wrote called Standup.

It's been a big year.

For Input, it was a bigger year. We decided to abandon the old Input codebase (omfg yay) in favor of rewriting it from the ground up. The rewrite took a couple of months and then has sort of been sitting around waiting for a security review. In the meantime, we (actually, Mike did) fixed a bunch of issues with the old site code because that's what's currently in production.

Rewriting Input wouldn't have taken so long except that we did a lot of work fixing bugs in external libraries and updating Playdoh. That work definitely cut into our schedule, but it benefitted a bunch of other groups/people/sites, so that's good.

That's the gist of the year: it was a lot of work, but we accomplished a ton.

w00t for 2012!