Will's blog

purpose: Will Kahn-Greene's blog of Miro, PyBlosxom, Python, GNU/Linux, random content, PyBlosxom, Miro, and other projects mixed in there ad hoc, half-baked, and with a twist of lemon

[home | blog home]

Page 1 of 23  >> (less recent)

Wed, 12 Jun 2013

ElasticUtils v0.7 released!

What is it?

ElasticUtils is a Python library for building and executing Elasticsearch searches.

v0.7 released!

Turns out I haven't announced an ElasticUtils release since August 2012. Why? Partially because up until now, I always had deep-seated problems with ElasticUtils and wasn't excited about announcing yet another version with things I disliked in it.

I feel really good about v0.7 for a variety of reasons. Let me tell you some of them:

  1. We switched from pyes to pyelasticsearch. I'm really happy with this.

  2. There was a monumental effort to fix sharp edges in the API, generalize bits that needed generalizing, improve the quality of the software, improve the test suite, improve the docs, ...

    Doing a git diff --stat tells me:

    65 files changed, 6164 insertions(+), 2716 deletions(-)
    

    That's a lot of change for a small project like this.

If you're using ElasticUtils, I highly encourage you to update to v0.7. We're using it on Input and Support already.

For the complete list of what's new, What's new in Version 0.7.

Many thanks to everyone who helped out: Erik Rose, Jannis Leidel, Rob Hudson, Steve Ivy, Will Kahn-Greene (oh, that's me!), Chris McDonald, Ricky Rosario, James Socol, Giorgos Logiotatidis, Mike Cooper, Grégoire Vigneron, Chris Sinchok and Brandon Adams.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Fri, 10 May 2013

My thoughts on Elasticsearch: Part 1: indexing

Summary

I just finished up an overhaul of ElasticUtils and then an overhaul of the search infrastructure for support.mozilla.org. During that period of time, I thought about extending the ElasticUtils documentation to include things I discovered while working on these projects. Then I decided that this information is temporal---it's probably good now, but might not be in a year. Maintaining it in the ElasticUtils docs seemed like more work than it was worth.

Thus I decided to write a series of blog posts.

This one covers indexing. Later ones will cover mappings, searching and other things.

It's also long, rambling and contains code. The rest is after the break.

read more after the break...

Wed, 20 Mar 2013

ElasticUtils sprint at PyCon US 2013

What is it?

ElasticUtils is a Python library for building and executing ElasticSearch searches.

PyCon US 2013 sprint

I was only at the sprints for a single day. Rob and I spent some time working on elasticutils. Several good things came out of that:

  1. Rob wrote up an elasticutils Django middleware which throws a 501 or 503 page if an unhandled pyelasticsearch or requests exception is raised
  2. I fixed the Django tasks, added a test, and updated the documentation
  3. I cleaned up the Django ElasticSearchTestCase class
  4. I spent a bunch of time thinking about queries, syntax and functionality

Someone on IRC asked whe the next version of elasticutils will go out. I have no schedule right now, but I think it's important to let the code get used by projects that don't mind being bleeding edge and bake for a bit. The code in master tip right now is 0.7.dev and the big change since 0.6 is that we switched from pyes to pyelasticsearch. That's a big change---the more baking it does, the better.

Having said that, a release depends mostly on how much free time I have in the near future. I'm about to lose all free time for a bit, so my guess is that we won't see a 0.7 release until this summer unless there's a compelling reason to push one out.

In the meantime, I'm actively maintaining the v0.5 and v0.6 branches. I'd like to stop maintaining the v0.5 branch, but need to get Mozillians and AMO off of it first.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Tue, 19 Mar 2013

Adding Persona authentication to richard

tl;dr

This is a post covering my first time experience with integrating Persona authentication into my Django project named richard. I briefly cover why I did it, what I used, and list the commits I did the work in as an example of how it can be done. I hope this helps others implement it on their sites..

why

A month ago, I added Persona authentication support to richard. This allowed me to use Persona authentication for pyvideo.org. I did this for several reasons:

  1. I wanted to try it out and see how well it worked on a small Django site (tl;dr works great---I'll use this on all my sites)
  2. I wanted people to authenticate with an email-based identity rather than a social network based identity
  3. I wanted to allow people to create accounts on pyvideo.org, but didn't want to deal with the responsibility of protecting things like passwords

So that's where I'm coming from.

how

I used django-browserid which gives you some JavaScript and a few template tags that make it easy to incorporate Persona authentication into a Django app.

It took about 15 minutes to get it working. I've made some minor edits to the code since then and updated to v0.8 of django-browserid. All told, I think I've spent a couple of hours on Persona implementation.

In the process of doing that work, I hit a few minor issues, created some pull requests, helped with other pull requests and became one of the maintainers. Yay!

Here are the commits I did the work in. I figured the diffs might help you implement similar things on your sites:

That last commit updates to django-browserid master tip to pick up a fix to login failures if BROWSERID_CREATE_USER is False. That fix will be released in v0.8.1 soon.

further reading

The Mozilla Persona site helps understand why it exists and has a Developer FAQ.

The django-browserid docs are pretty good and walk through setting it up, advanced usage, and troubleshooting. I encourage you to read through them in full---it'll give you a better understanding of the pieces.

Dan Callahan did a talk at PyCon US 2013 on Persona. That's worth watching. It covers why Mozilla built it, how it works, and why it's important that it works that way. He also demos integrating it into sites and talks about using Persona authentication alongside other authentication methods.

If you're interested in adding Persona authentication to your Django site and need help, let me know.

Sat, 16 Feb 2013

Django Eadred v0.2 released! Django app for generating sample data.

Django Eadred gives you some scaffolding for generating sample data to make it easier for new contributors to get up and running quickly, bootstrapping required database data, and generating large amounts of random data for testing graphs and things like that.

For v0.2, I added some helper methods for generating names, email addresses, sentences and paragraphs. It's definitely the case that these helpers won't handle all use cases, but I think they'll help specific ones.

There are no backwards-compatability problems with v0.1.

To update, do:

pip install -U eadred

Tue, 16 Oct 2012

Django Eadred v0.1 released! Django app for generating sample data.

I work on a few projects that had a need for generating sample data to make it easier for new contributors to get up and running quickly with little effort. These projects are fairly data-driven---they're kind of useless without data.

To satisfy that need, we wrote an app in richard to generate sample data across all the other apps in the project. Then I rewrote it for input.

Then we had a hankering for it in SUMO, plus I thought it made sense to turn it into its own app. So I spun it out into its own project.

Thus django-eadred was born.

Generally, it allows you to define a sampledata.py module with a generate_sampledata function that takes command line options to generate sample data for any app you want to generate sample data for.

You can use it to define different ways of generating sample data specified by the command line.

You can use it to generate random data, non-random data, initial data, data for contributors, sample data for large data sets, fixture data, etc.

Check out django-eadred.readthedocs.org for use cases, documentation and project details.

Thu, 02 Aug 2012

ElasticUtils v0.4 released!

What is it?

ElasticUtils is a Python library for building and executing ElasticSearch searches.

v0.4 released!

I released v0.4 a couple of days ago. This release adds new functionality, fixes some issues, adds more tests, and includes improved documentation.

On top of that, we removed the requirement for Django and moved the Django-aiding components into elasticutils.contrib.django. I personally like this because it makes it much easier to write test scripts to see how things react.

For the complete list of what's new, What's new in Version 0.4.

If you have any questions, let us know! We hang out on #elasticutils on irc.mozilla.org.

Thu, 21 Jun 2012

Pyblosxom has moved

I'm in the process of passing off maintenance to a few people who expressed interest on the pyblosxom-devel mailing list. Towards that goal, I've moved pyblosxom to a new github repository under the pyblosxom organization. Thus it's now at:

https://github.com/pyblosxom/pyblosxom

I then forked it, so you can continue looking at my fork if you so desire, but I probably won't be doing anything with it going forward.

Fri, 08 Jun 2012

Me and Pyblosxom

I've been working on Pyblosxom since January 9th, 2003. The initial commit from Wari is on November 24th, 2002. I discovered that (according to the commits), I was the second person to commit to the codebase. I didn't know that.

In January of 2012, I started a hiatus from Pyblosxom. We had done a ton of overhaul work for 1.4 and 1.5 and I was tired and wanted to work on other things.

I've decided it's time to end my reign as maintainer of Pyblosxom. I sent an email to the pyblosxom-devel list as such. Further, I contended that maybe it's time for the project to end altogether.

Yes, I suggested maybe it's time to end the project. The reasons are two-fold:

  1. The code has a ton of technical debt. There are a lot of plugins that need a lot of help. There's a lot of squirrely code. I've done a poor job of fixing the "Where can I find plugins?" and "Where can I find pretty flavours?" problems. Those are big problems and potentially require a lot of maintenance.
  2. It's really hard to fix architectural problems with Pyblosxom without changing the scope of the project. I contend it's way easier at this point to just start a new one.

Anyhow, I'm a little sad and I'll have to figure out what to do with my blog, but I think it's been a long time coming and it feels good to put it to rest.

So, two things:

  1. If you're interested in maintaining Pyblosxom, hop on the pyblosxom-devel mailing list and say hi!
  2. I cut a lot of my teeth on this project. Most of the current problems are my fault. If you're a current user of Pyblosxom, thank you for using something I spent a ton of time on and cared very much about and thank you for your patience as I figured a lot of things out.

On to new horizons!

Fri, 18 May 2012

elasticutils status -- May 18th, 2012

A few months ago, I "took over" maintenance of elasticutils. We use it in SUMO as the API for building search queries with elasticsearch.

One of the first things I did was spend some time figuring out whether we should keep working on elasticutils at all. django-haystack also provides a django-ish API for working with elasticsearch. Why have two libraries that at a high level do the same thing?

The thing is that they're not exactly the same. django-haystack is really great and supports a variety of backends for search, elasticsearch being one of them. Right now, it only has support for elasticsearch in 2.0 which is in either an alpha or beta state now (their web-site could use some updates). However, because it supports a bunch of backends, it only supports functionality that works across all of them.

elasticutils, on the other hand, is elasticsearch-specific. As elasticsearch adds functionality, we can, too. That's the compelling reason to keep working on this library. However, django-haystack has some awesome ideas that we'd like to implement in elasticutils, too. This will fix some sharp edges in elasticutils, but also make it much easier for projects to switch from one to the other.

Currently, elasticutils only handles the query side of things. django-haystack handles that, but also has an API for defining mappings, indexing, and all the other things you need with a search system.

Thus, Rob Hudson and I are going to embrace and extend elasticutils to:

  1. fix the current situation where it seems every elasticutils user is actually using their own branch with additional functionality in it (ew!)
  2. implement the rest of the things you need with a search system
  3. document the things we've learned while working with elasticutils because at a minimum, it seems most of the Mozilla projects that use elasticutils bumped into, spent time on, and solved the same problems---that's a huge waste of time and a failure on my part

One of the things users of a library need is for the library to be a mature project with releases, tagged version, documentation, tests, stability, reliability, reproduceability, communication, community and all that. Thus, I'm also going to spend some time to turn this into a real project. Towards that end, I created #elasticutils on irc.mozilla.org where we'll talk dirty elasticutils stuff. If we end up with more people pitching in, we'll create a mailing list. But for now, IRC will do.

My next step is to spend a little time cleaning up what's in the master branch, then tag and release a baseline version.

After that, I'm going to spend time identifying, thinking about and merging in the divergent functionality in the various branches while Rob works on continuing his imperative mapping work.

I think in a couple of months, we'll be in a better place and that'll make it easier for Mozilla projects and anyone else who wants to use elasticutils to use and contribute to it.

If you're a user of elasticutils, please come hang out with us! Let us know how we can better help you.

Page 1 of 23  >> (less recent)


pyblosxom::1.5.3.wgkg

Copyright 1996 to 2013, Will Guaraldi Kahn-Greene, under the Creative Commons BY-SA 3.0 license

Creative Commons License
Will's Blog by William Kahn-Greene is licensed under a Creative Commons Attribution-ShareAlike 3.0 Unported License.