Betahouse -- coworking environment

, | Tweet this

I'm working for Participatory Culture Foundation (w00t!) and they're based in Worcestor, MA, so I've been telecommuting. A friend of mine helped start a co-working place a few months back and a couple of weeks ago I joined as a part-time renter. Long story short, they wrote about us in the Boston Globe. The article is a little funny in that it makes it sound like some big fancy thing, but really it's a couple of rooms with desks where a bunch of people work. I wasn't there when Carolyn came in and interviewed people--I was on my way to Oregon for a family reunion.

At Betahouse, I've been sitting at a desk in what I've decided to dub the Python room. It's myself, Nate Aune and another person whose name escapes me who also works for Jazkarta.

It's been exciting and they have a much faster internet connection than I do at home. Can't beat that.

PyPi, Cheesecake, and such

, | Tweet this

I'm really late to the PyPi party. Both of the projects I've worked on over the years are in PyPi, but neither are up to date and neither support easy_install.

Figured I'd update them now. Three things I bumped into while working on Lyntin:

  1. the PyPi tutorial is pretty good and walks you through what you need to know,
  2. the list of classifiers is here, and
  3. the Cheesecake folks have scores for all the PyPi modules here

That last one is kind of interesting. You can see the scores and the Cheesecake log output.

New job--PCF

, | Tweet this

It's been a crazy few months... I worked like crazy the last two weeks of grad school, got my Masters, got married, went on a honey moon (or toffee moon depending on where you live), pushed out a new version of PyBlosxom, and ... got a job! I'll be a developer on the Democracy Player (soon to be Miro) at Participatory Culture.

I'm really psyched about this. Hopefully it nicely weds my Open Source/Free Software development hobbies with employment. Hopefully it nicely weds the way I like doing software development with the way I have to do it for work. Regardless, it's nice to be working in one major language (Python) rather than two (Java and Python). Also, there's a lot of really neat technology that PCF is using that I haven't worked with before--so it'll definitely be a growing experience.

I start next Monday (July 16th). Because I'll be doing Python development for employment, it'll be a lot easier to focus on Python-oriented things and blog more about Python the language, tools, methodologies and all that, too.

Playing with category feeds

, | Tweet this

It occurred to me that adding a link to category feeds was pretty trivial, so I figured I'd give it a try. It's a bit cluttered to look at (I'm talking about the Categories section on the right hand side there), but ... it's functional.

In case anyone was wondering what my category properties looked like, they're like this:

# category plugin properties
py["category_start"] = ""
py["category_begin"] = ""
py["category_item"] = r'%(indent)s<a href="%(base_url)s/' + \
   r'%(fullcategory_urlencoded)sindex.%(flavour)s">%(category)s</a>' + \
   r' [<a href="%(base_url)s/%(fullcategory_urlencoded)sindex.xml">atom</a>]' + \
   r' (%(count)d)<br />'
py["category_end"] = ""
py["category_finish"] = ""

PyBlosxom 1.4 released

, | Tweet this

It's been 17 months or so since the last PyBlosxom release which for a small project is probably too long a period of time. Real life is a harsh mistress sometimes.

Changes:

  • bunch of bug-fixes
  • converted the documentation from docbook to restructured text
  • updates to the documentation
  • code overhaul
  • added the beginnings of unit/functional testing using nose
  • better WSGI support
  • Paste support

w00t!

GPL 3 released

, | Tweet this

I went to the FSF GPL 3 release event and it was really interesting. There's been a lot of discussion and heated exchanges over the GPL 3, but I was in grad school and wasn't really paying much attention. Listening to the discussions at the event about GPL version 3 was really enlightening.

Personally, I'm excited about the license. It allays some fears I've always had as a developer. Going foward, I'll be putting the majority of my work under the GPL version 3 license.

PyBlosxom status: 06/20/2007

, | Tweet this

I quietly released an RC2 for PyBlosxom 1.4 today after merging in Yury's patch for Paste support and merging in Steven's work for WSGI support. It's really awesome to be able to do paster serve blog.ini and have things work. At a minimum, it'll be a lot easier to test PyBlosxom--no mucking about with web-server configurations needed anymore.

We've finished moving all the documentation from docbook to reST format. It still needs a lot of work, but we're making measurable progress which is really cool. Also, the documentation is easier to work with and maybe that reduces the amount of energy it takes for people to help out.

PyBlosxom 1.4 should be released soon. Very soon. Will it be perfect? No, but it's a huge milestone for the project. That's pretty exciting in the grand scheme of things.

06/24/2007 - Fixed "server" to "serve". I keep typing paster server ... which is wrong.

Missed another Python Meetup!

, | Tweet this

Bah--that's the second Python Cambridge Meetup I've missed out of two. The first one was three days before I was to be married. I wanted to go but it wasn't in the cards. The second one was tonight, but I was confused and thought it was tomorrow night. Fiddlesticks!

Using exiftool and Python to fix photos (edit: to order them)

, | Tweet this

S and I decided to get a wedding photographer in addition to allowing our guests to take as many photos as they wanted of all aspects of our wedding (except when we were getting dressed [and ... undressed]). There were a few reasons for this one of which being the several horror stories we've heard about people's digital media dying causing them to lose all pictures of their wedding. Ick.

The problem is that there are a fajillion pictures and it's really hard to order them into a single consistent timeline. The wedding photographer we had1 had four cameras and took some 800 pictures. My dad took another 100 or so. Other people took a bunch, too. Right now I'm working with 1200+ pictures all of which are pretty big (between 5 MB and 10 MB each). It's not feasible to tweak them all by hand to order them. I didn't want to leave them unordered--my soul shudders at that thought. I needed a way to do batch processing to reorder pictures from a bunch of cameras into a nice timeline.

More after the break...

BREAK

First thing I did was put all the pictures that had no EXIF information to the side. There weren't many of them and I don't see any way to batch process them.

We're now left with 1000+ pictures all of which have EXIF information that we can work with. Interesting properties of the problem:

  1. all of the pictures have DateTimeOriginal and SerialNumber headers in the EXIF data
  2. pictures from a single camera have a consistent timeline in the sense that if picture 2 from camera A comes before picture 3 from camera A by 14 seconds, then that's what happened
  3. the time from all the pictures from camera A are a constant offset from all the pictures from camera B

This is all pretty obvious and there's nothing exciting here, but it does reduce the problem to picking a camera as a baseline and then figuring out how far off in seconds and minutes the other cameras are from the baseline. That's pretty easy.

First thing I do is get exiftool and run it like this to rename all the files according to their timestamp and serialnumber:

  exiftool '-FileName

Then I copied all the files into a thumbs directory and in the thumbs directory I used mogrify which comes with ImageMagick to create thumbnails:

  for i in `ls *.jpg`; do mogrify -quality 65 -geometry 150; done

Then I did this to figure out all the serial numbers of the cameras that took these photos:

  exiftool -SerialNumber *.jpg | sort -u

Then I wrote a Python script (any language will do) to build an index of the images using timeline offsets:

import os, time, datetime

# serial number -> (offset(minute, second), color)
OFFSETS = { "1020415017": ((0, 37), "#ff5555"), # red
            "1420918126": ((1, 11), "#ff55ff"), # purple
            "1621009923": ((1, 9), "#55ff55"),  # green
            "620306618": ((0, 0), "#5555ff") }  # blue (baseline)

def getinfo(fn):
   t = fn.split("-", 1)[1]
   t = t.split("_", 1)

   cam = t[1]
   cam = cam.split(".")[0]

   t = t[0]

   print fn, t

   t = time.strptime(t, "%H%M%S")

   # hardcoded year, month, and day
   t = datetime.datetime(2007, 05, 26, t[3], t[4], t[5])

   offset = OFFSETS[cam][0]

   t = t - datetime.timedelta(0, offset[1], 0, 0, offset[0])
   return (t, cam, fn)

def build():
   files = os.listdir(os.getcwd())
   files = [f for f in files if f.endswith(".jpg")]

   pics = [ getinfo(fn) for fn in files ]

   pics.sort()

   out = open("index.html", "w")
   out.write("<html><body><table border=\"1\">\n")

   for t, cam, fn in pics:
      out.write("""
<tr><td>
  <table>
    <tr><td bgcolor="#aaaaaa">name</td><td>%s</td></tr>
    <tr><td bgcolor="#aaaaaa">camera</td><td bgcolor="%s">%s</td></tr>
    <tr><td bgcolor="#aaaaaa">datestamp</td><td>%s</td></tr>
  </table>
</td><td><img src="%s"></td></tr>""" % (fn, offsets[cam][1], cam, repr(t), fn))

   out.write("</table></body></html>")
   out.close()

if __name__ == "__main__":
   build()

Note that I color-code the cameras. I find this makes it really easy to eyeball the timeline without trying to distinguish between similar-looking serial numbers.

I run that in my thumbs directory and it builds an index.html page that I can look at with a web-browser. The index.html has the offsets factored in. I look through the pictures and tweak the offsets until all the cameras are consistent with one timeline. Once I have a final set of offsets, I go through the pictures for each camera and (very carefully) do this:

  exiftool "-AllDates-=0:0:0 0:M:S" *SN.jpg
                               ^ ^   ^^

replacing:

  • M with the minute offset,
  • S with the second offset, and
  • SN with the camera serial number

Then you do another pass at renaming files and the files should then be in the same consistent timeline and in alphabetical order by filename.

After I worked out the process it took a couple of hours. We had the advantage of having a couple of points during the wedding where a lot of photographs were taken and it was obvious as to what order they needed to be in.

[1] - http://www.jillgoldman.com/ -- Jill is awesome.

06/21/2007 - Changed the title to something more appropriate. I was thinking "fix" because I was modifying the EXIF metadata for each photo to put them in the correct order, but Konquest makes a good point. I also fixed one of the command lines.

06/22/2007 - bockris said in the reddit.com comments:
This is a good idea. I've had to fix the EXIF data my photos before because I changed the batteries and didn't reset the date but I've never tried to sync up multiple cameras after the fact.

If I'm ever in a similar situation as the OP, I think I have everyone take a picture of a clock with a second hand at some period during the event. That would let you easily get the time difference among all cameras.

Using register allocation algorithms for determining table layout

, | Tweet this

S and I decided to assign tables for our guests during the wedding reception. There were a bunch of good reasons for doing this which I'm not going to go into here. However, assigning 100 or so guests to 10 or so tables while maximizing "goodness" and minimizing "badness" isn't trivial to do on paper by hand. At some point during wedding planning, I decided we could do table layout with a modified register allocation algorithm. This is a quick summary of translating register allocation into table layout along with some commentary on how this works nicely and also where it doesn't quite work.

More after the jump....

BREAK

Register allocation is typically done using graph-coloring with some additional bits that turn the NP-complete graph-coloring algorithm into a linear-time one that results in a good approximation. Andrew Appel talks about register allocation in his book on compilers entitled Modern Compiler Implementation in ML in chapter 11. It involves a graph of nodes representing temps connected by edges that represent interference between the temps. There's also a list of move-related nodes and nodes related by move-edges are coalesced if possible leading the connected temps to be colored with the same register.

In table layout, there are groups of people who can't sit at the same table with other groups of people. This could be due to family issues, differences in politics, past history, ... Additionally, there are groups of people you want to sit at the same table with other groups if possible.

Let's do some substitution. We'll substitute registers for tables, edges representing people who can't sit at the same table as interference, edges representing people who would be good sitting together as move-related, and temps as people and we have mapped the register allocation problem into a table layout problem.

I'm not going to go into the details of register allocation--Appel talks about the George and Appel algorithm from "Iterated register coalescing" (1996) for 25 pages and it'd be hard to summarize that into a blog entry.

There are a few things that don't translate well from register allocation to table layout like spilling. In register allocation if you can't get it to work you spill a temp into memory and then start over. I suppose you could uninvite particularly ornery people, that's one possible mapping for spilled temps. Another possibility is that you look for the least-worst pairing and remove that edge from the graph. A third possibility is to break up a larger table into two smaller tables giving you an extra color to work with.

Speaking of tables, there's one big difference between tables and registers: a register can be assigned to an unlimited number of temps so long as they don't interfere with one another. A table has a limited number of seats. So if you were to use a register allocation algorithm, you have the additional constraint that a limited number of people can be assigned to each table. If the register allocation implementation that you use is deterministic, this constraint could cause your situation to be unsolvable without backtracking. It would be a good idea to introduce a random element that causes assignments to occur in different orders between iterations.

The move-related edges (which represent people who should be sitting together) could be prioritized and that priority could be used for selecting moves to coalesce. For example, you might want to keep couples together and perhaps families, too. Perhaps you want to put all the children at a single table that you can put close to the bathroom.

Theoretically, this sounds like it would work pretty well for many cases, at a minimum returning a layout that can be tinkered with. We never used this method--by the time I had finished grad school, S was already past most of the table layout issues.

It's possible the constraint on people per table handicaps the register allocation algorithm to such a degree that it would be better to use a more general purpose constraint satisfaction problem solver instead. This requires further study.