Friday, April 20, 2007

A persistent dict class

One big part of the website is importing events, with artist names associated with it. However, sometimes a band is referred to in slightly a different way -- not enough to be significant to a person, but enough to fool a computer into thinking it's a new band. The way I initially fixed this was a artist_name_fixup dict in the importing module. The dict contained a mapping between the incorrect name and the correct one. So for example, "Static X" mapped to "Static-X" (the latter is the actual band name, according to their website.)

But this meant that every time a new fixup was needed, the source file containing the dict needed to be changed. OK, so put it in the database. Well, we check this dict a LOT, so it would be nice to not hit the DB every time.

The persistent dict solves this by saving added entries in the database, but also keeping the dict in-memory. First, we define the fixup DB table in our model:
class ArtistNameFixup(SQLObject):
name = UnicodeCol(alternateID=True)
value = UnicodeCol()
Then, the new class (in our util.py module):
class PersistentDict(dict):
def __init__(self, model):
super(PersistentDict, self).__init__()
self.model = model
for row in model.select():
super(PersistentDict, self).__setitem__(row.name, row.value)

def __setitem__(self, name, value):
try:
r = self.model.byName(name)
r.value = value
except SQLObjectNotFound:
r = self.model(name=name, value=value)
super(PersistentDict, self).__setitem__(name, value)

def __delitem__(self, name):
self.model.byName(name).destroySelf()
super(PersistentDict, self).__delitem__(name)

def update(self, other_dict):
for name, value in other_dict.iteritems():
try:
r = self.model.byName(name)
r.value = value
except SQLObjectNotFound:
r = self.model(name=name, value=value)
super(PersistentDict, self).update(other_dict)
Finally in the import module, we add:
import util
artist_fixup_dict = util.PersistentDict(ArtistNameFixup)
When we start the TG application, the artist_fixup_dict will be populated from the database. Subsequent additions and deletions will be reflected both in the in-memory version, as well as in the db table, so if the app is reloaded, it won't lose anything.

Finally, note that the PersistentDict code is making assumptions about the SO model it is initialized with -- that the table has "name" and "value" fields. With a little more code, these could be made configurable, but I don't need that flexibility, so I haven't added it.

No comments: