PCL -> Clojure, Chapter 3

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 3, Practical: A Simple Database.

Defining a database

The code examples begin with a simple function for creating records. Here is a Clojure approach:

  (defstruct cd :title :artist :rating :ripped)

A cd has four defined slots: title, artist, rating, and ripped. (Note that this is pretty different from the approach taken in PCL. Peter is introducing language features one a time, and hasn't covered structs at this point in the book.)

Next, I need a way to keep an in-memory database of cd information. In PCL, this is done by changing a mutable global. Since Clojure uses immutable data structures, I will take a different approach, and pass the db object (a collection of some kind) to every method.

  (defn add-records [db & cd] (into db cd))

add-records lets us add an arbitrary number of cds to an existing db collection. The use of into allows us to defer the choice of collection type.

I also want an init-db function to populate initial values for the collection. Again I will avoid mutable data and have init-db return the database.

  (defn init-db []
    (add-records #{} 
          (struct cd "Roses" "Kathy Mattea" 7 true)
          (struct cd "Fly" "Dixie Chicks" 8 true)
          (struct cd "Home" "Dixie Chicks" 9 true)))

The #{} is a set literal.

Console output

Next, dump-db can print a summary of the database to the console. Again, since there is no mutable data, I will pass the database to dump-db

  (defn dump-db [db]
    (doseq [cd db]
      (doseq [[key value] cd]
        (print (format "%10s: %s\n" (name key) value)))
      (println)))                               

The calls to doseq are loops: the outer doseq loops over all cds in the db, and the inner one loops over each key/value pair in a cd record. The call to format wraps a Java call to String.format. The output from dump-db looks like this, when called on (init-db):

   title: Roses
  artist: Kathy Mattea
  rating: 7
  ripped: true

   title: Fly
  artist: Dixie Chicks
  rating: 8
  ripped: true

   title: Home
  artist: Dixie Chicks
  rating: 9
  ripped: true        

Console input

Reading in a new record is a little more tricky. I need a generic input function, plus number validation (for the rating field) and yes/no validation (for the ripped field). First, I will write a prompt-read function that displays a prompt and then waits for input:

  (defn prompt-read [prompt]
    (print (format "%s: " prompt))
    (flush)
    (read-line))

Now I can write prompt-for-cd, which will prompt for each of the four fields, and then combine them into a cd:

  (defn prompt-for-cd []
    (struct 
     cd 
     (prompt-read "Title")
     (prompt-read "Artist")
     (parse-integer (prompt-read "Rating"))
     (y-or-n-p "Ripped [y/n]")))

The prompt-for-cd function cannot compile, because it depends on two other functions that I haven't written yet. First, parse-integer:

  (defn parse-integer [str]
    (try (Integer/parseInt str) 
         (catch NumberFormatException nfe 0)))

This function demonstrates two bits of Java interop: the Integer/parseInt syntactic sugar for invoking static methods, and the try/catch special form. The resulting function will coerce any non-numeric input to zero.

Common Lisp has a built-in y-or-n-p for yes/no input, buy Clojure doesn't. So I will write one:

  (defn y-or-n-p [prompt]
    (= "y"
       (loop []
         (or 
            (re-matches #"[yn]" (.toLowerCase (prompt-read prompt)))
            (recur)))))

The call to recur will repeat the loop until the user enters "y" or "n". Since re-matches returns the match, there is no need to capture the input in a local variable.

File load and save

It would be nice to save and reload the database from a file. Since the database is just Clojure data, I can use Clojure's built-in support for serializing objects: pr-str and read-string. Combine these with the beautifully-named spit and slurp, and file I/O is done:

  (use 'clojure.contrib.duck-streams)
  (defn save-db [db filename]
    (spit filename (pr-str db)))

  (defn load-db [filename] 
    (read-string (slurp filename)))

For spit you will need the clojure-contrib library on your classpath.

Querying the database

Now, the fun begins. I would like to have a mini-DSL for querying the database. For example, I might say:

  (filter (artist-selector "Dixie Chicks") (init-db))

This should return all items in the database where the artist is "Dixie Chicks". For this to work, artist-selector needs to be a higher-order function, i.e. a function that returns another function:

  (defn artist-selector [artist]
    #(= (:artist %) artist))

There are two interesting bits of syntactic sugar here.

  • The #(...) creates an anonymous lambda, where % represents the first argument. For example, #(inc %) is the same as (fn [n] (inc n)).
  • The call (:artist %) uses the keyword :artist in function position to look up the corresponding value in %.

A more general query

artist-selector is cool, but it could be much more general. How about a where function that creates a test for any number of criteria? A general form would look like:

  (filter (where {:artist "Dixie Chicks"}) (init-db))

But now I can add multiple criteria. How about all the Dixie Chicks albums that I rated 8?

  (filter (where {:artist "Dixie Chicks" :rating 8}) (init-db))

Here's where:

  (defn where [criteria]
    (fn [m]
      (loop [criteria criteria] 
        (let [[k,v] (first criteria)]
            (or (not k)
                (and (= (k m) v) (recur (rest criteria))))))))

This is a little more complex than artist-selector:

  • The loop processes the criteria, one key/value pair at the time.
  • The let uses a destructuring bind to pull the key and value into k and v.
  • The and joins together multiple criteria.
  • The recur advances to the next criterion.
  • The (or (not k) ... allows the recursion to terminate when no criteria remain.

Notice that the variable name is now m instead of cd. While writing a CD database, I have accidentally produced a completely general purpose where function. Oops. :-)

I wrote the where to explicitly demonstrate implementing sequence operations with recur. You will often find that Clojure has already defined the sequence operation you need. In this case, there where applies its criteria to every object in the sequence. Sure enough, there is an every? that will simplify things:

  (defn simpler-where [criteria]
    (fn [m]
      (every? (fn [[k v]] (= (k m) v)) criteria)))

Updating the database

A nice feature of functional design is composability. We have used where to perform lookup queries with filter, but the same code can be used for update operations. For example, maybe I want to set the rating for all Dixie Chicks albums to 10: clojure (update (some-db) (where {:artist "Dixie Chicks"}) {:rating 10})

update is simply:

  (defn update [db criteria updates]
    (into (empty db) 
     (map (fn [m] 
      (if (criteria m) (merge m updates) m))
    db)))
  • Each cd in the db gets mapped into either itself, or itself+updates, depending on whether the criteria match.
  • The (into (empty db) ... gives back a collection of the same type as db. This will work for out set, but also means that the update function is generalized for other sequence types.

Wrapping up

Look back at where we have been. That's a lot of functionality for several dozen lines of code! The music database now supports

  • console-based input and output
  • file-based persistence
  • generic exact-match queries
  • arbitrary updates to matches

Clojure's unique features helped:

  • The map literal {:key value} made creating the query DSL simple.
  • Java interop was easy and unobtrusive.
  • Immutable programming wasn't so hard, after all. No state mutated in the code above, and yet nothing seemed too alien, did it?

Notes

The sample code is available at http://github.com/stuarthalloway/practical-cl-clojure.

Revision history

  • 2008/09/16: initial version
  • same day! : removed clojure.contrib.string format (format now in Clojure core)
  • 2008/09/17: batch of changes suggested by Rich Hickey. Most of these changes can be summarized as "It doesn't have to be a list!" See the git commit for details.
  • 2008/10/03: added link out to clojure-contrib, based on feedback from Hans Hübner.
  • 2008/10/28: fixed parameter reversal in update example, based on feedback from Chanwoo Yoo.
  • 2008/11/24: updated to the new uniform binding syntax.
  • 2009/10/15: fixed load/save bug by switching to pr-str, based on feedback from Perttu.