Blog Posts tagged with: clojure

Nov 21 2008

Comments

Clojure Wins Again

Steve Yegge's most recent post takes a right angle turn about a third of the way through, and begins a comparison of Emacs Lisp and JavaScript.

And the winner is ... Clojure!

OK, Steve didn't say that. What he did do was call out things he liked about JavaScript and Emacs Lisp.

For JavaScript:

  • momentum
  • (namespace) encapsulation
  • delegation (polymorphism?)
  • properties (by Steve's definition)
  • serialize to source

For Emacs Lisp:

  • Macros
  • S-Expressions

I first picked up Clojure looking for many of the same things that Steve wants. I found them. Clojure can do all the things on both lists above. (Serialize to source isn't formal yet, but check the mailing list. And of course, you will have to judge "momentum" for yourself.)

The scary thing is that Clojure wins the language war before you even learn about its signature features. When I started exploring Clojure, I quickly realized it had everything I wanted, which could be summarized as "Lisp that really embraces the Java platform."

Then Clojure changed the definition of what I wanted. Now I also want

If you have half an hour, watch a compelling vision of what software development will look like in 2010.

Nov 05 2008

Comments

Clojure Beta Book Available

The Clojure Beta book is now available. Here's the Table of Contents. (Chapters with an asterisk are included in this beta.)

  • Preface*
  • Getting Started*
  • Exploring Clojure*
  • Working with Java*
  • Unifying Data with Sequences*
  • Functional Programming
  • Concurrency*
  • Macros
  • Multimethods
  • Third-Party Libraries
  • Case Study

Because this is a Beta book, and Clojure is continuing to evolve, there will be errata. Please let me know any problems you find, and I will address them in the next Beta.

Other Clojure resources

Oct 10 2008

Comments

Concurrent Programming with Clojure

Clojure is a dynamic language for the Java Virtual Machine with several powerful features for building concurrent applications. In this talk you will learn about:

  • Functional programming. Clojure's immutable, persistent data structures encourage side-effect free programming that can easily scales across multiple processor cores.
  • Software Transactional Memory (STM). STM provides a mechanism for managing references and updates across threads that is easier to use and less error-prone than lock-based concurrency.
  • Agents. Agents provide a thread-safe mechanism for asynchronous, uncoordinated updates.
  • Atoms. Atoms provide for synchronous, uncoordinated updates.
  • Dynamic Vars. Dynamic Vars support thread-local state.
  • Direct access to Java. Clojure calls Java directly, and can emit the same byte code that a handcrafted Java program would. Where it makes sense to do so, you can easily access the java.util.concurrent library.

Sep 25 2008

Comments

PCL -> Clojure, Chapter 17

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 17, Object Reorientation: Classes.

Creating structs

Common Lisp defines classed with defclass. In Clojure, I can define structs with defstruct:

  (defstruct bank-account :customer-name :balance)

The bank-account struct has two basis keys: customer-name and balance. I can specify values for these keys, in the order they were declared, using struct:

  user=> (struct bank-account "John Doe" 1000)
  {:customer-name "John Doe", :balance 1000}

With struct, all the basis keys are optional:

  user=> (struct bank-account)
  {:customer-name nil, :balance nil}

If you prefer named parameters, you can use struct-map instead of struct:

  user=> (struct-map bank-account :balance 10)
  {:customer-name nil, :balance 10}

Very important: structs are still maps. I can specify additional keys that are not part of the basis:

  user=> (struct-map bank-account  :balance 10
                                       :customer-name "Jane Doe"
                                       :status :gold)
  {:customer-name "Jane Doe", :balance 10, :status :gold}

Accessing structs

The examples below assume an example-account:

  (def example-account (struct bank-account "Example Customer" 1000))

Pedants call get to access a structure value:

  user=> (get example-account :customer-name)
  "Example Customer"

But that' way too much effort. Structures are functions of their keys:

  user=> (example-account :customer-name)
  "Example Customer"

If the struct keys are symbols, I can go the other way. Symbols are functions of structs:

  user=> (:customer-name example-account)
  "Example Customer"

Other than symbols, what else can be a structure key? Ah, sweet immutability. Since Clojure data structures are immutable, any of them can function as keys.

I can use assoc and dissoc to get a new map with a key added or removed:

  user=> (assoc example-account :status :elite)
  {:customer-name "Example Customer", :balance 1000, :status :elite}

  user=> (dissoc {:a 1 :b 2} :a)
  {:b 2}

But I can't dissoc from example-account because you can never remove a basis key:

  user=> (dissoc example-account :customer-name)
  java.lang.Exception: Can't remove struct key

Defaults and validation

Since structs are also maps, default values are easy: just merge them. The example below doesn't even use a struct. (Often duck typing is good enough.)

  (def account-defaults {:balance 0})
  (defn create-account [options]
    (merge account-defaults options))

If I want to validate fields, I can just write a validation function. Here is a validation that simply requires non-false values:

  (defn validate-account [account]
    (or (every? account [:customer-name :balance])
        (throw (IllegalArgumentException. "Not a valid account"))))

Of course, if I wanted to create tons of different structs with similar validations, I could build some helpers. Macros + metadata would be one way to go.

Wrapping up

Clojure's structs fill some of the same roles as Common Lisp's classes. The exmaples above show how to create and access structs, and how to add default values and validation.

That said, Clojure's structs are not classes. They do not offer inheritance, polymorphism, etc. In Clojure, those kinds of jobs are handled by the incredibly flexible defmulti (see the previous article for details, especially the references at the end).

Notes

Sep 25 2008

Comments

PCL -> Clojure, Chapter 16

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 16, Object Reorientation: Generic Functions.

defmulti

In Common Lisp, a generic function defines an abstract operation and a parameter list. In Clojure, a multimethod takes a similar role:

  (defmulti draw :shape)

The multimethod's name is multi, and :shape is a dispatch function used to select the actual concrete implementation. (Remember that keywords like :shape are also lookup functions.) Now, I can create one or more methods:

  (defmethod draw :square [shape] "TBD: draw a sqaure")
  (defmethod draw :circle [shape] "TBD: draw a circle")

The first method will draw things with a :shape of :square, and the second method will draw things with a :shape of :circle:

  user=> (draw {:shape :square, :length 10})
  "TBD: draw a square"
  user=> (draw {:shape :circle, :radius 8})
  "TBD: draw a circle"  

The draw multimethod is emulating single inheritance, if you think of an object's :shape value as its type. But the multimethod mechanism is more general.

A more complete example

Let's say that I need to implement account withdrawals. Different kinds of accounts will have different rules:

  • Bank accounts are simple accounts. Withdrawals will work if there is enough money available.
  • Checking accounts attach an overdraft account which can be used to cover large withdrawals.

The multimethod for withdraw could look like this:

  (defmulti withdraw :account-type)

The bank account implementation will do a simple withdraw.

  (defmethod withdraw :bank [account amount]
    (raw-withdraw account amount))

PCL uses Common Lisp's method combination to share implementation code between the different account types. Clojure's dispatch is much more general, so a general method combination mechanism is not appropriate. I am taking a different approach, pulling the shared code into a helper function raw-withdraw:

  (defn raw-withdraw [account amount]
    (when (< (:balance account) amount)
      (throw (IllegalArgumentException. "Account overdrawn")))
    (assoc account :balance (- (:balance account) amount)))

The withdrawal differs from the original PCL implementation in one other way. The original code mutated the account. Since mutation is a no-no, I am instead returning a new account object, associng in the changed balance. In the example below, I am using a let just to show that the original account is unchanged.

  (let [original-state {:account-type :bank :balance 100}
        updated-state (withdraw original-state 50)]
    (println original-state updated-state)) 

  {:balance 100, :account-type :bank} {:balance 50, :account-type :bank}

The checking account is a little more complex. First, I have to shuttle money in from the overdraft account (if necessary), then raw-withdraw as before:

  (defmethod withdraw :checking [account amount]
    (let [over-account (account :overdraft-account)
    over-amount (- amount (:balance account))
    withdrawal-account 
    (if (> over-amount 0)
      (merge account
         {:overdraft-account (withdraw over-account over-amount)
          :balance amount})
      account)]
      (raw-withdraw withdrawal-account amount)))

Again, all the objects are immutable. The merge function returns a new account object (possibly with an overdraft), and the raw-withdraw returns another object:

  (let [overdraft {:account-type :checking, :balance 1000}
        original-state {:account-type :checking
              :balance 100
              :overdraft-account overdraft}
        updated-state (withdraw original-state 500)]
    (println original-state)
    (println updated-state))

  {:overdraft-account {:balance 1000, :account-type :checking}, 
   :balance 100, 
   :account-type :checking}
  {:overdraft-account {:balance 600, :account-type :checking}, 
   :balance 0, 
   :account-type :checking}

Dispatching on more than one parameter

In languages like Java, methods are polymorphic on their first (implicit) parameter. Because multimethods dispatch on arbitrary functions, they can be polymorphic on all of their parameters.

For example, a music library might implement a beat method that is polymorphic on both the drum and the stick:

  (defmulti beat (fn [d s] [(:drum d)(:stick s)]))
  (defmethod beat [:snare-drum :brush] [drum stick] "snare drum and brush")
  (defmethod beat [:snare-drum :soft-mallet] [drum stick] "snare drum and soft mallet")

The first beat method matches only snare drum + brush, etc.:

  user=> (beat {:drum :snare-drum} {:stick :brush})
  "snare drum and brush"
  user=> (beat {:drum :snare-drum} {:stick :soft-mallet})
  "snare drum and soft mallet"

If no methods match the dispatch value, Clojure throws an exception:

  user=> (beat {:drum :bongo} {:stick :none})

    java.lang.IllegalArgumentException: No method for dispatch value
    ... stack trace elided ...

Or, you can define a :default that will match if no other dispatch value matches:

  (defmethod beat :default [drum stick] "default value, if you want one")

  user=> (beat {:drum :bongo} {:stick :none})
    "default value, if you want one"

Wrapping up

The PCL chapter demonstrates dispatch based on one or more arguments to a function, and those examples are duplicated above. There are many other things you might do with defmulti, but since they are not covered in PCL I will declare them out of scope here, and point you to some other reading:

  • Clojure objects have metadata, so you could dispatch based on metadata values instead of data values. See mac's post on the mailing list for an example.
  • Dispatch can be based on the state of an object, rather than on some kind of type tag. This lets you treat a rectangle with equal width and height as a square, even if it was created as a rectangle. See my article on dispatch in the Java.next series for an example.
  • Clojure's defmulti allows you to create multiple taxonomies dynamically, and trivially dispatch based on isa relationships in a taxonomy. See Rich's mailing list post introducing this feature.

Notes

Revision history

  • 2008/09/25: initial version
  • 2008/12/09: fixed withdraw erratum. Thanks Dean Ferreyra.

Sep 24 2008

Comments

PCL -> Clojure, Chapter 9

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 9, Practical: Building a Unit Test Framework.

Tests and reports

To build a minimal testing library, I need nothing more than tests and results. To keep reporting as simple as possible, I will start with console output. The report-result function tests a result, and prints pass or FAIL, plus a form with supporting detail:

  (defn report-result [result form]
    (println (format "%s: %s" (if result "pass" "FAIL") (pr-str form))))

Now any function can be a test. The detail message can often be the same form that caused the error, so I will pass the same form twice: once for evaluation, and again (quoted!) for use in the detail message:

  (defn test-+ []
    (report-result (= (+ 1 2) 3) '(= (+ 1 2) 3))
    (report-result (= (+ 1 2 3) 6) '(= (+ 1 2 3) 6))
    (report-result (= (+ -1 -3) -4) '(= (+ -1 -3) -4)))

The console output for test-+ looks like this:

  user=> (test-+)
  pass: (= (+ 1 2) 3)
  FAIL: (= (+ 1 2 3) 7)
  pass: (= (+ -1 -3) -4)

Inferring the detail message

The fact that I want to pass the same form twice, but with different evaluation semantics, just screams macro. Sure enough, I can clean up the code with a macro:

  (defmacro check [form]
    `(report-result ~form '~form))

The macro expands the form twice, once for evaluation and once quoted for the detail message. Now I can replace calls to report-result with simpler calls to check:

  (defn test-* []
    (check (= (* 1 2) 3))
    (check (= (* 1 2 3) 6))
    (check (= (* -1 -3) -4)))

Hmm. The calls to check are cleaner than the calls to report-result in the earlier example, but the check itself still looks repetitive. Solution: a better check macro that can handle multiple forms:

  (defmacro check [& forms]
    `(do
       ~@(map (fn [f] `(report-result ~f '~f))  forms)))

The quoting and unquoting is a little more complex--play around with macroexpand-1 to see how it works.

With the better check in place, test functions are quite simple:

  (defn test-rem []
    (check (= (rem 10 3) 1)
     (= (rem 6 2) 0)
     (= (rem 7 4) 3)))

Aggregating results

So far I have tests and console output. Next, I need some way to aggregate a set of checks into a single, top-level "checks passed" or "checks failed".

I would like to simply and together all the individual checks, but that does not quite work. As in many languages, Clojure's and short-circuits and stops evaluating when it encounters a logical false. That's no good here: Even if one test fails, I still want all the tests to run.

Since it is a question of optional evaluation, a macro is appropriate. The combine-results macro works like and, but it always evaluates all the forms:

  (defmacro combine-results [& forms]
    `(every? identity (list ~@forms)))

Now check can use combine-results instead of do.

  (defmacro check [& forms]
    `(combine-results
      ~@(map (fn [f] `(report-result ~f '~f)) forms)))

All existing functionality still works, and now I can see a useful return value from a test.

  user=> (test-*)
  pass: (= (* 2 4) 8)
  pass: (= (* 3 3) 9)
  true

Capturing test names

Tests ought to have names. In fact, tests ought to support multiple names. You can imagine a test detail report saying:

Check math->addition->associative passed: ...

Where associative is the name of a check, addition is the name of a function, and math is the name of another function that called addition.

First, I need a variable to store a sequence of names:

  (def *test-name* [])

Printing the variable as part of a result is easy:

  (defn report-result [result form]
    (println (format "%s: %s %s" 
           (if result "pass" "fail") 
           (pr-str *test-name*) 
           (pr-str form)))
    result)

Now for the hard part: populating the collection of names. For this, I will introduce a deftest macro:

  (defmacro deftest [name & forms]
    `(defn ~name []
       (binding [*test-name* (conj *test-name* (str '~name))]
         ~@forms)))

The macro expansion perfomed by deftest is nothing new: deftest turns around and defns a new function named name. The interesting part is the call to binding, which rebinds *test-name* to a new collection built from the old *test-name* plus the name of the current test.

The new binding of *test-name* is visible anywhere inside the dynamic scope of the binding form. The dynamic scope includes any function calls made inside the binding, and their function calls, and so on ad infinitum ... or until another binding performs the same trick again. This gives exactly the semantics we want:

  • The dynamic scope allows callers to influence callees without having to pass test-name an an argument all over the place. Nested functions "remember" a stack of their caller's names through *test-name*.
  • The unwinding of the dynamic scope protects readers of *test-name* outside a binding. Code after the binding will never see the values *test-name* takes during the binding.
  • Dynamic bindings are thread-local (and therefore thread-safe).

With deftest in place, I can defined a hierarchy of nested tests:

  (deftest test-*
    (check (= (* 2 4) 8)
     (= (* 3 3) 9)))

  (deftest test-math
    ; TODO: test rest of math
    (test-*))

  (deftest test-all-of-nature
    ; TODO: test rest of nature
    (test-math))

Calling test-all-of-nature will demonstrate multiple levels of nested name in a test report:

  user=> (test-all-of-nature)
  pass: ["test-all-of-nature" "test-math" "test-*"] (= (* 2 4) 8)
  pass: ["test-all-of-nature" "test-math" "test-*"] (= (* 3 3) 9)
  true                         

From here, better formatting of the console message is just mopping up.

Wrapping up

When I first read Practical Common Lisp, this was my favorite chapter. The testing library evolves quickly and naturally to a substantial feature set. (In case you didn't keep count, the entire "framework" is less than twenty lines of code.)

Try implementing the unit-testing example in your language of choice. Don't just implement the finished design. Work through each of the iterations described above:

  1. tests and results
  2. inferring the detail message
  3. aggregating results
  4. capturing test names

I would love to hear about your results, and I will link to them here.

Notes

Sep 24 2008

Comments

Java.next Overview

As we reach the middle of our second decade of Java experience, the community has learned a lot about software development. Many of our best ideas on how to use a Java Virtual Machine (JVM) are now being baked into more advanced languages for the JVM. These languages tend to provide two significant advantages:

  • They reduce the amount of ceremony in your code, allowing you to focus on the essence of the problem you are solving.
  • They enable some degree of functional programming style. Think of it as a dash of verb-oriented programming to spice up your noun-oriented programming.

I have picked four "Java.next" languages to demonstrate these concepts: Clojure, Groovy, JRuby, and Scala. I have written a series of articles and conference talks describing how these languages can make teams more productive.

This page is the top-level table of contents for Java.next, and I will update the links below as new articles and talks become available.

Articles on Java.next

Conference talks on Java.next

Seeing a talk

If you are interested in hearing me speak on Java.next, check the event schedule, or contact Relevance (info@thinkrelevance.com) to schedule an event near you.

Sep 23 2008

Comments

PCL -> Clojure, Chapter 8

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 8, Macros: Defining Your Own.

Rolling your own

Lisp macros gain their power by controlling argument evaluation. In a normal Lisp function all arguments are evaluated when calling a function. Consider this call to function foo:

  (foo a b)

Arguments a and b are evaluated, and then passed to function foo. If foo were a macro, however, all bets would be off. Then foo's arguments might be evaluated in bizarre orders, or not at all.

This may seem a little crazy until you consider a simple if:

  (if monday (wake-up) (sleep))

if cannot possibly be a normal Lisp function. If it were, you would always both wake-up and sleep, regardless of the value of monday.

As the if example suggests, control flow is an obvious use case for macros. PCL demonstrates custom macros by defining a new control flow macro named do-primes.

Preparing for do-primes

In order to implement do-primes, I will need a primeness test. For clarity, I will divide this into two functions. First, a simple helper to detect factors.

  (defn divides? [candidate-divisor dividend]
    (zero? (rem dividend candidate-divisor)))

Now I can tell when one number divides another:

  user=> (divides? 7 42)
  true
  user=> (divides? 11 42)
  false

A prime is simply a number with no divisors greater than one. I am a busy guy, so I won't check all the natural numbers, only those from two up to the square root of the number being tested. Here is a simple primeness test:

  ; yes, I know there are faster ways.  
  (defn prime? [num]
    (when (> num 1)
      (every? (fn [x] (not (divides? x num)))
        (range 2 (inc (int (Math/sqrt num)))))))

Sequences of primes

My eventual objective is to call do-primes like this:

  (do-primes i 100 200 
    (print (format "%d " i)))

where i is the loop variable and runs the primes from 100 to 200. Because Clojure has nice support for infinite sequences, I find it easier to begin by thinking in terms of the pure math. So, here is a function that returns the sequence of primes starting from a number:

  (defn primes-from [number]
    (filter prime? (iterate inc number)))

(iterate inc number) returns an infinite sequence starting with number and then incrementing by one for each subsequent element. The filter then whittles this down to numbers that are prime.

This sequence is infinite, so don't try to view it from the console. Take your primes a few at the time:

  user=> (take 5 (primes-from 1000))
  (1009 1013 1019 1021 1031)

Now I need a simple helper that begins with primes-from, but cuts off the sequence at a chosen end:

  (defn primes-in-range [start end]
    (for [x (primes-from start) :while (<= x end)] x))

The for is a list comprehension. It takes all the (primes-from start), but only while those numbers are still less than or equal to end.

do-primes

Now I am finally ready to write the macro do-primes:

  (defmacro do-primes [var start end & body]
    `(doseq [~var (primes-in-range ~start ~end)] ~@body))

Macros work in two steps: expansion followed by normal Lisp evaluation. The expansion phase is like a template substitution, but with the full power of Lisp at your disposal.

In the definition of do-primes above, the syntax-quote (\`) identifies the static part of the template:

  • For symbols, syntax-quote resolves the name to a fully qualified symbol (with some exceptions we don't need to worry about in this example).
  • For lists, syntax-quote will recursively syntax-quote the contained forms.

The unquote (~) and splicing-unquote (~@) provide the dynamic part of the template by exempting their forms from syntax quoting rules.

Your reaction at this point should be "That's a lot of ugly punctuation." Fear not, macroexpand-1 will ease the pain. macroexpand-1 will show you how Clojure expands the macro, without executing the expanded result. This gives you a chance to experiment with the rules for quoting and unquoting. Here is an example:

  user=> (macroexpand-1 '(do-primes i 1 10 (print i)))
  (clojure/doseq i (pcl.chap_08/primes-in-range 1 10) (print i))

Looking back at the definition of do-primes, here is what happened:

  • doseq expanded to the fully-qualified clojure/doseq. (I haven't covered namespaces yet, but the clojure namespace contains most of the Clojure core.)
  • i, 1, and 10 are direct expansions from the macro call.
  • primes-in-range is one of the helper functions I wrote earlier. In the sample repository, I have placed this in the pcl/chap_08 namespace, hence the expansion.
  • body contains a list of things I want to do with my primes, specifically ((print i)). That is almost what I need, except a few too many parens. The "splice" part of splicing unquote gets rid of the extra parens, splicing the list into the template. This is exactly what I need to match the doseq signature.

Now I can do-primes:

  user=> (do-primes i 100 150 
    (print (format "%d " i)))
  101 103 107 109 113 127 131 137 139 149

Wrapping up

The easiest way to write a macro is to work backwards. Write the form that you want the macro to expand into, and then test interactively with macroexpand-1 until you have a macro that expends correctly.

Macros are hard, and I have skipped some of the building blocks here. Check out the chapter in PCL.


Notes

The sample code is available at http://github.com/stuarthalloway/practical-cl-clojure.

Revision history

Sep 22 2008

Comments

PCL -> Clojure, Chapter 11

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 11, Collections.

Sequence basics

PCL describes a group of basic collection functions: count, find, position, remove, and substitute. Clojure supports count for a variety of list-like types:

  user=> (count (quote (1 2 3)))
  3
  user=> (count [1 2 3])
  3
  user=> (count #{1 2 3})
  3
  user=> (count "characters")
  10                          

These types, and any others than implement a basic first/rest protocol, are called sequences in Clojure. A sequence is logically a list, but may be implemented using other data structures.

In addition to generic sequence functions, some sequences have specific functions unique to their underlying data structure. Clojure defines find for maps to return the matching key/value pair:

  user=> (find {:lname "Doe", :fname "John"} :fname)                   
  [:fname "John"]

Or, you could just place the map itself in function position, and get back the matching value for a key:

  user=> ({:lname "Doe", :fname "John"} :fname)
  "John"

The Clojure core does not define find for other collection types. But the implementation is a one-liner using some. For example, to ask if a collection contains the number 2:

  user=> (some #(= % 2) [1 2 3])
  true

Clojure-contrib wraps the some idiom into a function named includes?.

The rest of the "basic" functions have similar stories: The Clojure core tends to support them directly where they are efficient (constant time) operations. Where they would take longer (e.g. linear time), the operations can be written as one-liners atop higher-order functions.

Higher-order functions

CL includes higher order versions of the basic functions described above. These higher-order versions take an additional parameter, which is a function that acts as a filter. Here are some examples.

First, a collection of days for the examples to work against:

  ; for re-split
  (use 'clojure.contrib.str-utils)
  (def days (re-split #" " "Sun Mon Tues Wed Thurs Fri Sat"))

Now I can find the weekdays that start with "S":

  user=> (filter #(.startsWith % "S") days)
  ("Sun" "Sat")

Or simply count the days that start with "S":

  user=> (count (filter #(.startsWith % "S") days)) 
  2

In an immutable world, remove is the opposite of find. I can get a collection with all "S" days removed by reversing the previous filter with complement:

  user=> (filter (complement #(.startsWith % "S")) days)
  ("Mon" "Tues" "Wed" "Thurs" "Fri")

To replace all "S" days with "Weekend!" I can use map:

  user=> (map #(if (.startsWith % "S") "Weekend!" %) days)
  ("Weekend!" "Mon" "Tues" "Wed" "Thurs" "Fri" "Weekend!")

Sorting

Sorting is easy:

  user=> (sort days)
  ("Fri" "Mon" "Sat" "Sun" "Thurs" "Tues" "Wed")

Sorting by criteria is also easy:

  user=> (sort-by #(.length %) days)
  ("Sun" "Mon" "Wed" "Fri" "Sat" "Tues" "Thurs")

Combining sequences

The concat function concatenates sequences.

  user=> (concat [1 2 3] [4 5 6])
  (1 2 3 4 5 6)

Note that the resulting sequence is lazy. So, concat can return without walking each input sequence. In other words, the (take 5 ...) below does not have to wait (forever!) for all the powers of 2 to be generated:

  user=> (take 5 (concat (quote (1/4 1/2)) powers-of-2))
  (1/4 1/2 1 2 4)

What if one of the sequences passed to concat blows up instead of returning a sequence?

  user=> (take 2 (concat '(1 2 3) (throw (Error. "Not a sequence"))))
  java.lang.Error: Not a sequence

Here concat fails because its second argument is not a sequence. As it happens, I have an even lazier option than concat. The lazy-cat function does not even look at each argument until it is forced to do so:

  user=> (take 2 (lazy-cat '(1 2 3) (throw (Error. "Not a sequence"))))
  (1 2)

Lazy sequences have many uses, but take some getting used to. One mistake to avoid is trying to inspect a lazy infinite sequence from the REPL. The REPL tries to print the entire sequence, which will take forever (literally). Hence the (take 2 ...) wrappers above.

Subsequences

It is often interesting to take subsequences from the beginning, middle, or end of a collection. Clojure supports this in a general way with take and drop. You have already seen take, which returns the first part of a collection:

  user=> (take 2 days)
  ("Sun" "Mon")                                      

For the end of a collection, I can use drop:

  user=> (drop 2 days)
  ("Tues" "Wed" "Thurs" "Fri" "Sat")

For the middle of a collection, I can use take and drop together:

  user=> (take 5 (drop 1 days))
  ("Mon" "Tues" "Wed" "Thurs" "Fri")

The take-nth function takes only every nth item of a collection. To demonstrate take-nth, I will begin by defining a lazy collection of the natural-numbers:

  (def natural-numbers (iterate inc 1))

The call to iterate produces a collection that starts with 1 and generates subsequent members by calling inc. You can verify that these are the natural numbers by taking a few of them.

  user=> (take 10 natural-numbers)
  (1 2 3 4 5 6 7 8 9 10)

Now I can write an intuitive definition for the even and odd numbers in terms of the natural numbers:

  (def odd-numbers (take-nth 2 natural-numbers))
  (def even-numbers (take-nth 2 (drop 1 natural-numbers)))

Predicates

Clojure provides a number of functions that test boolean predicates, including every?, not-any?, and not-every?, and some. Here are a few examples, using the days collection defined above.

Does every day start with "S"?

  user=>(every? #(.startsWith % "S") days)
  false

Is there some day that starts with "M"?

  user=>(some #(.startsWith % "M") days)
  true

Map and reduce

map take a function and one or more sequences. It returns a new sequence which is the result of applying the function to the item(s) in each sequence. So, to take the product of numbers from two sequences:

  user=> (map * '(1 2 3 4 5) '(10 9 8 7 6))
  (10 18 24 28 30)

If I want to control the type of collection returned, I can use into:

  user=> (into [] (map * '(1 2 3 4 5) '(10 9 8 7 6)))
  [10 18 24 28 30]

reduce walks down a collection, applying function f of two arguments to the first two arguments, then applying f to the result of the first call and the next element. This is very useful for operations that process a sequence and return a single value. For example, I can sum a sequence:

  user=> (reduce + [1 2 3 4 5])
  15

Or find the max value of a sequence:

  user=> (reduce max [1 2 3 4 5])
  5

Maps

Maps (hash tables in CL) can be iterated just like any other sequence type, bearing in mind that the function you pass in should expect a key/value pair. Given the following map of names to scores:

  user=> (def scores {:john 18, :jane 21, :jim 14})                   7
  #'user/scores

I can find all the people who scored above 15:

  user=> (filter (fn [[k,v]] (> v 15)) scores)
  ([:jane 21] [:john 18])

Notice how the destructuring bind ([[k,v]]) makes it easy to bind k and v separately, without introducing a temporary variable pair that I don't really need.

Wrapping up

Lisp excels at processing lists. Clojure offers similar capabilities, but generalized to sequences, which can be lists, vectors, maps, sets, or other list-like collections.

Clojure's support for lazy collections allows a different style for collection processing that I will continue to explore in later articles in this series.


Notes

Revision history

  • 2008/09/22: initial version
  • 2008/10/15: removed var-quoting per Douglas's comment. Thanks.
  • 2008/12/09: fixed filter erratum. Thanks Dean Ferreyra.

Sep 19 2008

Comments

PCL -> Clojure, Chapter 10

This article is part of a series describing a port of the samples from Practical Common Lisp (PCL) to Clojure. You will probably want to read the intro first.

This article covers Chapter 10, Numbers, Characters, and Strings.

Lisp with Java guts

Because Clojure is Java under the covers, you can always use Java's support for numbers, characters, and strings. The Java interop syntax is clean and simple, so it is idiomatic to call Java directly, rather than write wrappers to make code look more Lisp-like. A few examples follow:

  user=> (Math/pow 3 3)
  27.0

  user=> (.compareTo "a" "b")
  -1

  user=> (Character/toLowerCase \A)
  \a

Numbers are numbers

In Clojure, as in most Lisps, numbers are numbers. They don't do irritating things like overflow:

  user=> (* 1000000 1000000 1000000)
  1000000000000000000

Also, integer division is exact:

  user=> (/ 10 3)
  10/3

Under the covers, Clojure's numeric representation switches Java types as necessary to do the right thing.

  user=> (class (* 1000 1000))
  java.lang.Integer
  user=> (class (* 1000000 1000000))
  java.lang.Long
  user=> (class (* 1000000 1000000 1000000))
  java.math.BigInteger

Study the API docs

Clojure does a good job of balancing the purity of math (Lisp) and the practical reality of efficient representation (Java's primitives). But you still have to know your way around. Some things are wrapped in Lisp, and some things aren't. For example, numbers support the mathematical comparison operators, but Strings use Java's compareTo:

For example:

  user=> (< 1 2)
  true
  user=> (< "a" "b")
  java.lang.ClassCastException: java.lang.String
  user=> (.compareTo "a" "b")
  -1
  user=> (.compareTo 1 2)
  -1

If you aren't sure whether there is a Lispy wrapper for some functionality, you can check the API docs, or just try it in the REPL.

Wrapping up

For numbers, characters, and strings, Clojure provides some of the trappings a Lisp programmer would expect, e.g. exact integer division. But under the covers, it's all Java. If you don't find what you need in the Clojure API, drop to Java using the interop syntax.


Notes

Popular Tags