Long hours driving the open road grant me that rare luxury: time to think about everything and nothing. I love freely associating ideas as the miles roll by. Over a brief vacation this past week, a few ideas collided in my head (somewhere near Story City, IA, if I recall correctly) and stuck together in an amusing way. I'd like to share that with you now---let's have some fun.
I was thinking about mutability, immutability, and systems design. I love immutability in the small. Pure functions over immutable arguments are so much easier to work with than squirmy, stateful goop.
When we use a wider lens, though, it's much less clear how to build full-scale systems that have the same nice properties. For instance, think about a retail system. A shopping cart is just inherently mutable, isn't it? You add stuff, remove stuff, change quantities---those are all mutations, right?
Turing's Shopping Cart
As it turns out, you can make an immutable shopping cart! The key idea is that the cart doesn't change on its own. Instead, every time you want a new cart you build it by applying a function to the old cart and some inputs. The result is a series of immutable values where each one is a valid cart at a point in time.
To see how this works, let's do some mental calisthenics with pure numbers. We start with a non-negative integer N. We define two functions:
A(N, I, Q) -> N' where I is a non-negative integer and Q is a strictly positive integer. R(N', I) -> N
We also have a few identities. (These identities aren't essential, but they help illustrate the properties of our functions.)
A(N, I, 0) = N R(A(N, I, Q), I) = N A(A(N, I1, Q1), I2, Q2) = A(A(N, I2, Q2), I1, Q1) (commutative) A(A(N, I, Q1), I, Q2) = A(N, I, Q1 + Q2) (distributive over Q)
I haven't yet said how A and R actually work, but we'll do that through a Turing-style coding trick. We can invent a coding system like this:
A(0, I, Q) = QQQIIIIIIIIIIII A(QQQIIIIIIIIIIII, J, R) = RRRJJJJJJJJJJJJQQQIIIIIIIIIIII
In other words, Q is zero-padded to three digits and I is zero-padded
to twelve digits (or 14, or 101... however many digits you desire.)
A essentially concatenates to the left, except for the
distributive property. That property gives us a different coding:
A(QQQIIIIIIIIIIII, I, R) = SSSIIIIIIIIIIII, where S = Q + R
R function almost does the reverse:
R(SSSIIIIIIIIIIII, I) = 0
R isn't quite the direct inverse of
A. It doesn't take a Q parameter. That's because I didn't want to put any conditional logic to deal with zero. Because
Q can only be positive,
A can only result in a positive number. There is no way to decrement or subtract so there's no way to get a negative quantity.
Some concrete examples might help:
A(0, 12345, 1) = 001000000012345 A(001000000012345, 12345, 1) = 002000000012345 A(002000000012345, 9876543, 3) = 00300009876543002000000012345 R(00300009876543002000000012345, 12345) = 00300009876543
It's evident that these functions are pure, right? All we are doing is manipulating numbers. There is no place for state to hide here!
I'm going to turn these numbers into shopping carts by whipping off the thin disguise:
Iis an item ID
Qis the quantity desired
Ameans to add an item
Rmeans to remove an item
So, the state of a shopping cart is just a number. You can interpret that number as a string of decimal digits, or as a single integer somewhere on the number line, or as a coded vector of [Quantity, Item] pairs. Manipulating the cart is really just finding and referring to a new number. The space of numbers (possible carts) is fixed, eternal, and infinite. (It is not, however, continuous! You can't pick an arbitrary number and expect it to be a cart.)
A cart is a value, with all the nice value semantics that offers: I can store it, retrieve it, pass it around, and cache it. The cart retains its integrity.
Notice that we don't need anything like a
CART table with related
CART_LINE_ITEMS or some such. Instead, we just have a number. That
number is sort of like an address for a point in the space of all
possible shopping carts. We can think about building an algebra of
shopping carts in that space, and even finding theorems about carts
that exist, but cannot be proven to exist.
We still have one remaining question, though. How would a store using this coding keep track of my specific cart? This gets to the question of identity versus state. The concept of "my cart" speaks to identity. I have a cart, but the state of that cart (the number) can change. Because the state is just a number, the simplest way to assign identity would just be a numeric field on my account record. Really, though, anything that can hold a number would be equally valid.
Is this a useful way to implement shopping carts? A long string of digits? Any data architects reading this post have probably thrown up on my shoes by now. Encoding multiple values into a single field is a cardinal sin for relational database design. There are some other practical limits, too:
- Some databases can't handle unlimited-length fields.
- The simplistic coding scheme here imposes a limit on quantities and a limit on the number of items that can be represented.
- A more subtle restriction is that this coding limits us to using a single catalog. All item identifiers must come from the same space of items. We could, of course, get around this by extending our coding system to include the namespace from which the item identifier is drawn.
Transmitting cart representations as long strings of digits may get unwieldy. We could use other coding schemes to increase the entropy-per-bit carried in the string. One way to improve that information density would be to use alphabetic characters instead of just numerals. We could also use lossless compression. Compression and decompression are pure functions, so they could be applied to representations without any semantic change.
I will also contend that there is no difference whatsoever between this purely numeric code and one which looks like:
[Q1 I1] [Q2 I2] ...
You can convince yourself of that by extending our coding scheme to include these delimiters. If we say that the "[" character is represented by 1111 and "]" is 9999, then the cart with 2 of item 12345 and 3 of item 987654 would be represented by the number:
I'll admit that is a mouthful. We should probably choose an easier coding. In fact, we can just choose ASCII or UTF-8 and say that "[" is 91 and "]" is 93.
It is worthwhile to think about problems in different ways. Our systems can gain many benefits by separating identity from state, and replacing squirmy state with a series of immutable values. Whether you adopt this notion of an "algebraic shopping cart" or not, I hope you've had fun contemplating another way to represent a familiar structure. At the very least, I hope you've had fun with the idea of an infinite vector space of virtual shopping carts.