Mar 11 2014Comments
Mar 03 2014Comments
Want to meet a Cognitect in person? Here's where you can find us during the month of March:
San Francisco, CA 3/21-3/23
Intro to Clojure, ClojureScript & Datomic Training
Trainers: Luke VanderHart, Stuart Sierra
San Francisco, CA 3/24-3/26
Organizers: Lynn Grogan, Alex Miller
Speakers: Timothy Baldridge, Stuart Sierra
Attending: Kim Foster, Justin Gehtland, Stu Halloway, Rich Hickey, Luke VanderHart, Alex Warr
Feb 25 2014Comments
Feb 17 2014Comments
I've been revising my DevOps at Five talk for GOTO Chicago. In it, I talk about the CAMS model for DevOps. The 'C' in CAMS stands for Culture, and it's always a hard one to define. Yes, we do say that DevOps is about culture not tools. But what kind of culture? I can say everything is a culture. Agile development is a culture. But so is waterfall development, or cowboy coding.
Every time we try to define the culture, we end up using words like communication, collaboration, and feedback. The problem is that those words don't differentiate DevOps from ITIL, TQM, or any other operational process since Deming.
Your culture is really defined by the things that would be unthinkable to say.
In that light, I'd like to offer some statements that you will never hear in a DevOps culture.
- That's not my job.
- (From a developer) Well, the tests all pass, so the problem must be in Operations.
- (From operations) I did everything right, it must be a code problem.
- Oh, production is down? I didn't know.
- I built the production environment by hand. There's no reason to automate it because we're only doing it once.
- How close is QA to production? I don't know.
- You need a new environment? Put in a service request and we'll see.
- Your service request was approved. You should have a new machine in a few weeks.
- Why do you need another terabyte of storage? You're already using a terabyte!
- Our DBAs won't let you install that server. We're a SQL Server shop.
- I don't know what our slowest query is.
- I don't know how many cores our production environment has.
I could go on, but I think you get the idea.
How many statements like these do you hear around your company?
Feb 11 2014Comments
Jan 31 2014Comments
Want to meet a Cognitect in person? Here's where you can find us during the month of February:
Jan 30 2014Comments
Jan 14 2014Comments
Dec 30 2013Comments
Dec 23 2013Comments
Ralph Waldo Emerson famously observed:
A foolish consistency is the hobgoblin of little minds.
Given that, I think that Mr. Emerson would be happy to know that "consistency"—the word—is not consistent. There are two separate technical meanings of "consistency," two meanings that technical people tend to conflate.
Consistency as Predicate
The older definition of the word comes from database researchers, especially including Jim Gray. He gave us locking, transactions, distributed transactions, consistency, and the ever-popular "ACID," all back when database was still spelled as "data base." I think we can regard him as authoritative.
In this definition, "consistency" is defined as a boolean function—a predicate—that takes the whole database as an input. When it returns "true", then the database is in a good state. To keep it that way, the server checks every transaction to see if it would make the predicate return false. If so, the server must reject that transaction.
We can regard this definition as having an "interior" perspective. That is, only the database server itself is able to apply the consistency predicate. Most databases do not have any way for an observer to get the complete value of the database at a point in time.
The idea of a consistency predicate also clarifies the relationship between consistency, atomicity, and isolation. The database server is temporarily permitted to break the consistency predicate while applying a transaction. (As an aside, I think that paper anticipates Datomic by 31 years.) However, it does not allow any outside observer to perceive the intermediate, unacceptable state. To external observers, the change appears atomic. Transactions do not commute, so the server also ensures that the effects of the transactions appear to obey some partial ordering. (Please note the very careful wording on that last sentence.) That's isolation.
Computationally, it is impractical to apply a general predicate over the entire database at the end of every transaction. Implementations use weaker forms of enforcement, typically syntactic constraints (i.e., CHAR(20) NOT NULL) and referential integrity. This moves from the notion of a consistency predicate to a lesser guarantee.
As an aside, using this weaker form of the predicate is one reason why we get data cleansing problems like negative ages and Little Bobby Tables. It's because we can't make a big table that has every possible human name, so we apply a weaker rule that says "any sequence of alphanumeric characters could be a name."
Consistency as History
There is a completely separate definition of "consistency" that comes from the distributed systems world. This definition of consistency has more to do with applying the same transactions in the same order at every node in a system. This is the meaning of "consistency" which is used in the CAP Theorem. In that paper, consistency means, "there must exist a total order on all operations such that each operation looks as if it were completed at a single instant." This definition of consistency is about the history of operations and their causality.
That kind of consistency—strict linearizability—is a very high bar. There are many ways to weaken it and exploit "loopholes". CRDTs also avoid the need for linearizability by requiring all operations to be commutative. With CRDTs, as long as every node has a permutation of the ordering of operations, they will still report the same resulting value.
One form of the word "consistency" talks about the state of the system at a given point in time. The other talks about the history of operations that produced the current state. Are they fully substitutable? No.
For instance, suppose the global consistency predicate says, "The atomic object V holds an integer." There are two operations on V:
dec which increment and decrement its value by one. The starting value is 0 and the representation can grow arbitrarily. Given that definition, we can have "consistency as predicate" without "consistency as history." That's because we can reorder those operations at will and still have the same value at the end. That means our predicate holds no matter what the history says. Two nodes that see a different history of operations are inconsistent by the second definition, but consistent by the first.
So it turns out that "consistency (predicate)" and "consistency (history)" are two distinct ideas that happen to share a word. It is always an error to substitute the distributed systems definition of "consistency" for the C in ACID.
The C in CAP is not the C in ACID.