Paul Snively home
A proof is a repeatable experiment in persuasion. — Jim Horning

100 Proof: Proof 1

Bradford Cross, co-founder and head of research for FlightCaster, has decided to do something very interesting: the 100 proof project. Since I am currently on a mathematical (re-)education kick myself and have tinkered around quite a bit with the Coq proof assistant in the context of programming language formalization, I've encouraged Bradford to use it in his developments. Bradford has agreed, so here's the beginning of the collaboration. It's very exciting!

MathJax and Theorem 1

Bradford's inaugural post in the series mulls over some issues in representation, terminology, "pencil and paper" proof process, and choice of tools that I'm going to completely ignore here, so please do follow the link above to his post. I am going to rely heavily on standard mathematical logical notation when discussing theorems and proofs generally, and Coq's excellent coqdoc system for embedding the proof scripts themselves. Since, unlike Bradford, I'm not a conkeror user and don't expect many people to be (sorry, Bradford) I've gone ahead and authored this page using the wonderful MathJax system. That means that you should see Bradford's first theorem in standard mathematical logical notation:

On Notation

One of my goals in this endeavor is to help reduce the mystique and intimidation that's unfortunately attached to the standard mathematical logical notation. I will be the first person to admit that for such a tiny, simple theorem, it's overkill. As our endeavors grow, however, it will hopefully become apparent that there is value in the directness of the mapping from syntax to semantics, the concision, and the precision that has evolved in the notation over the centuries.

For right now, all you need to know is that the funny upside-down "A" is pronounced "for all" and called the "universal quantifier." So the theorem begins "For all values of x and y in Z..." "Z" is one of the so-called "classical number domains," and to most people it's probably the most familiar one: the integers. Why the integers are called "Z" is outside the scope of this blog post.

Continuing on, "x > 3" and "y < 2" are hopefully still very familiar, but the vaguely tipi-looking symbol might not be. That's the standard notation for "and" in mathematical logic. It's tempting to say "whee, they saved two whole characters!" but the English word "and" actually does bring along some connotations that we really don't want just because we're thinking about what we formally call "conjunction" and "and" usually captures our intuition well enough for human language.

Finally, the double-lined right arrow symbol means "implies" or "yields."

Proof 1 in Coq

With that background, I can present the Coq proof script for Bradford's first theorem. Here it is:

"Require Import ZArith Psatz." reflects the fact that Coq has a sophisticated module system. In this instance, we're indicating that we expect to use integer arithmetic and "Positivstellensatz"—literally, "positivity theorem"—tactics in our proof script. "Open Scope Z_scope." tells Coq that we wish to use definitions in the scope defined by ZArith by default. If we didn't do that, we would be using Coq's default scope, which is over the domain N, the natural numbers.

Finally, we can state our theorem in Coq: "Theorem proof_1 : forall x y : Z, x > 3 -> y < 2 -> x * x - 2 * y > 5." As you can see, it's almost a literal transliteration of the formal statement above. We have the universal quantification of x and y in the integers, the familiar inequalities, "*" meaning multiplication, "-" subtraction, and so on. What's likely to seem a little weird at first is that the "" has been replaced with an arrow, making the Coq theorem seem to read "for all integers x and y, x > 3 implies y < 2 implies x^2 - 2y > 5." What's interesting is that you can, in fact, read the theorem that way! Another reasonable interpretation of it is "for all integers x and y, if x > 3 then if y < 2 then x^2 - 2y > 5." This latter interpretation might be closest to the formal statement with its "," but it's still a change. Why?

The Curry-Howard Isomorphism

The answer relates to something Bradford mentioned in his post: the Curry-Howard Isomorphism. The slogan usually associated with the Curry-Howard Isomorphism is "types are theorems and programs are proofs." A lot of people tend to treat the word "isomorphism" as if it's just a pretentious way of saying "analogy," but it's actually a very precise notion, and Coq is based on it. It's important to note that the theorems and proofs covered by the isomorphism are in what's called "Constructivist" or "Intuitionistic" logic. Constructivists are all from Missouri, the "show me" state: there's no law of the excluded middle. You have a proof if and only if, given a proof of the premises, you can provide an algorithm (program, proof) to "construct" the conclusion from them. The most obvious consequence of the isomorphism, then, is that the arrow of logical implication and the arrow type of functional programming are the same thing. So you can read the theorem as a function type as in Standard ML, OCaml, Haskell, Scala... that takes an "x > 3" and a "y < 2", and returns a "x^2 - 2y > 5."

Wait, what?

Dependent Types

Aren't "x > 3," "y < 2," and "x^2 - 2y > 5" expressions, not types?

In Coq, they can be both. In all statically-typed languages, there are two basic constructs that we work with: types and terms. We say that types and terms can "index" each other in various ways to give us the features that we've come to expect in our programming languages. Here's a table showing those intersections and what they're called in various programming languages:

indexed by:
termtype
termfunctionsinheritance, overloading
type?templates, generics

Terms indexed by other terms are so dirt common they have their own name: functions. The reason that might sound a little funny is just that we're accustomed to thinking of "indexing" as something that you do to data structures. PLT (Programming Language Theory) geeks like myself, though, have deeply internalized Lisp's great insight that there's no difference between programs and data, so to us, a function is just a term indexed by another term. This is even starting to show up in the mainstream, as I discussed in 100 Proof: Proof 1 It's easy to forget that the word "function" wasn't even used this way until Leibniz used it in a letter in the 17th century.

Terms indexed by types have been around for a long time now, too, in one form or another. Object-oriented programmers will recognize them in overriden methods; C++ programmers will recognize them in overloaded operators; CLOS programmers will recognize them in generic functions; and so on.

Types indexed by other types—"templates" or "generics"—are still considered somewhat esoteric in the mainstream, mostly, I think, due to the bizarre limitations in the type language in C++ and the fact that generics were retrofitted, with some design pain, onto Java quite late in its life. The concept, "parametric polymorphism," had been implemented well, however, since Standard ML in 1975. Thankfully, even with their limitations, templates and generics are bringing the concept to a broader audience, and better implementations such as Scala's can only help in this regard.

This brings us to the "?" entry, types indexed by terms, or "dependent types." Straightforward examples of the concept include things like the "string with length > 0" type, or "array of size 4 x 5" type. As we're seeing here, however, the concept is extremely general, and thanks to the Curry-Howard Isomorphism, extremely powerful.

Finally we can discuss the proof script proper: "intros. psatz Z 2. Qed." "intros" is a tactic that "introduces" as many things from the current proof goal as hypotheses as possible. If you were to try this out in Coq, via coqtop or an IDE such as Proof General, you'd see that your hypotheses include that x and y are integers, that x > 3, and that y < 2, leaving your goal as "x * x - 2 * y > 5." An interesting implication of this is that universal quantification and logical implication are, in some fundamental way, "the same thing" such that "intros" eliminates them both from the goal and introduces them as assumptions.

"psatz Z 2" is the tactic that does the work. Kind of. It actually requires that you have csdp installed. If you do, then after you use this tactic, Coq will say "Proof completed." "Qed" is, in a sense, the icing on the cake. It's how you acknowledge that you're happy with the proof, and its effect is to make the theorem visible in the scope so that, e.g. it can be used to prove still other things. In our case it's just an excuse to use the old Latin acronym.

Is That All There Is? (With Apologies to Peggy Lee)

Yes and no.

Yes: that really is the entire proof script. You can now have as much confidence as you've ever had in anything in your life that the theorem is true.

No: you might wonder how I can say that you can have as much confidence as you've ever had in anything in your life that the theorem is true. After all, we've relied heavily on a piece of software to prove it, and software has bugs all the time. Where does the confidence come from?

Part of the answer revolves around that "csdp" program that you need to have installed. The psatz tactic, which takes a domain and a search-depth as its parameters, packages up the hypotheses and proof goal and hands them to this semi-definite programming (the "sdp" in "csdp") procedure to search for a proof. This is the part that Bradford is referring to when he says "let's not prove arithmetic operations on inequalities." csdp does that heavy lifting for us here and returns the result to Coq, which dutifully tells us that the proof is completed.

Can I Get a Witness? (With Apologies to The Drifters)

The key point, though, is that csdp isn't treated as what's called an "oracle" that just returns "true" or "false." Instead, csdp returns what's called a "proof witness." The psatz tactic, which translated our hypotheses and goal into a form that csdp could digest, also translates the proof witness back into a Coq proof term, which is then checked by Coq's proof checker. The secret ingredient, then, is that, although constructing a proof term automatically might be brutally difficult, taking a lot of CPU time and even external specialized (semi)decision procedures, checking a proof term in Coq's core logic is so simple that it can be done with a checker that's a few hundred lines of OCaml.

The de Bruijn Criterion

A proof checker of a few hundred lines of OCaml can be audited by any competent OCaml programmer. Theorem provers with checkers small and simple enough to be realistically human-audited are said to satisify the "de Bruijn criterion," named after Dutch mathematician Niklaus de Bruijn, who developed the AUTOMATH theorem prover in 1974. The point is that, instead of having to audit every proof developed with the theorem prover, which may not even be feasible, you can audit the checker once (or, for that matter, as many times as you like, with as many eyeballs as you can persuade to help). Once you're satisfied with the logic it implements and that it implements it faithfully, you're done. If you wonder why a human being should trust the logic that the checker implements, please refer to the quote at the top of the page. :-)

What if OCaml Has a Bug?

You could run afoul of a bug in OCaml, strictly speaking. That's one reason why there's ongoing work to formalize OCaml and build certified compilers. In the meantime, OCaml is a very mature system that has had a lot of code developed with it for over 15 years, so its quality tends to be assured by the other known effective methodology: standing the test of time.

Finis

That brings us to the end of my first installment in the 100 Proof project. I'd like to thank Bradford for encouraging me to collaborate with him on it, the Coq developers for such a wonderful tool, the C-CoRN repository for tackling a lot of proofs that I know Bradford and I will be making use of later in the series, and you for reading. Until next time!

Comments

blog comments powered by Disqus
© 2003-2009 Paul Snively
Fork me on GitHub