A proof is a repeatable experiment in persuasion. — Jim Horning
100 Proof: Proof 1
Bradford Cross, co-founder and head
of research for
FlightCaster, has decided
to do something very interesting: the
100 proof
project. Since I am currently on a mathematical (re-)education kick myself and
have tinkered around quite a bit with the
Coq
proof assistant in the context of programming language formalization, I've
encouraged Bradford to use it in his developments. Bradford has agreed, so
here's the beginning of the collaboration. It's very exciting!
MathJax and Theorem 1
Bradford's inaugural post in the series mulls over some issues in representation,
terminology, "pencil and paper" proof process, and choice of tools that I'm
going to completely ignore here, so please do follow the link above to his post.
I am going to rely heavily on standard mathematical logical notation when
discussing theorems and proofs generally, and Coq's excellent coqdoc system
for embedding the proof scripts themselves. Since, unlike Bradford, I'm not a
conkeror user and don't expect many people to be (sorry, Bradford) I've gone
ahead and authored this page using the wonderful
MathJax
system. That means that you should see Bradford's first theorem in standard
mathematical logical notation:
On Notation
One of my goals in this endeavor is to help reduce the mystique and
intimidation that's unfortunately attached to the standard mathematical
logical notation. I will be the first person to admit that for such a tiny,
simple theorem, it's overkill. As our endeavors grow, however, it will
hopefully become apparent that there is value in the directness of the
mapping from syntax to semantics, the concision, and the precision that has
evolved in the notation over the centuries.
For right now, all you need to know is that the funny upside-down "A" is
pronounced "for all" and called the "universal quantifier." So the theorem
begins "For all values of x and y in Z..." "Z" is one of the so-called
"classical number domains," and to most people it's probably the most familiar
one: the integers. Why the integers are called "Z" is outside the scope of
this blog post.
Continuing on, "x > 3" and "y < 2" are hopefully still very familiar, but the
vaguely tipi-looking symbol might not be. That's the standard notation for "and"
in mathematical logic. It's tempting to say "whee, they saved two whole
characters!" but the English word "and" actually does bring along some
connotations that we really don't want just because we're thinking about what
we formally call "conjunction" and "and" usually captures our intuition well
enough for human language.
Finally, the double-lined right arrow symbol means "implies" or "yields."
Proof 1 in Coq
With that background, I can present the Coq proof script for Bradford's first
theorem. Here it is:
"Require Import ZArith Psatz." reflects the fact that Coq has a sophisticated module system.
In this instance, we're indicating that we expect to use integer arithmetic and
"Positivstellensatz"—literally, "positivity theorem"—tactics in our
proof script. "Open Scope Z_scope." tells Coq that we wish to use definitions in
the scope defined by ZArith by default. If we didn't do that, we would be using
Coq's default scope, which is over the domain N, the natural numbers.
Finally, we can state our theorem in Coq:
"Theorem proof_1 : forall x y : Z, x > 3 -> y < 2 -> x * x - 2 * y > 5." As you
can see, it's almost a literal transliteration of the formal statement above.
We have the universal quantification of x and y in the integers, the familiar
inequalities, "*" meaning multiplication, "-" subtraction, and so on. What's
likely to seem a little weird at first is that the "" has been replaced with
an arrow, making the Coq theorem seem to read "for all integers x and y, x > 3
implies y < 2 implies x^2 - 2y > 5." What's interesting is that you can, in fact,
read the theorem that way! Another reasonable interpretation of it is "for all
integers x and y, if x > 3 then if y < 2 then x^2 - 2y > 5." This latter
interpretation might be closest to the formal statement with its "," but
it's still a change. Why?
The Curry-Howard Isomorphism
The answer relates to something Bradford mentioned in his post: the
Curry-Howard Isomorphism. The slogan usually associated with the Curry-Howard
Isomorphism is "types are theorems and programs are proofs." A lot of people
tend to treat the word "isomorphism" as if it's just a pretentious way of
saying "analogy," but it's actually a very precise notion, and Coq is based on
it. It's important to note that the theorems and proofs covered by the
isomorphism are in what's called "Constructivist" or "Intuitionistic" logic.
Constructivists are all from Missouri, the "show me" state: there's no law of
the excluded middle. You have a proof if and only if, given a proof of the
premises, you can provide an algorithm (program, proof) to "construct" the
conclusion from them. The most obvious consequence of the isomorphism, then,
is that the arrow of logical implication and the arrow type of functional
programming are the same thing. So you can read the theorem as a function type
as in Standard ML, OCaml, Haskell, Scala... that takes an "x > 3" and a
"y < 2", and returns a "x^2 - 2y > 5."
Wait, what?
Dependent Types
Aren't "x > 3," "y < 2," and "x^2 - 2y > 5" expressions, not types?
In Coq, they can be both. In all statically-typed languages, there are two
basic constructs that we work with: types and terms. We say that types and
terms can "index" each other in various ways to give us the features that we've
come to expect in our programming languages. Here's a table showing those
intersections and what they're called in various programming languages:
| indexed by: |
| term | type |
| term | functions | inheritance, overloading |
| type | ? | templates, generics |
Terms indexed by other terms are so dirt common they have their own name: functions.
The reason that might sound a little funny is just that we're accustomed to thinking
of "indexing" as something that you do to data structures. PLT (Programming
Language Theory) geeks like myself, though, have deeply internalized Lisp's great
insight that there's no difference between programs and data, so to us, a function
is just a term indexed by another term. This is even starting to show up in the
mainstream, as I discussed in
100 Proof: Proof 1
It's easy to forget that the word "function" wasn't even used this way until
Leibniz used it in a letter in the 17th century.
Terms indexed by types have been around for a long time now, too, in one form
or another. Object-oriented programmers will recognize them in overriden methods;
C++ programmers will recognize them in overloaded operators; CLOS programmers
will recognize them in generic functions; and so on.
Types indexed by other types—"templates" or "generics"—are still
considered somewhat esoteric in the mainstream, mostly, I think, due to the
bizarre limitations in the type language in C++ and the fact that generics were
retrofitted, with some design pain, onto Java quite late in its life. The
concept, "parametric polymorphism," had been implemented well, however, since
Standard ML in 1975. Thankfully, even with their limitations, templates and
generics are bringing the concept to a broader audience, and better implementations
such as Scala's can only help in this regard.
This brings us to the "?" entry, types indexed by terms, or "dependent types."
Straightforward examples of the concept include things like the "string with
length > 0" type, or "array of size 4 x 5" type. As we're seeing here, however,
the concept is extremely general, and thanks to the Curry-Howard Isomorphism,
extremely powerful.
Finally we can discuss the proof script proper: "intros. psatz Z 2. Qed."
"intros" is a tactic that "introduces" as many things from the current proof
goal as hypotheses as possible. If you were to try this out in Coq, via coqtop
or an IDE such as
Proof General,
you'd see that your hypotheses include that x and y are integers, that x > 3,
and that y < 2, leaving your goal as "x * x - 2 * y > 5." An interesting
implication of this is that universal quantification and logical implication
are, in some fundamental way, "the same thing" such that "intros" eliminates
them both from the goal and introduces them as assumptions.
"psatz Z 2" is the tactic that does the work. Kind of. It actually requires
that you have
csdp installed.
If you do, then after you use this tactic, Coq will say "Proof completed."
"Qed" is, in a sense, the icing on the cake. It's how you acknowledge that
you're happy with the proof, and its effect is to make the theorem
visible in the scope so that, e.g. it can be used to prove still other things.
In our case it's just an excuse to use the old Latin acronym.
Is That All There Is? (With Apologies to Peggy Lee)
Yes and no.
Yes: that really is the entire proof script. You can now have as much
confidence as you've ever had in anything in your life that the theorem is
true.
No: you might wonder how I can say that you can have as much confidence as
you've ever had in anything in your life that the theorem is true. After all,
we've relied heavily on a piece of software to prove it, and software has bugs
all the time. Where does the confidence come from?
Part of the answer revolves around that "csdp" program that you need to have
installed. The psatz tactic, which takes a domain and a search-depth as its
parameters, packages up the hypotheses and proof goal and hands them to this
semi-definite programming (the "sdp" in "csdp") procedure to search for a proof.
This is the part that Bradford is referring to when he says
"let's not prove arithmetic operations on inequalities." csdp does that heavy
lifting for us here and returns the result to Coq, which dutifully tells us
that the proof is completed.
Can I Get a Witness? (With Apologies to The Drifters)
The key point, though, is that csdp isn't treated as what's called an "oracle"
that just returns "true" or "false." Instead, csdp returns what's called a
"proof witness." The psatz tactic, which translated our hypotheses and goal
into a form that csdp could digest, also translates the proof witness back
into a Coq proof term,
which is then checked by Coq's proof checker.
The secret ingredient, then, is that, although
constructing a proof
term automatically might be brutally difficult, taking a lot of CPU time and
even external specialized (semi)decision procedures,
checking a proof
term in Coq's core logic is so simple that it can be done with a checker that's
a few hundred lines of OCaml.
The de Bruijn Criterion
A proof checker of a few hundred lines of OCaml can be audited by any competent
OCaml programmer. Theorem provers with checkers small and simple enough to be
realistically human-audited are said to satisify the "de Bruijn criterion,"
named after Dutch mathematician Niklaus de Bruijn, who developed the AUTOMATH
theorem prover in 1974. The point is that, instead of having to audit every
proof developed with the theorem prover, which may not even be feasible,
you can audit the checker once (or, for that matter, as many times as you like,
with as many eyeballs as you can persuade to help). Once you're satisfied with
the logic it implements and that it implements it faithfully, you're done. If
you wonder why a human being should trust the logic that the checker implements,
please refer to the quote at the top of the page. :-)
What if OCaml Has a Bug?
You could run afoul of a bug in OCaml, strictly speaking. That's one reason why
there's ongoing work to
formalize OCaml
and build
certified compilers. In the
meantime, OCaml is a very mature system that has had a lot of code developed
with it for over 15 years, so its quality tends to be assured by the other
known effective methodology: standing the test of time.
Finis
That brings us to the end of my first installment in the 100 Proof project. I'd
like to thank Bradford for encouraging me to collaborate with him on it, the Coq
developers for such a wonderful tool, the C-CoRN repository for tackling a lot
of proofs that I know Bradford and I will be making use of later in the series,
and you for reading. Until next time!
Comments
blog comments powered by Disqus