Proofs

Reasons to be concerned

Do you think that proofs are correct if they're published in refereed venues? Or if people rely on the proofs for real-world applications, such as cryptography? For various counterexamples, see my list of proof errors.

If you see people claiming that proof errors are rare, try asking "How rare? 20% of proofs are wrong? 40% of proofs are wrong? Where's the evidence? How often do you think proofs have been carefully checked? How often do you think the errors that have been found have been announced? What fraction of those announcements have come to your attention?"

I have a paper giving reasons to believe that it is feasible to convincingly determine the actual error rate of a pool of proof papers. I've started executing this plan for a pool that interests me, namely recent papers on algorithms to attack cryptographic systems.

Catching proof errors

There are various systems for computers to check every step in a proof. Examples, along with links to cautionary notes about past soundness bugs (bugs allowing false statements to be "proven") for some of the systems:

Coq (new name: Rocq). There is a long list of Coq bugs labeled as "critical".
HOL4.
HOL Light.
Isabelle/HOL. A soundness bug was fixed in 2006. Another soundness bug was fixed in 2016.
Lean. A soundness bug was fixed in 2021. A soundness bug was fixed in 2023. Another soundness bug was fixed in 2023. A soundness bug was fixed in 2025.
Metamath.
Mizar. Some soundness bugs were discovered in 2023.

One way to see what theorem statements look like in these systems is to follow links from Freek Wiedijk's page "Formalizing 100 theorems".

See my report "Papers with computer-checked proofs" for four examples of papers where I posted computer-checked proofs before paper submission (one of those papers isn't even online yet), taking a few weeks per paper. The report describes in considerable detail where the time is going, and in particular explains why looking at line counts for computer-checked proofs is misleading.

For one paper, I wrote computer-checked proofs in Lean over a period of 15 days, and then computer-checked proofs of the same theorems in HOL Light over a period of 12 days. The report looks at how various components of the work differed between Lean and HOL Light. I used HOL Light for the other papers.

I've spent much more time over the years manually checking proof steps, although that's also covering many more papers.

The rest of this page has notes on my work checking proofs. Most of my computer-checked proofs are mentioned here, but only a small fraction of my manually checked proofs are mentioned here.

Transcendence of pi etc.

I wrote proofs in HOL Light of the transcendence of e and pi. I started on 2 November 2024 and posted the proofs on 22 November 2024. I came back to this in December 2024 to prove a more general theorem due to Lindemann (often miscredited to Weierstrass); I posted that proof on 21 December 2024.

For the core of the proof, I followed a strategy from a 5-page 1990 paper "An alternative proof of the Lindemann-Weierstrass theorem" by Beukers, Bezivin, and Robba. For some necessary facts about symmetrization of weighted sums of exponentials, I streamlined a strategy from a 12-page 1952 paper "Analytic proof of the Lindemann theorem" by Steinberg and Redheffer. The proofs that correspond to what I explained to HOL Light occupy 3.5 pages in the 1990 paper, plus about 5 pages in the 1952 paper, plus various background facts about polynomials and (formal) power series, such as the fact that the derivative of a rational function can't have an order-1 pole.

The goppadecoding paper

Daniel J. Bernstein, "Understanding binary-Goppa decoding", ID 8561e73dab75d01a6dd6bf542594ddac03cdbe6e:

Theorem 2.1.2: Appendix C.6 matches this up to interpolator_main in HOL Light and interpolator_main in Lean.
Theorem 3.1.2: Appendix C.6 matches this up to small_approximant_exists in HOL Light and small_approximant_exists in Lean.
Theorem 3.1.3: Appendix C.6 matches this up to approximant_best in HOL Light and approximant_best in Lean.
Theorem 4.1.2: Appendix C.6 matches this up to interpolation_with_errors in HOL Light and interpolation_with_errors in Lean.
Theorem 4.1.3: Appendix C.6 matches this up to checking_interpolation_with_errors in HOL Light and checking_interpolation_with_errors in Lean.
Theorem 5.1.2: Appendix C.6 matches this up to goppa_decoding in HOL Light and goppa_decoding in Lean.
Theorem 5.1.3: Appendix C.6 matches this up to goppa_checking in HOL Light and goppa_checking in Lean.
Theorem 5.4.1: Appendix C.6 matches this up to goppa_parity in HOL Light and goppa_parity in Lean.
Theorem 6.1.1: Appendix C.6 matches this up to goppa_squaring in HOL Light and goppa_squaring in Lean.
Theorem 7.1: Appendix C.6 matches this up to goppa_checking_2 in HOL Light and goppa_checking_2 in Lean.

I started writing Lean proofs on 12 July 2023 and posted them on 26 July 2023. I started writing HOL Light proofs on 31 July 2023 and posted them on 11 August 2023.

Computing gcd

Paper not posted yet. Main theorem says that if 0 ≤ g ≤ f ≤ 2^b and 9437b+1 ≤ 4096m then m divsteps starting from (1/2,f,g) produce g_m = 0. See https://github.com/jrh13/hol-light/tree/master/Divstep for a proof checked by HOL Light.

I started writing the computer-checked proof of this theorem on 23 March 2023, finished on 6 April 2023, and (after adding a few minor tweaks) posted the proof on 16 April 2023. I had done some work before that on techniques that, given any particular m, generate computer-checked proofs for m; I started writing such proofs on 25 January 2021, posted examples on 31 January 2021, and came back to those techniques on 18 March 2023.

The latticeasymp paper

Daniel J. Bernstein, "Asymptotics for the standard block size in primal lattice attacks: second order, formally verified", ID 22882984e77307356f7598b7ac1c434d8f31613f:

Theorem 1.2.1(1): Appendix A.3 traces how this is a special case of forward_main, which has a proof checked by HOL Light.
Theorem 1.2.1(2): Appendix A.3 traces how this is a special case of converse_main, which has a proof checked by HOL Light.

I started writing the computer-checked proof on 21 February 2023 and finished on 15 March 2023. I then added Appendix A to the paper, and posted everything on 17 March 2023.

The controlbits paper

Daniel J. Bernstein, "Verified fast formulas for control bits for permutation networks", ID 782136154c6ea5bb93257e2a5f667a579df7a48e:

Theorem 2.2: cycle_shift in HOL Light.
Theorem 2.4: fastcyclemin_part2powcycle in HOL Light.
Theorem 2.5: fastcyclemin_works in HOL Light.
Theorem 3.1: xor1_lteven and xor1_div2 in HOL Light.
Theorem 3.3: xif_fixesge and xif_involution in HOL Light.
Theorem 4.2: XbackXforth_fixesge and XbackXforth_bijection in HOL Light.
Theorem 4.3: XbackXforth_powpow_xor1 and XbackXforth_powpow_avoids_xor1 and XbackXforth_cycle_size in HOL Light.
Theorem 4.4: XbackXforth_cyclemin_xor1 in HOL Light.
Theorem 5.2: firstcontrol_0 in HOL Light.
Theorem 5.3: firstcontrol_parity in HOL Light.
Theorem 5.4: lastcontrol_first in HOL Light.
Theorem 5.5: middleperm_fixesge and middleperm_bijection and middleperm_parity in HOL Light.

I started writing the computer-checked proofs on 9 September 2020, and posted the paper with computer-checked proofs on 23 September 2020.

The smallheight paper

Daniel J. Bernstein, "Reducing lattice bases to find small-height values of univariate polynomials", ID 82f82c041b7e2bdce94a5e1f94511773:

Theorem 2.1, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 2.2, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 2.3, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 2.4, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 2.5, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 2.6, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.1, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.2, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.3, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.4, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.5, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.6, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.7, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 3.8, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 4.1, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 4.2, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 4.3, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 4.4, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 5.1, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 5.2, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 5.3, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 5.4, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 6.1, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 6.2, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 6.3, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 6.4, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 7.1, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 7.2, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 7.3, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Theorem 7.4, version 2006.05.31: Proof manually checked 2006.05.31. Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.

The curve25519 paper

Daniel J. Bernstein, "Curve25519: new Diffie-Hellman speed records", ID 4230efdfa673480fc079449d90f322c0:

Primality of p = 2^255 - 19: Pocklington-style proof manually checked 2006.02.09, using primes 127, 353, 75707, 8574133, 1919519569386763, 75445702479781427272750846543864801, 74058212732561358302231226437062788676166966415465897661863160754340907, with gp for underlying calculations.
Non-squareness of 2 and 486662^2-4 in F_p: Checked 2006.02.09 by gp.
Primality of p1 = 2^252 + 27742317777372353535851937790883648493: Pocklington-style proof manually checked 2006.02.09, using primes 531581, 1224481, 1257559732178653, 4434155615661930479, 172054593956031949258510691, 19757330305831588566944191468367130476339, 276602624281642239937218680557139826668747, with gp for underlying calculations.
Base point (9,...) has order p1: Checked 2006.02.09 by gp with code e=ellinit([0,486662,0,1,0]); u=Mod(9,p); b=[u,sqrt(u^3+486662*u^2+u)]; ellpow(e,b,p1)==[0].
Primality of p2 = (p+1)/2 - 2 p1 = 2^253 - 55484635554744707071703875581767296995: Pocklington-style proof manually checked 2006.02.09, using primes 743104567, 1013266244677, 203852586375664218368381551393371968928013, 104719073621178708975837602950775180438320278101, 27413359092552162435694767700453926735143482401279781, with gp for underlying calculations.
(p+1-8p1)^2-4p is divisible by 8312956054562778877481: Checked 2006.02.09 by gp.
Order of p mod p1 is (p1-1)/6: Checked 2006.02.09 by gp.
Order of p mod p2 is (p2-1)/1: Checked 2006.02.09 by gp.
Doubling formulas in version 2006.02.09 produce lambda^2 - A - 2x = (x^2-1)^2/4y^2 and y''^2 = x''^3 + A x''^2 + x'': Verified 2006.02.09.
Addition formulas in version 2006.02.09 produce y''^2 = x''^3 + A x''^2 + x'': Verified 2006.02.09.
Theorem 2.1, version 2006.02.09: Proof manually checked 2006.02.09.
Theorem B.1, version 2006.02.09: Proof manually checked 2006.02.09.
Theorem B.2, version 2006.02.09: Proof manually checked 2006.02.09.

Some other verification work on this topic: 2017 Russinoff gave an ACL2-checked proof that an elliptic curve is a group under the traditional addition law.