D. J. Bernstein
Proofs
Proof errors
Do you think that proofs are correct if they're published in refereed venues?
Or if people rely on the proofs for real-world applications, such as cryptography?
Here are some random examples of reasons to be concerned:
-
2003 Gen Nakamura, Gunther Uhlmann:
retraction of a theorem published in 1994 in Inventiones Mathematicae.
"Unfortunately, we have not been able to prove the global result stated in [NU1]."
-
2007
Semyon Alesker:
retraction of a theorem published in 1999 in Annals of Mathematics.
"We do not know if Theorem A is true".
-
2008 Daniel J. Bernstein:
pointing out an error in a 1999 proof published by Kaoru Kurosawa and Wakaha Ogata.
The history here illustrates how broken the pre-publication reviewing process is.
-
2008 Theodore P. Hill:
pointing out various counterexamples to a 2006 paper published by Steven J. Brams, Michael A. Jones, and Christian Klamler.
This time the
history illustrates
how broken the post-publication correction process is.
-
2009 Daniel Biss and Benson Farb:
retraction of a theorem published in 2005 in Inventiones Mathematicae;
"should be considered an open problem".
-
2012 Gwern Branwen (also updated after 2012):
various error examples, references, and analysis.
-
2014 Vladimir Voevodsky:
"A technical argument by a trusted author,
which is hard to check and looks similar to arguments known to be correct, is hardly ever checked in detail."
Various supporting examples of proof errors.
-
2015 Daniel J. Bernstein:
pointing out an error in a 2002 proof published by Steven Galbraith, John Malone-Lee, and Nigel Smart.
-
2017 Daniel Biss:
retraction of a 2002 paper.
-
2018 Akiko Inoue and Kazuhiko Minematsu:
pointing out an error in a 2004 proof published by Phillip Rogaway.
"We present practical attacks against OCB2, an ISO-standard authenticated
encryption (AE) scheme."
-
2018 Daniel J. Bernstein and Edoardo Persichetti:
"This appendix gives counterexamples to [20, Theorem 3.6] and [20, Theorem 3.5]"
under the "plausible assumption that there exists some efficient correct deterministic PKE with OW-Passive security".
"[20]" is a 2017 paper published by Dennis Hofheinz, Kathrin Hövelmanns, and Eike Kiltz;
that paper was updated online in 2021 to include an erratum.
-
2019 Neal Koblitz and Alfred Menezes
(also updated after 2019):
survey of many known proof flaws in cryptography, often outright errors.
-
2019 Leslie Lamport:
"George Bergman ... showed me the 51 reviews he had by then written in Math Reviews.
Based on those reviews, I judged that exactly 1/3 of the papers he reviewed contained an error [found by Bergman],
which I define to mean an incorrect statement in a proof or result that the author believed to be correct."
See also 2024 version.
-
2021
Richard P. Brent:
surveying several flaws in the literature, including some proof errors.
-
2022
Mario Berta, Fernando G. S. L. Brandão, Gilad Gour, Ludovico Lami, Martin B. Plenio, Bartosz Regula, and Marco Tomamichel:
pointing out an error in a 2010 proof published by Brandão and Plenio.
"Hence, the main achievability result of Brandão & Plenio is not known to hold. This puts into question a number of established results in the literature".
-
2024 Peng Wang, Shuping Mao, Ruozhou Xu, Jiwu Jing, and Yuewu Wang,
and independently
2024 Amit Singh Bhati, Michiel Verbauwhede, and Elena Andreeva:
pointing out an error in a 2007 proof published by David A. McGrew and Scott R. Fluhrer.
"We also discuss the impact of these vulnerabilities on XCB-based applications, such as disk encryption, nonce-based encryption, deterministic authenticated
encryption and robust authenticated encryption, highlighting the risks due to XCB's failure to achieve STPRP security."
-
2024 Daniel Litt
summarizing how an attempted proof of a conjecture
ran into one wrong published paper after another.
Wikipedia also
lists
some published proofs subsequently discovered to have gaps;
often the claimed theorem ended up being disproven.
But that list is misleading: it focuses primarily on old examples,
so it can create or confirm an impression that the problem has disappeared.
Another source of old examples is Section 6 of a 1972 survey by
Philip J. Davis.
If you see people claiming that proof errors are
rare,
try asking "How rare?
20% of proofs are wrong?
40% of proofs are wrong?
Where's the evidence?
How often do you think proofs have been carefully checked?
How often do you think the errors that have been found have been announced?
What fraction of those announcements have come to your attention?"
Catching proof errors
There are various systems for computers to check every step in a proof.
Examples: Coq (new name:
Rocq),
HOL4,
HOL Light,
Isabelle/HOL,
Lean,
Metamath,
Mizar.
One way to see what theorem statements look like in these systems
is to follow links from Freek Wiedijk's page
"Formalizing 100 theorems".
See my
report
"Papers with computer-checked proofs"
for four examples of papers where I posted computer-checked proofs before paper submission
(one of those papers isn't even online yet),
taking a few weeks per paper.
The report describes in considerable detail where the time is going,
and in particular explains why looking at line counts for computer-checked proofs is misleading.
For one paper,
I wrote computer-checked proofs in Lean over a period of 15 days,
and then computer-checked proofs of the same theorems in HOL Light over a period of 12 days.
The report looks at how various components of the work differed between Lean and HOL Light.
I used HOL Light for the other papers.
I've spent much more time over the years manually checking proof steps,
although that's also covering many more papers.
The rest of this page has notes on my work checking proofs.
Most of my computer-checked proofs are mentioned here,
but only a small fraction of my manually checked proofs are mentioned here.
Transcendence of pi etc.
I wrote proofs in HOL Light of the transcendence of e and pi.
I started on 2 November 2024 and posted the proofs on 22 November 2024.
I came back to this in December 2024
to prove a more general theorem due to Lindemann (often miscredited to Weierstrass);
I posted that
proof on 21 December 2024.
For the core of the proof,
I followed a strategy from a 5-page 1990 paper "An alternative proof of the Lindemann-Weierstrass theorem" by Beukers, Bezivin, and Robba.
For some necessary facts about symmetrization of weighted sums of exponentials,
I streamlined a strategy from a 12-page 1952 paper "Analytic proof of the Lindemann theorem" by Steinberg and Redheffer.
The proofs that correspond to what I explained to HOL Light
occupy 3.5 pages in the 1990 paper,
plus about 5 pages in the 1952 paper,
plus various background facts
about polynomials and (formal) power series,
such as the fact that the derivative of a rational function can't have an order-1 pole.
Daniel J. Bernstein,
"Understanding binary-Goppa decoding",
ID 8561e73dab75d01a6dd6bf542594ddac03cdbe6e:
-
Theorem 2.1.2:
Appendix C.6 matches this up to
interpolator_main in HOL Light and
interpolator_main in Lean.
-
Theorem 3.1.2:
Appendix C.6 matches this up to
small_approximant_exists in HOL Light and
small_approximant_exists in Lean.
-
Theorem 3.1.3:
Appendix C.6 matches this up to
approximant_best in HOL Light and
approximant_best in Lean.
-
Theorem 4.1.2:
Appendix C.6 matches this up to
interpolation_with_errors in HOL Light and
interpolation_with_errors in Lean.
-
Theorem 4.1.3:
Appendix C.6 matches this up to
checking_interpolation_with_errors in HOL Light and
checking_interpolation_with_errors in Lean.
-
Theorem 5.1.2:
Appendix C.6 matches this up to
goppa_decoding in HOL Light and
goppa_decoding in Lean.
-
Theorem 5.1.3:
Appendix C.6 matches this up to
goppa_checking in HOL Light and
goppa_checking in Lean.
-
Theorem 5.4.1:
Appendix C.6 matches this up to
goppa_parity in HOL Light and
goppa_parity in Lean.
-
Theorem 6.1.1:
Appendix C.6 matches this up to
goppa_squaring in HOL Light and
goppa_squaring in Lean.
-
Theorem 7.1:
Appendix C.6 matches this up to
goppa_checking_2 in HOL Light and
goppa_checking_2 in Lean.
I started writing Lean proofs on 12 July 2023 and posted them on 26 July 2023.
I started writing HOL Light proofs on 31 July 2023 and posted them on 11 August 2023.
Computing gcd
Paper not posted yet.
Main theorem says that if 0 ≤ g ≤ f ≤ 2^b
and 9437b+1 ≤ 4096m then m
divsteps
starting from (1/2,f,g)
produce g_m = 0.
See
https://github.com/jrh13/hol-light/tree/master/Divstep
for a proof checked by HOL Light.
I started writing the computer-checked proof of this theorem on 23 March 2023,
finished on 6 April 2023,
and (after adding a few minor tweaks) posted the proof on 16 April 2023.
I had done some work before that on techniques that, given any particular m,
generate computer-checked proofs for m;
I started writing such proofs on 25 January 2021, posted examples on 31 January 2021,
and came back to those techniques on 18 March 2023.
Daniel J. Bernstein,
"Asymptotics for the standard block size in primal lattice attacks: second order, formally verified",
ID 22882984e77307356f7598b7ac1c434d8f31613f:
- Theorem 1.2.1(1):
Appendix A.3 traces how this is a special case of
forward_main,
which has a proof checked by HOL Light.
- Theorem 1.2.1(2):
Appendix A.3 traces how this is a special case of
converse_main,
which has a proof checked by HOL Light.
I started writing the computer-checked proof on 21 February 2023
and finished on 15 March 2023.
I then added Appendix A to the paper,
and posted everything on 17 March 2023.
Daniel J. Bernstein,
"Verified fast formulas for control bits for permutation networks",
ID 782136154c6ea5bb93257e2a5f667a579df7a48e:
-
Theorem 2.2:
cycle_shift in HOL Light.
-
Theorem 2.4:
fastcyclemin_part2powcycle in HOL Light.
-
Theorem 2.5:
fastcyclemin_works in HOL Light.
-
Theorem 3.1:
xor1_lteven and xor1_div2 in HOL Light.
-
Theorem 3.3:
xif_fixesge and xif_involution in HOL Light.
-
Theorem 4.2:
XbackXforth_fixesge and XbackXforth_bijection in HOL Light.
-
Theorem 4.3:
XbackXforth_powpow_xor1 and XbackXforth_powpow_avoids_xor1 and XbackXforth_cycle_size in HOL Light.
-
Theorem 4.4:
XbackXforth_cyclemin_xor1 in HOL Light.
-
Theorem 5.2:
firstcontrol_0 in HOL Light.
-
Theorem 5.3:
firstcontrol_parity in HOL Light.
-
Theorem 5.4:
lastcontrol_first in HOL Light.
-
Theorem 5.5:
middleperm_fixesge and middleperm_bijection and middleperm_parity in HOL Light.
I started writing the computer-checked proofs on 9 September 2020,
and posted the paper with computer-checked proofs on 23 September 2020.
Daniel J. Bernstein,
"Reducing lattice bases to find small-height values of univariate polynomials",
ID 82f82c041b7e2bdce94a5e1f94511773:
- Theorem 2.1, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 2.2, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 2.3, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 2.4, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 2.5, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Relies on
LLL lemma outside the paper:
if m is a positive integer,
and L is a lattice of rank m,
then L has a nonzero vector of length at most 2^((m-1)/2) (det L)^(1/m).
This lemma has the virtue of being polynomial-time constructive;
the "cost" theorems in my paper inherit this virtue.
(On the other hand, the virtue is stated only in the accompanying text,
not as part of the theorems.)
- Theorem 2.6, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.1, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.2, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.3, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.4, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.5, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.6, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.7, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 3.8, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 4.1, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 4.2, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 4.3, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 4.4, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 5.1, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 5.2, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 5.3, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 5.4, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 6.1, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 6.2, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 6.3, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 6.4, version 2005.10.09 = 2006.05.31: Proof manually checked 2005.10.09.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 7.1, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 7.2, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 7.3, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
- Theorem 7.4, version 2006.05.31: Proof manually checked 2006.05.31.
Version 2007.07.27: Equivalence to version 2006.05.31 manually checked 2007.07.27.
Daniel J. Bernstein,
"Curve25519: new Diffie-Hellman speed records",
ID 4230efdfa673480fc079449d90f322c0:
- Primality of p = 2^255 - 19:
Pocklington-style proof manually checked 2006.02.09,
using primes
127,
353,
75707,
8574133,
1919519569386763,
75445702479781427272750846543864801,
74058212732561358302231226437062788676166966415465897661863160754340907,
with gp for underlying calculations.
- Non-squareness of 2 and 486662^2-4 in F_p:
Checked 2006.02.09 by gp.
- Primality of p1 = 2^252 + 27742317777372353535851937790883648493:
Pocklington-style proof manually checked 2006.02.09,
using primes
531581,
1224481,
1257559732178653,
4434155615661930479,
172054593956031949258510691,
19757330305831588566944191468367130476339,
276602624281642239937218680557139826668747,
with gp for underlying calculations.
- Base point (9,...) has order p1:
Checked 2006.02.09 by gp with code
e=ellinit([0,486662,0,1,0]);
u=Mod(9,p);
b=[u,sqrt(u^3+486662*u^2+u)];
ellpow(e,b,p1)==[0].
- Primality of
p2 = (p+1)/2 - 2 p1 = 2^253 - 55484635554744707071703875581767296995:
Pocklington-style proof manually checked 2006.02.09,
using primes
743104567,
1013266244677,
203852586375664218368381551393371968928013,
104719073621178708975837602950775180438320278101,
27413359092552162435694767700453926735143482401279781,
with gp for underlying calculations.
- (p+1-8p1)^2-4p is divisible by 8312956054562778877481:
Checked 2006.02.09 by gp.
- Order of p mod p1 is (p1-1)/6:
Checked 2006.02.09 by gp.
- Order of p mod p2 is (p2-1)/1:
Checked 2006.02.09 by gp.
- Doubling formulas in version 2006.02.09 produce
lambda^2 - A - 2x = (x^2-1)^2/4y^2 and
y''^2 = x''^3 + A x''^2 + x'':
Verified 2006.02.09.
- Addition formulas in version 2006.02.09 produce
y''^2 = x''^3 + A x''^2 + x'':
Verified 2006.02.09.
- Theorem 2.1, version 2006.02.09:
Proof manually checked 2006.02.09.
- Theorem B.1, version 2006.02.09:
Proof manually checked 2006.02.09.
- Theorem B.2, version 2006.02.09:
Proof manually checked 2006.02.09.
Some other verification work on this topic:
2017 Russinoff
gave an ACL2-checked proof that an elliptic curve is a group under the traditional addition law.