Coquelicot

2011-2014 research project

Coquelicot is a 3-year research project funded by the Fondation de Coopération Scientifique «Campus Paris-Saclay» and Digiteo.

People involved:

Motivations

Processors and software are ubiquitous in our everyday environment. Nevertheless, there is no guarantee that they are fit, or even safe, for their usage. It is therefore becoming critical to ensure they are behaving as expected. This can be done in two steps: precisely specifying what they are supposed to perform, then verifying that the implementation matches this specification. Being able to ensure the safety and security of software is one of the main research theme of the DIM LSC and Digiteo.

There are numerous ways of verifying the adequateness between a specification and an application, e.g. testing. The highest level of guarantee is achieved by formal methods and in particular formal proofs. Recent examples have shown that they have now become sufficiently practical so that they can be applied to achieve a high level of trust.

Formal proofs have a high level of entry and cannot be used out of the blue. They require tools, formalisms, libraries of theorems, and expertise. Moreover, a poor design of one of the basic blocks might hinder every development built upon it. This project is interested in the basic block corresponding to real numbers and analysis, and their definitions, theorems, and automation. Indeed, this block is the foundation needed to specify and prove any program performing numerical computations.

Numerical computations are usually based on the IEEE-754 floating-point arithmetic, available on all generic processors and increasingly on embedded ones. This arithmetic is efficient and is a good approximation of real arithmetic. Yet it is just an approximation: each operation will incur a round-off error due to the finite precision of the floating-point formats, thus causing computed values to drift off the expected results. Moreover, these formats are bounded and overflows and underflows may cause exceptional behaviors.

It is natural and usual to specify software behavior with respect to ideal mathematical values, as if they were computed with an infinite precision. A simple example is the implementation of a mathematical function, e.g. the exponential. Any useful specification will have to relate the computed results to the mathematical values. Moreover, proving the correctness of the implementation will require to manipulate various mathematical properties, not only about floating-point numbers but also of real analysis, e.g. bounding the distance between the exponential and the polynomial approximation that implements it.

Even such a simple use case already requires an extended formalization of real numbers, and practical use cases (e.g. optimal control) will require a huge framework (definitions, notations, theorems, automations, user-friendly interactions, and so on). The goal of this project is to design such a framework. While we are focusing on the subset required for program verification, the framework should ultimately be usable for any development related to real numbers, e.g. purely mathematical proofs of real analysis.

A Coq library for real numbers, issues and expectations

Our preferred formal system is the Coq proof assistant, developed and used by several research teams located in Île-de-France. It provides a high-order logic suitable for proving all kind of programs and theorems. Its standard library ships a formalization of real numbers, but it has several shortcomings.

The first version of the Coq standard library was developed by M. Mayero 1997 (Coq V6.2). The library was implemented to formalize and to prove the Three Gaps Theorem, which is in fact a geometrical result. At that time, there were few provers which included real number libraries. The HOL library, developed by John Harrison was the most important one and a reference for the Coq one. (Note that HOL is not based on the same logic than Coq). The Coq real standard library was later extended by Olivier Desmettre, in 2001.

Great changes in the internal logic of Coq (predicativity of Set) have forced some changes to the initial choices. But that is all. For the past 10 years, Coq has kept evolving, while the library has not. Moreover, in its present state, the library is sometimes considered unsatisfying enough to deserve a complete duplication in a large development that only requires a basic theory of reals. This is for instance the case for the formalization of the proof of the Four Color Theorem. To model the real plane embedding the colored maps, Georges Gonthier has built his own elementary formalization of real numbers as Dedekind cuts.

This mandates to rebuild the library to bring it up to date, while preserving its main features, that is to say, standard analysis, classical theorems, and its ability to prove programs manipulating floating-point numbers. The initial choices will pervade into all the later parts of the library and will thus have a big impact on its usability. These choices depend on both the logic of proof assistants and the kind of theorems or programs that we wish to prove (classical or intuitionist?). We also wish to keep the usual mathematical aspects (total function for the division?), etc. This is mainly a work of conception and research, dealing with Coq's logic. The main drawback is the risk of introducing contradictions in the theory.

While our immediate goal is to ease the process of formally verifying numerical applications, a much longer aim is to provide a state of the art library to support large scale formalization projects requiring real analysis and computer algebra. We plan to benchmark our choices by addressing the needs of undergoing works like the ones led in the Formath European project, in the FOST ANR action, and on computer algebra examples. The Formath project indeed comprises in particular the formalization of the correctness proof of the Cylindrical Algebraic Decomposition Algorithm which is a real algebraic geometry sophisticated algorithm. An other important computer algebra case study will be the mechanization of the proofs exposed by the Dynamical Dictionary of Mathematical Functions developed at the INRIA Microsoft Research Joint Centre. Another axis of the Formath project we plan to examine is the work package devoted to the formal study of the theory of ordinary differential equations. Finally, the FOST ANR project is interested in formally verifying programs that numerically solve some particular partial differential equations, from both the floating-point side and the mathematical side, which involves complicated reasoning about real analysis.

FOST and DDMF are two examples of projects developed by several research teams located in Île-de-France that would benefit from a revamped formalization of real numbers. They are not the only ones. More generally, a modern library for the real numbers will improve the usefulness and the visibility of Coq, and hence its adoption in the wide. As mentioned before, real numbers are the foundation for any formal verification of numerical programs. Providing a suitable library will ease such verifications and improve the safety of software as a consequence. As shown by recent qualification processes like DO-178C for avionics, the usage of formal proofs during software development will be a major challenge of the years to come, and the tools have to be ready for it.

One may wonder why we are tackling the process of renovating the Coq library for real numbers rather than delegating this work to some other research team, e.g. the Pi.R2 INRIA project-team which develops the proof assistant itself. First, we have no assurance that the work would be achieved in a timely fashion, while such a library is more and more needed. Second, and probably the most important reason for taking over this work is that these teams have never been users of the real numbers and thus would be bad judges of the inadequacies of the current library. All the partners of the present project, however, are everyday users of real number theories; they also have contacts with non-experts that are looking for such tools but are currently put off by their current complexity.

Goal: New Reals for Coq

As explained in the previous section, a modern formalization of the real numbers is sorely needed to ease the verification of numerical applications, especially those involving advanced mathematics. The target system is the Coq proof assistant. From both points of view of program verification and standard analysis, the choices of the C-Corn library do not suit our needs. Note that, while the objective is the formalization of real numbers, its extension to complex numbers should be envisioned, as it would simplify some parts of the library (e.g. elementary functions).

This may seem like an engineering task, but it is not. Several theoretical points have to be investigated. Indeed, the design decisions have a huge impact on the usability and expressiveness (and even soundness!), as seen in the various existing formalizations. Here is a nonexhaustive list of the topics that will have to be researched (bibliography, previous experiments, etc): relations between classical logic and calculus of constructions, partiality of functions (e.g. division), computability of closed expressions, primitive equality, and so on. Each decision will have to be strongly motivated and trade-offs weighted, due to their controversial nature.

More importantly, the library has to be designed so that it is both practical for the end user (software developers, mathematicians, and so on) and maintainable and extensible on the long run. It requires a deep reflection on the infrastructure of the library: notations, implicit terms (automatic inference), theorem naming conventions. The SSReflect experiment has shown that some advanced concepts would be impossible to wield without the proper infrastructure and its ideas might be worth reusing.

A comprehensive library on real numbers is out of the scope of a three-year PhD thesis. Moreover, some parts of it could be directly extracted from mathematical literature and thus they would not lead to any scientific novelty. So the primary goal of this thesis is to devise sane foundations, from which the whole library can later be built upon (possibly through postdoc and engineer positions). To be able to assess the results of this work, the thesis will focus on a specific part of real analysis that can be put to immediate use. It should offer suitable tools for defining and proving properties on elementary functions. It should also pave the way to a full formal verification of the results from the Dynamic Dictionary of Mathematical Functions.