Table of contents
Table of contents
It all started with mathematics – rigorous thinking, science, technology. Today’s world is maths‑driven. Despite recent advances in deep learning, the way mathematics is done today is still much the same as it was 100 years ago. Isn’t it time for a change?
Introduction
Mathematics is at the core of science and technology. However the growing amount of mathematical research makes it impossible for non‑experts to fully use the developments made in pure mathematics. Research has become more complicated and more interdependent. Moreover it is often impossible to verify correctness for non‑experts – knowledge is accepted as knowledge by a small group of experts (e.g. the problem with accepting Mochizuki’s proof of abc‑conjecture – it is not understandable for other experts). Fig. 1: The graph on the left shows the growing number of submissions to arXiv – an Internet repository for scientific research. In 2012, mathematics accounted for approx. 20,000 submissions annually.Automated reasoning
To address the issue mentioned above, researchers try to automate or semi‑automate:- Producing mathematics
- Verifying existing mathematics
- Take a mathematical work (e.g. Feit‑Thompson theorem or proof of Kepler’s conjecture)
- Rewrite it in Coq, Mizar or another Interactive Theorem Prover (language/program which understands logic behind mathematics and is able to check its correctness)
- Verify
- Hammers and tactics (methods for automatic reasoning over large libraries)
- Machine learning and deep learning
Building a dictionary
We need to automate translation of human‑written mathematical works in LaTeX to Coq/Mizar. We view it as an NLP problem of creating a dictionary between two languages. How can we build such a dictionary? We could build upon existing syntactic parsers (e.g. TensorFlow’s SyntaxNet) and enhance them with Types and variables, which we explain in an example: Consider the sentence “Let $G$ be a group” . Then “G” is a variable of Type “group”. Once we have such a dictionary with at least some basic accuracy we can use it to translate LaTeX into Coq/Mizar sentence by sentence. Nevertheless we still need a good source of mathematics! Here is what we propose in the DeepAlgebra program: Algebraic geometry is one of the pillars of modern mathematical research, which is rapidly developing and has a solid foundation (Grothendieck’s EGA/SGA, The Stacks Project). It is “abstract” hence easier to verify for computers than analytical parts of mathematics. The Stacks Project is an open multi‑collaboration on foundations of algebraic geometry starting from scratch (category theory and algebra) up to the current research. It has a well‑organized structure (an easy‑to‑manage dependency graph) and is verified thoroughly for correctness. The Stacks Project now consists of:- 547,156 lines of code
- 16,738 tags (57 inactive tags)
- 2,691 sections
- 99 chapters
- 5,712 pages
- 162 slogans
- Statements (also in LaTeX)
- Data for graphs