General information

Title
Extensional Paramodulation for Higher-Order Logic and its Effective Implementation Leo-III
Author
Alexander Steen
Year
2018, submitted on June 7th, 2018
Supervisor
Prof. Dr. habil. Christoph Benzmüller, Freie Universität Berlin
Second examiner
 
Prof. Dr. Geoff Sutcliffe, University of Miami

Abstract [top]

In this thesis, the theoretical foundations and the practical components for implementing an effective automated theorem proving system for higher-order logic are presented. A primary focus of this thesis is the provision of evidence that a paramodulation-based proof calculus can effectively be employed for performant equational reasoning in Extensional Type Theory (higher-order logic). To that end, a sound and complete paramodulation calculus for extensional higher-order logic with Henkin semantics is presented. The completeness proof hereby unifies and simplifies existing abstract consistency techniques for a formulation of higher-order logic that is based on primitive equality as sole logical connective.

In the practically motivated main part of this thesis, the design and architecture of the new higher-order theorem prover Leo-III is presented. Leo-III in based on the discussed paramodulation calculus and implements additional, practically motivated inference rules including equational simplification routines such as heuristic rewriting and support for reasoning with choice. The system encompasses a flexible mechanism for asynchronous cooperation with first-order reasoning systems, a powerful proof search procedure and a sophisticated and efficient set of underlying data structures. Additionally, pragmatic and practically significant features of Leo-III are discussed, including its native support for polymorphic higher-order logic and reasoning in higher-order quantified modal logics. An evaluation on a heterogeneous set of benchmark problems confirms that Leo-III is one of the most effective and versatile higher-order automated reasoning systems to date.

The Leo-III Theorem Prover [top]

-- Supplemental material to the Leo-III system --

General information

The Leo-III automated theorem prover is a theorem proving system for classical higher-order logic with choice that I implemented as one of the main developers in the context of my PhD thesis. Leo-III supports all common TPTP dialects, including THF, TFF and FOF as well as their rank-1 polymorphic derivatives. It is based on a paramodulation calculus with ordering constraints and, in tradition of its predecessor LEO-II, heavily relies on cooperation with external (mostly first-order) theorem provers for increased performance. Nevertheless, Leo-III can also be used as a stand-alone prover without employing any external cooperation.

Getting Leo-III

A current (pre-built) release of Leo-III 1.2 can be downloaded from GitHub under https://github.com/leoprover/Leo-III/releases/download/v1.2/leo3.jar. Note that this binary was built on a Debian-based system and might not work for all Linux derivatives. If the pre-build does not work, consider building Leo-III from source. Its quite simple and only takes a minute or two (see below).

Building Leo-III

The following requirements (dependencies) are not managed by the SBT build tool and hence need to be present at the system:

  • Java JDK 1.8
  • make (any reasonably current version)
  • gcc (any reasonably current version)
  • Scala Build Tool (SBT) >= 1.x

Leo-III uses SBT for building the Scala sources. SBT will download an appropriate version of Scala (and further dependencies) automatically. The actual build process in invoked by make. Proceed as follows to build Leo-III from source:

  1. Download the source distribution and unpack the archive
    > wget https://github.com/leoprover/Leo-III/archive/v1.2.tar.gz
    > tar -xvzf v1.2.tar.gz
  2. Step into the newly created directory and run make:
    > cd Leo-III-1.2/
    > make
    The building process might take some time, depending on your computer.
  3. If no error occurred, you should find a bin directory at top-level:
    > cd bin/
    > ls
      leo3 leo3.jar
    where leo3.jar is the executable jar of Leo-III. The leo3 file is a bash script short-cut calling java -jar leo3.jar with further technical parameters.
    Note that leo3 assumes that the jar file resides in the same directory as the script itself.
  4. (Optional) Install (i.e. copy) the Leo-III binaries to a dedicated location using
    > make install
    The default install destination is $HOME/bin. This will install the jar as well as the leo3 executable there. The install destination can be modified using the DESTDIR modifier.

Using Leo-III

Leo-III requires the Java 1.8 Runtime (JRE) for execution. Leo-III works on any common OS (including Windows, Mac OS and linux derivatives).

External cooperation so-far only works on Linux and Mac systems. If Windows is used, it might be possible to try running Leo-III using Cygwin or similar for exploiting external cooperation.

Leo-III is invoked via command-line (assuming the leo3 executable is in $PATH):

> leo3
Leo III -- A Higher-Order Theorem Prover.
Christoph Benzmuller, Alexander Steen, Max Wisniewski and others.

Usage: leo3 problem [option ...]

Options:
[...]
The release of Leo-III contains several test problems, including a polymorphic THF formulation of the surjective Cantor theorem located at ./src/test/resources/th1/sur\_cantor\_th1.p. The problem statement reads as follows:
thf(sur_cantor, conjecture, ( ! [T: $tType]: (~ ( ? [F: T > (T > $o)] : (
                                   ! [Y: T > $o] :
                                    ? [X: T] : (
                                      (F @ X) = Y
                                    )
                                 ) )))).
Leo-III can now be invoked for proving this conjecture. The -p option enables the output of a proof certificate (the output was formatted using Sutcliffe's TPTP4X tool):
> ./leo3 ./src/test/resources/th1/sur_cantor_th1.p -p
% Time passed: 2490ms
% Effective reasoning time: 1619ms
% Axioms used in derivation (0): 
% No. of inferences in proof: 14
% No. of processed clauses: 9
% No. of generated clauses: 60
[...]
% SZS status Theorem for ./src/test/resources/th1/sur_cantor_th1.p : 2490 ms resp. 1619 ms w/o parsing
% SZS output start CNFRefutation for ./src/test/resources/th1/sur_cantor_th1.p
...
% SZS output end CNFRefutation for ./src/test/resources/th1/sur_cantor_th1.p
The line starting with ''% SZS status Theorem'' confirms that the conjecture is indeed a theorem and the contents between ''% SZS output start'' and ''% SZS output end'' are the proof certificate for this claim.

Application examples [top]

-- Supplemental material to Chapter 5 --

This section contains the discussed application examples of Chapter 5.

The application examples discussed in the thesis are splitted into three categories: (1) Higher-Order Reasoning, containing TH0 problems from a diverse range of applications; (2) Polymorphic Higher-Order Reasoning, containing exemplary TH1 problems; and (3) Modal Logic Reasoning, containing exemplary problems from modal logic in the modal THF syntax presented in the thesis.

This section presents the problems, Leo-III's proof and the GDV proof verification output (if applicable). Note that the proof output of Leo-III was formatted with Geoff Sutcliffe's tptp4X tool.

Most of the example problems originate from the TPTP problem library, unless stated otherwise. A link to the problem source is given for each example problem.

An archive containing all discussed problem statements, the respective proof output of Leo-III and the GDV proof verification protocol output (if existent) is available for download: [zip archive]

Higher-Order Reasoning

E1: Surjective Cantor Theorem

Original problem source: SET557^1
Problem rating: 0.25 (v7.0.0)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

E2: Injective Cantor Theorem

Original problem source: SYO037^1
Problem rating: 1.00 (since v3.7.0)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

E3: Pigeonhole Principle

Original problem source: MSC007^1.003.004
Problem rating: 1.00 (since v6.3.0)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

E4: Function Approximation using if-then-else

Original problem source: NUN025^1
Problem rating: 1.00 (since v6.4.0)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

E5: Existence of Choice

Original problem source: SYN997^1
Problem rating: 0.14 (v7.0.0)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

E6: Special Choice Function

Original problem source: SYO548^1
Problem rating: 0.57 (since v7.0.0)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

E7: Swapping Function

Original problem source: SYO519^1
Problem rating: 1.00 (since v4.1.0)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

E8: Function Specification

Original problem source: This is a variant of NUN025^1 as discussed in E4 that is introduced in the thesis. It effectively removes the axiomatization of an if-then-else operator.
Problem rating: N/A (not a TPTP problem)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

Polymorphic Higher-Order Reasoning

E9: Polymorphic Surjective Cantor Theorem

Original problem source: This is a polymorphic variant of SET557^1
that has been introduced in the thesis as example E9. Problem rating: N/A (not a TPTP problem)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output: The proof could not be verified by GDV.

E10: Finiteness of Booleans

Original problem source: SEV485^1
Problem rating: N/A (not rated yet)

Problem statement [show/hide]

Proof by Leo-III [show/hide]

GDV verification output [show/hide]

Modal Logic Reasoning

E11: Barcan Formula

Original problem source: SYM001+1 from the QMLTP. Adapted to modal THF syntax.
Problem rating: N/A (quantification domain not supported by QMLTP)

Problem statement (modal THF) [show/hide]

Problem statement (embedded) [show/hide]

Proof by Leo-III [show/hide]

GDV verification output: The proof could not be verified by GDV.

E12: Converse Barcan Formula

Original problem source: SYM002+1 from the QMLTP. Adapted to modal THF syntax.
Problem rating: 0.25 (QMLTP v1.1)

Problem statement (modal THF) [show/hide]

Problem statement (embedded) [show/hide]

Proof by Leo-III [show/hide]

GDV verification output: The proof could not be verified by GDV.

Evaluation [top]

-- Supplemental material to Chapter 6 --

In the thesis, three different benchmark data sets were used:

  • TPTP TH0 (2463 problems): The set of all monomorphic higher-order (TH0) problems from the TPTP library v7.0.0 that are annotated as theorems.
  • TPTP TH1 (442 problems): The subset of all 666 rank-1 polymorphic higher-order (TH1) problems from the TPTP library v7.0.0 that are annotated as theorems and do not contain arithmetic.
  • QMLTP (580 problems): The subset of all mono-modal benchmarks from the QMLTP library (version 1.1).

The exact benchmark problems are available for download:
» [TPTP TH0, zip archive]
» [TPTP TH1, zip archive]
» [QMLTP, in modal THF syntax, zip archive]

The measurements were taken on the StarExec compute cluster. For the evaluation, Leo-III, LEO-II, Satallax (3.0), Satallax (3.2), Zipperposition, Isabelle/HOL and MleanCoP (partially also with different parameter settings) were used. All primary measurement results (the measurement data on which the evaluation of chapter 6 is based) is available as [zip archive].