(************************************************************************* * Copyright (C) * 2019-2023 The University of Exeter * 2018-2023 The University of Paris-Saclay * 2018 The University of Sheffield * * License: * This program can be redistributed and/or modified under the terms * of the 2-clause BSD-style license. * * SPDX-License-Identifier: BSD-2-Clause *************************************************************************) (*<*) theory "M_02_Background" imports "M_01_Introduction" begin (*>*) chapter*[background::text_section]\ Background\ section*[bgrnd1::introduction]\The Isabelle System Architecture\ figure*[architecture::figure,relative_width="95",file_src="''figures/isabelle-architecture.pdf''"]\ The system architecture of Isabelle (left-hand side) and the asynchronous communication between the Isabelle system and the IDE (right-hand side). \ text*[bg::introduction]\ While Isabelle is widely perceived as an interactive theorem prover for HOL (Higher-order Logic)~@{cite "nipkow.ea:isabelle:2002"}, we would like to emphasize the view that Isabelle is far more than that: it is the \<^emph>\Eclipse of Formal Methods Tools\. This refers to the ``\<^emph>\generic system framework of Isabelle/Isar underlying recent versions of Isabelle. Among other things, Isabelle provides an infrastructure for Isabelle plug-ins, comprising extensible state components and extensible syntax that can be bound to SML programs. Thus, the Isabelle architecture may be understood as an extension and refinement of the traditional `LCF approach', with explicit infrastructure for building derivative systems.\''~@{cite "wenzel.ea:building:2007"} The current system framework offers moreover the following features: \<^item> a build management grouping components into pre-compiled sessions, \<^item> a prover IDE (PIDE) framework~@{cite "wenzel:asynchronous:2014"} with various front-ends, \<^item> documentation-generation, \<^item> code generators for various target languages, \<^item> an extensible front-end language Isabelle/Isar, and, \<^item> last but not least, an LCF style, generic theorem prover kernel as the most prominent and deeply integrated system component. \ text\ The Isabelle system architecture shown in @{figure \architecture\} comes with many layers, with Standard ML (SML) at the bottom layer as implementation language. The architecture actually foresees a \<^emph>\Nano-Kernel\ (our terminology) which resides in the SML structure \<^boxed_sml>\Context\. This structure provides a kind of container called \<^emph>\context\ providing an identity, an ancestor-list as well as typed, user-defined state for plugins such as \<^isadof>. On top of the latter, the LCF-Kernel, tactics, automated proof procedures as well as specific support for higher specification constructs were built. \<^footnote>\We use the term \<^emph>\plugin\ for a collection of HOL-definitions, SML and Scala code in order to distinguish it from the official Isabelle term \<^emph>\component\ which implies a particular format and support by the Isabelle build system.\ \ section*[dof::introduction]\The Document Model Required by \<^dof>\ text\ In this section, we explain the assumed document model underlying our Document Ontology Framework (\<^dof>) in general. In particular we discuss the concepts \<^emph>\integrated document\\<^bindex>\integrated document\, \<^emph>\sub-document\\<^bindex>\sub-document\, \<^emph>\document-element\\<^bindex>\document-element\, and \<^emph>\semantic macros\\<^bindex>\semantic macros\ occurring inside document-elements. This type of document structure is quite common for scripts interactively evaluated in an incremental fashion. Furthermore, we assume a bracketing mechanism that unambiguously allows to separate different syntactic fragments and that can be nested. In the case of Isabelle, these are the guillemot symbols \\...\\, which represent the begin and end of a \<^emph>\cartouche\\<^bindex>\cartouche\.\ (*<*) declare_reference*[docModGenConcr::figure] (*>*) text\ The Isabelle Framework is based on a \<^emph>\document-centric view\\<^bindex>\document-centric view\ of a document, treating the input in its integrality as sets of (user-programmable) \<^emph>\document element\ that may mutually depend on and link to each other; A \<^emph>\document\ in our sense is what is configured in a set of \<^verbatim>\ROOT\- and \<^verbatim>\ROOTS\-files. Isabelle assumes a hierarchical document model\<^index>\document model\, \<^ie>, an \<^emph>\integrated\ document consists of a hierarchy of \<^emph>\sub-documents\ (files); dependencies are restricted to be acyclic at this level (c.f. @{figure (unchecked) "docModGenConcr"}). Document parts can have different document types in order to capture documentations consisting of documentation, models, proofs, code of various forms and other technical artifacts. We call the main sub-document type, for historical reasons, \<^emph>\theory\-files. A theory file\<^bindex>\theory!file\ consists of a \<^emph>\header\\<^bindex>\header\, a \<^emph>\context definition\\<^index>\context\, and a body consisting of a sequence of document elements called \<^emph>\command\s (see @{figure (unchecked) "docModGenConcr"}(left)). Even the header consists of a sequence of commands used for introductory text elements not depending on any context. The context-definition contains an \<^boxed_theory_text>\import\ and a \<^boxed_theory_text>\keyword\ section, for example: @{boxed_theory_text [display]\ theory Example \\Name of the 'theory'\ imports \\Declaration of 'theory' dependencies\ Main \\Imports a library called 'Main'\ keywords \\Registration of keywords defined locally\ requirement \\A command for describing requirements\ \} where \<^boxed_theory_text>\Example\ is the abstract name of the text-file, \<^boxed_theory_text>\Main\ refers to an imported theory (recall that the import relation must be acyclic) and \<^boxed_theory_text>\keywords\ are used to separate commands from each other. \ text*[docModGenConcr::float, main_caption="\A Representation of a Document Model.\"] \ @{fig_content (width=45, caption="Schematic Representation.") "figures/doc-mod-generic.pdf" }\<^hfill>@{fig_content (width=45, caption="The Isar Instance.") "figures/doc-mod-isar.pdf"} \ text\The body of a theory file consists of a sequence of \<^emph>\commands\ that must be introduced by a command keyword such as \<^boxed_theory_text>\requirement\ above. Command keywords may mark the begin of a text that is parsed by a command-specific parser; the end of the command-span is defined by the next keyword. Commands were used to define definitions, lemmas, code and text-elements (see @{float "docModGenConcr"}(right)). \ text\ A simple text-element \<^index>\text-element\ may look like this: @{boxed_theory_text [display]\ text\ This is a simple text.\\} ... so it is a command \<^theory_text>\text\ followed by an argument (here in \\ ... \\ parenthesis) which contains characters. While \<^theory_text>\text\-elements play a major role in this manual --- document generation is the main use-case of \<^dof> in its current stage --- it is important to note that there are actually three families of ``ontology aware'' document elements with analogous syntax to standard ones. The difference is a bracket with meta-data of the form: @{theory_text [display,indent=5, margin=70] \ text*[label::classid, attr\<^sub>1=E\<^sub>1, ... attr\<^sub>n=E\<^sub>n]\ some semi-formal text \ ML*[label::classid, attr\<^sub>1=E\<^sub>1, ... attr\<^sub>n=E\<^sub>n]\ some SML code \ value*[label::classid, attr\<^sub>1=E\<^sub>1, ... attr\<^sub>n=E\<^sub>n]\ some annotated \-term \ \} Other instances of document elements belonging to these families are, for example, the freeform \<^theory_text>\Definition*\ and \<^theory_text>\Lemma*\ as well as their formal counterparts \<^theory_text>\definition*\ and \<^theory_text>\lemma*\, which allow in addition to their standard Isabelle functionality the creation and management of ontology-generated meta-data associated to them (cf. @{text_section (unchecked) "writing_doc"}). Depending on the family, we will speak about \<^emph>\(formal) text-contexts\,\<^bindex>\formal text-contexts\ \<^emph>\(ML) code-contexts\\<^bindex>\code-contexts\ and \<^emph>\term-contexts\\<^bindex>\term-contexts\ if we refer to sub-elements inside the \\...\\ cartouches of these command families. Text- code- or term-contexts may contain a special form comment ???, that may be considered as a "semantic macro" or a machine-checked annotation: the so-called antiquotations\<^bindex>\antiquotation\. Its general syntactic format reads as follows: @{boxed_theory_text [display]\ @{antiquotation_name (args) [more_args] \sub-context\ }\} The sub-context may be different from the surrounding one; therefore, it is possible to switch from a text-context to a term-context, for example. Therefore, antiquotations allow the nesting of cartouches, albeit not all combinations are actually supported. \<^footnote>\In the literature, this concept has been referred to \Cascade-Syntax\ and was used in the Centaur-system and is existing in some limited form in some Emacs-implementations these days. \ Isabelle comes with a number of built-in antiquotations for text- and code-contexts; a detailed overview can be found in @{cite "wenzel:isabelle-isar:2020"}. \<^dof> reuses this general infrastructure but \<^emph>\generates\ its own families of antiquotations from ontologies.\ text\ An example for a text-element \<^index>\text-element\ using built-in antiquotations may look like this: @{boxed_theory_text [display]\ text\ According to the \<^emph>\reflexivity\ axiom @{thm refl}, we obtain in \ for @{term "fac 5"} the result @{value "fac 5"}.\\} ... so it is a command \<^theory_text>\text\ followed by an argument (here in \\ ... \\ parenthesis) which contains characters and a special notation for semantic macros \<^bindex>\semantic macros\ (here \<^theory_text>\@{term "fac 5"}\). The above text element is represented in the final document (\<^eg>, a PDF) by: @{boxed_pdf [display] \According to the $\emph{reflexivity}$ axiom $\mathrm{x = x}$, we obtain in $\Gamma$ for $\operatorname{fac} \text{\textrm{5}}$ the result $\text{\textrm{120}}$.\ }\ text\Antiquotations seen as semantic macros are partial functions of type \logical_context \ text\; since they can use the system state, they can perform all sorts of specific checks or evaluations (type-checks, executions of code-elements, references to text-elements or proven theorems such as \refl\, which is the reference to the axiom of reflexivity). Therefore, semantic macros can establish \<^emph>\formal content\ inside informal content; they can be type-checked before being displayed and can be used for calculations before being typeset. They represent the device for linking formal with the informal content. \ text*[docModOnto::float, main_caption="\Documents conform to Ontologies.\"] \ @{fig_content (width=47, caption="A Document with Ontological Annotations.") "figures/doc-mod-DOF.pdf" }\<^hfill>@{fig_content (width=47, caption="Ontological References.") "figures/doc-mod-onto-docinst.pdf"} \ text\Since Isabelle's commands are freely programmable, it is possible to implement \<^dof> as an extension of the system. In particular, the ontology language of \<^dof> provides an Ontology Definition Language (ODL)\<^bindex>\ODL\ that \<^emph>\generates\ anti-quotations and the infrastructure to check and evaluate them. This allows for checking an annotated document with respect to a given ontology, which may be specific for a given domain-specific universe of discourse (see @{float "docModOnto"}). ODL will be described in @{text_section (unchecked) "isadof_tour"} in more detail.\ section*[bgrnd21::introduction]\Implementability of the Document Model in other ITP's\ text\ Batch-mode checkers for \<^dof> can be implemented in all systems of the LCF-style prover family, \<^ie>, systems with a type-checked \term\, and abstract \thm\-type for theorems (protected by a kernel). This includes, \<^eg>, ProofPower, HOL4, HOL-light, Isabelle, or Coq and its derivatives. \<^dof> is, however, designed for fast interaction in an IDE. If a user wants to benefit from this experience, only Isabelle and Coq have the necessary infrastructure of asynchronous proof-processing and support by a PIDE~@{cite "wenzel:asynchronous:2014" and "wenzel:system:2014" and "barras.ea:pervasive:2013" and "faithfull.ea:coqoon:2018"} which in many features over-accomplishes the required features of \<^dof>. \ figure*["fig:dof-ide",relative_width="95",file_src="''figures/cicm2018-combined.png''"]\ The \<^isadof> IDE (left) and the corresponding PDF (right), showing the first page of~@{cite "brucker.ea:isabelle-ontologies:2018"}.\ text\ We call the present implementation of \<^dof> on the Isabelle platform \<^isadof> . @{figure "fig:dof-ide"} shows a screen-shot of an introductory paper on \<^isadof>~@{cite "brucker.ea:isabelle-ontologies:2018"}: the \<^isadof> PIDE can be seen on the left, while the generated presentation in PDF is shown on the right. Isabelle provides, beyond the features required for \<^dof>, a lot of additional benefits. Besides UTF8-support for characters used in text-elements, Isabelle offers built-in already a mechanism for user-programmable antiquotations \<^index>\antiquotations\ which we use to implement semantic macros \<^index>\semantic macros\ in \<^isadof> (We will actually use these two terms as synonym in the context of \<^isadof>). Moreover, \<^isadof> allows for the asynchronous evaluation and checking of the document content~@{cite "wenzel:asynchronous:2014" and "wenzel:system:2014" and "barras.ea:pervasive:2013"} and is dynamically extensible. Its PIDE provides a \<^emph>\continuous build, continuous check\ functionality, syntax highlighting, and auto-completion. It also provides infrastructure for displaying meta-information (\<^eg>, binding and type annotation) as pop-ups, while hovering over sub-expressions. A fine-grained dependency analysis allows the processing of individual parts of theory files asynchronously, allowing Isabelle to interactively process large (hundreds of theory files) documents. Isabelle can group sub-documents into sessions, \<^ie>, sub-graphs of the document-structure that can be ``pre-compiled'' and loaded instantaneously, \<^ie>, without re-processing, which is an important means to scale up. \ (*<*) end (*>*)