Section 5.1.

This commit is contained in:
Achim D. Brucker 2019-08-11 16:52:56 +01:00
parent 4f242c06ba
commit 02377de0d2
1 changed files with 211 additions and 152 deletions

View File

@ -7,182 +7,241 @@ begin
chapter*[isadof_developers::text_section]\<open>Extending \isadof\<close>
text\<open>
In this section, we introduce our framework, called \isadof. \isadof
is based on several design-decisions:
\begin{compactitem}
\item the entire \isadof is conceived as ``pure add-on'', \ie, we
deliberately resign on the possibility to modify Isabelle itself
(thus accepting a minor loss in performance and some additional
complexity in the documentation generation process)
\item we decided to identify the ontology types with the Isabelle/HOL
type system, \ie, we reuse the same infrastructure for parsers and
type-checkers, but use the types on the meta-level of the document
structure
\item we decided to make the markup-generation by own means in order
to adapt it as good as possible to the needs of tracking the linking
in documents.
\end{compactitem}
In this chapter, we describe the basic implementation aspects of \isadof, which is based on
the following design-decisions:
\<^item> the entire \isadof is a ``pure add-on,'' \ie, we deliberately resign on the possibility to
modify Isabelle itself.
\<^item> we made a small exception to this rule: the \isadof package modifies in its installation
about 10 lines in the \LaTeX-generator (\path{src/patches/thy_output.ML}).
\<^item> we decided to make the markup-generation by itself to adapt it as well as possible to the
needs of tracking the linking in documents.
\<^item> \isadof is deeply integrated into the Isabelle's IDE (PIDE) to give immediate feedback during
editing and other forms of document evolution.
\<close>
text\<open>
Semantic macros, as required by our document model, are called \<^emph>\<open>document antiquotations\<close>
in the Isabelle literature~@{cite "wenzel:isabelle-isar:2017"}. While Isabelle's code-antiquotations
are an old concept going back to Lisp and having found via SML and OCaml their ways into modern
proof systems, special annotation syntax inside documentation comments have their roots in
documentation generators such as Javadoc. Their use, however, as a mechanism to embed
machine-checked \<^emph>\<open>formal content\<close> is usually very limited and also lacks
IDE support.
\<close>
section\<open>\isadof: A User-Defined Plugin in Isabelle/Isar\<close>
text\<open>
A plugin in Isabelle starts with defining the local data and registering it in the framework. As
mentioned before, contexts are structures with independent cells/compartments having three
primitives \inlinesml+init+, \inlinesml+extend+ and \inlinesml+merge+. Technically this is done by
instantiating a functor \inlinesml+Generic_Data+, and the following fairly typical code-fragment
is drawn from \isadof:
subsection*["sec:plugins"::technical]\<open>Writing \isadof as User-Defined Plugin in Isabelle/Isar\<close>
text\<open>
Writing an own plugin in Isabelle starts with defining the local data
and registering it in the framework. As mentioned before, contexts
are structures with independent cells/compartments having three
primitives \inlinesml+init+, \inlinesml+extend+ and
\inlinesml+merge+. Technically this is done by instantiating a functor
\inlinesml+Generic_Data+, and the following fairly typical
code-fragment is already drawn from \isadof:
\begin{sml}
structure Data = Generic_Data
( type T = docobj_tab * docclass_tab
val empty = (initial_docobj_tab, initial_docclass_tab)
val extend = I
fun merge((d1,c1),(d2,c2)) = (merge_docobj_tab (d1, d2),
merge_docclass_tab(c1,c2))
);
structure Data = Generic_Data
( type T = docobj_tab * docclass_tab * ...
val empty = (initial_docobj_tab, initial_docclass_tab, ...)
val extend = I
fun merge((d1,c1,...),(d2,c2,...)) = (merge_docobj_tab (d1,d2,...),
merge_docclass_tab(c1,c2,...))
);
\end{sml}
with some tables \inlinesml+docobj_tab+ and \inlinesml+docclass_tab+
(not explained here) capturing the specific data of the application in
mind, \ie, a table of document classes and another one capturing the
document class instances.
\enlargethispage{2\baselineskip}
\<close>
text\<open>
All the text samples shown here have to be in the context of an SML
file or in an \inlineisar|ML{* ... *}| command inside a \verb|.thy|
file, which has the effect to type-check and compile these text
elements by the underlying SML compiler.
\<close>
text\<open>
Operations are organized in a model-view-controller scheme, where
Isabelle/Isar provides the controller part. A typical model operation
has the type:\<close>
text\<open>
where the table \inlinesml+docobj_tab+ manages document classes and \inlinesml+docclass_tab+ the
environment for class definitions (inducing the inheritance relation). Other tables capture, \eg,
the class invariants, inner-syntax antiquotations. Operations follow the MVC-pattern, where
Isabelle/Isar provides the controller part. A typical model operation has the type:
\begin{sml}
val opn :: <args_type> -> Context.generic -> Context.generic
val opn :: <args_type> -> Context.generic -> Context.generic
\end{sml}
\ie, it represents a transformation on contexts. For example, the
operation of declaring a local reference in the context is presented
as follows:
representing a transformation on system contexts. For example, the operation of declaring a local
reference in the context is presented as follows:
\begin{sml}
fun declare_object_local oid ctxt =
let fun decl {tab,maxano} = {tab=Symtab.update_new(oid,NONE) tab,
maxano=maxano}
in (Data.map(apfst decl)(ctxt)
handle Symtab.DUP _ =>
error("multiple declaration of document reference"))
end
\end{sml}
\<close>
text\<open>
where \inlineisar+Data.map+ is the update function resulting from the
above functor instantiation. This code fragment uses operations from
a library structure \inlinesml+Symtab+ that were used to update the
appropriate table for document objects in the plugin-local
state. Possible exceptions to the update operation were mapped to a
system-global error reporting function.
\<close>
text\<open>
Finally, the view-aspects where handled by an API for
parsing-combinators. The library structure \inlinesml+Scan+ provides,
for example, the operators:
\begin{sml}
op || : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b
op -- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e
op |-- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'd * 'e
op --| : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'b * 'e
op >> : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c
op option : ('a -> 'b * 'a) -> 'a -> 'b option * 'a
op repeat : ('a -> 'b * 'a) -> 'a -> 'b list * 'a
fun declare_object_local oid ctxt =
let fun decl {tab,maxano} = {tab=Symtab.update_new(oid,NONE) tab,
maxano=maxano}
in (Data.map(apfst decl)(ctxt)
handle Symtab.DUP _ =>
error("multiple declaration of document reference"))
end
\end{sml}
for alternative, sequence, sequence-ignore-left,
sequence-ignore-right, and piping, as well as combinators for option
and repeat. The parsing combinator technology arrived meanwhile even
in mainstream languages such as Java or Scala and is nowadays
sufficiently efficient implemented to replace conventional Lex-Yacc
technology for most applications. It has the advantage to be
smoothlessly integrated into standard programs and allows for dynamic
grammar extensions. There is a more high-level structure
\inlinesml{Parse} providing specific combinators for the
command-language Isar:
where \inlineisar+Data.map+ is the update function resulting from the instantiation of the
functor \inlinesml|Generic_Data|. This code fragment uses operations from a library structure
\inlinesml+Symtab+ that were used to update the appropriate table for document objects in
the plugin-local state. Possible exceptions to the update operation were mapped to a system-global
error reporting function.
Finally, the view-aspects were handled by an API for parsing-combinators. The library structure
\inlinesml+Scan+ provides the operators:
\begin{sml}
op || : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b
op -- : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e
op >> : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c
op option : ('a -> 'b * 'a) -> 'a -> 'b option * 'a
op repeat : ('a -> 'b * 'a) -> 'a -> 'b list * 'a
\end{sml}
for alternative, sequence, and piping, as well as combinators for option and repeat. Parsing
combinators have the advantage that they can be smoothlessly integrated into standard programs,
and they enable the dynamic extension of the grammar. There is a more high-level structure
\inlinesml{Parse} providing specific combinators for the command-language Isar:
\begin{sml}[mathescape=false]
val attribute = Parse.position Parse.name
-- Scan.optional (Parse.$$$ "=" |-- Parse.!!! Parse.name) "";
val reference = Parse.position Parse.name
-- Scan.option (Parse.$$$ "::" |-- Parse.!!!
(Parse.position Parse.name));
val attributes = (Parse.$$$ "[" |-- (reference
-- (Scan.optional(Parse.$$$ ","
|-- (Parse.enum "," attribute))) []))
--| Parse.$$$ "]"
val attribute = Parse.position Parse.name
-- Scan.optional(Parse.$$$ "=" |-- Parse.!!! Parse.name)"";
val reference = Parse.position Parse.name
-- Scan.option (Parse.$$$ "::" |-- Parse.!!!
(Parse.position Parse.name));
val attributes =(Parse.$$$ "[" |-- (reference
-- (Scan.optional(Parse.$$$ ","
|--(Parse.enum ","attribute)))[]))--| Parse.$$$ "]"
\end{sml}
``Model'' and ``View'' parts were combined to ``parsers'' which were
registered in the interpreter toplevel of the Isabelle/Isar framework:
The ``model'' \inlineisar+declare_reference_opn+ and ``new'' \inlineisar+attributes+ parts were
combined via the piping operator and registered in the Isar toplevel:
\begin{sml}
val _ = Outer_Syntax.command @ {command_keyword "declare_reference"}
fun declare_reference_opn (((oid,_),_),_) =
(Toplevel.theory (DOF_core.declare_object_global oid))
val _ = Outer_Syntax.command <@>{command_keyword "declare_reference"}
"declare document reference"
(attributes >> (fn (((oid,pos),cid),doc_attrs) =>
(Toplevel.theory (DOF_core.declare_object_global oid))));
(attributes >> declare_reference_opn);
\end{sml}
\<close>
text\<open>
Altogether, this gives the \emph{extension} of the Isar syntax
allowing to parse and interpret the new \emph{command} in a subsequent
\verb+.thy+ file:
Altogether, this gives the extension of Isabelle/HOL with Isar syntax and semantics for the
new \emph{command}:
\begin{isar}
declare_reference [lal::requirement, alpha="main", beta=42]
\end{isar}
where we ignore the semantics at the moment. The construction also
generates implicitely some markup information; for example, when
hovering over the \inlineisar|declare_reference| command in a
front-end supporting PIDE, a popup window with the text: ``declare
document reference'' will appear.
The construction also generates implicitly some markup information; for example, when hovering over
the \inlineisar|declare_reference| command in the IDE, a popup window with the text:
``declare document reference'' will appear.
\<close>
subsection*["sec:prog_anti"::technical]\<open>Programming Text Antiquotations\<close>
section\<open>Programming Antiquotations\<close>
text\<open>
As mentioned in the introduction, Isabelle/Isar is configured with a
number of standard commands to annotate formal definitions and proofs
with text---the latter is used in particular to generate PDF and HTML
documents with internal hypertext-references. Inside these text
commands, a number of predefined antiquotations can be inserted which
were checked and decorated with markup during editing.
\<close>
text\<open>
Moreover, there is an API for user-defined antiquotations. It follows
the lines of the MVC style system extensions presented in the previous
section. An excerpt of the table defining some antiquotations can be
found in \verb+thy_output.ML+ of the Isabelle sources and give the
basic idea:
The definition and registration of text antiquotations and ML-antiquotations is
similar in principle: based on a number of combinators, new
user-defined antiquotation syntax and semantics can be added to the
system that works on the internal plugin-data freely. For example, in
\begin{sml}
val _ = Theory.setup
(basic_entity @ {binding term} (stp -- Args.term) pretty_term_style #>
basic_entity @ {binding prop} (stp -- Args.prop) pretty_term_style #>
... )
val _ = Theory.setup(
Thy_Output.antiquotation <@>{binding docitem}
docitem_antiq_parser
(docitem_antiq_gen default_cid) #>
ML_Antiquotation.inline <@>{binding docitem_value}
ML_antiq_docitem_value)
\end{sml}
where \inlinesml+stp+ (=\inlinesml+Term_Style.parse+),
\inlinesml+Args.term+ and \inlinesml+Args.prop+ are parsing
combinators from higher Isar-API's (that actually do the type checking
in the surrounding HOL context) and \inlinesml+pretty_term_style+ an
operation pretty-printing the parsed term for one of the possible
targets HTML or \LaTeX{} (converted to \verb+.pdf+ in a
post-compilation process). The construct \inlinesml+@ {binding term}+
decorates the keyword ``term'' with positioning markup (allowing
navigating to this defining place in \verb+thy_output.ML+ by
hyperlinking) and \inlinesml+Theory.setup+ the operator that
registers the entire parser/checker into the Isar framework.
\<close>
text\<open>
Together, this establishes the syntax and semantics of, for example,
the antiquotation:
the text antiquotation \inlineisar+docitem+ is declared and bounded to a parser
for the argument syntax and the overall semantics. This code defines a generic
antiquotation to be used in text elements such as
\begin{isar}
text{* @{term "fac 5"} *}
text\<Open>as defined in <@>{docitem \<open>d1\<close>} ...\<Close>
\end{isar}
inside the text command. A key contribution of this paper is that such
SML code is generated \emph{automatically} from an \isadof ontology
definition introduced in the subsequent section.
The subsequent registration \inlineisar+docitem_value+ binds code to a
ML-antiquotation usable in an ML context for user-defined extensions; it permits
the access to the current ``value'' of document element, \ie; a term with the
entire update history.
It is possible to generate antiquotations \emph{dynamically}, as a
consequence of a class definition in ODL. The processing of the
ODL class \inlineisar+d$$efinition+ also \emph{generates} a text
antiquotation \inlineisar+<@>{definition \<open>d1\<close>}+, which works
similar to \inlineisar+<@>{docitem \<open>d1\<close>}+ except for an
additional type-check that assures that \inlineisar+d1+ is a
reference to a definition. These type-checks support the subclass
hierarchy.
\<close>
section\<open>Implementing Second-level Type-Checking\<close>
text\<open>
On expressions for attribute values, for which we chose to use HOL syntax
to avoid that users need to learn another syntax, we implemented
an own pass over type-checked terms. Stored in the late-binding table
\inlineisar+ISA_transformer_tab+, we register for each inner-syntax-annotation (ISA's),
a function of type
\begin{sml}
theory -> term * typ * Position.T -> term option
\end{sml}
Executed in a second pass of term parsing, ISA's may just return
\inlineisar+None+. This is adequate for ISA's just performing some checking in
the logical context \inlineisar+theory+; ISA's of this kind report errors
by exceptions. In contrast, \emph{transforming} ISA's will yield a term; this
is adequate, for example, by replacing a string-reference to some term denoted by it.
This late-binding table is also used to generate standard
inner-syntax-antiquotations from a \inlineisar+doc_class+.
\<close>
section\<open>Programming Class Invariants\<close>
text\<open>
For the moment, there is no high-level syntax for the definition of
class invariants. A formulation, in SML, of the first class-invariant
in \autoref{sec:class_inv} is straight-forward:
\begin{sml}
fun check_result_inv oid {is_monitor:bool} ctxt =
let val kind = compute_attr_access ctxt "kind" oid <@>{here} <@>{here}
val prop = compute_attr_access ctxt "property" oid <@>{here} <@>{here}
val tS = HOLogic.dest_list prop
in case kind_term of
@{term "proof"} => if not(null tS) then true
else error("class result invariant violation")
| _ => false
end
val _ = Theory.setup (DOF_core.update_class_invariant
"tiny_cert.result" check_result_inv)
\end{sml}
The \inlinesml{setup}-command (last line) registers the
\inlineisar+check_result_inv+ function into the \isadof kernel, which
activates any creation or modification of an instance of
\inlineisar+result+. We cannot replace \inlineisar+compute_attr_access+
by the corresponding antiquotation
\inlineisar+<@>{docitem_value kind::oid}+, since \inlineisar+oid+ is
bound to a variable here and can therefore not be statically expanded.
Isabelle's code generator can in principle generate class invariant code
from a high-level syntax. Since class-invariant checking can result in an
efficiency problem ---they are checked on any edit--- and since invariant
programming involves a deeper understanding of ontology modeling and the
\isadof implementation, we backed off from using this technique so far.
\<close>
section\<open>Implementing Monitors\<close>
text\<open>
Since monitor-clauses have a regular expression syntax, it is natural
to implement them as deterministic automata. These are stored in the
\inlineisar+docobj_tab+ for monitor-objects in the \isadof component.
We implemented the functions:
\begin{sml}
val enabled : automaton -> env -> cid list
val next : automaton -> env -> cid -> automaton
\end{sml}
where \inlineisar+env+ is basically a map between internal automaton
states and class-id's (\inlineisar+cid+'s). An automaton is said to be
\emph{enabled} for a class-id, iff it either occurs in its accept-set
or its reject-set (see \autoref{sec:monitors}). During top-down
document validation, whenever a text-element is encountered, it is
checked if a monitor is \emph{enabled} for this class; in this case,
the \inlineisar+next+-operation is executed. The transformed automaton
recognizing the rest-language is stored in \inlineisar+docobj_tab+ if
possible; otherwise, if \inlineisar+next+ fails, an error is
reported. The automata implementation is, in large parts, generated
from a formalization of functional automata~\cite{Functional-Automata-AFP}.
\<close>
(*<*)
end
(*>*)