Section 5.1.

2019-08-11 16:52:56 +01:00 · 2019-08-11 16:52:56 +01:00 · 02377de0d2
parent 4f242c06ba
commit 02377de0d2
1 changed files with 211 additions and 152 deletions
--- a/examples/technical_report/Isabelle_DOF-Manual/05_Implementation.thy
+++ b/examples/technical_report/Isabelle_DOF-Manual/05_Implementation.thy
@ -7,182 +7,241 @@ begin

 chapter*[isadof_developers::text_section]\<open>Extending \isadof\<close>
 text\<open>
-In this section, we introduce our framework, called \isadof. \isadof
-is based on several design-decisions:
-\begin{compactitem}
-\item the entire \isadof is conceived as ``pure add-on'', \ie, we
-  deliberately resign on the possibility to modify Isabelle itself
-  (thus accepting a minor loss in performance and some additional
-  complexity in the documentation generation process)
-\item we decided to identify the ontology types with the Isabelle/HOL
-  type system, \ie, we reuse the same infrastructure for parsers and
-  type-checkers, but use the types on the meta-level of the document
-  structure
-\item we decided to make the markup-generation by own means in order
-  to adapt it as good as possible to the needs of tracking the linking
-  in documents.
-\end{compactitem}
+  In this chapter, we describe the basic implementation aspects of \isadof, which is based on 
+  the following design-decisions:
+  \<^item> the entire \isadof is a ``pure add-on,'' \ie, we deliberately resign on the possibility to 
+    modify Isabelle itself.
+  \<^item> we made a small exception to this rule: the \isadof package modifies in its installation 
+    about 10 lines in the \LaTeX-generator (\path{src/patches/thy_output.ML}).
+  \<^item> we decided to make the markup-generation by itself to adapt it as well as possible to the 
+    needs of tracking the linking in documents.
+  \<^item> \isadof is deeply integrated into the Isabelle's IDE (PIDE) to give immediate feedback during 
+    editing and other forms of document evolution.
+\<close>
+text\<open>
+  Semantic macros, as required by our document model, are called \<^emph>\<open>document antiquotations\<close>
+  in the Isabelle literature~@{cite "wenzel:isabelle-isar:2017"}. While Isabelle's code-antiquotations 
+  are an old concept going back to Lisp and having found via SML and OCaml their ways into modern 
+  proof systems, special annotation syntax inside documentation comments have their roots in 
+  documentation generators such as Javadoc. Their use, however, as a mechanism to embed 
+  machine-checked \<^emph>\<open>formal content\<close> is usually very limited and also lacks 
+  IDE support.
 \<close>

+section\<open>\isadof: A User-Defined Plugin in Isabelle/Isar\<close>
+text\<open>
+  A plugin in Isabelle starts with defining the local data and registering it in the framework. As 
+  mentioned before, contexts are structures with independent cells/compartments having three
+  primitives \inlinesml+init+, \inlinesml+extend+ and \inlinesml+merge+. Technically this is done by 
+  instantiating a functor \inlinesml+Generic_Data+, and the following fairly typical code-fragment 
+  is drawn from \isadof:

-subsection*["sec:plugins"::technical]\<open>Writing \isadof as User-Defined Plugin in Isabelle/Isar\<close>
-text\<open>
-Writing an own plugin in Isabelle starts with defining the local data
-and registering it in the framework.  As mentioned before, contexts
-are structures with independent cells/compartments having three
-primitives \inlinesml+init+, \inlinesml+extend+ and
-\inlinesml+merge+. Technically this is done by instantiating a functor
-\inlinesml+Generic_Data+, and the following fairly typical
-code-fragment is already drawn from \isadof:
 \begin{sml}
-  structure Data = Generic_Data
-  (  type T = docobj_tab * docclass_tab 
-     val empty  = (initial_docobj_tab, initial_docclass_tab) 
-     val extend = I
-     fun merge((d1,c1),(d2,c2))  = (merge_docobj_tab (d1, d2), 
-                                    merge_docclass_tab(c1,c2))
-  );
+structure Data = Generic_Data
+(  type T = docobj_tab * docclass_tab * ...
+   val empty  = (initial_docobj_tab, initial_docclass_tab, ...) 
+   val extend = I
+   fun merge((d1,c1,...),(d2,c2,...)) = (merge_docobj_tab  (d1,d2,...), 
+                                         merge_docclass_tab(c1,c2,...))
+);
 \end{sml}
-with some tables \inlinesml+docobj_tab+ and \inlinesml+docclass_tab+
-(not explained here) capturing the specific data of the application in
-mind, \ie, a table of document classes and another one capturing the
-document class instances.
-\enlargethispage{2\baselineskip}
-\<close>
-text\<open>
-All the text samples shown here have to be in the context of an SML
-file or in an \inlineisar|ML{* ... *}| command inside a \verb|.thy|
-file, which has the effect to type-check and compile these text
-elements by the underlying SML compiler.
-\<close>
-text\<open>
-Operations are organized in a model-view-controller scheme, where
-Isabelle/Isar provides the controller part. A typical model operation
-has the type:\<close>
-text\<open>
+  where the table \inlinesml+docobj_tab+ manages document classes and \inlinesml+docclass_tab+ the 
+  environment for class definitions (inducing the inheritance relation). Other tables capture, \eg, 
+  the class invariants, inner-syntax antiquotations. Operations follow the MVC-pattern, where 
+  Isabelle/Isar provides the controller part. A typical model operation has the type:
+
 \begin{sml}
-  val opn :: <args_type> -> Context.generic -> Context.generic
+val opn :: <args_type> -> Context.generic -> Context.generic
 \end{sml}
-\ie, it represents a transformation on contexts. For example, the
-operation of declaring a local reference in the context is presented
-as follows:
+  representing a transformation on system contexts. For example, the operation of declaring a local 
+  reference in the context is presented as follows:
+
 \begin{sml}
-  fun declare_object_local oid ctxt  =
-  let fun decl {tab,maxano} = {tab=Symtab.update_new(oid,NONE) tab,
-                               maxano=maxano}
-  in  (Data.map(apfst decl)(ctxt)
-    handle Symtab.DUP _ =>
-                error("multiple declaration of document reference"))
-  end
-\end{sml} 
-\<close>
-text\<open>
-where \inlineisar+Data.map+ is the update function resulting from the
-above functor instantiation.  This code fragment uses operations from
-a library structure \inlinesml+Symtab+ that were used to update the
-appropriate table for document objects in the plugin-local
-state. Possible exceptions to the update operation were mapped to a
-system-global error reporting function.
-\<close>
-text\<open>
-Finally, the view-aspects where handled by an API for
-parsing-combinators. The library structure \inlinesml+Scan+ provides,
-for example, the operators:
-\begin{sml}
-  op ||     : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b
-  op --     : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e
-  op |--    : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'd * 'e
-  op --|    : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> 'b * 'e
-  op >>     : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c
-  op option : ('a -> 'b * 'a) -> 'a -> 'b option * 'a
-  op repeat : ('a -> 'b * 'a) -> 'a -> 'b list * 'a
+fun declare_object_local oid ctxt  =
+let fun decl {tab,maxano} = {tab=Symtab.update_new(oid,NONE) tab,
+                             maxano=maxano}
+in  (Data.map(apfst decl)(ctxt)
+  handle Symtab.DUP _ =>
+              error("multiple declaration of document reference"))
+end
 \end{sml}                                            
-for alternative, sequence, sequence-ignore-left,
-sequence-ignore-right, and piping, as well as combinators for option
-and repeat. The parsing combinator technology arrived meanwhile even
-in mainstream languages such as Java or Scala and is nowadays
-sufficiently efficient implemented to replace conventional Lex-Yacc
-technology for most applications. It has the advantage to be
-smoothlessly integrated into standard programs and allows for dynamic
-grammar extensions.  There is a more high-level structure
-\inlinesml{Parse} providing specific combinators for the
-command-language Isar:
+  where \inlineisar+Data.map+ is the update function resulting from the instantiation of the 
+  functor \inlinesml|Generic_Data|. This code fragment uses operations from a library structure 
+  \inlinesml+Symtab+ that were used to update the appropriate table for document objects in
+  the plugin-local state. Possible exceptions to the update operation were mapped to a system-global 
+  error reporting function.
+
+  Finally, the view-aspects were handled by an API for parsing-combinators. The library structure 
+  \inlinesml+Scan+ provides the operators:
+
+\begin{sml}
+op ||     : ('a -> 'b) * ('a -> 'b) -> 'a -> 'b
+op --     : ('a -> 'b * 'c) * ('c -> 'd * 'e) -> 'a -> ('b * 'd) * 'e
+op >>     : ('a -> 'b * 'c) * ('b -> 'd) -> 'a -> 'd * 'c
+op option : ('a -> 'b * 'a) -> 'a -> 'b option * 'a
+op repeat : ('a -> 'b * 'a) -> 'a -> 'b list * 'a 
+\end{sml}                                            
+  for alternative, sequence, and piping, as well as combinators for option and repeat. Parsing 
+  combinators have the advantage that they can be smoothlessly integrated into standard programs, 
+  and they enable the dynamic extension of the grammar. There is a more high-level structure 
+  \inlinesml{Parse} providing specific combinators for the command-language Isar:
+
 \begin{sml}[mathescape=false]
-  val attribute = Parse.position Parse.name 
-        -- Scan.optional (Parse.$$$ "=" |-- Parse.!!! Parse.name) "";
-  val reference = Parse.position Parse.name 
-        -- Scan.option (Parse.$$$ "::" |-- Parse.!!!
-                         (Parse.position Parse.name));
-  val attributes =  (Parse.$$$ "[" |-- (reference 
-                     -- (Scan.optional(Parse.$$$ ","
-                         |-- (Parse.enum "," attribute))) []))
-                                  --| Parse.$$$ "]"              
+val attribute = Parse.position Parse.name 
+              -- Scan.optional(Parse.$$$ "=" |-- Parse.!!! Parse.name)"";
+val reference = Parse.position Parse.name 
+              -- Scan.option (Parse.$$$ "::" |-- Parse.!!!
+                              (Parse.position Parse.name));
+val attributes =(Parse.$$$ "[" |-- (reference 
+               -- (Scan.optional(Parse.$$$ ","
+                   |--(Parse.enum ","attribute)))[]))--| Parse.$$$ "]"              
 \end{sml}                                            
-``Model'' and ``View'' parts were combined to ``parsers'' which were
-registered in the interpreter toplevel of the Isabelle/Isar framework:
+
+  The ``model'' \inlineisar+declare_reference_opn+ and ``new'' \inlineisar+attributes+ parts were 
+  combined via the piping operator and registered in the Isar toplevel:
+
 \begin{sml}
-  val _ = Outer_Syntax.command @ {command_keyword "declare_reference"}
+  fun declare_reference_opn (((oid,_),_),_) =
+                 (Toplevel.theory (DOF_core.declare_object_global oid))
+  val _ = Outer_Syntax.command <@>{command_keyword "declare_reference"}
          "declare document reference"
-          (attributes >> (fn (((oid,pos),cid),doc_attrs) =>  
-             (Toplevel.theory (DOF_core.declare_object_global oid))));
+          (attributes >> declare_reference_opn);
 \end{sml}
-\<close>
-text\<open>
-Altogether, this gives the \emph{extension} of the Isar syntax
-allowing to parse and interpret the new \emph{command} in a subsequent
-\verb+.thy+ file:
+
+  Altogether, this gives the extension of Isabelle/HOL with Isar syntax and semantics for the 
+  new \emph{command}:
+
 \begin{isar}
 declare_reference [lal::requirement, alpha="main", beta=42]
 \end{isar}
-where we ignore the semantics at the moment. The construction also
-generates implicitely some markup information; for example, when
-hovering over the \inlineisar|declare_reference| command in a
-front-end supporting PIDE, a popup window with the text: ``declare
-document reference'' will appear.
+
+The construction also generates implicitly some markup information; for example, when hovering over 
+the \inlineisar|declare_reference| command in the IDE, a popup window with the text: 
+``declare document reference'' will appear.
 \<close>

-subsection*["sec:prog_anti"::technical]\<open>Programming Text Antiquotations\<close>
+section\<open>Programming Antiquotations\<close>
 text\<open>
-As mentioned in the introduction, Isabelle/Isar is configured with a
-number of standard commands to annotate formal definitions and proofs
-with text---the latter is used in particular to generate PDF and HTML
-documents with internal hypertext-references. Inside these text
-commands, a number of predefined antiquotations can be inserted which
-were checked and decorated with markup during editing. 
-\<close>
-text\<open>
-Moreover, there is an API for user-defined antiquotations.  It follows
-the lines of the MVC style system extensions presented in the previous
-section.  An excerpt of the table defining some antiquotations can be
-found in \verb+thy_output.ML+ of the Isabelle sources and give the
-basic idea:
+The definition and registration of text antiquotations and ML-antiquotations is
+similar in principle: based on a number of combinators, new
+user-defined antiquotation syntax and semantics can be added to the
+system that works on the internal plugin-data freely. For example, in
+
 \begin{sml}
-  val _ = Theory.setup
-  (basic_entity @ {binding term} (stp -- Args.term) pretty_term_style #>
-   basic_entity @ {binding prop} (stp -- Args.prop) pretty_term_style #>
-    ... )
+val _ = Theory.setup(
+          Thy_Output.antiquotation <@>{binding docitem} 
+                                   docitem_antiq_parser 
+                                   (docitem_antiq_gen default_cid) #>
+          ML_Antiquotation.inline  <@>{binding docitem_value} 
+                                   ML_antiq_docitem_value)
 \end{sml}
-where \inlinesml+stp+ (=\inlinesml+Term_Style.parse+),
-\inlinesml+Args.term+ and \inlinesml+Args.prop+ are parsing
-combinators from higher Isar-API's (that actually do the type checking
-in the surrounding HOL context) and \inlinesml+pretty_term_style+ an
-operation pretty-printing the parsed term for one of the possible
-targets HTML or \LaTeX{} (converted to \verb+.pdf+ in a
-post-compilation process). The construct \inlinesml+@ {binding term}+
-decorates the keyword ``term'' with positioning markup (allowing 
-navigating to this defining place in \verb+thy_output.ML+ by
-hyperlinking) and \inlinesml+Theory.setup+ the operator that
-registers the entire parser/checker into the Isar framework.
-\<close>
-text\<open>
-Together, this establishes the syntax and semantics of, for example,
-the antiquotation:
+
+the text antiquotation \inlineisar+docitem+ is declared and bounded to a parser
+for the argument syntax and the overall semantics. This code defines a generic
+antiquotation to be used in text elements such as
+
 \begin{isar}
-text{* @{term "fac 5"} *}
+text\<Open>as defined in <@>{docitem \<open>d1\<close>} ...\<Close>
 \end{isar}
-inside the text command. A key contribution of this paper is that such
-SML code is generated \emph{automatically} from an \isadof ontology
-definition introduced in the subsequent section.
+
+The subsequent registration \inlineisar+docitem_value+ binds code to a
+ML-antiquotation usable in an ML context for user-defined extensions; it permits
+the access to the current ``value'' of document element, \ie; a term with the
+entire update history.
+
+It is possible to generate antiquotations \emph{dynamically}, as a
+consequence of a class definition in ODL. The processing of the
+ODL class \inlineisar+d$$efinition+ also \emph{generates} a text
+antiquotation \inlineisar+<@>{definition \<open>d1\<close>}+, which works
+similar to \inlineisar+<@>{docitem \<open>d1\<close>}+ except for an
+additional type-check that assures that \inlineisar+d1+ is a
+reference to a definition. These type-checks support the subclass
+hierarchy.
 \<close>

+section\<open>Implementing Second-level Type-Checking\<close>
+text\<open>
+On expressions for attribute values, for which we chose to use HOL syntax
+to avoid that users need to learn another syntax, we implemented
+an own pass over type-checked terms. Stored in the late-binding table
+\inlineisar+ISA_transformer_tab+, we register for each inner-syntax-annotation (ISA's),
+a function of type
+
+\begin{sml}
+   theory -> term * typ * Position.T -> term option
+\end{sml}
+
+Executed in a second pass of term parsing, ISA's may just return
+\inlineisar+None+. This is adequate for ISA's just performing some checking in
+the logical context \inlineisar+theory+; ISA's of this kind report errors 
+by exceptions. In contrast, \emph{transforming} ISA's will yield a term; this
+is adequate, for example, by replacing a string-reference to some term denoted by it.
+This late-binding table is also used to generate standard
+inner-syntax-antiquotations from a \inlineisar+doc_class+.
+\<close>
+
+section\<open>Programming Class Invariants\<close>
+text\<open>
+For the moment, there is no high-level syntax for the definition of
+class invariants. A formulation, in SML, of the first class-invariant
+in \autoref{sec:class_inv} is straight-forward:
+
+\begin{sml}
+fun check_result_inv oid {is_monitor:bool} ctxt =
+  let val kind = compute_attr_access ctxt "kind" oid <@>{here} <@>{here} 
+      val prop = compute_attr_access ctxt "property" oid <@>{here} <@>{here} 
+      val tS = HOLogic.dest_list prop
+  in  case kind_term of
+       @{term "proof"} => if not(null tS) then true
+                          else error("class result invariant violation") 
+      | _ => false
+  end
+ val _ = Theory.setup (DOF_core.update_class_invariant 
+                                "tiny_cert.result" check_result_inv)
+\end{sml}
+
+The \inlinesml{setup}-command (last line) registers the
+\inlineisar+check_result_inv+ function into the \isadof kernel, which
+activates any creation or modification of an instance of
+\inlineisar+result+.  We cannot replace \inlineisar+compute_attr_access+ 
+by the corresponding antiquotation
+\inlineisar+<@>{docitem_value kind::oid}+, since \inlineisar+oid+ is
+bound to a variable here and can therefore not be statically expanded.
+
+Isabelle's code generator can in principle generate class invariant code 
+from a high-level syntax. Since class-invariant checking can result in an
+efficiency problem ---they are checked on any edit--- and since invariant 
+programming involves a deeper understanding of ontology modeling and the
+\isadof implementation, we backed off from using this technique so far.
+\<close>
+
+section\<open>Implementing Monitors\<close>
+text\<open>
+Since monitor-clauses have a regular expression syntax, it is natural
+to implement them as deterministic automata. These are stored in the 
+\inlineisar+docobj_tab+ for monitor-objects in the \isadof component. 
+We implemented the functions:
+
+\begin{sml}
+   val  enabled : automaton -> env -> cid list
+   val  next    : automaton -> env -> cid -> automaton
+\end{sml}
+
+where \inlineisar+env+ is basically a map between internal automaton
+states and class-id's (\inlineisar+cid+'s). An automaton is said to be
+\emph{enabled} for a class-id, iff it either occurs in its accept-set
+or its reject-set (see \autoref{sec:monitors}). During top-down
+document validation, whenever a text-element is encountered, it is
+checked if a monitor is \emph{enabled} for this class; in this case,
+the \inlineisar+next+-operation is executed. The transformed automaton
+recognizing the rest-language is stored in \inlineisar+docobj_tab+ if
+possible; otherwise, if \inlineisar+next+ fails, an error is
+reported. The automata implementation is, in large parts, generated
+from a formalization of functional automata~\cite{Functional-Automata-AFP}.
+
+\<close>
+
+
+(*<*)
 end
+(*>*)