- experiments with document ontologies

- substantially more input on "MyCommentedIsabelle" (should be another TR example ? )
2018-11-07 06:00:01 +01:00 · 2018-11-07 06:00:01 +01:00 · 3e6dc80445
parent c67f39c236
commit 3e6dc80445
4 changed files with 260 additions and 100 deletions
--- a/MyCommentedIsabelle.thy
+++ b/MyCommentedIsabelle.thy
@ -1,24 +1,64 @@
 chapter \<open>A More Or Less Structured File with my Personal, Ecclectic Comments 
-          on Internal Isabelle/Infrastructure \<close>
+          on the Isabelle 2017 Infrastructure \<close>

-text" Covering Parsers, Pretty-Printers and Presentation, and Kernel. "
+text\<open>This is a programming-tutorial, complementary to the "The Isabelle Cookbook" in
+     \url{https://nms.kcl.ac.uk/christian.urban/Cookbook/}.\<close>
  
 theory MyCommentedIsabelle
  imports Main
-  
 begin
  
-section "Isabelle/Pure bootstrap";
-  text "The Fundamental Boot Order gives an Idea on Module Dependencies: "
-  text \<open> @{file "$ISABELLE_HOME/src/Pure/ROOT.ML"}\<close>
-    
-  text "It's even roughly commented ... "
-  (* Paradigmatic Example for Antiquotation programming *)  
-  text \<open> @{footnote \<open>sdf\<close>  }\<close>    
-  
-section{* Stuff - Interesting operators (just sample code) *}    
+section{* SML and Fundamental SML libraries  *}    

-(* General combinators (in Pure/General/basics.ML)*)
+subsection "Text, Antiquotations, and the Isabelle/Pure bootstrap"
+
+text\<open>Isabelle is written in SML, the "Standard Meta-Language", which is
+is an impure functional programming language allowing, in principle, variables and side-effects: \<close>
+
+ML\<open> val X = Unsynchronized.ref 0;
+    X:= !X + 1;
+    X\<close>
+
+text\<open>However, since Isabelle is a platform involving parallel execution, concurrent computing, and,
+as an interactive environment, involves backtracking / re-evaluation as a consequence of user-
+interaction, imperative programming is discouraged and nearly never used in the entire Isabelle
+code-base. Isabelle then implements an own top-level for an extensible language, Isar, in which
+domain-specific components can be developed and integrated into the system on the fly. Thus, the 
+Isabelle code-base consists mainly of SML and \<^verbatim>\<open>.thy\<close>-files containing mixtures of 
+Isar commands and SML. Furthermore, a rich text annotation language is provided for a specific
+style of literate programming. \<close>
+
+text\<open>Text enriched by "formal content", i.e. machine-checked, typed references 
+     to all sorts of entities in form \<^emph>\<open>antiquotations\<close> allows for coping with 
+     sources that rapidly evolve and were developed by distributed teams 
+     as is typical in open-source developments. 
+     A paradigmatic example for antiquotation in Texts and Program snippets is here:
+\<close>
+
+text \<open> @{footnote \<open>sdf\<close>}, @{file "$ISABELLE_HOME/src/Pure/ROOT.ML"}\<close> 
+ML\<open> val x = @{thm refl}\<close>
+
+text\<open>In a way, literate specification with a maximum of formal content is a way to
+     ensure "Agile Development", at least for its objectives, albeit not
+     by its popular methods and processes like SCRUM. 
+
+     A maximum of formal content inside text documentation also ensures the 
+     consistency of this present text with the underlying Isabelle version.\<close>
+
+text\<open>It is instructive to study the fundamental Boot Order of the Isabelle system;
+     it is written in the Isar format and gives an idea of the global module dependencies: 
+     @{file "$ISABELLE_HOME/src/Pure/ROOT.ML"}. Loading this file 
+     (for example by hovering over this hyperlink in the antiquotation holding control or
+     command key in Isabelle/jedit and activating it) allows the Isabelle IDE 
+     to support hyperlinking \<^emph>\<open>inside\<close> the Isabelle source.\<close>
+
+subsection "Elements of the SML library";  
+text\<open>Elements of the @{file "$ISABELLE_HOME/src/Pure/General/basics.ML"} SML library
+are basic exceptions. Note that exceptions should be catched individually, uncatched exceptions 
+except those generated by the specific "error" function are discouraged in Isabelle
+source programming since they might produce races. Finally, a number of commonly
+used "squigglish" combinators is listed:
+\<close>
 ML{*
 (*
    exception Bind
@ -49,17 +89,45 @@ I: 'a -> 'a;
 K: 'a -> 'b -> 'a
 *}  

+text\<open>Some basic examples for the programming style using these combinators can be found in the
+ "The Isabelle Cookbook" section 2.3.\<close>
+
+text\<open>An omnipresent data-structure in the Isabelle SML sources are tables
+implemented in @{file "$ISABELLE_HOME/src/Pure/General/table.ML"}. These 
+generic tables are presented in an efficient purely functional implementation using
+balanced 2-3 trees. Key operations are:\<close>
+ML\<open>
+signature TABLE_reduced =
+sig
+  type key
+  type 'a table
+  exception DUP of key
+  exception SAME
+  exception UNDEF of key
+  val empty: 'a table
+  val dest: 'a table -> (key * 'a) list
+  val keys: 'a table -> key list
+  val lookup_key: 'a table -> key -> (key * 'a) option
+  val lookup: 'a table -> key -> 'a option
+  val defined: 'a table -> key -> bool
+  val update: key * 'a -> 'a table -> 'a table
+  (* ... *)
+end
+\<close>
+text\<open>... where \<^verbatim>\<open>key\<close> is usually just a synonym for string.\<close>
+
+
 section {*  The Nano-Kernel: Contexts,  (Theory)-Contexts, (Proof)-Contexts *}

 text\<open> What I call the 'Nano-Kernel' in Isabelle can also be seen as an acyclic theory graph.
 The meat of it can be found in the file @{file "$ISABELLE_HOME/src/Pure/context.ML"}. 
-My notion is a bit criticisable since this entity, which provides the type of theory
-and proof_context which is not that Nano after all.
+My notion is a bit criticisable since this component, which provides the type of @{ML_type theory}
+and @{ML_type proof_context},  is not that Nano after all.
 However, these type are pretty empty place-holders at that level and the content of
@{file  "$ISABELLE_HOME/src/Pure/theory.ML"} is registered much later.
-The sources themselves mention it as "Fundamental Structure"
-and in principle theories and proof contexts could be REGISTERED as user data;
-the chosen specialization is therefore an acceptable meddling of the abstraction "Nano-Kernel"
+The sources themselves mention it as "Fundamental Structure".
+In principle, theories and proof contexts could be REGISTERED as user data inside contexts.
+The chosen specialization is therefore an acceptable meddling of the abstraction "Nano-Kernel"
 and its application context: Isabelle.

 Makarius himself says about this structure:
@ -68,7 +136,7 @@ Makarius himself says about this structure:
 monotonic development graph and history support.  Generic proof
 contexts with arbitrarily typed data."

-A context is essentially a container with
+In my words: a context is essentially a container with
 \<^item> an id
 \<^item> a list of parents (so: the graph structure)
 \<^item> a time stamp and
@ -78,9 +146,13 @@ A context is essentially a container with


 A context comes in form of three 'flavours':
-\<^item> theories
-\<^item> Proof-Contexts (containing theories but also additional information if Isar goes into prove-mode)
-\<^item> Generic (the sum of both)
+\<^item> theories : @{ML_type theory}'s, containing a syntax and axioms, but also
+             user-defined data and configuration information.
+\<^item> Proof-Contexts:  @{ML_type Proof.context} 
+             containing theories but also additional information if Isar goes into prove-mode.
+             In general a richer structure than theories coping also with fixes, facts, goals,
+             in order to support the structured Isar proof-style.
+\<^item> Generic:  @{ML_type Context.generic }, i.e. the sum of both.

 All context have to be seen as mutable; so there are usually transformations defined on them
 which are possible as long as a particular protocol (begin_thy - end_thy etc) are respected.
@ -90,7 +162,7 @@ a context.
 \<close>  

 subsection\<open> Mechanism 1 : Core Interface. \<close>
-
+text\<open>To be found in @{file  "$ISABELLE_HOME/src/Pure/context.ML"}:\<close>
  
 ML{*
 Context.parents_of: theory -> theory list;
@ -101,9 +173,6 @@ Context.proof_of : Context.generic -> Proof.context;
 Context.certificate_theory_id : Context.certificate -> Context.theory_id;
 Context.theory_name : theory -> string;
 Context.map_theory: (theory -> theory) -> Context.generic -> Context.generic;
-(* Theory.map_thy; *)
-
-open Context; (* etc *)
 *}
  

@ -125,33 +194,27 @@ structure Data = Generic_Data
  val merge  = merge
 );  

+
+ Data.get : generic -> Data.T;
+ Data.put : Data.T -> generic -> generic;
+ Data.map : (Data.T -> Data.T) -> generic -> generic;
 *}


+section\<open>  The LCF-Kernel: terms, types, theories, proof_contexts, thms \<close>  
+text\<open>The classical LCF-style \<^emph>\<open>kernel\<close> is about 
+\<^enum> Types and terms of a typed \<lambda>-Calculus including constant symbols,
+  free variables, \<lambda>-Binder and application,
+\<^enum> An infrastructure to define types and terms, a \<^emph>\<open>signature\<close>,
+  that also assigns to constant symbols types
+\<^enum> An abstract type of \<^emph>\<open>theorem\<close> and logical operations on them
+\<^enum> (Isabelle specific): a notion of \<^emph>\<open>theory\<close>, i.e. a container 
+  providing a signature and set (list) of theorems.   
+\<close>

-ML{*
-(*
-signature CONTEXT =
-sig
-  include BASIC_CONTEXT
-  (*theory context*)
-  type theory_id
-  val theory_id: theory -> theory_id
-  val timing: bool Unsynchronized.ref
-  val parents_of: theory -> theory list
-  val ancestors_of: theory -> theory list
-  val theory_id_name: theory_id -> string
-  val theory_name: theory -> string
-  val PureN: string
-...
-end
-*)
-*}
-
-section\<open>  Kernel: terms, types, theories, proof_contexts, thms \<close>  

 subsection{* Terms and Types *}
-text \<open>A basic data-structure of the kernel is term.ML \<close>  
+text \<open>A basic data-structure of the kernel is @{file "$ISABELLE_HOME/src/Pure/term.ML"} \<close>  
 ML{* open Term;
 (*
  type indexname = string * int
@ -241,7 +304,7 @@ val true = Sign.typ_instance @{theory} (ty', generalize_typ ty)
 text\<open>... or more general variants thereof that are parameterized by the indexes for schematic
 type variables instead of assuming just @{ML "0"}. \<close>

-text\<open> Example:\<close>
+text\<open> Example:\<close> 
 ML\<open>val t = generalize_term @{term "[]"}\<close>

 text
@ -296,6 +359,79 @@ Sign.syn_of : theory -> Syntax.syntax;
 Sign.of_sort : theory -> typ * sort -> bool ;
 \<close>

+subsection{* Thm's and the Core Kernel *}  
+text\<open> 
+  The basic constructors and operations on theorems@{file "$ISABELLE_HOME/src/Pure/thm.ML"},
+  a set of derived (Pure) inferences can be found in @{file "$ISABELLE_HOME/src/Pure/drule.ML"}.
+  They reflect the Pure logic depicted in a number of presentations such as 
+  M. Wenzel, \<^emph>\<open>Parallel Proof Checking in Isabelle/Isar\<close>, PLMMS 2009, or ...
+
+  The main types provided by structure \<^verbatim>\<open>thm\<close> are certified types @{ML_type ctyp}, 
+  certified terms @{ML_type cterm}, @{ML_type thm} as well as conversions @{ML_type conv}.
+\<close>
+
+ML\<open>
+signature BASIC_THM =
+sig
+  type ctyp
+  type cterm
+  exception CTERM of string * cterm list
+  type thm
+  type conv = cterm -> thm
+  exception THM of string * int * thm list
+end;
+\<close>
+
+text\<open>Certification of types and terms on the kernel-level is done by the generators:\<close>
+ML\<open>
+  Thm.global_ctyp_of: theory -> typ -> ctyp;
+  Thm.ctyp_of: Proof.context -> typ -> ctyp;
+  Thm.global_cterm_of: theory -> term -> cterm;
+  Thm.cterm_of: Proof.context -> term -> cterm;
+\<close>
+text\<open>... which perform type-checking in the given theory context in order to make a type
+     or term "admissible" for the kernel.\<close>
+
+text\<open>Besides a number of destructors on @{ML_type thm}'s, this true LCF-Kernel in the
+     provides the fundamental inference rules of Isabelle/Pure. \<close>
+ML\<open>
+  (*inference rules*)
+  Thm.assume: cterm -> thm;
+  Thm.implies_intr: cterm -> thm -> thm;
+  Thm.implies_elim: thm -> thm -> thm;
+  Thm.forall_intr: cterm -> thm -> thm;
+  Thm.forall_elim: cterm -> thm -> thm;
+  Thm.reflexive: cterm -> thm;
+  Thm.symmetric: thm -> thm;
+  Thm.transitive: thm -> thm -> thm;
+  Thm.trivial: cterm -> thm;
+
+  Thm.generalize: string list * string list -> int -> thm -> thm;
+  Thm.instantiate: ((indexname*sort)*ctyp)list * ((indexname*typ)*cterm) list -> thm -> thm;
+\<close>
+text\<open>... where @{ML "Thm.trivial"} produces the elementary tautologies of the form @{prop "A \<Longrightarrow> A"},
+  an operation used to start a backward-style proof.\<close>
+
+text\<open>The elementary conversions are:\<close>
+ML\<open>
+  Thm.beta_conversion: bool -> conv;
+  Thm.eta_conversion: conv;
+  Thm.eta_long_conversion: conv;
+\<close>
+
+text\<open>On the level of @{ML_structure "Drule"}, a number of higher-level operations is established, 
+     which is in part accessible by a number of forward-reasoning notations on Isar-level.\<close>
+ML\<open>
+  op RSN: thm * (int * thm) -> thm;
+  op RS: thm * thm -> thm;
+  op RLN: thm list * (int * thm list) -> thm list;
+  op RL: thm list * thm list -> thm list;
+  op MRS: thm list * thm -> thm;
+  op OF: thm * thm list -> thm;
+  op COMP: thm * thm -> thm;
+\<close>
+
+
 subsection{* Theories *}


@ -357,10 +493,8 @@ val _ =
      uncheck_terms = uncheck_terms});
 *)
  
-text{*
-Thus, Syntax_Phases does the actual work, including markup generation and 
-generation of reports. 
-
+text{* Thus, Syntax_Phases does the actual work, including 
+        markup generation and  generation of reports. 
 Look at: *}
 (*
 fun check_typs ctxt raw_tys =
@ -497,7 +631,16 @@ ML{*
        ("l", {offset=26, id=-2769}), ("}", {offset=27, id=-2769})]
     *)*} 
  
-subsection {*Scanning and combinator parsing. *}  
+subsection {*Scanning and combinator parsing. *} 
+text\<open>Is used on two levels: 
+\<^enum> outer syntax, that is the syntax in which Isar-commands are written, and 
+\<^enum> inner-syntax, that is the syntax in which lambda-terms, and in particular HOL-terms were written.
+\<close>
+text\<open> A constant problem for newbies is this distinction, which makes it necessary that
+  the " ... " quotes have to be used when switching to inner-syntax, except when only one literal
+  is expected when an inner-syntax term is expected. 
+\<close>
+
 ML\<open>
 Scan.peek  : ('a -> 'b -> 'c * 'd) -> 'a * 'b -> 'c * ('a * 'd);
 Scan.optional: ('a -> 'b * 'a) -> 'b -> 'a -> 'b * 'a;
@ -542,34 +685,46 @@ Parse.position: 'a parser -> ('a * Position.T) parser;
 Parse.position Args.cartouche_input;
 \<close>

-text{* Parsing combinators for elementary Isabelle Lexems*}  
-ML{* 
-Syntax.parse_sort;
-Syntax.parse_typ;
-Syntax.parse_term;
-Syntax.parse_prop;
-Syntax.check_term;
-Syntax.check_props;
-Syntax.uncheck_sort;
-Syntax.uncheck_typs;
-Syntax.uncheck_terms;
-Syntax.read_sort;
-Syntax.read_typ;
-Syntax.read_term;
-Syntax.read_typs;
-Syntax.read_sort_global;
-Syntax.read_typ_global;
-Syntax.read_term_global;
-Syntax.read_prop_global;
-Syntax.pretty_term;
-Syntax.pretty_typ;
-Syntax.pretty_sort;
-Syntax.pretty_classrel;
-Syntax.pretty_arity;
-Syntax.string_of_term;
-Syntax.string_of_typ;
-Syntax.lookup_const;
-*}
+text{* Inner Syntax Parsing combinators for elementary Isabelle Lexems*}  
+ML\<open>
+Syntax.parse_sort : Proof.context -> string -> sort;
+Syntax.parse_typ  : Proof.context -> string -> typ;
+Syntax.parse_term : Proof.context -> string -> term;
+Syntax.parse_prop : Proof.context -> string -> term;
+Syntax.check_term : Proof.context -> term -> term;
+Syntax.check_props: Proof.context -> term list -> term list;
+Syntax.uncheck_sort: Proof.context -> sort -> sort;
+Syntax.uncheck_typs: Proof.context -> typ list -> typ list;
+Syntax.uncheck_terms: Proof.context -> term list -> term list;\<close>
+
+text\<open>In contrast to mere parsing, the following operators provide also type-checking
+     and internal reporting to PIDE --- see below. I did not find a mechanism to address
+     the internal serial-numbers used for the PIDE protocol, however, rumours have it
+     that such a thing exists. The variants \<^verbatim>\<open>_global\<close> work on theories instead on 
+     @{ML_type "Proof.context"}s.\<close>
+
+ML\<open>
+Syntax.read_sort: Proof.context -> string -> sort;
+Syntax.read_typ : Proof.context -> string -> typ;
+Syntax.read_term: Proof.context -> string -> term;
+Syntax.read_typs: Proof.context -> string list -> typ list;
+Syntax.read_sort_global: theory -> string -> sort;
+Syntax.read_typ_global: theory -> string -> typ;
+Syntax.read_term_global: theory -> string -> term;
+Syntax.read_prop_global: theory -> string -> term;
+\<close>
+text \<open>The following operations are concerned with the conversion of pretty-prints
+and, from there, the generation of (non-layouted) strings.\<close>
+ML\<open>
+Syntax.pretty_term:Proof.context -> term -> Pretty.T;
+Syntax.pretty_typ:Proof.context -> typ -> Pretty.T;
+Syntax.pretty_sort:Proof.context -> sort -> Pretty.T;
+Syntax.pretty_classrel: Proof.context -> class list -> Pretty.T;
+Syntax.pretty_arity: Proof.context -> arity -> Pretty.T;
+Syntax.string_of_term: Proof.context -> term -> string;
+Syntax.string_of_typ: Proof.context -> typ -> string;
+Syntax.lookup_const : Syntax.syntax -> string -> string option;
+\<close>

 ML{*
 fun read_terms ctxt =
--- a/RegExpInterface.thy
+++ b/RegExpInterface.thy
@ -94,11 +94,14 @@ export_code  zero one Suc Int.nat  nat_of_integer int_of_integer  (* for debuggi
             rexp2na      na2da    enabled                        (* low-level automata interface *)
             NA.accepts   DA.accepts  

-             in SML  module_name RegExpChecker file "RegExpChecker.sml"
-
-
-SML_file "RegExpChecker.sml"
-SML_export \<open>structure RegExpChecker = RegExpChecker\<close>
+             in SML  module_name RegExpChecker 
+                     file "RegExpChecker.sml"                     (* writing it to a file *)
+                                                                  
+(* potentially susceptible to race conditions ... *)                                                                 
+SML_file "RegExpChecker.sml"                                      (* reads and eval generated file  
+                                                                     into SML toplevel  *)
+SML_export \<open>structure RegExpChecker = RegExpChecker\<close>              (* copies from SML toplevel into 
+                                                                     Isabelle/ML toplevel *)

 section\<open>The Abstract Interface For Monitor Expressions\<close>
 text\<open>Here comes the hic : The reflection of the HOL-Automata module into an SML module 
@ -107,12 +110,6 @@ of the deterministic automata ...\<close>

 ML\<open> 

-
-(*
-use "RegExpChecker.sml";
-*)
-
-
 structure RegExpInterface : sig
    type automaton
    type env 
--- a/examples/scholarly/IsaDofApplications.thy
+++ b/examples/scholarly/IsaDofApplications.thy
@ -44,8 +44,8 @@ on top of Isabelle. \isadof allows for conventional typesetting
 for enforcing a certain document structure, and discuss ontology-specific IDE support. 
 \<close>

-section*[intro::introduction]\<open> Introduction \<close>  
-text*[introtext::introduction]\<open> 
+section*[intro::introduction_title]\<open> Introduction \<close>  
+text*[introtext::introduction_elem]\<open> 
 The linking of the \<^emph>\<open>formal\<close> to the \<^emph>\<open>informal\<close> is perhaps the
 most pervasive challenge in the digitization of knowledge and its
 propagation. This challenge incites numerous research efforts
@ -94,7 +94,7 @@ declare_reference*[ontomod::text_section]
 declare_reference*[ontopide::text_section]
 declare_reference*[conclusion::text_section]
 (*>*)
-text*[plan::introduction]\<open> The plan of the paper is follows: we start by introducing  the underlying 
+text*[plan::introduction_elem]\<open> The plan of the paper is follows: we start by introducing  the underlying 
 Isabelel sytem (@{docitem_ref (unchecked) \<open>bgrnd\<close>}) followed by presenting the 
 essentials of  \isadof and its ontology language (@{docitem_ref (unchecked) \<open>isadof\<close>}). 
 It follows @{docitem_ref (unchecked) \<open>ontomod\<close>}, where we present three application 
@ -104,7 +104,7 @@ conclusions  and discuss related work in @{docitem_ref (unchecked) \<open>conclu

 section*[bgrnd::text_section,main_author="Some(@{docitem ''adb''}::author)"]
        \<open> Background: The Isabelle System \<close>
-text*[background::introduction]\<open>
+text*[background::introduction_elem]\<open>
 While Isabelle is widely perceived as an interactive theorem prover
 for HOL (Higher-order Logic)~@{cite "nipkow.ea:isabelle:2002"}, we
 would like to emphasize the view that Isabelle is far more than that:
@ -133,7 +133,7 @@ figure*[architecture::figure,relative_width="100",src="''figures/isabelle-archit
     asynchronous communication between the Isabelle system and 
     the IDE (right-hand side). \<close>

-text*[blug::introduction]\<open> The Isabelle system architecture shown in @{docitem_ref \<open>architecture\<close>}
+text*[blug::introduction_elem]\<open> The Isabelle system architecture shown in @{docitem_ref \<open>architecture\<close>}
 comes with many layers, with Standard ML (SML) at the bottom layer as implementation 
 language. The architecture actually foresees a \<^emph>\<open>Nano-Kernel\<close> (our terminology) which 
 resides in the SML structure \texttt{Context}. This structure provides a kind of container called 
--- a/ontologies/scholarly_paper.thy
+++ b/ontologies/scholarly_paper.thy
@ -24,11 +24,17 @@ doc_class abstract =
 doc_class text_section = 
   main_author :: "author option"  <=  None
   fixme_list  :: "string list"    <=  "[]" 
-   
+
 doc_class introduction = text_section +
   comment :: string
   claims  :: "thm list"

+doc_class introduction_title = introduction +
+   fixme_list  :: "string list"    <=  "[]" 
+
+doc_class introduction_elem = introduction +
+   fixme_list  :: "string list"    <=  "[]" 
+
 doc_class technical = text_section +
   definition_list :: "string list" <=  "[]"
   formal_results  :: "thm list"
@ -81,8 +87,10 @@ doc_class article =
           \<lbrakk>subtitle\<rbrakk>   ~~
           \<lbrace>author\<rbrace>\<^sup>+   ~~ 
           abstract     ~~
-           introduction ~~ 
-           \<lbrace>technical || example\<rbrace>\<^sup>+ ~~ 
+           introduction_title ~~ 
+           \<lbrace>introduction_elem\<rbrace>\<^sup>+ ~~ 
+           \<lbrace>technical || example\<rbrace>\<^sup>+ ~~
+           \<lbrakk>introduction\<rbrakk>   ~~
           conclusion   ~~  
           bibliography)"