Many little things
Isabelle_DOF/Isabelle_DOF/master This commit looks good Details

This commit is contained in:
Burkhart Wolff 2019-07-23 09:41:15 +02:00
parent 68f865557b
commit bf4255dbf9
2 changed files with 143 additions and 136 deletions

View File

@ -216,21 +216,18 @@ text\<open>
subsection*["odl-design"::example]\<open>An Ontology Example\<close>
text\<open>
text\<open>We illustrate the design of \dof by modeling a small ontology
that can be used for writing formal specifications that, \eg, could
build the basis for an ontology for certification documents used in
processes such as Common Criteria~@{cite "cc:cc-part3:2006"} or CENELEC
50128~@{cite "bsi:50128:2014"}.@{footnote \<open>The \isadof distribution
contains an ontology for writing documents for a certification
according to CENELEC 50128.\<close>} Moreover, in examples of certification
documents, we refer to a controller of a steam boiler that is
inspired by the famous steam boiler formalization
challenge~@{cite "abrial:steam-boiler:1996"}.\<close>
We illustrate the design of \dof by modeling a small ontology
that can be used for writing formal specifications that, \eg, could
build the basis for an ontology for certification documents used in
processes such as Common Criteria~@{cite "cc:cc-part3:2006"} or CENELEC
50128~@{cite "bsi:50128:2014"}.@{footnote \<open>The \isadof distribution
contains an ontology for writing documents for a certification
according to CENELEC 50128.\<close>} Moreover, in examples of certification
documents, we refer to a controller of a steam boiler that is
inspired by the famous steam boiler formalization
challenge~@{cite "abrial:steam-boiler:1996"}.
\<close>
text\<open>
\begin{isar}[float,mathescape,label={lst:doc},caption={An example ontology modeling
simple certification documents, including scientific papers such
as~@{cite "brucker.ea:isabelle-ontologies:2018"}; also recall
@ -273,107 +270,103 @@ doc_class "conclusion" = text_section +
\<close>
text\<open>
\autoref{lst:doc} shows an example ontology for mathematical papers
(an extended version of this ontology was used for writing
@{cite "brucker.ea:isabelle-ontologies:2018"}, also recall
\autoref{fig:dof-ide}). The commands \inlineisar+datatype+ (modeling
fixed enumerations) and \inlineisar+type_synonym+ (defining type
synonyms) are standard mechanisms in HOL systems. Since ODL is an
add-on, we have to quote sometimes constant symbols (\eg,
\inlineisar+"proof"+) to avoid confusion with predefined keywords. ODL
admits overriding (such as \inlineisar+authored_by+ in the document
class \inlineisar+introduction+), where it is set to another library
constant, but no overloading. All \inlineisar+text_section+ elements
have an optional \inlineisar+level+ attribute, which will be used in
the output generation for the decision if the context is a section
header and its level (\eg, chapter, section, subsection). While
\<open>within\<close> an inheritance hierarchy overloading is prohibited,
attributes may be re-declared freely in independent parts (as is the
case for \inlineisar+property+).
\<close>
\autoref{lst:doc} shows an example ontology for mathematical papers
(an extended version of this ontology was used for writing
@{cite "brucker.ea:isabelle-ontologies:2018"}, also recall
\autoref{fig:dof-ide}). The commands \inlineisar+datatype+ (modeling
fixed enumerations) and \inlineisar+type_synonym+ (defining type
synonyms) are standard mechanisms in HOL systems. Since ODL is an
add-on, we have to quote sometimes constant symbols (\eg,
\inlineisar+"proof"+) to avoid confusion with predefined keywords. ODL
admits overriding (such as \inlineisar+authored_by+ in the document
class \inlineisar+introduction+), where it is set to another library
constant, but no overloading. All \inlineisar+text_section+ elements
have an optional \inlineisar+level+ attribute, which will be used in
the output generation for the decision if the context is a section
header and its level (\eg, chapter, section, subsection). While
\<open>within\<close> an inheritance hierarchy overloading is prohibited,
attributes may be re-declared freely in independent parts (as is the
case for \inlineisar+property+).\<close>
subsection*["sec:advanced"::technical]\<open>Advanced ODL Concepts\<close>
subsubsection\<open>Meta-types as Types\<close>
text\<open>To express the dependencies between text elements to the formal
entities, \eg, \inlinesml+term+ ($\lambda$-term), \inlinesml+typ+, or
\inlinesml+thm+, we represent the types of the implementation language
\<^emph>\<open>inside\<close> the HOL type system. We do, however, not reflect the data of
these types. They are just declared abstract types,
``inhabited'' by special constant symbols carrying strings, for
example of the format \inlineisar+<AT>{thm <string>}+. When HOL
expressions were used to denote values of \inlineisar+doc_class+
instance attributes, this requires additional checks after
conventional type-checking that this string represents actually a
defined entity in the context of the system state
\inlineisar+\<theta>+. For example, the \inlineisar+establish+
attribute in the previous section is the power of the ODL: here, we model
a relation between \<^emph>\<open>claims\<close> and \<^emph>\<open>results\<close> which may be a
formal, machine-check theorem of type \inlinesml+thm+ denoted by, for
example: \inlineisar+property = "[<AT>{thm ''system_is_safe''}]"+ in a
system context \inlineisar+\<theta>+ where this theorem is
established. Similarly, attribute values like
\inlineisar+property = "<AT>{term \<Open>A \<leftrightarrow> B\<Close>}"+
require that the HOL-string \inlineisar+A \<leftrightarrow> B+ is
again type-checked and represents indeed a formula in $\theta$. Another instance of
this process, which we call \<open>second-level type-checking\<close>, are term-constants
generated from the ontology such as
\inlineisar+<AT>{definition <string>}+. For the latter, the argument string
must be checked that it represents a reference to a text-element
having the type \inlineisar+definition+ according to the ontology in @{example "odl-design"}.\<close>
entities, \eg, \inlinesml+term+ ($\lambda$-term), \inlinesml+typ+, or
\inlinesml+thm+, we represent the types of the implementation language
\<^emph>\<open>inside\<close> the HOL type system. We do, however, not reflect the data of
these types. They are just declared abstract types,
``inhabited'' by special constant symbols carrying strings, for
example of the format \inlineisar+<AT>{thm <string>}+. When HOL
expressions were used to denote values of \inlineisar+doc_class+
instance attributes, this requires additional checks after
conventional type-checking that this string represents actually a
defined entity in the context of the system state
\inlineisar+\<theta>+. For example, the \inlineisar+establish+
attribute in the previous section is the power of the ODL: here, we model
a relation between \<^emph>\<open>claims\<close> and \<^emph>\<open>results\<close> which may be a
formal, machine-check theorem of type \inlinesml+thm+ denoted by, for
example: \inlineisar+property = "[<AT>{thm ''system_is_safe''}]"+ in a
system context \inlineisar+\<theta>+ where this theorem is
established. Similarly, attribute values like
\inlineisar+property = "<AT>{term \<Open>A \<leftrightarrow> B\<Close>}"+
require that the HOL-string \inlineisar+A \<leftrightarrow> B+ is
again type-checked and represents indeed a formula in $\theta$. Another instance of
this process, which we call \<open>second-level type-checking\<close>, are term-constants
generated from the ontology such as
\inlineisar+<AT>{definition <string>}+. For the latter, the argument string
must be checked that it represents a reference to a text-element
having the type \inlineisar+definition+ according to the ontology in @{example "odl-design"}.\<close>
subsubsection*["sec:monitors"::technical]\<open>ODL Monitors\<close>
text\<open>
We call a document class with an accept-clause a
\<^emph>\<open>monitor\<close>. Syntactically, an accept-clause contains a regular
expression over class identifiers. We can extend our
\inlineisar+tiny_cert+ ontology with the following example:
\begin{isar}
doc_class article =
style_id :: string <= "''CENELEC50128''"
accepts "(title ~~ \<lbrace>author\<rbrace>\<bsup>+\<esup> ~~ abstract ~~ \<lbrace>introduction\<rbrace>\<bsup>+\<esup> ~~
\<lbrace>technical || example\<rbrace>\<bsup>+\<esup> ~~ \<lbrace>conclusion\<rbrace>\<bsup>+\<esup>)"
\end{isar}
Semantically, monitors introduce a behavioral element into ODL:
\begin{isar}
open_monitor*[this::article] (* begin of scope of monitor "this" *)
...
close_monitor*[this] (* end of scope of monitor "this" *)
\end{isar}
Inside the scope of a monitor, all instances of classes mentioned in its accept-clause (the
\<^emph>\<open>accept-set\<close>) have to appear in the order specified by the
regular expression; instances not covered by an accept-set may freely
occur. Monitors may additionally contain a reject-clause with a list
of class-ids (the reject-list). This allows specifying ranges of
admissible instances along the class hierarchy:
\begin{compactitem}
\item a superclass in the reject-list and a subclass in the
accept-expression forbids instances superior to the subclass, and
\item a subclass $S$ in the reject-list and a superclass $T$ in the
accept-list allows instances of superclasses of $T$ to occur freely,
instances of $T$ to occur in the specified order and forbids
instances of $S$.
\end{compactitem}
Monitored document sections can be nested and overlap; thus, it is
possible to combine the effect of different monitors. For example, it
would be possible to refine the \inlineisar+example+ section by its own
monitor and enforce a particular structure in the presentation of
examples.
Monitors manage an implicit attribute \inlineisar+trace+ containing
the list of ``observed'' text element instances belonging to the
accept-set. Together with the concept of ODL class invariants, it is
possible to specify properties of a sequence of instances occurring in
the document section. For example, it is possible to express that in
the sub-list of \inlineisar+introduction+-elements, the first has an
\inlineisar+introduction+ element with a \inlineisar+level+ strictly
smaller than the others. Thus, an introduction is forced to have a
header delimiting the borders of its representation. Class invariants
on monitors allow for specifying structural properties on document
sections.
\<close>
text\<open>We call a document class with an accept-clause a
\<^emph>\<open>monitor\<close>. Syntactically, an accept-clause contains a regular
expression over class identifiers. We can extend our
\inlineisar+tiny_cert+ ontology with the following example:
\begin{isar}
doc_class article =
style_id :: string <= "''CENELEC50128''"
accepts "(title ~~ \<lbrace>author\<rbrace>\<bsup>+\<esup> ~~ abstract ~~ \<lbrace>introduction\<rbrace>\<bsup>+\<esup> ~~
\<lbrace>technical || example\<rbrace>\<bsup>+\<esup> ~~ \<lbrace>conclusion\<rbrace>\<bsup>+\<esup>)"
\end{isar}
Semantically, monitors introduce a behavioral element into ODL:
\begin{isar}
open_monitor*[this::article] (* begin of scope of monitor "this" *)
...
close_monitor*[this] (* end of scope of monitor "this" *)
\end{isar}
Inside the scope of a monitor, all instances of classes mentioned in its accept-clause (the
\<^emph>\<open>accept-set\<close>) have to appear in the order specified by the
regular expression; instances not covered by an accept-set may freely
occur. Monitors may additionally contain a reject-clause with a list
of class-ids (the reject-list). This allows specifying ranges of
admissible instances along the class hierarchy:
\begin{compactitem}
\item a superclass in the reject-list and a subclass in the
accept-expression forbids instances superior to the subclass, and
\item a subclass $S$ in the reject-list and a superclass $T$ in the
accept-list allows instances of superclasses of $T$ to occur freely,
instances of $T$ to occur in the specified order and forbids
instances of $S$.
\end{compactitem}
Monitored document sections can be nested and overlap; thus, it is
possible to combine the effect of different monitors. For example, it
would be possible to refine the \inlineisar+example+ section by its own
monitor and enforce a particular structure in the presentation of
examples.
Monitors manage an implicit attribute \inlineisar+trace+ containing
the list of ``observed'' text element instances belonging to the
accept-set. Together with the concept of ODL class invariants, it is
possible to specify properties of a sequence of instances occurring in
the document section. For example, it is possible to express that in
the sub-list of \inlineisar+introduction+-elements, the first has an
\inlineisar+introduction+ element with a \inlineisar+level+ strictly
smaller than the others. Thus, an introduction is forced to have a
header delimiting the borders of its representation. Class invariants
on monitors allow for specifying structural properties on document
sections.\<close>
subsubsection*["sec:class_inv"::technical]\<open>ODL Class Invariants\<close>
@ -557,9 +550,9 @@ referencing it, although the actual text-element will occur later in
the document.
\<close>
text \<open>
\subsection{Editing Documents with Ontology Meta-Data: Inner Syntax}
We continue our running example as follows:
subsection\<open>Editing Documents with Ontology Meta-Data: Inner Syntax\<close>
text\<open>We continue our running example as follows:
\begin{isar}[mathescape]
text*[d1::definition]\<Open>
We define the water level <AT>{term "level"} of a system state
@ -591,7 +584,6 @@ and \<^emph>\<open>results\<close> required in the \inlineisar|establish|-relati
required in a \inlineisar|summary|.
\<close>
subsection*["core-manual1"::technical]\<open>Annotatable Test-Elements: Syntax\<close>
text\<open>The syntax of toplevel \isadof commands reads as follows:
@ -627,6 +619,8 @@ text\<open>The syntax of toplevel \isadof commands reads as follows:
subsection*["commonlib"::technical]\<open>Common Ontology Library (COL)\<close>
subsection*["core-manual2"::technical]\<open>Examples\<close>

View File

@ -13,43 +13,56 @@ IDE deeply integrated into the Isabelle/Isar Framework. The two most distinguish
\<^item> \isadof is supported by the Isabelle/PIDE framework; thus, the advantages of an IDE for
text-exploration (which is the type of this link? To which text element does this link refer?
Which are the syntactic alternatives here?) were available during editing
instead of a post-hoc validation process.
\<close>
instead of a post-hoc validation process.\<close>
text\<open> Of course, a conventional batch-process also exists which can be used
for the validation of large document bases in a conventional continuous build process.
This combination of formal and semi-informal elements, as well as a systematic enforcement
of the coherence to a document ontology of the latter, is, as we believe, novel and offers
a unique potential for the semantic treatment of scientific texts and technical documentations. \<close>
for the validation of large document bases in a conventional continuous build process.
This combination of formal and semi-informal elements, as well as a systematic enforcement
of the coherence to a document ontology of the latter, is, as we believe, novel and offers
a unique potential for the semantic treatment of scientific texts and technical documentations. \<close>
text\<open>
To our knowledge, this is the first ontology-driven framework for editing mathematical and technical
documents that focuses particularly on documents mixing formal and informal content---a type of
documents that is very common in technical certification processes. We see mainly one area of
related works: IDEs and text editors that support editing and checking of documents based on an
ontology. There is a large group of ontology editors (\eg, Prot{\'e}g{\'e}~@{cite "protege"},
Fluent Editor~@{cite "cognitum"}, NeOn~@{cite "neon"}, or OWLGrEd~@{cite "owlgred"}). With them,
we share the support for defining ontologies as well as auto-completion when editing documents
based on an ontology. While our ontology definitions are currently based on a textual definition,
widely used ontology editors (\eg, OWLGrEd~@{cite "owlgred"}) also support graphical notations.
This could be added to \isadof in the future. A unique feature of \isadof is the deep integration
of formal and informal text parts. The only other work in this area we are aware of is
rOntorium~@{cite "rontorium"}, a plugin for Prot{\'e}g{\'e} that integrates
R~@{cite "adler:r:2010"} into an ontology environment. Here, the main motivation behind this
integration is to allow for statistically analyze ontological documents. Thus, this is
complementary to our work. \<close>
text\<open> To our knowledge, this is the first ontology-driven framework for editing mathematical and
technical documents that focuses particularly on documents mixing formal and informal content---a
type of documents that is very common in technical certification processes. We see mainly one
area of related works: IDEs and text editors that support editing and checking of documents based
on an ontology. There is a large group of ontology editors (\eg, Prot{\'e}g{\'e}~@{cite "protege"},
Fluent Editor~@{cite "cognitum"}, NeOn~@{cite "neon"}, or OWLGrEd~@{cite "owlgred"}). With them,
we share the support for defining ontologies as well as auto-completion when editing documents
based on an ontology. While our ontology definitions are currently based on a textual definition,
widely used ontology editors (\eg, OWLGrEd~@{cite "owlgred"}) also support graphical notations.
This could be added to \isadof in the future. A unique feature of \isadof is the deep integration
of formal and informal text parts. The only other work in this area we are aware of is
rOntorium~@{cite "rontorium"}, a plugin for Prot{\'e}g{\'e} that integrates
R~@{cite "adler:r:2010"} into an ontology environment. Here, the main motivation behind this
integration is to allow for statistically analyze ontological documents. Thus, this is
complementary to our work. \<close>
text\<open> \isadof in its present form has a number of technical short-comings as well
as potentials not yet explored. On the long list of the short-comings is the
fact that strings inside HOL-terms do not support, for example, Unicode.
For the moment, \isadof is conceived as an
fact that tables are at present not supported, and a test-execution environment
for external test-executions is missing. For the moment, \isadof is conceived as an
add-on for Isabelle/HOL; a much deeper integration of \isadof into Isabelle
could increase both performance and uniformity. Finally, different target
presentation (such as HTML) would be highly desirable in particular for the
math exam scenarios. And last but not least, it would be desirable that PIDE
itself is ``ontology-aware'' and can, for example, use meta-information
to control read- and write accesses of \<^emph>\<open>parts\<close> of documents.
\<close>
to control read- and write accesses of \<^emph>\<open>parts\<close> of documents. An important
other direction to explore is \<^emph>\<open>out-of-order\<close> presentation, \ie{} a systematic
approach to present a document in another than the usual linear \<^emph>\<open>definition-before-use\<close>
order.\<close>
subsubsection\<open>Availability.\<close>
text\<open> The implementation of the framework is available at
@{url \<open>https://git.logicalhacking.com/Isabelle_DOF/Isabelle_DOF/\<close>}.
\isadof is licensed under a 2-clause BSD license (SPDX-License-Identifier: BSD-2-Clause).\<close>
subsubsection\<open>Acknowledgments.\<close>
text\<open>This work has been partially supported by IRT SystemX, Paris-Saclay, France, and
therefore granted with public funds of the Program ``Investissements d'Avenir.''\<close>
(*<*)
end