Updated introduction and started content restructuring.

This commit is contained in:
Achim D. Brucker 2019-08-02 11:12:16 +01:00
parent 7c6a214e98
commit 13a6384cfe
9 changed files with 161 additions and 245 deletions

View File

@ -6,80 +6,65 @@ begin
chapter*[intro::introduction]\<open> Introduction \<close>
text*[introtext::introduction]\<open>
The linking of the \<^emph>\<open>formal\<close> to the \<^emph>\<open>informal\<close> is perhaps the
most pervasive challenge in the digitization of knowledge and its
propagation. This challenge incites numerous research efforts
summarized under the labels ``semantic web'', ``data mining'', or any
form of advanced ``semantic'' text processing. A key role in
structuring this linking play \<^emph>\<open>document ontologies\<close> (also called
\<^emph>\<open>vocabulary\<close> in the semantic web community~@{cite "w3c:ontologies:2015"}),
\ie, a machine-readable form of the structure of documents as well as
the document discourse.
The linking of the \<^emph>\<open>formal\<close> to the \<^emph>\<open>informal\<close> is perhaps the most pervasive challenge in the
digitization of knowledge and its propagation. This challenge incites numerous research efforts
summarized under the labels ``semantic web,'' ``data mining,'' or any form of advanced ``semantic''
text processing. A key role in structuring this linking play \<^emph>\<open>document ontologies\<close> (also called
\<^emph>\<open>vocabulary\<close> in the semantic web community~@{cite "w3c:ontologies:2015"}), \ie, a machine-readable
form of the structure of documents as well as the document discourse.
Such ontologies can be used for the scientific discourse within scholarly articles, mathematical
libraries, and in the engineering discourse of standardized software certification
documents~@{cite "boulanger:cenelec-50128:2015" and "cc:cc-part3:2006"}: certification documents
have to follow a structure. In practice, large groups of developers have to produce a substantial
set of documents where the consistency is notoriously difficult to maintain. In particular,
certifications are centered around the \emph{traceability} of requirements throughout the entire
set of documents. While technical solutions for the traceability problem exists (most notably:
DOORS~\cite{doors}), they are weak in the treatment of formal entities (such as formulas and their
logical contexts).
Such ontologies can be used for the scientific discourse within scholarly
articles, mathematical libraries, and in the engineering discourse
of standardized software certification
documents~@{cite "boulanger:cenelec-50128:2015" and "cc:cc-part3:2006"}.
Further applications are the domain-specific discourse in juridical texts or medical reports.
In general, an ontology is a formal explicit description of \<^emph>\<open>concepts\<close>
in a domain of discourse (called \<^emph>\<open>classes\<close>), properties of each concept
describing \<^emph>\<open>attributes\<close> of the concept, as well as \<^emph>\<open>links\<close> between
them. A particular link between concepts is the \<^emph>\<open>is-a\<close> relation declaring
In general, an ontology is a formal explicit description of \<^emph>\<open>concepts\<close> in a domain of discourse
(called \<^emph>\<open>classes\<close>), properties of each concept describing \<^emph>\<open>attributes\<close> of the concept, as well
as \<^emph>\<open>links\<close> between them. A particular link between concepts is the \<^emph>\<open>is-a\<close> relation declaring
the instances of a subclass to be instances of the super-class.
The main objective of this paper is to present \isadof, a novel
framework to \<^emph>\<open>model\<close> typed ontologies and to \<^emph>\<open>enforce\<close> them during
document evolution. Based on Isabelle infrastructures, ontologies may refer to
types, terms, proven theorems, code, or established assertions.
Based on a novel adaption of the Isabelle IDE, a document is checked to be
\<^emph>\<open>conform\<close> to a particular ontology---\isadof is designed to give fast user-feedback
\<^emph>\<open>during the capture of content\<close>. This is particularly valuable in case of document
evolution, where the \<^emph>\<open>coherence\<close> between the formal and the informal parts of the
\isadof is a novel framework, extending of Isabelle/HOL, to \<^emph>\<open>model\<close> typed ontologies and to
\<^emph>\<open>enforce\<close> them during document evolution. Based on Isabelle infrastructures, ontologies may refer
to types, terms, proven theorems, code, or established assertions. Based on a novel adaption of the
Isabelle IDE, a document is checked to be \<^emph>\<open>conform\<close> to a particular ontology---\isadof is designed
to give fast user-feedback \<^emph>\<open>during the capture of content\<close>. This is particularly valuable in case
of document evolution, where the \<^emph>\<open>coherence\<close> between the formal and the informal parts of the
content can be mechanically checked.
To avoid any misunderstanding: \isadof is \<^emph>\<open>not a theory in HOL\<close>
on ontologies and operations to track and trace links in texts,
it is an \<^emph>\<open>environment to write structured text\<close> which \<^emph>\<open>may contain\<close>
Isabelle/HOL definitions and proofs like mathematical articles, tech-reports and
scientific papers---as the present one, which is written in \isadof
itself. \isadof is a plugin into the Isabelle/Isar
framework in the style of~@{cite "wenzel.ea:building:2007"}.\<close>
text\<open>This manual adresses at three different types of users:
\<^enum> users that just want to edit a core document, be it for a paper or a technical report,
using a given ontology,
\<^enum> users that want to develop ontologies and/or modify the generated PDF-presentations,
\<^enum> users that want to add text-elements or new features to \isadof.
This manual gives priority to the former two groups; users with an interest in \isadof implementation
might find complementary information in @{cite "brucker.wolff19:isadof-design-impl:2019"}.
To avoid any misunderstanding: \isadof is \<^emph>\<open>not a theory in HOL\<close> on ontologies and operations to
track and trace links in texts, it is an \<^emph>\<open>environment to write structured text\<close> which
\<^emph>\<open>may contain\<close> Isabelle/HOL definitions and proofs like mathematical articles, tech-reports and
scientific papers---as the present one, which is written in \isadof itself. \isadof is a plugin
into the Isabelle/Isar framework in the style of~@{cite "wenzel.ea:building:2007"}.\<close>
subsubsection\<open>How to Read This Manual\<close>
(*<*)
declare_reference*[background::text_section]
declare_reference*[isadof_tour::text_section]
declare_reference*[isadof_ontologies::text_section]
declare_reference*[isadof_developers::text_section]
(*>*)
text\<open>
This manual can be read in different ways, depending on what you want to accomplish. We see three
different main user groups:
\<^enum> \<^emph>\<open>\isadof users\<close>, \ie, users that just want to edit a core document, be it for a paper or a
technical report, using a given ontology. These users should focus on
@{docitem_ref (unchecked) \<open>isadof_tour\<close>} and, depending on their knowledge of Isabelle/HOL, also
@{docitem_ref (unchecked) \<open>background\<close>}.
\<^enum> \<^emph>\<open>Ontology developers\<close>, \ie, users that want to develop new ontologies or modify existing
document ontologies. These users should, after having gained a acquaintance as a user, focus
on @{docitem_ref (unchecked) \<open>isadof_ontologies\<close>}.
\<^enum> \<^emph>\<open>\isadof developers\<close>, \ie, users that want to extend or modify \isadof, \eg, by adding new
text-elements. These users should read @{docitem_ref (unchecked) \<open>isadof_developers\<close>}
\<close>
(* declaring the forward references used in the subsequent section *)
(*<*)
declare_reference*[bgrnd::text_section]
declare_reference*[isadof::text_section]
declare_reference*[casestudies::text_section]
declare_reference*[latex::text_section]
declare_reference*[conclusion::text_section]
(*>*)
text*[plan::introduction]\<open> The plan of the paper is follows: we start by introducing the underlying
Isabelle system (@{docitem_ref (unchecked) \<open>bgrnd\<close>}) followed guided tour or tutorial
adressing the needs of the first intended user group.
It follows the chapter @{docitem_ref (unchecked) \<open>isadof\<close>} for the
first user group needing a more systematic introduction as well as the
second user group with essentials of \isadof and its ontology language
(@{docitem_ref (unchecked) \<open>isadof\<close>}).
It follows @{docitem_ref (unchecked) \<open>latex\<close>}, where we necessary bits on the LaTeX
generation and ways to adapt it to particular purposes.
Finally, we draw
conclusions and discuss related work in @{docitem_ref (unchecked) \<open>conclusion\<close>}. \<close>
section\<open>Typographical Conventions\<close>
subsubsection\<open>Typographical Conventions\<close>
text\<open>
We acknowledge that understanding \isadof and its implementation in all details requires
to separate multiple technological layers or languages. To help the reader with this, we
@ -103,7 +88,8 @@ text\<open>
\end{ltx}
\<^item> For shell scripts and interative shell sessions, we use a grey background:
\begin{bash}
> ls
achim@logicalhacking:~/Isabelle_DOF$ ls
CHANGELOG.md CITATION examples install LICENSE README.md ROOTS src
\end{bash}
\<close>

View File

@ -4,10 +4,9 @@ theory "02_Background"
begin
(*>*)
chapter*[bgrnd::text_section,main_author="Some(@{docitem ''adb''}::author)"]
\<open> Background: The Isabelle System \<close>
chapter*[background::text_section]\<open> Background: The Isabelle System \<close>
section*[bgrnd1::introduction]\<open>Document Processing in Isabelle\<close>
text*[background::introduction]\<open>
text*[bg::introduction]\<open>
While Isabelle @{cite "nipkow.ea:isabelle:2002"} is widely perceived as an interactive theorem prover
for HOL (Higher-order Logic)~@{cite "nipkow.ea:isabelle:2002"}, we
would like to emphasize the view that Isabelle is far more than that:

View File

@ -4,7 +4,7 @@ theory "03_GuidedTour"
begin
(*>*)
chapter*[casestudies::example,main_author="Some(@{docitem ''adb''}::author)"]\<open> \isadof: A Guided Tour \<close>
chapter*[isadof_tour::text_section]\<open>\isadof: A Guided Tour\<close>
text\<open> In this section, we will use the \isadof document ontology language
for three different application scenarios: for scholarly papers, for mathematical

View File

@ -5,9 +5,7 @@ theory "04_RefMan"
begin
(*>*)
chapter*[isadof::technical,main_author="Some(@{docitem ''bu''}::author)"]
\<open> \isadof : Syntax and Semantics\<close>
chapter*[isadof_ontologies::text_section]\<open>Developing Ontologies and Document Representations\<close>
text\<open> \isadof is embedded in the underlying generic
document model of Isabelle as described in @{docitem "sec:background"}.
@ -670,6 +668,98 @@ assert*[aaa::F] "0 < (4::int)"
subsection*["core-manual2"::technical]\<open>Examples\<close>
section*[latex1::technical]\<open>How to adapt \isadof to a new document style file\<close>
text\<open>Current \isadof comes with a number of setups for the PDF generation, notably
Springer LNCS, lipics, ..., as well as setupqs of technical reports this the present one.
The question arises, what to do if \isadof has to be accoustomed to a new LaTeX style
file configuration in order to generate PDF.
\<close>
text\<open>
In theory, this is simple - in practice, the efforts required
depends mostly on two factors:
\<^item> how compatible is the required LaTeX class with Isabelle's LateX
setup in general (e.g., does it work with an antique version
of \<open>comment.sty\<close> required by Isabelle)
\<^item> how much magic does the required LaTeX class wrt the title page
information (author, affiliation).
\<^item> which ontologies should be supported. While most ontologies are
generic, some only support a very specific subset of LaTeX
templates (classes). For example, \<^theory_text>\<open>scholarly_paper\<close> does currently
\<^emph>\<open>only\<close> support \<open>llncs\<close>, \<open>rartcl\<close>, and \<open>lipics\<close>.
The general process as such is straight-forward:
\<^enum> From the supported LaTeX classes, try to find the one that is
most similar to the one that you want to support. For the sake
of the this example, let's assume this is llncs
\<^enum> Use the template for this class (llncs) as starting point, i.e.,
\<^verbatim>\<open>cp document-generator/document-template/root-lncs.tex
document-generator/document-template/root-eptcs.tex\<close>
The mandatory naming scheme for the templates is
\<^verbatim>\<open>root-<TEMPLATE_NAME>.tex\<close>
That is to say that \<^verbatim>\<open><TEMPLATE_NAME>\<close> should be the name of the
new LaTeX class (all lowercase preferred) that will be used
in config files and also will be shown in the help text
shown by executing
\<^verbatim>\<open>isabelle DOF_mkroot -h\<close>
\<^enum> Edit the new template as necessary (using the provided
example from the target class as reference):
\<^verbatim>\<open>vim document-generator/document-template/root-foo.tex\<close>
and do the needful.
\<^enum> Install the new template:
\<^verbatim>\<open>./install\<close>
(If you have a working installation of the required AFP entries
and the Isabelle/DOF patch, you can use \<^verbatim>\<open>./install -s\<close>
which will \<^emph>\<open>ONLY\<close> install the \<^verbatim>\<open>LaTeX styles/templates\<close>, see
\<^verbatim>\<open>./install -h)\<close>
\<^enum> Now the new template should be available, you can check this with
\<^verbatim>\<open>isabelle DOF_mkroot -h\<close>
\<^enum> Create an "tiny/empty" Isabelle project using the ontology "core"
and test your template. If everything works, celebrate. If it does
not work, understand what you need to change and continue
with step 3.
(of course, if the new class is not available in TeXLive, you need
to add them locally to the documents folder of your Isabelle
project and, as usual, it needs to be registered in your ROOTS
file)
\<^enum> If the ontologies that should be used together with this LaTeX
class do not require specific adaptions (e.g., CENELEC 50128),
everything should work. If one of the required ontology LaTeX
setups only supports a subset of LaTeX classes (e.g., \<^theory_text>\<open>scholarly_paper\<close>
due to the different setups for authors/affiliations blurb),
you need to develop support for you new class in the ontology
specific LaTeX styles, e.g.,
\<open>document-generator/latex/DOF-scholarly_paper.sty\<close>
(mandatory naming convention: \<open>DOF-<ONTOLOGY_NAME>.sty\<close>)
\<^enum> After updating the ontology style (or styles), you need
to install them (again, you might want to use ./install -s)
and test your setup with a paper configuration using
your new LaTeX template and the required styles. In case
of errors, go back to step 7.
\<close>
(*<*)
end
(*>*)

View File

@ -1,12 +1,11 @@
(*<*)
theory "04_IsaDofImpl"
imports "02_Background"
theory "05_Implementation"
imports "04_RefMan"
begin
(*>*)
chapter*[impl1::introduction,main_author="Some(@{docitem ''adb''}::author)"]
\<open>Isabelle Ontology Framework \isadof\<close>
chapter*[isadof_developers::text_section]\<open>Extending \isadof\<close>
text\<open>
In this section, we introduce our framework, called \isadof. \isadof
is based on several design-decisions:
@ -108,7 +107,7 @@ smoothlessly integrated into standard programs and allows for dynamic
grammar extensions. There is a more high-level structure
\inlinesml{Parse} providing specific combinators for the
command-language Isar:
\begin{sml}
\begin{sml}[mathescape=false]
val attribute = Parse.position Parse.name
-- Scan.optional (Parse.$$$ "=" |-- Parse.!!! Parse.name) "";
val reference = Parse.position Parse.name
@ -186,4 +185,4 @@ SML code is generated \emph{automatically} from an \isadof ontology
definition introduced in the subsequent section.
\<close>
end
end

View File

@ -1,107 +0,0 @@
(*<*)
theory "05_IsaDofLaTeX"
imports "04_RefMan"
begin
(*>*)
chapter*[latex::technical,main_author="Some(@{docitem ''adb''}::author)"]\<open> PDF Generation with \isadof \<close>
section*[latex1::technical]\<open>How to adapt \isadof to a new document style file\<close>
text\<open>Current \isadof comes with a number of setups for the PDF generation, notably
Springer LNCS, lipics, ..., as well as setupqs of technical reports this the present one.
The question arises, what to do if \isadof has to be accoustomed to a new LaTeX style
file configuration in order to generate PDF.
\<close>
text\<open>
In theory, this is simple - in practice, the efforts required
depends mostly on two factors:
\<^item> how compatible is the required LaTeX class with Isabelle's LateX
setup in general (e.g., does it work with an antique version
of \<open>comment.sty\<close> required by Isabelle)
\<^item> how much magic does the required LaTeX class wrt the title page
information (author, affiliation).
\<^item> which ontologies should be supported. While most ontologies are
generic, some only support a very specific subset of LaTeX
templates (classes). For example, \<^theory_text>\<open>scholarly_paper\<close> does currently
\<^emph>\<open>only\<close> support \<open>llncs\<close>, \<open>rartcl\<close>, and \<open>lipics\<close>.
The general process as such is straight-forward:
\<^enum> From the supported LaTeX classes, try to find the one that is
most similar to the one that you want to support. For the sake
of the this example, let's assume this is llncs
\<^enum> Use the template for this class (llncs) as starting point, i.e.,
\<^verbatim>\<open>cp document-generator/document-template/root-lncs.tex
document-generator/document-template/root-eptcs.tex\<close>
The mandatory naming scheme for the templates is
\<^verbatim>\<open>root-<TEMPLATE_NAME>.tex\<close>
That is to say that \<^verbatim>\<open><TEMPLATE_NAME>\<close> should be the name of the
new LaTeX class (all lowercase preferred) that will be used
in config files and also will be shown in the help text
shown by executing
\<^verbatim>\<open>isabelle DOF_mkroot -h\<close>
\<^enum> Edit the new template as necessary (using the provided
example from the target class as reference):
\<^verbatim>\<open>vim document-generator/document-template/root-foo.tex\<close>
and do the needful.
\<^enum> Install the new template:
\<^verbatim>\<open>./install\<close>
(If you have a working installation of the required AFP entries
and the Isabelle/DOF patch, you can use \<^verbatim>\<open>./install -s\<close>
which will \<^emph>\<open>ONLY\<close> install the \<^verbatim>\<open>LaTeX styles/templates\<close>, see
\<^verbatim>\<open>./install -h)\<close>
\<^enum> Now the new template should be available, you can check this with
\<^verbatim>\<open>isabelle DOF_mkroot -h\<close>
\<^enum> Create an "tiny/empty" Isabelle project using the ontology "core"
and test your template. If everything works, celebrate. If it does
not work, understand what you need to change and continue
with step 3.
(of course, if the new class is not available in TeXLive, you need
to add them locally to the documents folder of your Isabelle
project and, as usual, it needs to be registered in your ROOTS
file)
\<^enum> If the ontologies that should be used together with this LaTeX
class do not require specific adaptions (e.g., CENELEC 50128),
everything should work. If one of the required ontology LaTeX
setups only supports a subset of LaTeX classes (e.g., \<^theory_text>\<open>scholarly_paper\<close>
due to the different setups for authors/affiliations blurb),
you need to develop support for you new class in the ontology
specific LaTeX styles, e.g.,
\<open>document-generator/latex/DOF-scholarly_paper.sty\<close>
(mandatory naming convention: \<open>DOF-<ONTOLOGY_NAME>.sty\<close>)
\<^enum> After updating the ontology style (or styles), you need
to install them (again, you might want to use ./install -s)
and test your setup with a paper configuration using
your new LaTeX template and the required styles. In case
of errors, go back to step 7.
\<close>
(*<*)
end
(*>*)

View File

@ -1,58 +0,0 @@
(*<*)
theory "06_Conclusion"
imports "05_IsaDofLaTeX"
begin
(*>*)
chapter*[conclusion::conclusion]\<open> Conclusion and Related Work\<close>
text\<open> We have demonstrated the use of \isadof, a novel ontology modeling and enforcement
IDE deeply integrated into the Isabelle/Isar Framework. The two most distinguishing features are
\<^item> \isadof and its ontology language are a strongly typed language that allows
for referring (albeit not reasoning) to entities of Isabelle/HOL, most notably types, terms,
and (formally proven) theorems, and
\<^item> \isadof is supported by the Isabelle/PIDE framework; thus, the advantages of an IDE for
text-exploration (which is the type of this link? To which text element does this link refer?
Which are the syntactic alternatives here?) were available during editing
instead of a post-hoc validation process.\<close>
text\<open> Of course, a conventional batch-process also exists which can be used
for the validation of large document bases in a conventional continuous build process.
This combination of formal and semi-informal elements, as well as a systematic enforcement
of the coherence to a document ontology of the latter, is, as we believe, novel and offers
a unique potential for the semantic treatment of scientific texts and technical documentations. \<close>
text\<open> To our knowledge, this is the first ontology-driven framework for editing mathematical and
technical documents that focuses particularly on documents mixing formal and informal content---a
type of documents that is very common in technical certification processes. We see mainly one
area of related works: IDEs and text editors that support editing and checking of documents based
on an ontology. There is a large group of ontology editors (\eg, Prot{\'e}g{\'e}~@{cite "protege"},
Fluent Editor~@{cite "cognitum"}, NeOn~@{cite "neon"}, or OWLGrEd~@{cite "owlgred"}). With them,
we share the support for defining ontologies as well as auto-completion when editing documents
based on an ontology. While our ontology definitions are currently based on a textual definition,
widely used ontology editors (\eg, OWLGrEd~@{cite "owlgred"}) also support graphical notations.
This could be added to \isadof in the future. A unique feature of \isadof is the deep integration
of formal and informal text parts. The only other work in this area we are aware of is
rOntorium~@{cite "rontorium"}, a plugin for Prot{\'e}g{\'e} that integrates
R~@{cite "adler:r:2010"} into an ontology environment. Here, the main motivation behind this
integration is to allow for statistically analyze ontological documents. Thus, this is
complementary to our work. \<close>
text\<open> \isadof in its present form has a number of technical short-comings as well
as potentials not yet explored. On the long list of the short-comings is the
fact that tables are at present not supported, and a test-execution environment
for external test-executions is missing. For the moment, \isadof is conceived as an
add-on for Isabelle/HOL; a much deeper integration of \isadof into Isabelle
could increase both performance and uniformity. Finally, different target
presentation (such as HTML) would be highly desirable in particular for the
math exam scenarios. And last but not least, it would be desirable that PIDE
itself is ``ontology-aware'' and can, for example, use meta-information
to control read- and write accesses of \<^emph>\<open>parts\<close> of documents. An important
other direction to explore is \<^emph>\<open>out-of-order\<close> presentation, \ie{} a systematic
approach to present a document in another than the usual linear \<^emph>\<open>definition-before-use\<close>
order.\<close>
(*<*)
end
(*>*)

View File

@ -1,6 +1,6 @@
(*<*)
theory "Isabelle_DOF-Manual"
imports "06_Conclusion"
imports "05_Implementation"
begin
(*<*)

View File

@ -474,3 +474,10 @@
organisations and division of responsibility involved in
the development, deployment and maintenanceactivities.}
}
@Misc{ doors,
title = {IBM Engineering Requirements Management DOORS Family},
note = {\url{https://www.ibm.com/us-en/marketplace/requirements-management}},
year = 2019
}