This commit is contained in:
Achim D. Brucker 2019-01-08 15:54:38 +00:00
parent 5503dafbce
commit 773b20b918
1 changed files with 0 additions and 177 deletions

View File

@ -1,177 +0,0 @@
(*<*)
theory ConceptsAndImpl
imports "../../../ontologies/scholarly_paper"
begin
open_monitor*[this::article]
(*>*)
declare[[strict_monitor_checking=false]]
title*[tit::title]\<open>A Document Ontology Framework in Isabelle\<close>
subtitle*[stit::subtitle]\<open>Design and Implementation\<close>
text*[adb:: author,
email="''a.brucker@sheffield.ac.uk''",
orcid="''0000-0002-6355-1200''",
affiliation="''The University of Sheffield, Sheffield, UK''"]\<open>Achim D. Brucker\<close>
text*[bu::author,
email = "''wolff@lri.fr''",
affiliation = "''Universit\\'e Paris-Sud, Paris, France''"]\<open>Burkhart Wolff\<close>
text*[abs::abstract,
keywordlist="[''Semantic Web'',''Document Ontology'',''Formal Document Development'',''Isabelle/DOF'']"]\<open>
We present an extension of the Isabelle/Isar framework allowing both the
\<^emph>\<open>modeling\<close> of document ontologies as well as the their *\<open>enforcement\<close> inside theory documents
by a smooth integration into Isabelle's IDE. The resulting extension \<^verbatim>\<open>Isa_DOF\<close> provides a
strongly typed ontology definition language allowing to annotate a document element (or ``corpus'')
with meta-information that is validated during document development and maintenance.
Ontology definitions provide \<^emph>\<open>concepts\<close>, \<^emph>\<open>is-a\<close> relations and \<^emph>\<open>links\<close> between them,
as well as \<^emph>\<open>F-links\<close> from concepts to formal content such as types, terms and theorems.
Documents referring to an ontology are edited, validated and proof-checked within
Isabelle/HOL. Particular emphasis is put on a deep integration
into the Isabelle IDE to give immediate ontological feedback to the
developer of documents containing entities such as text, models,
formal proofs, and code.
The IDE animates \<^emph>\<open>links\<close> and \<^emph>\<open>F-links\<close> via hyper-references, and
controls the document structure by checking a kind of behavioural specification.
Sufficiently annotated, large documents are easier to be developed collaboratively
by continuously validating the \<^emph>\<open>coherence\<close> between formal and informal parts, and
the impact of changes can be better tracked automatically.
\<close>
(* Industrial Application ? *)
(* Support of document ontologies is provided for immediate
user-feedback when editing large documents with formal and
semi-formal content, be it for mathematical articles, exercises, or,
\eg, deliverables in a certified software engineering
process. *)
(*
@inproceedings{DBLP:conf/mkm/BlanchetteHMN15,
author = {Jasmin Christian Blanchette and
Maximilian P. L. Haslbeck and
Daniel Matichuk and
Tobias Nipkow},
title = {Mining the Archive of Formal Proofs},
booktitle = {Intelligent Computer Mathematics - International Conference, {CICM}
2015, Washington, DC, USA, July 13-17, 2015, Proceedings},
pages = {3--17},
year = {2015},
crossref = {DBLP:conf/mkm/2015},
url = {https://doi.org/10.1007/978-3-319-20615-8\_1},
doi = {10.1007/978-3-319-20615-8\_1},
timestamp = {Fri, 02 Nov 2018 09:40:47 +0100},
biburl = {https://dblp.org/rec/bib/conf/mkm/BlanchetteHMN15},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
*)
section*[intro::introduction]\<open> Introduction \<close>
text*[introtext::introduction]\<open>
With the maturation and growing power of interactive proof systems, the body of formalized
mathematics and engineering is dramatically increasing. The Isabelle Archive of Formal Proof (AFP),
for example, created in 2004, counted in 2015 the number of 215 articles
@{footnote \<open>... where "articles" are theory developments underlying a submission process
similar to the mizar journal \cite{}\<close>}, whereas the count stood at 413 only three years later.
An in-depth empirical analysis shows that both complexity and size increased accordingly
@{cite "DBLP:conf/mkm/BlanchetteHMN15"}. Since the entire AFP is part of the Isabelle regression
test suite and therefore maintained for the different releases, this body of knowledge is also
available in high technical consistency.
This raises a wider interest in the application of advanced "semantic Web" structuring, query
and mining techniques. Compared to other scientific disciplines like biology or medicine,
where large data-bases of genomes or scientific publications have been organized along
ontologies @{cite "..." and "..." } enabling queries, for example, there are still a number of
technical and social challenges to overcome in order to leverage this techniques in the
field of interactive theorem proving. \fixme{avoid ontology here.}
One of the main use of ontologies is annotation. Let us consider a set of entities available in
a given corpus. These entities may be sentences or paragraphs in a document, figures, tables,
definitions or lemmas in a document, etc. By annotation, we denote the link that may exist between
an ontology concept and a document element of the considered corpus.
The annotation process consists in defining and running a set of rules leading to the production of
annotations. This process may be completely automated, semi-automatic with user validation or completely
interactive.
\<close>
text\<open>
IDEA FOR RELATED WORK: a table with conceptual properties of ontology languages
Feature | Ontolingua | DAML+OIL | RDFS | OWL | PLIB | XML | Isa_ODL
------------------------------------------------------------------------
granularity | Word | | | | | Character | sentence (word)
relationships | | sentence - concept, is_A, (class-class)
strong typing | | on attributes and classes
links | | links , F-links
algebraic operators | | sets, lists, relations, HOL.
Constraints | | ML, executable HOL
CWA vs OWA | | OWA
Context Modeling | | Context import
Inheritance | | SINGLE
Instantiation | | MULTIPLE
\<close>
text*[intro_old::introduction]\<open>
The linking of the \<^emph>\<open>formal\<close> to the \<^emph>\<open>informal\<close> is perhaps the
most pervasive challenge in the digitization of knowledge and its
propagation. This challenge incites numerous research efforts
summarized under the labels ``semantic web'', ``data mining'', or any
form of advanced ``semantic'' text processing. A key role in
structuring this linking play \<^emph>\<open>document ontologies\<close> (also called
\<^emph>\<open>vocabulary\<close> in the semantic web community~@{cite "w3c:ontologies:2015"}),
\ie, a machine-readable form of the structure of documents as well as
the document discourse.
Such ontologies can be used for the scientific discourse within scholarly
articles, mathematical libraries, and in the engineering discourse
of standardized software certification
documents~@{cite "boulanger:cenelec-50128:2015" and "cc:cc-part3:2006"}.
Further applications are the domain-specific discourse in juridical texts or medical reports.
In general, an ontology is a formal explicit description of \<^emph>\<open>concepts\<close>
in a domain of discourse (called \<^emph>\<open>classes\<close>), properties of each concept
describing \<^emph>\<open>attributes\<close> of the concept, as well as \<^emph>\<open>links\<close> between
them. A particular link between concepts is the \<^emph>\<open>is-a\<close> relation declaring
the instances of a subclass to be instances of the super-class.
The main objective of this paper is to present \isadof, a novel
framework to \<^emph>\<open>model\<close> typed ontologies and to \<^emph>\<open>enforce\<close> them during
document evolution. Based on Isabelle infrastructures, ontologies may refer to
types, terms, proven theorems, code, or established assertions.
Based on a novel adaption of the Isabelle IDE, a document is checked to be
\<^emph>\<open>conform\<close> to a particular ontology---\isadof is designed to give fast user-feedback
\<^emph>\<open>during the capture of content\<close>. This is particularly valuable in case of document
changes, where the \<^emph>\<open>coherence\<close> between the formal and the informal parts of the
content can be mechanically checked.
To avoid any misunderstanding: \isadof is \<^emph>\<open>not a theory in HOL\<close>
on ontologies and operations to track and trace links in texts,
it is an \<^emph>\<open>environment to write structured text\<close> which \<^emph>\<open>may contain\<close>
Isabelle/HOL definitions and proofs like mathematical articles, tech-reports and
scientific papers---as the present one, which is written in \isadof
itself. \isadof is a plugin into the Isabelle/Isar
framework in the style of~@{cite "wenzel.ea:building:2007"}.
\<close>
(* declaring the forward references used in the subsequent section *)
(*<*)
declare_reference*[bgrnd::text_section]
declare_reference*[isadof::text_section]
declare_reference*[ontomod::text_section]
declare_reference*[ontopide::text_section]
declare_reference*[conclusion::text_section]
(*>*)
text*[plan::introduction]\<open> The plan of the paper is follows: we start by introducing the underlying
Isabelel sytem (@{docitem (unchecked) \<open>bgrnd\<close>}) followed by presenting the
essentials of \isadof and its ontology language (@{docitem (unchecked) \<open>isadof\<close>}).
It follows @{docitem (unchecked) \<open>ontomod\<close>}, where we present three application
scenarios from the point of view of the ontology modeling. In @{docitem_ref (unchecked) \<open>ontopide\<close>}
we discuss the user-interaction generated from the ontological definitions. Finally, we draw
conclusions and discuss related work in @{docitem_ref (unchecked) \<open>conclusion\<close>}. \<close>
section*[bgrnd::text_section,main_author="Some(@{docitem ''adb''}::author)"]
\<open> Background: The Isabelle System \<close>