\documentclass[10pt,DIV16,a4paper,abstract=true,twoside=semi,openright] {scrreprt} \usepackage[USenglish]{babel} \usepackage[numbers, sort&compress]{natbib} \usepackage{isabelle,isabellesym} \usepackage{booktabs} \usepackage{paralist} \usepackage{graphicx} \usepackage{amssymb} \usepackage{xspace} \usepackage{xcolor} \usepackage{listings} \lstloadlanguages{HTML} \usepackage[]{mathtools} \usepackage[pdfpagelabels, pageanchor=false, plainpages=false]{hyperref} \lstdefinestyle{html}{language=XML, basicstyle=\ttfamily, commentstyle=\itshape, keywordstyle=\color{blue}, ndkeywordstyle=\color{blue}, } \lstdefinestyle{displayhtml}{style=html, floatplacement={tbp}, captionpos=b, framexleftmargin=0pt, basicstyle=\ttfamily\scriptsize, backgroundcolor=\color{black!2}, frame=lines, } \lstnewenvironment{html}[1][]{\lstset{style=displayhtml, #1}}{} \def\inlinehtml{\lstinline[style=html, columns=fullflexible]} \pagestyle{headings} \isabellestyle{default} \setcounter{tocdepth}{1} \newcommand{\ie}{i.\,e.\xspace} \newcommand{\eg}{e.\,g.\xspace} \newcommand{\thy}{\isabellecontext} \renewcommand{\isamarkupsection}[1]{% \begingroup% \def\isacharunderscore{\textunderscore}% \section{#1 (\thy)}% \endgroup% } \title{Core SC DOM\\\medskip \Large A Formal Model of the Document Object Model for Safe Components}% \author{% \href{https://www.brucker.ch/}{Achim~D.~Brucker}\footnotemark[1] \and \href{https://www.michael-herzberg.de/}{Michael Herzberg}\footnotemark[2] } \publishers{ \footnotemark[1]~Department of Computer Science, University of Exeter, Exeter, UK\texorpdfstring{\\}{, } \texttt{a.brucker@exeter.ac.uk}\\[2em] % \footnotemark[2]~ Department of Computer Science, The University of Sheffield, Sheffield, UK\texorpdfstring{\\}{, } \texttt{msherzberg1@sheffield.ac.uk} } \begin{document} \maketitle \begin{abstract} \begin{quote} In this AFP entry, we formalize the core of the \emph{Safely Composable Document Object Model} (SC DOM). The SC DOM improve the standard DOM by strengthening the tree boundaries set by shadow roots: in the SC DOM, the shadow root is a sub-class of the document class (instead of a base class). This modifications also results in changes to some API methods (e.g., getOwnerDocument) to return the nearest shadow root rather than the document root. As a result, many API methods that, when called on a node inside a shadow tree, would previously ``break out'' and return or modify nodes that are possibly outside the shadow tree, now stay within its boundaries. This change in behavior makes programs that operate on shadow trees more predictable for the developer and allows them to make more assumptions about other code accessing the DOM. \bigskip \noindent{\textbf{Keywords:}} Document Object Model, DOM, SC DOM, Safely Composable DOM, Formal Semantics, Isabelle/HOL \end{quote} \end{abstract} \tableofcontents \cleardoublepage \chapter{Introduction} In a world in which more and more applications are offered as services on the internet, web browsers start to take on a similarly central role in our daily IT infrastructure as operating systems. Thus, web browsers should be developed as rigidly and formally as operating systems. While formal methods are a well-established technique in the development of operating systems (see, \eg,~\citet{klein:operating:2009} for an overview of formal verification of operating systems), there are few proposals for improving the development of web browsers using formal approaches~\cite{gardner.ea:dom:2008,raad.ea:dom:2016,jang.ea:establishing:2012,bohannon.ea:featherweight:2010}. As a first step towards a verified client-side web application stack, we model and formally verify the Document Object Model (DOM) in Isabelle/HOL\@. The DOM~\cite{whatwg:dom:2017,w3c:dom:2015} is \emph{the} central data structure of all modern web browsers. At its core, the Document Object Model (DOM), defines a tree-like data structure for representing documents in general and HTML documents in particular. Thus, the correctness of a DOM implementation is crucial for ensuring that a web browser displays web pages correctly. Moreover, the DOM is the core data structure underlying client-side JavaScript programs, \ie, client-side JavaScript programs are mostly programs that read, write, and update the DOM. In more detail, we formalize the core core of the \emph{Safely Composable Document Object Model} (SC DOM) a shallow embedding~\cite{joyce.ea:higher:1994} in Isabelle/HOL\@. Our formalization is based on a typed data model for the \emph{node-tree}, \ie, a data structure for representing XML-like documents in a tree structure. Furthermore, we formalize a typed heap for storing (partial) node-trees together with the necessary consistency constraints. Finally, we formalize the operations (as described in the DOM standard~\cite{whatwg:dom:2017}) on this heap that allow manipulating node-trees. Our machine-checked formalization of the DOM node tree~\cite{whatwg:dom:2017} has the following desirable properties: \begin{itemize} \item It provides a \emph{consistency guarantee.} Since all definitions in our formal semantics are conservative and all rules are derived, the logical consistency of the DOM node-tree is reduced to the consistency of HOL. \item It serves as a \emph{technical basis for a proof system.} Based on the derived rules and specific setup of proof tactics over node-trees, our formalization provides a generic proof environment for the verification of programs manipulating node-trees. \item It is \emph{executable}, which allows to validate its compliance to the standard by evaluating the compliance test suite on the formal model and \item It is \emph{extensible} in the sense of~\cite{brucker.ea:extensible:2008-b,brucker:interactive:2007}, \ie, properties proven over the core DOM do not need to be re-proven for object-oriented extensions such as the HTML document model. \end{itemize} The rest of this document is automatically generated from the formalization in Isabelle/HOL, i.e., all content is checked by Isabelle.\footnote{For a brief overview of the work, we refer the reader to~\cite{brucker.ea:core-dom:2018,herzberg:web-components:2020}.} The structure follows the theory dependencies (see \autoref{fig:session-graph}): we start with introducing the technical preliminaries of our formalization (\autoref{cha:perliminaries}). Next, we introduce the concepts of pointers (\autoref{cha:pointers}) and classes (\autoref{cha:classes}), i.e., the core object-oriented datatypes of the DOM. On top of this data model, we define the functional behavior of the DOM classes, i.e., their methods (\autoref{cha:monads}). In \autoref{cha:dom}, we introduce the formalization of the functionality of the core DOM, i.e., the \emph{main entry point for users} that want to use this AFP entry. Finally, we formalize the relevant compliance test cases in \autoref{cha:tests}. \paragraph{Important Note:} This document describes the formalization of the \emph{Safely Composable Document Object Model} (SC DOM), which deviated in one important aspect from the official DOM standard: in the SC DOM, the shadow root is a sub-class of the document class (instead of a base class). This modification results in a stronger notion of web components that provide improved safety properties for the composition of web components. While the SC DOM still passes the compliance test suite as provided by the authors of the DOM standard, its data model is different. We refer readers interested in a formalisation of the standard compliant DOM to the AFP entry ``Core\_DOM''~\cite{brucker.ea:afp-core-dom:2018}. \begin{figure} \centering \includegraphics[width=.8\textwidth]{session_graph} \caption{The Dependency Graph of the Isabelle Theories.\label{fig:session-graph}} \end{figure} \clearpage \chapter{Preliminaries} \label{cha:perliminaries} In this chapter, we introduce the technical preliminaries of our formalization of the core DOM, namely a mechanism for hiding type variables and the heap error monad. \input{Hiding_Type_Variables} \input{Heap_Error_Monad} \chapter{References and Pointers} \label{cha:pointers} In this chapter, we introduce a generic type for object-oriented references and typed pointers for each class type defined in the DOM standard. \input{Ref} \input{ObjectPointer} \input{NodePointer} \input{ElementPointer} brucker.ea:afp-core-dom:2018 \input{CharacterDataPointer} \input{DocumentPointer} \input{ShadowRootPointer} \chapter{Classes} \label{cha:classes} In this chapter, we introduce the classes of our DOM model. The definition of the class types follows closely the one of the pointer types. Instead of datatypes, we use records for our classes. a generic type for object-oriented references and typed pointers for each class type defined in the DOM standard. \input{BaseClass} \input{ObjectClass} \input{NodeClass} \input{ElementClass} \input{CharacterDataClass} \input{DocumentClass} \chapter{Monadic Object Constructors and Accessors} \label{cha:monads} In this chapter, we introduce the moandic method definitions for the classes of our DOM formalization. Again the overall structure follows the same structure as for the class types and the pointer types. \input{BaseMonad} \input{ObjectMonad} \input{NodeMonad} \input{ElementMonad} \input{CharacterDataMonad} \input{DocumentMonad} \chapter{The Core SC DOM} \label{cha:dom} In this chapter, we introduce the formalization of the core DOM, i.e., the most important algorithms for querying or modifying the DOM, as defined in the standard. For more details, we refer the reader to \cite{brucker.ea:core-dom:2018}. \input{Core_DOM_Basic_Datatypes} \input{Core_DOM_Functions} \input{Core_DOM_Heap_WF} \input{Core_DOM} \chapter{Test Suite} \label{cha:tests} In this chapter, we present the formalized compliance test cases for the core DOM. As our formalization is executable, we can (symbolically) execute the test cases on top of our model. Executing these test cases successfully shows that our model is compliant to the official DOM standard. As future work, we plan to generate test cases from our formal model (e.g., using~\cite{brucker.ea:interactive:2005,brucker.ea:theorem-prover:2012}) to improve the quality of the official compliance test suite. For more details on the relation of test and proof in the context of web standards, we refer the reader to \cite{brucker.ea:standard-compliance-testing:2018}. \input{Core_DOM_BaseTest} \input{Document_adoptNode} \input{Document_getElementById} \input{Node_insertBefore} \input{Node_removeChild} \input{Core_DOM_Tests} {\small \bibliographystyle{abbrvnat} \bibliography{root} } \end{document} %%% Local Variables: %%% mode: latex %%% TeX-master: t %%% End: