diff --git a/thys/Bicategory/document/root.tex b/thys/Bicategory/document/root.tex --- a/thys/Bicategory/document/root.tex +++ b/thys/Bicategory/document/root.tex @@ -1,354 +1,356 @@ +\RequirePackage{luatex85} + \documentclass[11pt,notitlepage,a4paper]{report} \usepackage{isabelle,isabellesym,eufrak} \usepackage[english]{babel} % For graphics files \usepackage[pdftex]{graphicx} % this should be the last package used \usepackage{pdfsetup} % urls in roman style, theory text in math-similar italics \urlstyle{rm} \isabellestyle{it} % XYPic package, for drawing commutative diagrams. \input{xy} \xyoption{curve} \xyoption{arrow} \xyoption{matrix} \xyoption{2cell} \xyoption{line} \UseAllTwocells % Even though I stayed within the default boundary in the JEdit buffer, % some proof lines wrap around in the PDF document. To minimize this, % increase the text width a bit from the default. \addtolength\textwidth{60pt} \addtolength\oddsidemargin{-30pt} \addtolength\evensidemargin{-30pt} \begin{document} \title{Bicategories} \author{Eugene W. Stark\\[\medskipamount] Department of Computer Science\\ Stony Brook University\\ Stony Brook, New York 11794 USA} \maketitle \begin{abstract} Taking as a starting point the author's previous work (\cite{Category3-AFP} \cite{MonoidalCategory-AFP}) on developing aspects of category theory in Isabelle/HOL, this article gives a compatible formalization of the notion of ``bicategory'' and develops a framework within which formal proofs of facts about bicategories can be given. The framework includes a number of basic results, including the Coherence Theorem, the Strictness Theorem, pseudofunctors and biequivalence, and facts about internal equivalences and adjunctions in a bicategory. As a driving application and demonstration of the utility of the framework, it is used to give a formal proof of a theorem, due to Carboni, Kasangian, and Street \cite{carboni-et-al}, that characterizes up to biequivalence the bicategories of spans in a category with pullbacks. The formalization effort necessitated the filling-in of many details that were not evident from the brief presentation in the original paper, as well as identifying a few minor corrections along the way. \end{abstract} \tableofcontents \phantomsection \addcontentsline{toc}{chapter}{Introduction} \chapter*{Introduction} Bicategories, introduced by B\'{e}nabou \cite{benabou}, are a generalization of categories in which the sets of arrows between pairs of objects (\emph{i.e.}~the ``hom-sets'') themselves have the structure of categories. In a typical formulation, the definition of bicategories involves three separate kinds of entities: \emph{objects} (or \emph{$0$-cells}), \emph{arrows} (or \emph{$1$-cells}), and morphisms between arrows (or \emph{$2$-cells}). There are two kinds of composition: \emph{vertical} composition, which composes $2$-cells within a single hom-category, and \emph{horizontal} composition, which composes $2$-cells in ``adjacent'' hom-categories ${\rm hom}(A, B)$ and ${\rm hom}(B, C)$. Horizontal composition is required to be functorial with respect to vertical composition; the identification of a $1$-cell with the corresponding identity $2$-cell then leads to the ability to horizontally compose $1$-cells with $2$-cells (\emph{i.e.}~``whiskering'') and to horizontally compose $1$-cells with each other. Each hom-category ${\rm hom}(A, A)$ is further equipped with an \emph{identity} $1$-cell ${\rm id}_A$, which serves as a unit for horizontal composition. In a \emph{strict} bicategory, also known as a \emph{$2$-category}, the usual unit and associativity laws for horizontal composition are required to hold exactly, or (as it is said) ``on the nose''. In a general bicategory, these laws are only required to hold ``weakly''; that is, up to a collection of (vertical) isomorphisms that satisfy certain \emph{coherence conditions}. A bicategory, all of whose hom-categories are discrete, is essentially an ordinary category. A bicategory with just one object amounts to a monoidal category whose tensor is given by horizontal composition. Alternatively, we may think of bicategories as a generalization of monoidal categories in which the tensor is permitted to be a partial operation, in analogy to the way in which ordinary categories can be considered as a generalization of monoids. A standard example of a bicategory is \textbf{Cat}, the bicategory whose $0$-cells are categories, whose $1$-cells are functors, and whose $2$-cells are natural transformations. This is in fact a $2$-category; however, as two categories that are related by an equivalence of categories have the same ``categorical'' properties, it is often more sensible to consider constructions on categories as given up to equivalence, rather than up to isomorphism, and this leads to considering \textbf{Cat} as a bicategory and using bicategorical constructions rather than as a $2$-category and using $2$-categorical ones. This is one reason for the importance of bicategories: as Street \cite{street-fibrations-ii} remarks, ``In recent years it has become even more obvious that, although the fundamental constructions of set theory are categorical, the fundamental constructions of category theory are bicategorical.'' An alternative reason for studying bicategories, which is more aligned with my own personal interests and forms a major reason why I chose to pursue the present project, is that they provide an elegant framework for theories of generalized relations, as has been shown by Carboni, Walters, Street, and others \cite{carboni-et-al} \cite{cartesian-bicategories-i} \cite{cartesian-bicategories-ii} \cite{carboni-partial-maps}. Indeed, the category of sets and relations becomes a bicategory by taking the inclusions between relations as $2$-cells and thereby becomes an exemplar of the notion bicategory of relations which itself is a specialization of the notion of cartesian bicategory \cite{cartesian-bicategories-i} \cite{cartesian-bicategories-ii}. In the study of the semantics of programming languages containing nondeterministic or concurrent constructs, it is natural to consider the meaning of a program in such a language as some kind of relation between inputs and outputs. Ordinary relations can be used for this purpose in simple situations, but they fail to be adequate for the study of higher-order nondeterministic programs or for concurrent programs that engage in interaction with their environment, so some sort of notion of generalized relation is needed. One is therefore led to try to identify some kind of bicategories of generalized relations as framework suitable for defining the semantics of such programs. One expects these to be instances of cartesian bicategories. I attempted for a long time to try to develop a semantic framework for a certain class of interactive concurrent programs along the lines outlined above, but ultimately failed to obtain the kind of comprehensive understanding that I was seeking. The basic idea was to try to regard a program as denoting a kind of generalized machine, expressed as some sort of bimodule or two-sided fibration ({\em cf.}~\cite{street-fibrations-i} \cite{street-fibrations-ii}), to be represented as a certain kind of span in an underlying category of ``maps'', which would correspond to the meanings of deterministic programs. A difficulty with trying to formulate any kind of theory like this is that there quickly gets to be a lot of data and a lot of properties to keep track of, and it was certainly more than I could handle. For example, bicategories have objects, $1$-cells, and $2$-cells, as well as domains, codomains, composition and identities for both the horizontal and vertical structure. In addition, there are unit and associativity isomorphisms for the weak horizontal composition, as well as their associated coherence conditions. Cartesian bicategories are symmetric monoidal bicategories, which means that there is an additional tensor product, which comes with another set of canonical isomorphisms and coherence conditions. Still more canonical morphisms and coherence conditions are associated with the cartesian structure. Even worse, in order to give a proper account of the computational ideas I was hoping to capture, the underlying category of maps would at least have to be regarded as an ordered category, if not a more general $2$-category or bicategory, so the situation starts to become truly daunting. With so much data and so many properties, it is unusual in the literature to find proofs written out in anything approaching complete detail. To the extent that proofs are given, they often involve additional assumptions made purely for convenience and presentational clarity, such as assuming that the bicategories under consideration are strict when actually they are not, and then discharging these assumptions by appeals to informal arguments such as ``the result holds in the general case because we can always replace a non-strict bicategory by an equivalent strict one.'' This is perhaps fine if you happen to have finely honed insight, but in my case I am always left wondering if something important hasn't been missed or glossed over, and I don't trust very much my own ability to avoid gross errors if I were to work at the same level of detail as the proofs that I see in the literature. So my real motivation for the present project was to try to see whether a proof assistant would actually be useful in carrying out fully formalized, machine-checkable proofs of some kind of interesting facts about bicategories. I also hoped in the process to develop a better understanding of some concepts that I knew that I hadn't understood very well. The project described in the present article is divided into two main parts. The first part, which comprises Chapter 1, seeks to develop a formalization of the notion of bicategory using Isabelle/HOL and to prove various facts about bicategories that are required for a subsequent application. Additional goals here are: (1) to be able to make as much use as possible of the formalizations previously created for categories \cite{Category3-AFP} and monoidal categories \cite{MonoidalCategory-AFP}; (2) to create a plausibly useful framework for future extension; and (3) to better understand some subtleties involved in the definition of bicategory. In this chapter, we give an HOL formalization of bicategories that makes use of and extends the formalization of categories given in \cite{Category3-AFP}. In that previous work, categories were formalized in an ``object-free'' style in terms of a suitably defined associative partial binary operation of composition on a single type. Elements of the type that behave as units for the composition were called ``identities'' and the ``arrows'' were identified as the elements of the type that are composable both on the left and on the right with identities. The identities composable in this way with an arrow were then shown to be uniquely determined, which permitted domain and codomain functions to be defined. This formalization of categories is economical in terms of basic data (only a single partial binary operation is required), but perhaps more importantly, functors and natural transformations need not be defined as structured objects, but instead can be taken to be ordinary functions between types that suitably preserve arrows and composition. In order to carry forward unchanged the framework developed for categories, for the formalization of bicategories we take as a jumping-off point the somewhat offbeat view of a bicategory as a single global category under vertical composition (the arrows are the $2$-cells), which is then equipped with an additional partial binary operation of horizontal composition. This point of view corresponds to thinking of bicategories as generalizations of monoidal categories in which the tensor is allowed to be a partial operation. In a direct generalization of the approach taken for categories, we then show that certain \emph{weak units} with respect to the horizontal composition play the role of $0$-cells (the identities with respect to vertical composition play the role of $1$-cells) and that we can define the \emph{sources} and \emph{targets} of an arrow -as the sets of weak units horizontally composable on the right and on the left with it. +as the sets of weak units horizontally composable on the right and on the left with it. We then define a notion of weak associativity for the horizontal composition and arrive at the definition of a \emph{prebicategory}, which consists of a (vertical) category equipped with an associative weak (horizontal) composition, subject to the additional assumption that every vertical arrow has a nonempty set of sources and targets with respect to the horizontal composition. We then show that, to obtain from a prebicategory a structure that satisfies a more traditional-looking definition of a bicategory, all that is necessary is to choose arbitrarily a particular representative source and target for each arrow. Moreover, every bicategory determines a prebicategory by simply forgetting the chosen sources and targets. This development clarifies that an \emph{a priori} assignment of source and target objects for each $2$-cell is merely a convenience, rather than an element essential to the notion of bicategory. Additional highlights of Chapter 1 are as follows: \begin{itemize} \item As a result of having formalized bicategories essentially as ``monoidal categories with partial tensor'', we are able to generalize to bicategories, in a mostly straightforward way, the proof of the Coherence Theorem we previously gave for monoidal categories in \cite{MonoidalCategory-AFP}. We then develop some machinery that enables us to apply the Coherence Theorem to shortcut certain kinds of reasoning involving canonical isomorphisms. % \item Using the syntactic setup developed for the proof of the Coherence Theorem, we also give a proof of the Strictness Theorem, which states that every bicategory is biequivalent to a $2$-category, its so-called ``strictification''. % \item We define the notions of internal equivalence and internal adjunction in a bicategory and prove a number of basic facts about these notions, including composition of equivalences and adjunctions, and that every equivalence can be refined to an adjoint equivalence. % \item We formalize the notion of a pseudofunctor between bicategories, generalizing the notion of a monoidal functor between monoidal categories and we show that pseudofunctors preserve internal equivalences and adjunctions. % \item We define a sub-class of pseudofunctors which we call \emph{equivalence pseudofunctors}. Equivalence pseudofunctors are intended to coincide with those pseudofunctors that can be extended to an equivalence of bicategories, but we do not attempt to give an independent definition equivalence of bicategories in the present development. Instead, we establish various properties of equivalence pseudofunctors to provide some confidence that the notion has been formalized correctly. Besides establishing various preservation results, we prove that, given an equivalence pseudofunctor, we may obtain one in the converse direction. For the rest of this article we use the property of two bicategories being connected by an equivalence pseudofunctor as a surrogate for the property of biequivalence, leaving for future work a more proper formulation of equivalence of bicategories and a full verification of the relationship of this notion with equivalence pseudofunctors. \end{itemize} The second part of the project, presented in Chapter 2, is to demonstrate the utility of the framework by giving a formalized proof of a nontrivial theorem about bicategories. For this part, I chose to tackle a theorem of Carboni, Kasangian, and Street (\cite{carboni-et-al}, ``CKS'' for short) which gives axioms that characterize up to equivalence those bicategories whose $1$-cells are spans of arrows in an underlying category with pullbacks and whose $2$-cells are arrows of spans. The original paper is very short (nine pages in total) and the result I planned to formalize (Theorem 4) was given on the sixth page. I thought I had basically understood this result and that the formalization would not take very long to accomplish, but I definitely underestimated both my prior understanding of the result and the amount of auxiliary material that it would be necessary to formalize before I could complete the main proof. Eventually I did complete the formalization, and in the process filled in what seemed to me to be significant omissions in Carboni, Kasangian, and Street's presentation, as well as correcting some errors of a minor nature. Highlights of Chapter 2 are the following: \begin{itemize} \item A formalization of the notion of a category with chosen pullbacks, a proof that this formalization is in agreement with the general definition of limits we gave previously in \cite{Category3-AFP}, and the development of some basic properties of a category with pullbacks. % \item A construction, given a category $C$ with chosen pullbacks, of the ``span bicategory'' ${\rm Span}(C)$, whose objects are those of the given category, whose $1$-cells are spans of arrows of $C$, and whose $2$-cells are arrows of spans. We characterize the maps (the \emph{i.e.}~left adjoints) in ${\rm Span}(C)$ as exactly those spans whose ``input leg'' is invertible. % \item A formalization of the notion of \emph{tabulation} of a $1$-cell in a bicategory and a development of some of its properties. Tabulations are a kind of bicategorical limit introduced by CKS, which can be used to define a kind of biuniversal way of factoring a $1$-cell up to isomorphism as the horizontal composition of a map and the adjoint of a map. % \item A formalization of \emph{bicategories of spans}, which are bicategories that satisfy three axioms introduced in CKS. We give a formal proof of CKS Theorem 4, which characterizes the bicategories of spans as those bicategories that are biequivalent to a bicategory ${\rm Span}(C)$ for some category $C$ with pullbacks. One direction of the proof shows that if $C$ is a category with pullbacks, then ${\rm Span}(C)$ satisfies the axioms for a bicategory of spans. Moreover, we show that the notion ``bicategory of spans'' is preserved under equivalence of bicategories, so that in fact any bicategory biequivalent to one of the form ${\rm Span}(C)$ is a bicategory of spans. Conversely, we show that if $B$ is a bicategory of spans, then $B$ is biequivalent to ${\rm Span}({\rm Maps}(B))$, where ${\rm Maps}(B)$ is the so-called \emph{classifying category} of the maps in $B$, which has as objects those of $B$ and as arrows the isomorphism classes of maps in $B$. In order to formalize the proof of this result, it was necessary to develop a number of details not mentioned by CKS, including ways of composing tabulations vertically and horizontally, and spelling out a way to choose pullbacks in ${\rm Maps}(B)$ so that the tupling of arrows of ${\rm Maps}(B)$ obtained using the chosen pullbacks agrees with that obtained through horizontal composition of tabulations. These details were required in order to give the definition of the compositor for an equivalence pseudofunctor ${\rm SPN}$ from $B$ to ${\rm Span}({\rm Maps}(B))$ and establish the necessary coherence conditions. \end{itemize} In the end, I think it can be concluded that Isabelle/HOL can be used with benefit to formalize proofs about bicategories. It is certainly very helpful for keeping track of the data involved and the proof obligations required. For example, in the formalization given here, a total of 99 separate subgoals are involved in proving that a given set of data constitutes a bicategory (only 7 subgoals are required for an ordinary category) and another 29 subgoals must be proved in order to establish a pseudofunctor between two bicategories (only 5 additional subgoals are required for an ordinary functor), but the proof assistant assumes the burden of keeping track of these proof obligations and presenting them to the human user in a structured, understandable fashion. On the other hand, some of the results proved here still required some lengthy equational ``diagram chases'' for which the proof assistant (at least so far) didn't provide that much help (aside from checking their correctness). An exception to this was in the case of equational reasoning about expressions constructed purely of canonical isomorphisms, which our formulation of the Coherence Theorem permitted to be carried out automatically by the simplifier. It seems likely, though, that there is still room for more general procedures to be developed in order to allow other currently lengthy chains of equational reasoning to be carried out automatically. \medskip\par\noindent {\bf Revision Notes} The original version of this document dates from January, 2020. The current version of this document incorporates revisions made in early to mid-2020. A number of the changes consisted of minor improvements and speedups. Aside from these main change was that theory ``category with pullbacks'' was moved to \cite{Category3-AFP}, where it more logically belongs. \phantomsection \addcontentsline{toc}{chapter}{Preliminaries} \chapter*{Preliminaries} \input{IsomorphismClass.tex} \chapter{Bicategories} \input{Prebicategory.tex} \input{Bicategory.tex} \input{Coherence.tex} \input{CanonicalIsos.tex} \input{Subbicategory.tex} \input{InternalEquivalence.tex} \input{Pseudofunctor.tex} \input{Strictness.tex} \input{InternalAdjunction.tex} \chapter{Bicategories of Spans} \input{SpanBicategory.tex} \input{Tabulation.tex} \input{BicategoryOfSpans.tex} \phantomsection \addcontentsline{toc}{chapter}{Bibliography} \bibliographystyle{abbrv} \bibliography{root} \end{document} diff --git a/thys/InfPathElimination/document/root.tex b/thys/InfPathElimination/document/root.tex --- a/thys/InfPathElimination/document/root.tex +++ b/thys/InfPathElimination/document/root.tex @@ -1,129 +1,129 @@ %\documentclass[11pt,a4paper]{article} -\documentclass[11pt, USenglish, pdftex]{article} +\documentclass[11pt, USenglish]{article} \usepackage{isabelle,isabellesym} \usepackage{hyphenat} \usepackage{authblk} \usepackage[final]{graphicx} \usepackage{url} % further packages required for unusual symbols (see also % isabellesym.sty), use only when needed %\usepackage{amssymb} %for \, \, \, \, \, \, %\, \, \, \, \, %\, \, \ %\usepackage{eurosym} %for \ %\usepackage[only,bigsqcap]{stmaryrd} %for \ %\usepackage{eufrak} %for \ ... \, \ ... \ (also included in amssymb) %\usepackage{textcomp} %for \, \, \, \, \, %\ % this should be the last package used \usepackage{pdfsetup} % urls in roman style, theory text in math-similar italics \urlstyle{rm} \isabellestyle{it} % for uniform font size %\renewcommand{\isastyle}{\isastyleminor} \begin{document} -\title{\huge Infeasible Paths Elimination by Symbolic Execution Techniques:\\ +\title{\huge Infeasible Paths Elimination by Symbolic Execution Techniques:\\ \large Proof of Correctness and Preservation of Paths} \author{% %\href{https://www.lri.fr/~aissat/}{ Romain Aissat %} and %\href{https://www.lri.fr/~fv/}{ Fr\'ed\'eric Voisin %} and %\href{https://www.lri.fr/~wolff/}{ Burkhart Wolff %} } \affil{% LRI, Univ Paris-Sud, CNRS, CentraleSup\'elec,\\Universit\'e Paris-Saclay, France\\ -\href{mailto:"Romain Aissat"}{aissat@lri.fr}, +\href{mailto:"Romain Aissat"}{aissat@lri.fr}, \href{mailto:"Burkhart Wolff"}{wolff@lri.fr} } \maketitle \begin{abstract} TRACER~\cite{DBLP:conf/cav/JaffarMNS12} is a tool for verifying safety properties of sequential C programs. TRACER attempts at building a finite -symbolic execution graph which over\hyp{}approximates the +symbolic execution graph which over\hyp{}approximates the set of all concrete reachable states and the set of feasible paths. - -We present an abstract framework for TRACER and similar -CEGAR\hyp{}like systems~\cite{DBLP:journals/sttt/BeyerHJM07,DBLP:conf/tacas/ClarkeKSY05,DBLP:conf/cav/IvancicYGGSA05,DBLP:conf/pldi/GrebenshchikovLPR12,McMillan2006}. + +We present an abstract framework for TRACER and similar +CEGAR\hyp{}like systems~\cite{DBLP:journals/sttt/BeyerHJM07,DBLP:conf/tacas/ClarkeKSY05,DBLP:conf/cav/IvancicYGGSA05,DBLP:conf/pldi/GrebenshchikovLPR12,McMillan2006}. The framework provides -1) a graph\hyp{}transformation based method for reducing the feasible paths in +1) a graph\hyp{}transformation based method for reducing the feasible paths in control\hyp{}flow graphs, -2) a model for symbolic execution, subsumption, predicate abstraction and +2) a model for symbolic execution, subsumption, predicate abstraction and invariant generation. In this framework we formally prove two key properties: correct construction of the symbolic states and preservation of feasible paths. The framework focuses on core operations, leaving to concrete prototypes to ``fit in'' heuristics for combining them. The accompanying paper (published in ITP 2016) can be found at \url{https://www.lri.fr/~wolff/papers/conf/2016-itp-InfPathsNSE.pdf}, also appeared in\cite{AissatVW2016}. \bigskip -\noindent{\textbf{Keywords: TRACER, CEGAR, Symbolic Executions, Feasible Paths, +\noindent{\textbf{Keywords: TRACER, CEGAR, Symbolic Executions, Feasible Paths, Control-Flow Graphs, Graph Transformation}} \end{abstract} \newpage - + \tableofcontents \newpage % sane default for proof documents \parindent 0pt\parskip 0.5ex \input{intro} \newpage % generated text of all theories \input{session} \newpage \input{summary} %\newpage % optional bibliography %\bibliographystyle{abbrv} \bibliographystyle{IEEEtran} \bibliography{root} \end{document} %%% Local Variables: %%% mode: latex %%% TeX-master: t %%% End: diff --git a/thys/Possibilistic_Noninterference/document/root.tex b/thys/Possibilistic_Noninterference/document/root.tex --- a/thys/Possibilistic_Noninterference/document/root.tex +++ b/thys/Possibilistic_Noninterference/document/root.tex @@ -1,62 +1,64 @@ +\RequirePackage{luatex85} + \documentclass[11pt,a4paper]{article} \usepackage{isabelle,isabellesym} %\usepackage{graphicx} -%\usepackage{alltt} +%\usepackage{alltt} %\usepackage{url} %\usepackage{hyperref} %\usepackage{mathptmx} %\include{myCommands} %\include{tikzlib} \usepackage[all]{xy} %\usepackage{cite} \usepackage[only,bigsqcap]{stmaryrd} % this should be the last package used \usepackage{pdfsetup} % urls in roman style, theory text in math-similar italics \urlstyle{rm} \isabellestyle{it} \begin{document} \title{Possibilistic Noninterference\thanks{Supported by the DFG project Ni 491/13--1 (part of the DFG priority program RS3) and the DFG RTG 1480.} } \author{Andrei Popescu \hspace*{10ex} Johannes H\"{o}lzl} -%\institute{1: Technische Universit\"{a}t M\"{u}nchen -% \hspace*{2ex} 2: Institute of Mathematics Simion Stoilow, Romania +%\institute{1: Technische Universit\"{a}t M\"{u}nchen +% \hspace*{2ex} 2: Institute of Mathematics Simion Stoilow, Romania % } \maketitle \begin{abstract} -We formalize a wide variety of Volpano/Smith-style -noninterference notions for a while language with parallel composition. -We systematize and classify these notions according to compositionality w.r.t.~the language constructs. -Compositionality yields sound syntactic criteria (a.k.a.~type systems) in a uniform way. +We formalize a wide variety of Volpano/Smith-style +noninterference notions for a while language with parallel composition. +We systematize and classify these notions according to compositionality w.r.t.~the language constructs. +Compositionality yields sound syntactic criteria (a.k.a.~type systems) in a uniform way. \end{abstract} \tableofcontents % sane default for proof documents \parindent 0pt\parskip 0.5ex \input{intro} % generated text of all theories \input{session} % optional bibliography \bibliographystyle{abbrv} \bibliography{root} \end{document} %%% Local Variables: %%% mode: latex %%% TeX-master: t %%% End: diff --git a/thys/Store_Buffer_Reduction/document/root.tex b/thys/Store_Buffer_Reduction/document/root.tex --- a/thys/Store_Buffer_Reduction/document/root.tex +++ b/thys/Store_Buffer_Reduction/document/root.tex @@ -1,723 +1,725 @@ +\RequirePackage{luatex85} + %\documentclass[11pt,a4paper]{article} \documentclass[11pt]{llncs} \pdfoutput=1 \usepackage[utf8]{inputenc} % replace by the encoding you are using \usepackage{geometry} \geometry{ a4paper, % or letterpaper textwidth=15cm, % llncs has 12.2cm textheight=24cm, % llncs has 19.3cm heightrounded, % integer number of lines hratio=1:1, % horizontally centered vratio=2:3, % not vertically centered } \usepackage{amsmath} \usepackage{graphicx} \usepackage{mathpartir} \usepackage{float} \usepackage{cite} % produce nice graphics using tikz and pgf %\usepackage{tikz} %\usetikzlibrary{snakes} \usepackage{xspace} \newcommand{\cf}{cf.\@\xspace} \newcommand{\eg}{e.g.,\xspace} \newcommand{\ie}{i.e.,\xspace} \newcommand{\Sound}{sound} \usepackage{isabelle,isabellesym} \usepackage{amssymb} % We redefine it use the ragged2e package for \RaggedRight which makes better % use of hyphenation in Latex2e compared to standard \raggedright \usepackage{ragged2e} % {tabularx} provides an expanding column specifier X \usepackage{tabularx} % todo not loading tabularx seems to break compilation \usepackage{booktabs} \usepackage{longtable} \usepackage{amsmath} \usepackage{stmaryrd} \usepackage{subfig} \usepackage[draft]{fixme} % provides {inparaenum} for enumerations inside a paragraph \usepackage[defblank]{paralist} % make roman numerals default for {inparaenum}: \let\oldinparaenum=\inparaenum \def\inparaenum{\oldinparaenum[(i)]} %inference rules \usepackage{mathpartir} %for greek letters in our Isabelle setup \usepackage[greek, english]{babel} \usepackage{type1cm} \usepackage{float} \usepackage{tikz} \usetikzlibrary{calc} \usepackage{pdfsetup} %fix robustness issue in Isabelle 2007 \DeclareRobustCommand{\isascriptstyle}{\def\isamath##1{##1}\def\isatext##1{\mbox{\isastylescript##1}}} \newcommand{\isabellestylesf}{% \isabellestyleit% \renewcommand{\isastyle}{\small\sf}% \renewcommand{\isastyleminor}{\sf}% \renewcommand{\isastylescript}{\scriptsize}% \renewcommand{\isascriptstyle}{\def\isamath####1{####1}\def\isatext####1{\mbox{\isastylescript####1}}\def\isagreek####1{\foreignlanguage{greek}{\mbox{\isastylescript####1}}}}% \renewcommand{\isacharprime}{\ensuremath{\mskip2mu{'}\mskip-2mu}}% \DeclareRobustCommand{\isactrlsub}[1]{{\isascriptstyle${}\mathsf{\sb{\vphantom{gb}##1}}$}}% \DeclareRobustCommand{\isactrlsup}[1]{{\isascriptstyle${}\mathsf{\sp{\vphantom{gb}##1}}$}}% \DeclareRobustCommand{\isactrlisub}[1]{{\isascriptstyle${}\mathsf{\sb{\vphantom{gb}##1}}$}}% \DeclareRobustCommand{\isactrlisup}[1]{{\isascriptstyle${}\mathsf{\sp{\vphantom{gb}##1}}$}}% \DeclareRobustCommand{\isactrlbsub}{\bgroup\isascriptstyle\begin{math}{}\mathsf\bgroup\sb\bgroup}% \DeclareRobustCommand{\isactrlesub}{\egroup\egroup\end{math}\egroup}% \DeclareRobustCommand{\isactrlbsup}{\bgroup\isascriptstyle\begin{math}{}\mathsf\bgroup\sp\bgroup}% \DeclareRobustCommand{\isactrlesup}{\egroup\egroup\end{math}\egroup}% \renewcommand{\isamarkupchapter}[1]{\isastyletext\chapter{##1}} \renewcommand{\isamarkupsection}[1]{\isastyletext\section{##1}} \renewcommand{\isamarkupsubsection}[1]{\isastyletext\subsection{##1}} \renewcommand{\isamarkupsubsubsection}[1]{\isastyletext\subsubsection{##1}} %\renewcommand{\isamarkupsect}[1]{\isastyletext\section{##1}} %\renewcommand{\isamarkupsubsect}[1]{\isastyletext\subsection{##1}} %\renewcommand{\isamarkupsubsubsect}[1]{\isastyletext\subsubsection{##1}} %we use babel for greek letters to easily obtain different fontshapes; \mbox helps in math mode \newcommand{\isagreek}[1]{\foreignlanguage{greek}{\mbox{##1}}} \renewcommand{\isasymalpha}{\isagreek{a}} \renewcommand{\isasymbeta}{\isagreek{b}} \renewcommand{\isasymgamma}{\isagreek{g}} \renewcommand{\isasymdelta}{\isagreek{d}} \renewcommand{\isasymepsilon}{\isagreek{e}} \renewcommand{\isasymzeta}{\isagreek{z}} \renewcommand{\isasymeta}{\isagreek{h}} \renewcommand{\isasymtheta}{\isagreek{j}} \renewcommand{\isasymiota}{\isagreek{i}} \renewcommand{\isasymkappa}{\isagreek{k}} \renewcommand{\isasymlambda}{\isamath{\lambda}} \renewcommand{\isasymmu}{\isagreek{m}} \renewcommand{\isasymnu}{\isagreek{n}} \renewcommand{\isasymxi}{\isagreek{x}} \renewcommand{\isasympi}{\isagreek{p}} \renewcommand{\isasymrho}{\isagreek{r}} \renewcommand{\isasymsigma}{\isagreek{sv}} \renewcommand{\isasymtau}{\isagreek{t}} \renewcommand{\isasymupsilon}{\isagreek{u}} \renewcommand{\isasymphi}{\isagreek{f}} \renewcommand{\isasymchi}{\isagreek{q}} \renewcommand{\isasympsi}{\isagreek{y}} \renewcommand{\isasymomega}{\isagreek{w}} \renewcommand{\isasymGamma}{\isagreek{G}} \renewcommand{\isasymDelta}{\isagreek{D}} \renewcommand{\isasymTheta}{\isagreek{J}} \renewcommand{\isasymLambda}{\isagreek{L}} \renewcommand{\isasymXi}{\isagreek{X}} \renewcommand{\isasymPi}{\isagreek{P}} \renewcommand{\isasymSigma}{\isagreek{Sv}} \renewcommand{\isasymUpsilon}{\isagreek{U}} \renewcommand{\isasymPhi}{\isagreek{F}} \renewcommand{\isasymPsi}{\isagreek{Y}} \renewcommand{\isasymOmega}{\isagreek{W}} } \isabellestyle{sf} % todo compared to the original isabelle.sty, we have omitted some vertical % spacing and forced \par's -- check if we really want this in the end... % % The problems we tried to fix with the current solution is that `unmotivated' % vertical space appeared between alternations of "(*<*)...(*>*)" and "text {* % ... *}", which is not acceptable and must otherwise be fixed manually by % reordering the material in the theories. \renewcommand{\isabeginpar}{} \renewcommand{\isaendpar}{} \makeatletter \renewenvironment{isapar}{\parindent\isa@parindent\parskip\isa@parskip\isabeginpar}{\isaendpar} \makeatother \DeclareRobustCommand\ensuretext[1]{\ifmmode\text{#1}\else{#1}\fi} \newcommand{\freefnt}[1]{\textsl{\rmfamily#1}} \newcommand{\boundfnt}[1]{{\textsl{\sffamily#1}}} \newcommand{\constructorfnt}[1]{\textsc{#1}} \newcommand{\holkeywordfnt}[1]{\texttt{#1}} \newcommand{\tfreeify}[1]{\ensuretext{\freefnt{#1}}} \newcommand{\freeify}[1]{\ensuretext{\freefnt{#1}}} \newcommand{\boundify}[1]{\ensuretext{\boundfnt{#1}}} \newcommand{\constructor}[1]{\ensuretext{\constructorfnt{#1}}} \newcommand{\holkeyword}[1]{\ensuretext{\holkeywordfnt{#1}}} \newcommand{\isasymllceil}{\isamath{\llceil}} \newcommand{\isasymrrceil}{\isamath{\rrceil}} \renewcommand{\isasymturnstile}{\isamath{\,\vdash}} \renewcommand{\isasymTurnstile}{\isamath{\,\models}} \newcommand{\isastring}[1]{``#1''} \newcommand{\accessor}[1]{\isa{the}$_{\textsc{#1}}$} \newcommand{\isaclike}[1]{\texttt{#1}} \renewcommand{\isasymvv}{\mbox{\isastyleminor\isastylescript v+1}} % include pdfcolor when using pdflatex %\ifpdf % \input pdfcolor.tex %\fi \newcommand{\listty}[1]{% \ensuremath{\mathit{\id{#1} \ \id{list} } } } \newcommand{\texth}{\mathcode`\-=`\-\relax} \newcommand{\id}[1]{% \ensuremath{\mathit{\texth#1}}} \newcommand{\co}[1]{% \ensuremath{\mathsf{\texth#1}}} \newcommand{\rf}[1]{% \ensuremath{\mathsf{#1}}} \newcommand{\Some}[1]{% \ensuremath{\mathit{\left\lfloor #1 \right\rfloor} } } \newcommand{\cons}[1]{% \ensuremath{\mathsf{#1}}} % TODO: improve typesetting of formulas. use Isabelle's document generation? if so remove the following macro definitions % some macro's \DeclareRobustCommand{\listlength}[1]{\left|#1\right|} % Isabelle: length \DeclareRobustCommand{\MemConfLM}{\ensuremath{\id{lm}}\xspace} \DeclareRobustCommand{\recursiondepth}[1]{\listlength{#1.\MemConfLM}} \DeclareRobustCommand{\heapbase}{\ensuremath{\id{abase}_\text{heap}}\xspace} % Isabelle: heap_base, heap_base_word \DeclareRobustCommand{\maxheapsize}{\ensuremath{\id{asize}_\text{heap}^\text{max}}\xspace} % Isabelle: heap_size_max \DeclareRobustCommand{\maxheap}{\ensuremath{\heapbase+\maxheapsize}\xspace} % Isabelle: max_heap \DeclareRobustCommand{\intwdasnat}{\ensuremath{\id{i2n}}\xspace} % Isabelle: intwd_as_nat \DeclareRobustCommand{\gprs}{\ensuremath{\id{gpr}}\xspace} \DeclareRobustCommand{\heaptopreg}{\ensuremath{r_\text{htop}}\xspace} % Isabelle: heaptop_reg \DeclareRobustCommand{\lastframereg}{\ensuremath{r_\text{lframe}}\xspace} % Isabelle: last_frame_reg \DeclareRobustCommand{\sbasereg}{\ensuremath{r_\text{sbase}}\xspace} % Isabelle: sbase_reg \DeclareRobustCommand{\sbasebubble}{\ensuremath{\id{bubble}_\text{code}}\xspace} % Isabelle: sbase_bubble \DeclareRobustCommand{\computesbase}{\ensuremath{\id{abase}_\text{gm}}\xspace} % Isabelle: compute_sbase \DeclareRobustCommand{\programbase}{\ensuremath{\id{progbase}}\xspace} % Isabelle: program_basee, program_base_word \DeclareRobustCommand{\stackframebubble}{\ensuremath{\id{bubble}_\text{gm}}\xspace} % Isabelle: stack_frame_bubble \DeclareRobustCommand{\abaselocalframe}{\ensuremath{\id{abase}_\text{lm}}\xspace} % Isabelle: abase_local_frame \DeclareRobustCommand{\tenv}{\ensuremath{\id{te}}\xspace} \DeclareRobustCommand{\pt}{\ensuremath{\id{ft}}\xspace} \DeclareRobustCommand{\extractsymbolconf}{\ensuremath{\id{sc}}\xspace} % Isabelle: extract_symbolconf \DeclareRobustCommand{\stackstart}{\abaselocalframe} % Isabelle: stack_start \DeclareRobustCommand{\gmsymbols}{\ensuremath{\id{gst}}\xspace} % Isabelle: gm_symbols %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% \def\listith{\id{!}} \def\implies {\ensuremath{\Rightarrow}} \def\ibool {\id{bool}} \def\inat{\Nat} \def\iff {\textbf{if }} \def\then {\textbf{then}} \def\elseif {\textbf{else}} \def\case{\id{\textbf{case}}} \def\of{\id{\textbf{ of }}} \def\indef{\id{\textbf{in }}} \def\letdef{\id{\textbf{let }}} \def\undef{\id{undef}} \def\funnames{\id{fun_n}} \def\cc {\id{c}} \def\sczero {\id{c0}} \def\confc {\ensuremath{\mathit{C_{\rm{co}}}}} \def\deltac {\ensuremath{\mathit{\delta_\sczero}}} \newcommand{\Def}[1]{\emph{#1}} \sloppy % The following is enclosed to allow easy detection of differences in % ascii coding. % Upper-case A B C D E F G H I J K L M N O P Q R S T U V W X Y Z % Lower-case a b c d e f g h i j k l m n o p q r s t u v w x y z % Digits 0 1 2 3 4 5 6 7 8 9 % Exclamation ! Double quote " Hash (number) # % Dollar $ Percent % Ampersand & % Acute accent ' Left paren ( Right paren ) % Asterisk * Plus + Comma , % Minus - Point . Solidus / % Colon : Semicolon ; Less than < % Equals =3D Greater than > Question mark ? % At @ Left bracket [ Backslash \ % Right bracket ] Circumflex ^ Underscore _ % Grave accent ` Left brace { Vertical bar | % Right brace } Tilde ~ \newcommand{\Nat}{{\mathbb N}} \newcommand{\Real}{{\mathbb R}} \def\lastname{Schirmer} \pagestyle{plain} \setcounter{tocdepth}{3} \begin{document} %\begin{frontmatter} \title{A Reduction Theorem for Store Buffers} \author{Ernie Cohen\inst{1}, Norbert Schirmer\inst{2}\fnmsep\thanks{Work funded by the German Federal Ministry of Education and Research (BMBF) in the framework of the Verisoft XT project under grant 01 IS 07 008.}} \institute{Microsoft Corp., Redmond, WA, USA \and German Research Center for Artificial Intelligence (DFKI) Saarbr\"ucken, Germany\\ \email{ecohen@amazon.com}, \email{norbert.schirmer@web.de}} \maketitle \begin{abstract} When verifying a concurrent program, it is usual to assume that memory is sequentially consistent. However, most modern multiprocessors depend on store buffering for efficiency, and provide native sequential consistency only at a substantial performance penalty. To regain sequential consistency, a programmer has to follow an appropriate programming discipline. However, na\"ive disciplines, such as protecting all shared accesses with locks, are not flexible enough for building high-performance multiprocessor software. We present a new discipline for concurrent programming under TSO (total store order, with store buffer forwarding). It does not depend on concurrency primitives, such as locks. Instead, threads use ghost operations to acquire and release ownership of memory addresses. A thread can write to an address only if no other thread owns it, and can read from an address only if it owns it or it is shared and the thread has flushed its store buffer since it last wrote to an address it did -not own. This discipline covers both coarse-grained concurrency (where -data is protected by locks) as well as fine-grained concurrency (where +not own. This discipline covers both coarse-grained concurrency (where +data is protected by locks) as well as fine-grained concurrency (where atomic operations race to memory). We formalize this discipline in Isabelle/HOL, and prove that if every execution of a program in a system without store buffers follows the discipline, then every execution of the program with store buffers is sequentially consistent. Thus, we can show sequential consistency under TSO by ordinary assertional reasoning about the program, without having to consider store buffers at all. \end{abstract} \tableofcontents %\begin{keyword} %Pervasive formal verification, systems verification, software verification, theorem proving %\end{keyword} %\end{frontmatter} \section{Introduction \label{sec:introduction}} When verifying a shared-memory concurrent program, it is usual to assume that each memory operation works directly on a shared memory state, a model sometimes called \Def{atomic} memory. A memory implementation that provides this abstraction for programs that communicate only through shared memory is said to be \Def{sequentially consistent}. Concurrent algorithms in the computing literature tacitly assume sequential consistency, as do most application programmers. However, modern computing platforms typically do not guarantee sequential consistency for arbitrary programs, for two reasons. First, optimizing compilers are typically incorrect unless the program is appropriately annotated to indicate which program locations might be concurrently accessed by other threads; this issue is addressed only cursorily in this report. Second, modern processors buffer stores of retired instructions. To make such buffering transparent to single-processor programs, subsequent reads of the processor read from these buffers in preference to the cache. (Otherwise, a program could write a new value to an address but later read an older value.) However, in a multiprocessor system, processors do not snoop the store buffers of other processors, so a store is visible to the storing processor before it is visible to other processors. This can result in executions that are not sequentially consistent. The simplest example illustrating such an inconsistency is the following program, consisting of two threads T0 and T1, where \texttt{x} and \texttt{y} are shared memory variables (initially 0) and \texttt{r0} and \texttt{r1} are registers: % \begin{center} \begin{minipage}{6cm} \begin{multicols}{3} T0 \begin{verbatim} x = 1; r0 = y; \end{verbatim} -\columnbreak +\columnbreak T1 \begin{verbatim} y = 1; r1 = x; \end{verbatim} -\columnbreak +\columnbreak \end{multicols} \end{minipage} \end{center} % In a sequentially consistent execution, it is impossible for both \texttt{r0} and \texttt{r1} to be assigned $0$. This is because the assignments to \texttt{x} and \texttt{y} must be executed in some order; if \texttt{x} (resp. \texttt{y}) is assigned first, then \texttt{r1} (resp. \texttt{r0}) will be set to $1$. However, in the presence of store buffers, the assignments to \texttt{r0} and \texttt{r1} might be performed while the writes to \texttt{x} and \texttt{y} are still in their respective store buffers, resulting in both \texttt{r0} and \texttt{r1} being assigned $0$. One way to cope with store buffers is make them an explicit part of the programming model. However, this is a substantial programming concession. First, because store buffers are FIFO, it ratchets up the complexity of program reasoning considerably; for example, the reachability problem for a finite set of concurrent finite-state programs over a finite set of finite-valued locations is in PSPACE without store buffers, but undecidable (even for two threads) with store buffers. Second, because writes from function calls might still be buffered when a function returns, making the store buffers explicit would break modular program reasoning. In practice, the usual remedy for store buffering is adherence to a programming discipline that provides sequential consistency for a suitable class of architectures. In this report, we describe and prove the correctness of such a discipline suitable for the memory model provided by existing x86/x64 machines, where each write emerging from a store buffer hits a global cache visible to all processors. Because each processor sees the same global ordering of writes, this model is sometimes called \Def{total store order} (TSO)\cite{Adve:Computer-29-12-66}\footnote{Before 2008, Intel \cite{IntelWhitePaper} and AMD \cite{AMD:AMD64A2006-ALL} both put forward a weaker memory model in which writes to different memory addresses may be seen in different orders on different processors, but respecting causal ordering. However, current implementations satisfy the stronger conditions described in this report and are also compliant with the latest revisions of the Intel specifications \cite{Intel:IIA2006-ALL}. According to Owens et al. \cite{Owens:TPHOL09-?} AMD is also planning a similar adaptation of their manuals.} The concurrency discipline most familiar to concurrent programs is one where each variable is protected by a lock, and a thread must hold the corresponding lock to access the variable. (It is possible to generalize this to allow shared locks, as well as variants such as split semaphores.) Such lock-based techniques are typically referred to as \Def{coarse-grained} concurrency control, and suffice for most concurrent application programming. However, these techniques do not suffice for low-level system programming (\eg the construction of OS kernels), for several reasons. First, in kernel programming efficiency is paramount, and atomic memory operations are more efficient for many problems. Second, lock-free concurrency control can sometimes guarantee stronger correctness (\eg wait-free algorithms can provide bounds on execution time). Third, kernel programming requires taking into account the implicit concurrency of concurrent hardware activities (\eg a hardware TLB racing to use page tables while the kernel is trying to access them), and hardware cannot be forced to follow a locking discipline. A more refined concurrency control discipline, one that is much closer to expert practice, is to classify memory addresses as lock-protected or shared. Lock-protected addresses are used in the usual way, but shared addresses can be accessed using atomic operations provided by hardware (e.g., on x86 class architectures, most reads and writes are atomic\footnote{This atomicity isn't guaranteed for certain memory types, or for operations that cross a cache line.}). The main restriction on these accesses is that if a processor does a shared write and a subsequent shared read (possibly from a different address), the processor must flush the store buffer somewhere in between. For example, in the example above, both \texttt{x} and \texttt{y} would be shared addresses, so each processor would have to flush its store buffer between its first and second operations. However, even this discipline is not very satisfactory. First, we would need even more rules to allow locks to be created or destroyed, or to change memory between shared and protected, and so on. Second, there are many interesting concurrency control primitives, and many algorithms, that allow a thread to obtain exclusive ownership of a -memory address; why should we treat locking as special? +memory address; why should we treat locking as special? In this report, we consider a much more general and powerful discipline that also guarantees sequential consistency. The basic rule for shared addresses is similar to the discipline above, but there are no locking primitives. Instead, we treat \Def{ownership} as fundamental. The difference is that ownership is manipulated by -nonblocking ghost updates, rather than an operation like locking that +nonblocking ghost updates, rather than an operation like locking that have runtime overhead. Informally the rules of the discipline are as follows: \begin{itemize} \item In any state, each memory address is either \Def{shared} or \Def{unshared}. Each memory address is also either \Def{owned} by a unique thread or \Def{unowned}. Every unowned address must be - shared. Each address is also either read-only or read-write. + shared. Each address is also either read-only or read-write. Every read-only address is unowned. \item A thread can (autonomously) acquire ownership of an unowned address, or release ownership of a address that it owns. It can also change whether an address it owns is shared or not. Upon release of an address it can mark it as read-only. \item Each memory access is marked as \Def{volatile} or - \Def{non-volatile}. + \Def{non-volatile}. \item A thread can perform a write if it is \Def{\Sound}. It can perform a read if it is sound and \Def{clean}. \item A non-volatile write is \Sound\ if the thread owns the address - and the address is unshared. + and the address is unshared. \item A non-volatile read is \Sound\ if the thread owns the address or the address is read-only. \item A volatile write is \Sound\ if no other thread owns the address and the address is not marked as read-only. \item A volatile read is \Sound\ if the address is shared or the thread owns it. \item A volatile read is clean if the store buffer has been flushed since the last volatile write. Moreover, every non-volatile read is clean. \item For interlocked operations (like compare and swap), which have the side effect of the store buffer getting flushed, the rules for volatile accesses apply. \end{itemize} Note first that these conditions are not thread-local, because some actions are allowed only when an address is unowned, marked read-only, or not marked read-only. A thread can ascertain such conditions only through system-wide invariants, respected by all threads, along with data it reads. By imposing suitable global invariants, various thread-local disciplines (such as one where addresses are protected by locks, conditional critical reasons, or monitors) can be derived as lemmas by ordinary program reasoning, without need for meta-theory. Second, note that these rules can be checked in the context of a concurrent program without store buffers, by introducing ghost state to keep track of ownership and sharing and whether the thread has performed a volatile write since the last flush. Our main result is that if a program obeys the rules above, then the program is sequentially consistent when executed on a TSO machine. Consider our first example program. If we choose to leave both \texttt{x} and \texttt{y} unowned (and hence shared), then all accesses must be volatile. This would force each thread to flush the store buffer between their first and second operations. In practice, on an x86/x64 machine, this would be done by making the writes interlocked, which flushes store buffers as a side effect. Whichever thread flushes its store buffer second is guaranteed to see the write of the other thread, making the execution violating sequential consistency impossible. However, couldn't the first thread try to take ownership of \texttt{x} before writing it, so that its write could be non-volatile? The answer is that it could, but then the second thread would be unable to read \texttt{x} volatile (or take ownership of \texttt{x} and read it non-volatile), because we would be unable to prove that \texttt{x} is unowned at that point. In other words, a thread can take ownership of an address only if it is not racing to do so. Ultimately, the races allowed by the discipline involve volatile access to a shared address, which brings us back to locks. A spinlock is typically implemented with an interlocked read-modify-write on an address (the interlocking providing the required flushing of the store buffer). If the locking succeeds, we can prove (using for example a ghost variable giving the ID of the thread taking the lock) that no other thread holds the lock, and can therefore safely take ownership of an address ``protected'' by the lock (using the global invariant that only the lock owner can own the protected address). Thus, our discipline subsumes the better-known disciplines governing coarse-grained concurrency control. -To summarize, our motivations for using ownership as our core notion of a +To summarize, our motivations for using ownership as our core notion of a practical programming discipline are the following: \begin{enumerate} \item the distinction between global (volatile) and local (non-volatile) accesses is a practical requirement to reduce the performance penalty due to necessary flushes and to allow important compiler optimizations (such as moving a local write ahead of a global read), \item coarse-grained concurrency control like locking is nothing special but only a derived concept which is used for ownership transfer (any other concurrency control that guarantees exclusive access is also fine), and -\item we want that the conditions to check for the programming discipline can be discharged by ordinary state-based -program reasoning on a sequentially consistent memory model +\item we want that the conditions to check for the programming discipline can be discharged by ordinary state-based +program reasoning on a sequentially consistent memory model (without having to talk about histories or complete executions). -\end{enumerate} +\end{enumerate} \paragraph{Overview} In Section \ref{sec:preliminaries} we introduce preliminaries of Isabelle/HOL, the theorem prover in which we mechanized our work. In Section \ref{sec:discipline} we informally describe the programming discipline and basic ideas of the formalization, which is detailed in -Section \ref{sec:formalization} where we introduce the formal models and the +Section \ref{sec:formalization} where we introduce the formal models and the reduction theorem. In Section \ref{sec:buildingblocks} we give some details of important building blocks for the proof of the reduction theorem. -To illustrate the connection between a programming language semantics and our reduction theorem, we instantiate our +To illustrate the connection between a programming language semantics and our reduction theorem, we instantiate our framework with a simple semantics for a parallel WHILE language in Section \ref{sec:pimp}. Finally we conclude in Section \ref{sec:conclusion}. \input{Preliminaries.tex} %\input{thy/document/Text.tex} \input{Text.tex} \section{Conclusion \label{sec:conclusion}} We have presented a practical and flexible programming discipline for concurrent programs that ensures sequential consistency on TSO machines, such as present x64 architectures. Our approach covers a wide variety of concurrency control, covering locking, data races, single writer multiple readers, read only and thread local portions of memory. We minimize the need for store buffer flushes to optimize the usage of the hardware. Our theorem is not coupled to a specific logical framework like separation logic but is based on more fundamental arguments, namely the adherence to the programming discipline -which can be discharged within any program logic using the standard sequential +which can be discharged within any program logic using the standard sequential consistent memory model, without any of the complications of TSO. \paragraph{Related work.} \emph{Disclaimer.} This contribution presents the state of our work from 2010 \cite{Cohen:ITP2010-}. -Finally, 8 years later, we made the AFP submission for Isabelle2018. +Finally, 8 years later, we made the AFP submission for Isabelle2018. This related work paragraph does not thoroughly cover publications that came up in the meantime. - + A categorization of various weak memory models is presented in \cite{Adve:Computer-29-12-66}. It is compatible with the recent revisions of the Intel manuals \cite{Intel:IIA2006-ALL} and the revised x86 model presented in \cite{Owens:TPHOL09-?}. The state of the art in formal verification of concurrent programs is still based on a sequentially consistent memory model. To justify this on a weak memory model often a quite drastic approach is chosen, allowing only coarse-grained concurrency usually implemented by locking. Thereby data races are ruled out completely and there are results that data race free programs can be considered as sequentially consistent for example for the Java memory model \cite{DBLP:conf/ecoop/SevcikA08,DBLP:conf/tphol/AspinallS07} or the x86 memory model\cite{Owens:TPHOL09-?}. Ridge \cite{conf/tphol/Ridge07} considers weak memory and data-races and verifies Peterson's mutual exclusion algorithm. He ensures sequentially consistency by flushing after every write to shared memory. % -Burckhardt and Musuvathi\cite{Sober} describe an execution monitor that +Burckhardt and Musuvathi\cite{Sober} describe an execution monitor that efficiently checks whether a sequentially consistent TSO execution has a single-step extension that is not sequentially consistent. Like our approach, it avoids having to consider the store buffers as an explicit part of the state. However, their condition requires maintaining in ghost state enough history information to determine causality between events, which means maintaining a vector clock (which is itself unbounded) for each memory address. Moreover, causality (being essentially graph reachability) is already not first-order, and hence unsuitable for -many types of program verification. +many types of program verification. % Closely related to our work is the draft of Owens~\cite{Owens-draft} which also investigates on the conditions for sequential consistent reasoning within TSO. The notion of a \emph{triangular-race} free trace is established to exactly characterize the traces on a TSO machine that are still sequentially consistent. A triangular race occurs between a read and a write of two different threads to the same address, when the reader still has some outstanding writes in the store buffer. To avoid the triangular race the reader has to flush the store buffer before reading. -This is essentially the same condition that our framework enforces, if we limit - every address to be unowned and every access to be volatile. +This is essentially the same condition that our framework enforces, if we limit + every address to be unowned and every access to be volatile. We regard this limitation as too strong for practical programs, where non-volatile accesses -(without any flushes) to temporarily local portions of memory (e.g. lock protected data) is common practice. +(without any flushes) to temporarily local portions of memory (e.g. lock protected data) is common practice. This is our core motivation for introducing the ownership based programming discipline. % We are aware of two extensions of our work that were published in the meantime. Chen \textit{et al}.~\cite{chen-2014} also take effects of the MMU into account and generalize our reduction theorem to handle programs that edit page tables. Oberhauser~\cite{Oberhauser-2016} improves on the flushing policy to also take non-triangular races into account and facilitates an alternative proof approach. \paragraph{Limitations.} There is a class of important programs that are not sequentially consistent but nevertheless correct. -First consider a simple spinlock implementation with a volatile lock \texttt{l}, where \texttt{l == 0} +First consider a simple spinlock implementation with a volatile lock \texttt{l}, where \texttt{l == 0} indicates that the lock is not taken. The following code acquires the lock: \begin{verbatim} while(!interlocked_test_and_set(l)); , \end{verbatim} -and with the assignment \texttt{l = 0} we can release the lock again. -Within our framework address \texttt{l} can be considered \emph{unowned} (and hence shared) and every access to it +and with the assignment \texttt{l = 0} we can release the lock again. +Within our framework address \texttt{l} can be considered \emph{unowned} (and hence shared) and every access to it is \emph{volatile}. We do not have to transfer ownership of the lock \texttt{l} itself but of the objects it protects. -As acquiring the lock is an expensive interlocked oprations anyway there are no additional restrictions from our framework. -The interesting point is the release of the lock via the volatile write \texttt{l=0}. -This leaves the dirty bit set, and hence our programming discipline requires a flushing instruction before the -next volatile read. -If \texttt{l} is the only volatile variable this is fine, since the next operation will be a lock acquire again which is interlocked and thus flushes the store buffer. -So there is no need for an additonal fence. -But in general this is not the case and we would have to insert a fence after the lock release to make the dirty bit clean again and to stay sequentially consistent. However, can we live without the fence? For the correctness of the mutal-exclusion algorithm we can, but we leave the domain of sequential consistent reasoning. +As acquiring the lock is an expensive interlocked oprations anyway there are no additional restrictions from our framework. +The interesting point is the release of the lock via the volatile write \texttt{l=0}. +This leaves the dirty bit set, and hence our programming discipline requires a flushing instruction before the +next volatile read. +If \texttt{l} is the only volatile variable this is fine, since the next operation will be a lock acquire again which is interlocked and thus flushes the store buffer. +So there is no need for an additonal fence. +But in general this is not the case and we would have to insert a fence after the lock release to make the dirty bit clean again and to stay sequentially consistent. However, can we live without the fence? For the correctness of the mutal-exclusion algorithm we can, but we leave the domain of sequential consistent reasoning. The intuitive reason for correctness is that the threads waiting for the lock do no harm while waiting. They only take some action if they see the lock being zero again, this is when the lock release has made its way out of the store buffer. Another typical example is the following simplified form of -barrier synchronization: each processor has a flag that it writes (with ordinarry volatile writes without any flushing) +barrier synchronization: each processor has a flag that it writes (with ordinarry volatile writes without any flushing) and other processors read, and each processor waits for all processors to set their flags before continuing past the barrier. This is not sequentially consistent -- each processor might see his own flag set -and later see all other flags clear -- but it is still correct. +and later see all other flags clear -- but it is still correct. -Common for these examples is that there is only a single writer to an address, +Common for these examples is that there is only a single writer to an address, and the values written are monotonic in a sense that allows the readers to draw the correct conlcusion -when they observe a certain value. This pattern is named +when they observe a certain value. This pattern is named \emph{Publication Idiom} in Owens work~\cite{Owens-draft}. \paragraph{Future work.} The first direction of future work is to try to deal with the limitations -of sequential consistency described above and try to come up with a more general reduction -theorem that can also handle non sequential consistent code portions that follow some +of sequential consistency described above and try to come up with a more general reduction +theorem that can also handle non sequential consistent code portions that follow some monotonicity rules. Another direction of future work is to take compiler optimization into account. Our volatile accesses correspond roughly to volatile memory accesses within a C program. An optimizing compiler is free to convert any sequence of non-volatile accesses into a (sequentially semantically equivalent) sequence of accesses. As long as execution is sequentially consistent, equivalence of these programs (\eg with respect to final states of executions that end with volatile operations) follows immediately by reduction. However, some compilers are a little more lenient in their optimizations, and allow operations on certain local variables to move across volatile operations. In the context of C (where pointers to stack variables can be passed by pointer), the notion of ``locality'' is somewhat tricky, and makes essential use of C forbidding (semantically) address arithmetic across memory objects. \section*{Acknowledgements} We thank Mark Hillebrand for discussions and feedback on this work and extensive comments on this report. \appendix \section{Appendix} After the explanatory text in the main body of the document we now show the plain theory files. \input{ReduceStoreBuffer.tex} \input{ReduceStoreBufferSimulation.tex} \input{PIMP.tex} \bibliographystyle{plain} \bibliography{root} %\pagebreak %\appendix %\input{thy/document/Appendix.tex} \end{document} diff --git a/thys/pGCL/document/root.tex b/thys/pGCL/document/root.tex --- a/thys/pGCL/document/root.tex +++ b/thys/pGCL/document/root.tex @@ -1,107 +1,109 @@ +\RequirePackage{luatex85} + \documentclass[11pt,a4paper]{book} \usepackage{isabelle,isabellesym} \usepackage[english]{babel} \usepackage{natbib} \usepackage{times} \usepackage{amsmath} \usepackage{amssymb} \usepackage[only,bigsqcap]{stmaryrd} \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage[all]{xy} \usepackage{lmodern} \usepackage[pdftex,colorlinks=true,linkcolor=blue]{hyperref} % urls in roman style, theory text in math-similar italics \urlstyle{rm} \isabellestyle{it} \begin{document} \renewcommand{\chapterautorefname}{Chapter} \renewcommand{\sectionautorefname}{Section} \renewcommand{\subsectionautorefname}{Section} \renewcommand{\subsubsectionautorefname}{Section} \renewcommand{\appendixautorefname}{Appendix} \renewcommand{\Hfootnoteautorefname}{Footnote} \newcommand{\lemmaautorefname}{Lemma} \newcommand{\definitionautorefname}{Definition} \frontmatter \title{pGCL for Isabelle} \author{David Cock} \maketitle \tableofcontents % sane default for proof documents \parindent 0pt\parskip 0.5ex \mainmatter \chapter{Overview} pGCL is both a programming language and a specification language that incorporates both probabilistic and nondeterministic choice, in a unified manner. Program verification is by \emph{refinement} or \emph{annotation} (or both), using either Hoare triples, or weakest-precondition entailment, in the style of GCL \citep{Dijkstra_75}. This document is divided into three parts: \autoref{c:intro} gives a tutorial-style introduction to pGCL, and demonstrates the tools provided by the package; \autoref{c:semantics} covers the development of the semantic interpretation: \emph{expectation transformers}; and \autoref{c:language} covers the formalisation of the language primitives, the associated \emph{healthiness} results, and the tools for structured and automated reasoning. This second part follows the technical development of the pGCL theory package, in detail. It is not a great place to start learning pGCL. For that, see either the tutorial or \citet{McIver_M_04}. This formalisation was first presented (as an overview) in \citet{Cock_12}. The language has previously been formalised in HOL4 by \citet{Hurd_05}. Two substantial results using this package were presented in \citet{Cock_13}, \citet{Cock_14} and \citet{Cock_14a}. \chapter{Introduction to pGCL} \label{c:intro} \input{Primitives} \input{LoopExamples} \input{Monty} \chapter{Semantic Structures} \label{c:semantics} \input{Expectations} \input{Transformers} \input{Induction} \chapter{The pGCL Language} \label{c:language} \input{Embedding} \input{Healthiness} \input{Continuity} \input{LoopInduction} \input{Sublinearity} \input{Determinism} \input{WellDefined} \input{Loops} \input{Algebra} \input{StructuredReasoning} \input{Termination} \input{Automation} \backmatter \chapter{Additional Material} \label{c:additional} \input{Misc} \bibliographystyle{plainnat} \bibliography{root} \end{document} %%% Local Variables: %%% mode: latex %%% TeX-master: t %%% End: