diff --git a/src/Doc/Codegen/Computations.thy b/src/Doc/Codegen/Computations.thy --- a/src/Doc/Codegen/Computations.thy +++ b/src/Doc/Codegen/Computations.thy @@ -1,544 +1,545 @@ theory Computations imports Setup "HOL-Library.Code_Target_Int" begin section \Computations \label{sec:computations}\ subsection \Prelude -- The \code\ antiquotation \label{sec:code_antiq}\ text \ The @{ML_antiquotation_def code} antiquotation allows to include constants from generated code directly into ML system code, as in the following toy example: \ datatype %quote form = T | F | And form form | Or form form (*<*) (*>*) ML %quote \ fun eval_form @{code T} = true | eval_form @{code F} = false | eval_form (@{code And} (p, q)) = eval_form p andalso eval_form q | eval_form (@{code Or} (p, q)) = eval_form p orelse eval_form q; \ text \ \noindent The antiquotation @{ML_antiquotation code} takes the name of a constant as argument; the required code is generated transparently and the corresponding constant names are inserted for the given antiquotations. This technique allows to use pattern matching on constructors stemming from compiled datatypes. Note that the @{ML_antiquotation code} antiquotation may not refer to constants which carry adaptations; here you have to refer to the corresponding adapted code directly. \ subsection \The concept of computations\ text \ Computations embody the simple idea that for each monomorphic Isabelle/HOL term of type \\\ by virtue of code generation there exists an corresponding ML type \T\ and a morphism \\ :: \ \ T\ satisfying \\ (t\<^sub>1 \ t\<^sub>2) = \ t\<^sub>1 \ \ t\<^sub>2\, with \\\ denoting term application. For a given Isabelle/HOL type \\\, parts of \\\ can be implemented by a corresponding ML function \\\<^sub>\ :: term \ T\. How? \<^descr> \Let input be a constant C :: \.\ \\ Then \\\<^sub>\ C = f\<^sub>C\ with \f\<^sub>C\ being the image of \C\ under code generation. \<^descr> \Let input be an application (t\<^sub>1 \ t\<^sub>2) :: \.\ \\ Then \\\<^sub>\ (t\<^sub>1 \ t\<^sub>2) = \\<^sub>\ t\<^sub>1 (\\<^sub>\ t\<^sub>2)\. \noindent Using these trivial properties, each monomorphic constant \C : \<^vec>\\<^sub>n \ \\ yields the following equations: \ text %quote \ \\\<^bsub>(\\<^sub>1 \ \\<^sub>2 \ \ \ \\<^sub>n \ \)\<^esub> C = f\<^sub>C\ \\ \\\<^bsub>(\\<^sub>2 \ \ \ \\<^sub>n \ \)\<^esub> (C \ t\<^sub>1) = f\<^sub>C (\\<^bsub>\\<^sub>1\<^esub> t\<^sub>1)\ \\ \\\ \\ \\\<^bsub>\\<^esub> (C \ t\<^sub>1 \ \ \ t\<^sub>n) = f\<^sub>C (\\<^bsub>\\<^sub>1\<^esub> t\<^sub>1) \ (\\<^bsub>\\<^sub>n\<^esub> t\<^sub>n)\ \ text \ \noindent Hence a computation is characterized as follows: \<^item> Let \input constants\ denote a set of monomorphic constants. \<^item> Let \\\ denote a monomorphic type and \'ml\ be a schematic placeholder for its corresponding type in ML under code generation. \<^item> Then the corresponding computation is an ML function of type \<^ML_type>\Proof.context -> term -> 'ml\ partially implementing the morphism \\ :: \ \ T\ for all \<^emph>\input terms\ consisting only of input constants and applications. \noindent The charming idea is that all required code is automatically generated by the code generator for givens input constants and types; that code is directly inserted into the Isabelle/ML runtime system by means of antiquotations. \ subsection \The \computation\ antiquotation\ text \ The following example illustrates its basic usage: \ ML %quote \ local fun int_of_nat @{code "0 :: nat"} = 0 | int_of_nat (@{code Suc} n) = int_of_nat n + 1; in val comp_nat = @{computation nat terms: "plus :: nat \ nat \ nat" "times :: nat \ nat \ nat" "sum_list :: nat list \ nat" "prod_list :: nat list \ nat" datatypes: nat "nat list"} (fn post => post o HOLogic.mk_nat o int_of_nat o the); end \ text \ \<^item> Antiquotations occurring in the same ML block always refer to the same transparently generated code; particularly, they share the same transparently generated datatype declarations. \<^item> The type of a computation is specified as first argument. \<^item> Input constants are specified the following ways: \<^item> Each term following \terms:\ specifies all constants it contains as input constants. \<^item> Each type following \datatypes:\ specifies all constructors of the corresponding code datatype as input constants. Note that this does not increase expressiveness but succinctness for datatypes with many constructors. Abstract type constructors are skipped silently. \<^item> The code generated by a \computation\ antiquotation takes a functional argument which describes how to conclude the computation. What's the rationale behind this? \<^item> There is no automated way to generate a reconstruction function from the resulting ML type to a Isabelle term -- this is in the responsibility of the implementor. One possible approach for robust term reconstruction is the \code\ antiquotation. \<^item> Both statically specified input constants and dynamically provided input terms are subject to preprocessing. Likewise the result is supposed to be subject to postprocessing; the implementor is expected to take care for this explicitly. \<^item> Computations follow the partiality principle (cf.~\secref{sec:partiality_principle}): failures due to pattern matching or unspecified functions are interpreted as partiality; therefore resulting ML values are optional. Hence the functional argument accepts the following parameters \<^item> A postprocessor function \<^ML_type>\term -> term\. \<^item> The resulting value as optional argument. The functional argument is supposed to compose the final result from these in a suitable manner. \noindent A simple application: \ ML_val %quote \ comp_nat \<^context> \<^term>\sum_list [Suc 0, Suc (Suc 0)] * Suc (Suc 0)\ \ text \ \noindent A single ML block may contain an arbitrary number of computation antiquotations. These share the \<^emph>\same\ set of possible input constants. In other words, it does not matter in which antiquotation constants are specified; in the following example, \<^emph>\all\ constants are specified by the first antiquotation once and for all: \ ML %quote \ local fun int_of_nat @{code "0 :: nat"} = 0 | int_of_nat (@{code Suc} n) = int_of_nat n + 1; in val comp_nat = @{computation nat terms: "plus :: nat \ nat \ nat" "times :: nat \ nat \ nat" "sum_list :: nat list \ nat" "prod_list :: nat list \ nat" "replicate :: nat \ nat \ nat list" datatypes: nat "nat list"} (fn post => post o HOLogic.mk_nat o int_of_nat o the); val comp_nat_list = @{computation "nat list"} (fn post => post o HOLogic.mk_list \<^typ>\nat\ o map (HOLogic.mk_nat o int_of_nat) o the); end \ subsection \Pitfalls when specifying input constants \label{sec:input_constants_pitfalls}\ text \ \<^descr> \Complete type coverage.\ Specified input constants must be \<^emph>\complete\ in the sense that for each required type \\\ there is at least one corresponding input constant which can actually \<^emph>\construct\ a concrete value of type \\\, potentially requiring more types recursively; otherwise the system of equations cannot be generated properly. Hence such incomplete input constants sets are rejected immediately. \<^descr> \Unsuitful right hand sides.\ The generated code for a computation must compile in the strict ML runtime environment. This imposes the technical restriction that each compiled input constant \f\<^sub>C\ on the right hand side of a generated equations must compile without throwing an exception. That rules out pathological examples like @{term [source] "undefined :: nat"} as input constants, as well as abstract constructors (cf. \secref{sec:invariant}). \<^descr> \Preprocessing.\ For consistency, input constants are subject to preprocessing; however, the overall approach requires to operate on constants \C\ and their respective compiled images \f\<^sub>C\.\footnote{Technical restrictions of the implementation enforce this, although those could be lifted in the future.} This is a problem whenever preprocessing maps an input constant to a non-constant. To circumvent these situations, the computation machinery has a dedicated preprocessor which is applied \<^emph>\before\ the regular preprocessing, both to input constants as well as input terms. This can be used to map problematic constants to other ones that are not subject to regular preprocessing. Rewrite rules are added using attribute @{attribute code_computation_unfold}. There should rarely be a need to use this beyond the few default setups in HOL, which deal with literals (see also \secref{sec:literal_input}). \ subsection \Pitfalls concerning input terms\ text \ \<^descr> \No polymorphims.\ Input terms must be monomorphic: compilation to ML requires dedicated choice of monomorphic types. \<^descr> \No abstractions.\ Only constants and applications are admissible as input; abstractions are not possible. In theory, the compilation schema could be extended to cover abstractions also, but this would increase the trusted code base. If abstractions are required, they can always be eliminated by a dedicated preprocessing step, e.g.~using combinatorial logic. \<^descr> \Potential interfering of the preprocessor.\ As described in \secref{sec:input_constants_pitfalls} regular preprocessing can have a disruptive effect for input constants. The same also applies to input terms; however the same measures to circumvent that problem for input constants apply to input terms also. \ subsection \Computations using the \computation_conv\ antiquotation\ text \ Computations are a device to implement fast proof procedures. Then results of a computation are often assumed to be trustable and thus wrapped in oracles (see @{cite "isabelle-isar-ref"}), as in the following example:\footnote{ The technical details how numerals are dealt with are given later in \secref{sec:literal_input}.} \ ML %quote \ local - fun raw_dvd (b, ct) = Thm.mk_binop \<^cterm>\Pure.eq :: bool \ bool \ prop\ - ct (if b then \<^cterm>\True\ else \<^cterm>\False\); + fun raw_dvd (b, ct) = + \<^instantiate>\x = ct and y = \if b then \<^cterm>\True\ else \<^cterm>\False\\ + in cterm "x \ y" for x y :: bool\; val (_, dvd_oracle) = Context.>>> (Context.map_theory_result (Thm.add_oracle (\<^binding>\dvd\, raw_dvd))); in val conv_dvd = @{computation_conv bool terms: "Rings.dvd :: int \ int \ bool" "plus :: int \ int \ int" "minus :: int \ int \ int" "times :: int \ int \ int" "0 :: int" "1 :: int" "2 :: int" "3 :: int" "-1 :: int" } (K (curry dvd_oracle)) end \ text \ \<^item> Antiquotation @{ML_antiquotation computation_conv} basically yields a conversion of type \<^ML_type>\Proof.context -> cterm -> thm\ (see further @{cite "isabelle-implementation"}). \<^item> The antiquotation expects one functional argument to bridge the \qt{untrusted gap}; this can be done e.g.~using an oracle. Since that oracle will only yield \qt{valid} results in the context of that particular computation, the implementor must make sure that it does not leave the local ML scope; in this example, this is achieved using an explicit \local\ ML block. The very presence of the oracle in the code acknowledges that each computation requires explicit thinking before it can be considered trustworthy! \<^item> Postprocessing just operates as further conversion after normalization. \<^item> Partiality makes the whole conversion fall back to reflexivity. \ (*<*) (*>*) ML_val %quote \ conv_dvd \<^context> \<^cterm>\7 dvd ( 62437867527846782 :: int)\; conv_dvd \<^context> \<^cterm>\7 dvd (-62437867527846783 :: int)\; \ text \ \noindent An oracle is not the only way to construct a valid theorem. A computation result can be used to construct an appropriate certificate: \ lemma %quote conv_div_cert: "(Code_Target_Int.positive r * Code_Target_Int.positive s) div Code_Target_Int.positive s \ (numeral r :: int)" (is "?lhs \ ?rhs") proof (rule eq_reflection) have "?lhs = numeral r * numeral s div numeral s" by simp also have "numeral r * numeral s div numeral s = ?rhs" by (rule nonzero_mult_div_cancel_right) simp finally show "?lhs = ?rhs" . qed lemma %quote positive_mult: "Code_Target_Int.positive r * Code_Target_Int.positive s = Code_Target_Int.positive (r * s)" by simp ML %quote \ local fun integer_of_int (@{code int_of_integer} k) = k val cterm_of_int = Thm.cterm_of \<^context> o HOLogic.mk_numeral o integer_of_int; val divisor = Thm.dest_arg o Thm.dest_arg; val evaluate_simps = map mk_eq @{thms arith_simps positive_mult}; fun evaluate ctxt = Simplifier.rewrite_rule ctxt evaluate_simps; fun revaluate ctxt k ct = @{thm conv_div_cert} |> Thm.instantiate' [] [SOME (cterm_of_int k), SOME (divisor ct)] |> evaluate ctxt; in val conv_div = @{computation_conv int terms: "divide :: int \ int \ int" "0 :: int" "1 :: int" "2 :: int" "3 :: int" } revaluate end \ ML_val %quote \ conv_div \<^context> \<^cterm>\46782454343499999992777742432342242323423425 div (7 :: int)\ \ text \ \noindent The example is intentionally kept simple: only positive integers are covered, only remainder-free divisions are feasible, and the input term is expected to have a particular shape. This exhibits the idea more clearly: the result of the computation is used as a mere hint how to instantiate @{fact conv_div_cert}, from which the required theorem is obtained by performing multiplication using the simplifier; hence that theorem is built of proper inferences, with no oracles involved. \ subsection \Computations using the \computation_check\ antiquotation\ text \ The \computation_check\ antiquotation is convenient if only a positive checking of propositions is desired, because then the result type is fixed (\<^typ>\prop\) and all the technical matter concerning postprocessing and oracles is done in the framework once and for all: \ ML %quote \ val check_nat = @{computation_check terms: Trueprop "less :: nat \ nat \ bool" "plus :: nat \ nat \ nat" "times :: nat \ nat \ nat" "0 :: nat" Suc } \ text \ \noindent The HOL judgement \<^term>\Trueprop\ embeds an expression of type \<^typ>\bool\ into \<^typ>\prop\. \ ML_val %quote \ check_nat \<^context> \<^cprop>\less (Suc (Suc 0)) (Suc (Suc (Suc 0)))\ \ text \ \noindent Note that such computations can only \<^emph>\check\ for \<^typ>\prop\s to hold but not \<^emph>\decide\. \ subsection \Some practical hints\ subsubsection \Inspecting generated code\ text \ The antiquotations for computations attempt to produce meaningful error messages. If these are not sufficient, it might by useful to inspect the generated code, which can be achieved using \ declare %quote [[ML_source_trace]] (*<*) declare %quote [[ML_source_trace = false]] (*>*) subsubsection \Literals as input constants \label{sec:literal_input}\ text \ Literals (characters, numerals) in computations cannot be processed naively: the compilation pattern for computations fails whenever target-language literals are involved; since various common code generator setups (see \secref{sec:common_adaptation}) implement \<^typ>\nat\ and \<^typ>\int\ by target-language literals, this problem manifests whenever numeric types are involved. In practice, this is circumvented with a dedicated preprocessor setup for literals (see also \secref{sec:input_constants_pitfalls}). The following examples illustrate the pattern how to specify input constants when literals are involved, without going into too much detail: \ paragraph \An example for \<^typ>\nat\\ ML %quote \ val check_nat = @{computation_check terms: Trueprop "even :: nat \ bool" "plus :: nat \ nat \ nat" "0 :: nat" Suc "1 :: nat" "2 :: nat" "3 :: nat" } \ ML_val %quote \ check_nat \<^context> \<^cprop>\even (Suc 0 + 1 + 2 + 3 + 4 + 5)\ \ paragraph \An example for \<^typ>\int\\ ML %quote \ val check_int = @{computation_check terms: Trueprop "even :: int \ bool" "plus :: int \ int \ int" "0 :: int" "1 :: int" "2 :: int" "3 :: int" "-1 :: int" } \ ML_val %quote \ check_int \<^context> \<^cprop>\even ((0::int) + 1 + 2 + 3 + -1 + -2 + -3)\ \ paragraph \An example for \<^typ>\String.literal\\ definition %quote is_cap_letter :: "String.literal \ bool" where "is_cap_letter s \ (case String.asciis_of_literal s of [] \ False | k # _ \ 65 \ k \ k \ 90)" (*<*) (*>*) ML %quote \ val check_literal = @{computation_check terms: Trueprop is_cap_letter datatypes: bool String.literal } \ ML_val %quote \ check_literal \<^context> \<^cprop>\is_cap_letter (STR ''Q'')\ \ subsubsection \Preprocessing HOL terms into evaluable shape\ text \ When integrating decision procedures developed inside HOL into HOL itself, a common approach is to use a deep embedding where operators etc. are represented by datatypes in Isabelle/HOL. Then it is necessary to turn generic shallowly embedded statements into that particular deep embedding (\qt{reification}). One option is to hardcode using code antiquotations (see \secref{sec:code_antiq}). Another option is to use pre-existing infrastructure in HOL: \<^ML>\Reification.conv\ and \<^ML>\Reification.tac\. A simplistic example: \ datatype %quote form_ord = T | F | Less nat nat | And form_ord form_ord | Or form_ord form_ord | Neg form_ord primrec %quote interp :: "form_ord \ 'a::order list \ bool" where "interp T vs \ True" | "interp F vs \ False" | "interp (Less i j) vs \ vs ! i < vs ! j" | "interp (And f1 f2) vs \ interp f1 vs \ interp f2 vs" | "interp (Or f1 f2) vs \ interp f1 vs \ interp f2 vs" | "interp (Neg f) vs \ \ interp f vs" text \ \noindent The datatype \<^type>\form_ord\ represents formulae whose semantics is given by \<^const>\interp\. Note that values are represented by variable indices (\<^typ>\nat\) whose concrete values are given in list \<^term>\vs\. \ ML %quote (*<*) \\ lemma "thm": fixes x y z :: "'a::order" shows "x < y \ x < z \ interp (And (Less (Suc 0) (Suc (Suc 0))) (Less (Suc 0) 0)) [z, x, y]" ML_prf %quote (*>*) \val thm = Reification.conv \<^context> @{thms interp.simps} \<^cterm>\x < y \ x < z\\ (*<*) by (tactic \ALLGOALS (resolve_tac \<^context> [thm])\) (*>*) text \ \noindent By virtue of @{fact interp.simps}, \<^ML>\Reification.conv\ provides a conversion which, for this concrete example, yields @{thm thm [no_vars]}. Note that the argument to \<^const>\interp\ does not contain any free variables and can thus be evaluated using evaluation. A less meager example can be found in the AFP, session \Regular-Sets\, theory \Regexp_Method\. \ end diff --git a/src/HOL/ex/Sorting_Algorithms_Examples.thy b/src/HOL/ex/Sorting_Algorithms_Examples.thy --- a/src/HOL/ex/Sorting_Algorithms_Examples.thy +++ b/src/HOL/ex/Sorting_Algorithms_Examples.thy @@ -1,64 +1,65 @@ (* Title: HOL/ex/Sorting_Algorithms_Examples.thy Author: Florian Haftmann, TU Muenchen *) theory Sorting_Algorithms_Examples imports Main "HOL-Library.Sorting_Algorithms" begin subsection \Evaluation examples\ definition int_abs_reversed :: "int comparator" where "int_abs_reversed = key abs (reversed default)" definition example_1 :: "int list" where "example_1 = [65, 1705, -2322, 734, 4, (-17::int)]" definition example_2 :: "int list" where "example_2 = [-3000..3000]" ML \ local val term_of_int_list = HOLogic.mk_list \<^typ>\int\ o map (HOLogic.mk_number \<^typ>\int\ o @{code integer_of_int}); - fun raw_sort (ctxt, ct, ks) = Thm.mk_binop \<^cterm>\Pure.eq :: int list \ int list \ prop\ - ct (Thm.cterm_of ctxt (term_of_int_list ks)); + fun raw_sort (ctxt, ct, ks) = + \<^instantiate>\x = ct and y = \Thm.cterm_of ctxt (term_of_int_list ks)\ + in cterm "x \ y" for x y :: "int list"\; val (_, sort_oracle) = Context.>>> (Context.map_theory_result (Thm.add_oracle (\<^binding>\sort\, raw_sort))); in val sort_int_abs_reversed_conv = @{computation_conv "int list" terms: int_abs_reversed "sort :: int comparator \ _" "quicksort :: int comparator \ _" "mergesort :: int comparator \ _" example_1 example_2 } (fn ctxt => fn ct => fn ks => sort_oracle (ctxt, ks, ct)) end \ declare [[code_timing]] ML_val \sort_int_abs_reversed_conv \<^context> \<^cterm>\sort int_abs_reversed example_1\\ ML_val \sort_int_abs_reversed_conv \<^context> \<^cterm>\quicksort int_abs_reversed example_1\\ ML_val \sort_int_abs_reversed_conv \<^context> \<^cterm>\mergesort int_abs_reversed example_1\\ ML_val \sort_int_abs_reversed_conv \<^context> \<^cterm>\sort int_abs_reversed example_2\\ ML_val \sort_int_abs_reversed_conv \<^context> \<^cterm>\quicksort int_abs_reversed example_2\\ ML_val \sort_int_abs_reversed_conv \<^context> \<^cterm>\mergesort int_abs_reversed example_2\\ end