diff --git a/src/Doc/System/Sessions.thy b/src/Doc/System/Sessions.thy --- a/src/Doc/System/Sessions.thy +++ b/src/Doc/System/Sessions.thy @@ -1,1013 +1,1016 @@ (*:maxLineLen=78:*) theory Sessions imports Base begin chapter \Isabelle sessions and build management \label{ch:session}\ text \ An Isabelle \<^emph>\session\ consists of a collection of related theories that may be associated with formal documents (\chref{ch:present}). There is also a notion of \<^emph>\persistent heap\ image to capture the state of a session, similar to object-code in compiled programming languages. Thus the concept of session resembles that of a ``project'' in common IDE environments, but the specific name emphasizes the connection to interactive theorem proving: the session wraps-up the results of user-interaction with the prover in a persistent form. Application sessions are built on a given parent session, which may be built recursively on other parents. Following this path in the hierarchy eventually leads to some major object-logic session like \HOL\, which itself is based on \Pure\ as the common root of all sessions. Processing sessions may take considerable time. Isabelle build management helps to organize this efficiently. This includes support for parallel build jobs, in addition to the multithreaded theory and proof checking that is already provided by the prover process itself. \ section \Session ROOT specifications \label{sec:session-root}\ text \ Session specifications reside in files called \<^verbatim>\ROOT\ within certain directories, such as the home locations of registered Isabelle components or additional project directories given by the user. The ROOT file format follows the lexical conventions of the \<^emph>\outer syntax\ of Isabelle/Isar, see also \<^cite>\"isabelle-isar-ref"\. This defines common forms like identifiers, names, quoted strings, verbatim text, nested comments etc. The grammar for @{syntax chapter_def}, @{syntax chapter_entry} and @{syntax session_entry} is given as syntax diagram below. Each ROOT file may contain multiple specifications like this. Chapters help to organize browser info (\secref{sec:info}), but have no formal meaning. The default chapter is ``\Unsorted\''. Chapter definitions, which are optional, allow to associate additional information. Isabelle/jEdit \<^cite>\"isabelle-jedit"\ includes a simple editing mode \<^verbatim>\isabelle-root\ for session ROOT files, which is enabled by default for any file of that name. \<^rail>\ @{syntax_def chapter_def}: @'chapter_definition' @{syntax name} \ groups? description? ; @{syntax_def chapter_entry}: @'chapter' @{syntax name} ; @{syntax_def session_entry}: @'session' @{syntax system_name} groups? dir? '=' \ (@{syntax system_name} '+')? description? options? \ sessions? directories? (theories*) \ (document_theories?) (document_files*) \ (export_files*) (export_classpath?) ; groups: '(' (@{syntax name} +) ')' ; dir: @'in' @{syntax embedded} ; description: @'description' @{syntax text} ; options: @'options' opts ; opts: '[' ( (@{syntax name} '=' value | @{syntax name}) + ',' ) ']' ; value: @{syntax name} | @{syntax real} ; sessions: @'sessions' (@{syntax system_name}+) ; directories: @'directories' (dir+) ; theories: @'theories' opts? (theory_entry+) ; theory_entry: @{syntax system_name} ('(' @'global' ')')? ; document_theories: @'document_theories' (@{syntax name}+) ; document_files: @'document_files' ('(' dir ')')? (@{syntax embedded}+) ; export_files: @'export_files' ('(' dir ')')? ('[' nat ']')? \ (@{syntax embedded}+) ; export_classpath: @'export_classpath' (@{syntax embedded}*) \ \<^descr> \isakeyword{chapter{\isacharunderscorekeyword}definition}~\A (groups)\ associates a collection of groups with chapter \A\. All sessions that belong to this chapter will automatically become members of these groups. \<^descr> \isakeyword{session}~\A = B + body\ defines a new session \A\ based on parent session \B\, with its content given in \body\ (imported sessions and theories). Note that a parent (like \HOL\) is mandatory in practical applications: only Isabelle/Pure can bootstrap itself from nothing. All such session specifications together describe a hierarchy (graph) of sessions, with globally unique names. The new session name \A\ should be sufficiently long and descriptive to stand on its own in a potentially large library. \<^descr> \isakeyword{session}~\A (groups)\ indicates a collection of groups where the new session is a member. Group names are uninterpreted and merely follow certain conventions. For example, the Isabelle distribution tags some important sessions by the group name called ``\main\''. Other projects may invent their own conventions, but this requires some care to avoid clashes within this unchecked name space. \<^descr> \isakeyword{session}~\A\~\isakeyword{in}~\dir\ specifies an explicit directory for this session; by default this is the current directory of the \<^verbatim>\ROOT\ file. All theory files are located relatively to the session directory. The prover process is run within the same as its current working directory. \<^descr> \isakeyword{description}~\text\ is a free-form description for this session (or chapter), e.g. for presentation purposes. \<^descr> \isakeyword{options}~\[x = a, y = b, z]\ defines separate options (\secref{sec:system-options}) that are used when processing this session, but \<^emph>\without\ propagation to child sessions. Note that \z\ abbreviates \z = true\ for Boolean options. \<^descr> \isakeyword{sessions}~\names\ specifies sessions that are \<^emph>\imported\ into the current name space of theories. This allows to refer to a theory \A\ from session \B\ by the qualified name \B.A\ --- although it is loaded again into the current ML process, which is in contrast to a theory that is already present in the \<^emph>\parent\ session. Theories that are imported from other sessions are excluded from the current session document. \<^descr> \isakeyword{directories}~\dirs\ specifies additional directories for import of theory files via \isakeyword{theories} within \<^verbatim>\ROOT\ or \<^theory_text>\imports\ within a theory; \dirs\ are relative to the main session directory (cf.\ \isakeyword{session} \dots \isakeyword{in}~\dir\). These directories need to be exclusively assigned to a unique session, without implicit sharing of file-system locations. \<^descr> \isakeyword{theories}~\options names\ specifies a block of theories that are processed within an environment that is augmented by the given options, in addition to the global session options given before. Any number of blocks of \isakeyword{theories} may be given. Options are only active for each \isakeyword{theories} block separately. A theory name that is followed by \(\\isakeyword{global}\)\ is treated literally in other session specifications or theory imports --- the normal situation is to qualify theory names by the session name; this ensures globally unique names in big session graphs. Global theories are usually the entry points to major logic sessions: \Pure\, \Main\, \Complex_Main\, \HOLCF\, \IFOL\, \FOL\, \ZF\, \ZFC\ etc. Regular Isabelle applications should not claim any global theory names. \<^descr> \isakeyword{document_theories}~\names\ specifies theories from other sessions that should be included in the generated document source directory. These theories need to be explicit imports in the current session, or implicit imports from the underlying hierarchy of parent sessions. The generated \<^verbatim>\session.tex\ file is not affected: the session's {\LaTeX} setup needs to \<^verbatim>\\input{\\\\\<^verbatim>\}\ generated \<^verbatim>\.tex\ files separately. \<^descr> \isakeyword{document_files}~\(\\isakeyword{in}~\base_dir) files\ lists source files for document preparation, typically \<^verbatim>\.tex\ and \<^verbatim>\.sty\ for {\LaTeX}. Only these explicitly given files are copied from the base directory to the document output directory, before formal document processing is started (see also \secref{sec:tool-document}). The local path structure of the \files\ is preserved, which allows to reconstruct the original directory hierarchy of \base_dir\. The default \base_dir\ is \<^verbatim>\document\ within the session root directory. \<^descr> \isakeyword{export_files}~\(\\isakeyword{in}~\target_dir) [number] patterns\ specifies theory exports that may get written to the file-system, e.g. via @{tool_ref build} with option \<^verbatim>\-e\ (\secref{sec:tool-build}). The \target_dir\ specification is relative to the session root directory; its default is \<^verbatim>\export\. Exports are selected via \patterns\ as in @{tool_ref export} (\secref{sec:tool-export}). The number given in brackets (default: 0) specifies the prefix of elements that should be removed from each name: it allows to reduce the resulting directory hierarchy at the danger of overwriting files due to loss of uniqueness. \<^descr> \isakeyword{export_classpath}~\patterns\ specifies export artifacts that should be included into the local Java/Scala classpath of this session context. This is only relevant for tools that allow dynamic loading of service classes (\secref{sec:scala-build}), while most other Isabelle/Scala tools require global configuration during system startup. An empty list of \patterns\ defaults to \<^verbatim>\"*:classpath/*.jar"\, which fits to the naming convention of JAR modules produced by the Isabelle/Isar command \<^theory_text>\scala_build_generated_files\ \<^cite>\"isabelle-isar-ref"\. \ subsubsection \Examples\ text \ See \<^file>\~~/src/HOL/ROOT\ for a diversity of practically relevant situations, although it uses relatively complex quasi-hierarchic naming conventions like \<^verbatim>\HOL-SPARK\, \<^verbatim>\HOL-SPARK-Examples\. An alternative is to use unqualified names that are relatively long and descriptive, as in the Archive of Formal Proofs (\<^url>\https://isa-afp.org\), for example. \ section \System build options \label{sec:system-options}\ text \ See \<^file>\~~/etc/options\ for the main defaults provided by the Isabelle distribution. Isabelle/jEdit \<^cite>\"isabelle-jedit"\ includes a simple editing mode \<^verbatim>\isabelle-options\ for this file-format. The following options are particularly relevant to build Isabelle sessions, in particular with document preparation (\chref{ch:present}). \<^item> @{system_option_def "browser_info"} controls output of HTML browser info, see also \secref{sec:info}. \<^item> @{system_option_def "document"} controls document output for a particular session or theory; \<^verbatim>\document=pdf\ or \<^verbatim>\document=true\ means enabled, \<^verbatim>\document=""\ or \<^verbatim>\document=false\ means disabled (especially for particular theories). \<^item> @{system_option_def "document_output"} specifies an alternative directory for generated output of the document preparation system; the default is within the @{setting "ISABELLE_BROWSER_INFO"} hierarchy as explained in \secref{sec:info}. See also @{tool mkroot}, which generates a default configuration with output readily available to the author of the document. \<^item> @{system_option_def "document_echo"} informs about document file names during session presentation. \<^item> @{system_option_def "document_variants"} specifies document variants as a colon-separated list of \name=tags\ entries. The default name is \<^verbatim>\document\, without additional tags. Tags are specified as a comma separated list of modifier/name pairs and tell {\LaTeX} how to interpret certain Isabelle command regions: ``\<^verbatim>\+\\foo\'' (or just ``\foo\'') means to keep, ``\<^verbatim>\-\\foo\'' to drop, and ``\<^verbatim>\/\\foo\'' to fold text tagged as \foo\. The builtin default is equivalent to the tag specification ``\<^verbatim>\+document,+theory,+proof,+ML,+visible,-invisible,+important,+unimportant\''; see also the {\LaTeX} macros \<^verbatim>\\isakeeptag\, \<^verbatim>\\isadroptag\, and \<^verbatim>\\isafoldtag\, in \<^file>\~~/lib/texinputs/isabelle.sty\. In contrast, \<^verbatim>\document_variants=document:outline=/proof,/ML\ indicates two documents: the one called \<^verbatim>\document\ with default tags, and the other called \<^verbatim>\outline\ where proofs and ML sections are folded. Document variant names are just a matter of conventions. It is also possible to use different document variant names (without tags) for different document root entries, see also \secref{sec:tool-document}. \<^item> @{system_option_def "document_tags"} specifies alternative command tags as a comma-separated list of items: either ``\command\\<^verbatim>\%\\tag\'' for a specific command, or ``\<^verbatim>\%\\tag\'' as default for all other commands. This is occasionally useful to control the global visibility of commands via session options (e.g.\ in \<^verbatim>\ROOT\). \<^item> @{system_option_def "document_comment_latex"} enables regular {\LaTeX} \<^verbatim>\comment.sty\, instead of the historic version for plain {\TeX} (default). The latter is much faster, but in conflict with {\LaTeX} classes like Dagstuhl LIPIcs\<^footnote>\\<^url>\https://github.com/dagstuhl-publishing/styles\\. \<^item> @{system_option_def "document_bibliography"} explicitly enables the use of \<^verbatim>\bibtex\; the default is to check the presence of \<^verbatim>\root.bib\, but it could have a different name. \<^item> @{system_option_def "document_heading_prefix"} specifies a prefix for the {\LaTeX} macro names generated from Isar commands like \<^theory_text>\chapter\, \<^theory_text>\section\ etc. The default is \<^verbatim>\isamarkup\, e.g. \<^theory_text>\section\ becomes \<^verbatim>\\isamarkupsection\. \<^item> @{system_option_def "threads"} determines the number of worker threads for parallel checking of theories and proofs. The default \0\ means that a sensible maximum value is determined by the underlying hardware. For machines with many cores or with hyperthreading, this sometimes requires manual adjustment (on the command-line or within personal settings or preferences, not within a session \<^verbatim>\ROOT\). \<^item> @{system_option_def "condition"} specifies a comma-separated list of process environment variables (or Isabelle settings) that are required for the subsequent theories to be processed. Conditions are considered ``true'' if the corresponding environment value is defined and non-empty. \<^item> @{system_option_def "timeout"} and @{system_option_def "timeout_scale"} specify a real wall-clock timeout for the session as a whole: the two values are multiplied and taken as the number of seconds. Typically, @{system_option "timeout"} is given for individual sessions, and @{system_option "timeout_scale"} as global adjustment to overall hardware performance. The timer is controlled outside the ML process by the JVM that runs Isabelle/Scala. Thus it is relatively reliable in canceling processes that get out of control, even if there is a deadlock without CPU time usage. \<^item> @{system_option_def "profiling"} specifies a mode for global ML profiling. Possible values are the empty string (disabled), \<^verbatim>\time\ for \<^ML>\profile_time\ and \<^verbatim>\allocations\ for \<^ML>\profile_allocations\. Results appear near the bottom of the session log file. \<^item> @{system_option_def system_log} specifies an optional log file for low-level messages produced by \<^ML>\Output.system_message\ in Isabelle/ML; the standard value ``\<^verbatim>\-\'' refers to console progress of the build job. \<^item> @{system_option_def "system_heaps"} determines the directories for session heap images: \<^path>\$ISABELLE_HEAPS\ is the user directory and \<^path>\$ISABELLE_HEAPS_SYSTEM\ the system directory (usually within the Isabelle application). For \<^verbatim>\system_heaps=false\, heaps are stored in the user directory and may be loaded from both directories. For \<^verbatim>\system_heaps=true\, store and load happens only in the system directory. The @{tool_def options} tool prints Isabelle system options. Its command-line usage is: @{verbatim [display] \Usage: isabelle options [OPTIONS] [MORE_OPTIONS ...] Options are: -b include $ISABELLE_BUILD_OPTIONS -g OPTION get value of OPTION -l list options -t TAGS restrict list to given tags (comma-separated) -x FILE export options to FILE in YXML format Report Isabelle system options, augmented by MORE_OPTIONS given as arguments NAME=VAL or NAME.\} The command line arguments provide additional system options of the form \name\\<^verbatim>\=\\value\ or \name\ for Boolean options. Option \<^verbatim>\-b\ augments the implicit environment of system options by the ones of @{setting ISABELLE_BUILD_OPTIONS}, cf.\ \secref{sec:tool-build}. Option \<^verbatim>\-g\ prints the value of the given option. Option \<^verbatim>\-l\ lists all options with their declaration and current value. Option \<^verbatim>\-t\ restricts the listing to given tags (as comma-separated list), e.g. \<^verbatim>\-t build,document\. Option \<^verbatim>\-x\ specifies a file to export the result in YXML format, instead of printing it in human-readable form. \ section \Invoking the build process \label{sec:tool-build}\ text \ The @{tool_def build} tool invokes the build process for Isabelle sessions. It manages dependencies between sessions, related sources of theories and auxiliary files, and target heap images. Accordingly, it runs instances of the prover process with optional document preparation. Its command-line usage is:\<^footnote>\Isabelle/Scala provides the same functionality via \<^scala_method>\isabelle.Build.build\.\ @{verbatim [display] \Usage: isabelle build [OPTIONS] [SESSIONS ...] Options are: -B NAME include session NAME and all descendants -D DIR include session directory and select its sessions -N cyclic shuffling of NUMA CPU nodes (performance tuning) -P DIR enable HTML/PDF presentation in directory (":" for default) -R refer to requirements of selected sessions -S soft build: only observe changes of sources, not heap images -X NAME exclude sessions from group NAME and all descendants -a select all sessions -b build heap images -c clean build -d DIR include session directory -e export files from session specification into file-system -f fresh build -g NAME select session group NAME -j INT maximum number of parallel jobs (default 1) -k KEYWORD check theory sources for conflicts with proposed keywords -l list session source files -n no build -- take existing session build databases -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -v verbose -x NAME exclude session NAME and all descendants - Build and manage Isabelle sessions, depending on implicit settings: + Build and manage Isabelle sessions: ML heaps, session databases, documents. - ISABELLE_TOOL_JAVA_OPTIONS="..." - ISABELLE_BUILD_OPTIONS="..." + Notable system options: see "isabelle options -l -t build" - ML_PLATFORM="..." - ML_HOME="..." - ML_SYSTEM="..." - ML_OPTIONS="..."\} + Notable system settings: + ISABELLE_TOOL_JAVA_OPTIONS="..." + ISABELLE_BUILD_OPTIONS="..." + + ML_PLATFORM="..." + ML_HOME="..." + ML_SYSTEM="..." + ML_OPTIONS="..."\} \<^medskip> Isabelle sessions are defined via session ROOT files as described in (\secref{sec:session-root}). The totality of sessions is determined by collecting such specifications from all Isabelle component directories (\secref{sec:components}), augmented by more directories given via options \<^verbatim>\-d\~\DIR\ on the command line. Each such directory may contain a session \<^verbatim>\ROOT\ file with several session specifications. Any session root directory may refer recursively to further directories of the same kind, by listing them in a catalog file \<^verbatim>\ROOTS\ line-by-line. This helps to organize large collections of session specifications, or to make \<^verbatim>\-d\ command line options persistent (e.g.\ in \<^verbatim>\$ISABELLE_HOME_USER/ROOTS\). \<^medskip> The subset of sessions to be managed is determined via individual \SESSIONS\ given as command-line arguments, or session groups that are given via one or more options \<^verbatim>\-g\~\NAME\. Option \<^verbatim>\-a\ selects all sessions. The build tool takes session dependencies into account: the set of selected sessions is completed by including all ancestors. \<^medskip> One or more options \<^verbatim>\-B\~\NAME\ specify base sessions to be included (all descendants wrt.\ the session parent or import graph). \<^medskip> One or more options \<^verbatim>\-x\~\NAME\ specify sessions to be excluded (all descendants wrt.\ the session parent or import graph). Option \<^verbatim>\-X\ is analogous to this, but excluded sessions are specified by session group membership. \<^medskip> Option \<^verbatim>\-R\ reverses the selection in the sense that it refers to its requirements: all ancestor sessions excluding the original selection. This allows to prepare the stage for some build process with different options, before running the main build itself (without option \<^verbatim>\-R\). \<^medskip> Option \<^verbatim>\-D\ is similar to \<^verbatim>\-d\, but selects all sessions that are defined in the given directories. \<^medskip> Option \<^verbatim>\-S\ indicates a ``soft build'': the selection is restricted to those sessions that have changed sources (according to actually imported theories). The status of heap images is ignored. \<^medskip> The build process depends on additional options (\secref{sec:system-options}) that are passed to the prover eventually. The settings variable @{setting_ref ISABELLE_BUILD_OPTIONS} allows to provide additional defaults, e.g.\ \<^verbatim>\ISABELLE_BUILD_OPTIONS="document=pdf threads=4"\. Moreover, the environment of system build options may be augmented on the command line via \<^verbatim>\-o\~\name\\<^verbatim>\=\\value\ or \<^verbatim>\-o\~\name\, which abbreviates \<^verbatim>\-o\~\name\\<^verbatim>\=true\ for Boolean or string options. Multiple occurrences of \<^verbatim>\-o\ on the command-line are applied in the given order. \<^medskip> Option \<^verbatim>\-P\ enables PDF/HTML presentation in the given directory, where ``\<^verbatim>\-P:\'' refers to the default @{setting_ref ISABELLE_BROWSER_INFO} (or @{setting_ref ISABELLE_BROWSER_INFO_SYSTEM}). This applies only to explicitly selected sessions; note that option \<^verbatim>\-R\ allows to select all requirements separately. \<^medskip> Option \<^verbatim>\-b\ ensures that heap images are produced for all selected sessions. By default, images are only saved for inner nodes of the hierarchy of sessions, as required for other sessions to continue later on. \<^medskip> Option \<^verbatim>\-c\ cleans the selected sessions (all descendants wrt.\ the session parent or import graph) before performing the specified build operation. \<^medskip> Option \<^verbatim>\-e\ executes the \isakeyword{export_files} directives from the ROOT specification of all explicitly selected sessions: the status of the session build database needs to be OK, but the session could have been built earlier. Using \isakeyword{export_files}, a session may serve as abstract interface for add-on build artefacts, but these are only materialized on explicit request: without option \<^verbatim>\-e\ there is no effect on the physical file-system yet. \<^medskip> Option \<^verbatim>\-f\ forces a fresh build of all selected sessions and their requirements. \<^medskip> Option \<^verbatim>\-n\ omits the actual build process after the preparatory stage (including optional cleanup). The overall return code always the status of the set of selected sessions. Add-on builds (like presentation) are run for successful sessions, i.e.\ already finished ones. \<^medskip> Option \<^verbatim>\-j\ specifies the maximum number of parallel build jobs (prover processes). Each prover process is subject to a separate limit of parallel worker threads, cf.\ system option @{system_option_ref threads}. \<^medskip> Option \<^verbatim>\-N\ enables cyclic shuffling of NUMA CPU nodes. This may help performance tuning on Linux servers with separate CPU/memory modules. \<^medskip> Option \<^verbatim>\-v\ increases the general level of verbosity. \<^medskip> Option \<^verbatim>\-l\ lists the source files that contribute to a session. \<^medskip> Option \<^verbatim>\-k\ specifies a newly proposed keyword for outer syntax. It is possible to use option \<^verbatim>\-k\ repeatedly to check multiple keywords. The theory sources are checked for conflicts wrt.\ this hypothetical change of syntax, e.g.\ to reveal occurrences of identifiers that need to be quoted. \ subsubsection \Examples\ text \ Build a specific logic image: @{verbatim [display] \ isabelle build -b HOLCF\} \<^smallskip> Build the main group of logic images: @{verbatim [display] \ isabelle build -b -g main\} \<^smallskip> Build all descendants (and requirements) of \<^verbatim>\FOL\ and \<^verbatim>\ZF\: @{verbatim [display] \ isabelle build -B FOL -B ZF\} \<^smallskip> Build all sessions where sources have changed (ignoring heaps): @{verbatim [display] \ isabelle build -a -S\} \<^smallskip> Provide a general overview of the status of all Isabelle sessions, without building anything: @{verbatim [display] \ isabelle build -a -n -v\} \<^smallskip> Build all sessions with HTML browser info and PDF document preparation: @{verbatim [display] \ isabelle build -a -o browser_info -o document\} \<^smallskip> Build all sessions with a maximum of 8 parallel prover processes and 4 worker threads each (on a machine with many cores): @{verbatim [display] \ isabelle build -a -j8 -o threads=4\} \<^smallskip> Build some session images with cleanup of their descendants, while retaining their ancestry: @{verbatim [display] \ isabelle build -b -c HOL-Library HOL-Algebra\} \<^smallskip> HTML/PDF presentation for sessions that happen to be properly built already, without rebuilding anything except the missing browser info: @{verbatim [display] \ isabelle build -a -n -o browser_info\} \<^smallskip> Clean all sessions without building anything: @{verbatim [display] \ isabelle build -a -n -c\} \<^smallskip> Build all sessions from some other directory hierarchy, according to the settings variable \<^verbatim>\AFP\ that happens to be defined inside the Isabelle environment: @{verbatim [display] \ isabelle build -D '$AFP'\} \<^smallskip> Inform about the status of all sessions required for AFP, without building anything yet: @{verbatim [display] \ isabelle build -D '$AFP' -R -v -n\} \ section \Print messages from session build database \label{sec:tool-log}\ text \ The @{tool_def "build_log"} tool prints prover messages from the build database of the given session. Its command-line usage is: @{verbatim [display] \Usage: isabelle build_log [OPTIONS] [SESSIONS ...] Options are: -H REGEX filter messages by matching against head -M REGEX filter messages by matching against body -T NAME restrict to given theories (multiple options possible) -U output Unicode symbols -m MARGIN margin for pretty printing (default: 76.0) -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -v print all messages, including information etc. Print messages from the session build database of the given sessions, without any checks against current sources nor session structure: results from old sessions or failed builds can be printed as well. Multiple options -H and -M are conjunctive: all given patterns need to match. Patterns match any substring, but ^ or $ may be used to match the start or end explicitly.\} The specified session databases are taken as is, with formal checking against current sources: There is \<^emph>\no\ implicit build process involved, so it is possible to retrieve error messages from a failed session as well. The order of messages follows the source positions of source files; thus the result is mostly deterministic, independent of the somewhat erratic evaluation of parallel processing. \<^medskip> Option \<^verbatim>\-o\ allows to change system options, as in @{tool build} (\secref{sec:tool-build}). This may affect the storage space for the build database, notably via @{system_option system_heaps}, or @{system_option build_database_server} and its relatives. \<^medskip> Option \<^verbatim>\-T\ restricts output to given theories: multiple entries are possible by repeating this option on the command-line. The default is to refer to \<^emph>\all\ theories used in the original session build process. \<^medskip> Options \<^verbatim>\-m\ and \<^verbatim>\-U\ modify pretty printing and output of Isabelle symbols. The default is for an old-fashioned ASCII terminal at 80 characters per line (76 + 4 characters to prefix warnings or errors). \<^medskip> Option \<^verbatim>\-v\ prints all messages from the session database that are normally inlined into the source text, including information messages etc. \<^medskip> Options \<^verbatim>\-H\ and \<^verbatim>\-M\ filter messages according to their header or body content, respectively. The header follows a very basic format that makes it easy to match message kinds (e.g. \<^verbatim>\Warning\ or \<^verbatim>\Error\) and file names (e.g. \<^verbatim>\src/HOL/Nat.thy\). The body is usually pretty-printed, but for matching it is treated like one long line: blocks are ignored and breaks are turned into plain spaces (according to their formal width). The syntax for patters follows regular expressions of the Java platform.\<^footnote>\\<^url>\https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/util/regex/Pattern.html\\ \ subsubsection \Examples\ text \ Print messages from theory \<^verbatim>\HOL.Nat\ of session \<^verbatim>\HOL\, using Unicode rendering of Isabelle symbols and a margin of 100 characters: @{verbatim [display] \ isabelle build_log -T HOL.Nat -U -m 100 HOL\} Print warnings about ambiguous input (inner syntax) of session \<^verbatim>\HOL-Library\, which is built beforehand: @{verbatim [display] \ isabelle build HOL-Library isabelle build_log -H "Warning" -M "Ambiguous input" HOL-Library\} Print all errors from all sessions, e.g. from a partial build of Isabelle/AFP: @{verbatim [display] \ isabelle build_log -H "Error" $(isabelle sessions -a -d AFP/thys)\} \ section \Retrieve theory exports \label{sec:tool-export}\ text \ The @{tool_def "export"} tool retrieves theory exports from the session database. Its command-line usage is: @{verbatim [display] \Usage: isabelle export [OPTIONS] SESSION Options are: -O DIR output directory for exported files (default: "export") -d DIR include session directory -l list exports -n no build of session -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -p NUM prune path of exported files by NUM elements -x PATTERN extract files matching pattern (e.g. "*:**" for all) List or export theory exports for SESSION: named blobs produced by isabelle build. Option -l or -x is required; option -x may be repeated. The PATTERN language resembles glob patterns in the shell, with ? and * (both excluding ":" and "/"), ** (excluding ":"), and [abc] or [^abc], and variants {pattern1,pattern2,pattern3}.\} \<^medskip> The specified session is updated via @{tool build} (\secref{sec:tool-build}), with the same options \<^verbatim>\-d\, \<^verbatim>\-o\. The option \<^verbatim>\-n\ suppresses the implicit build process: it means that a potentially outdated session database is used! \<^medskip> Option \<^verbatim>\-l\ lists all stored exports, with compound names \theory\\<^verbatim>\:\\name\. \<^medskip> Option \<^verbatim>\-x\ extracts stored exports whose compound name matches the given pattern. Note that wild cards ``\<^verbatim>\?\'' and ``\<^verbatim>\*\'' do not match the separators ``\<^verbatim>\:\'' and ``\<^verbatim>\/\''; the wild card \<^verbatim>\**\ matches over directory name hierarchies separated by ``\<^verbatim>\/\''. Thus the pattern ``\<^verbatim>\*:**\'' matches \<^emph>\all\ theory exports. Multiple options \<^verbatim>\-x\ refer to the union of all specified patterns. Option \<^verbatim>\-O\ specifies an alternative output directory for option \<^verbatim>\-x\: the default is \<^verbatim>\export\ within the current directory. Each theory creates its own sub-directory hierarchy, using the session-qualified theory name. Option \<^verbatim>\-p\ specifies the number of elements that should be pruned from each name: it allows to reduce the resulting directory hierarchy at the danger of overwriting files due to loss of uniqueness. \ section \Dump PIDE session database \label{sec:tool-dump}\ text \ The @{tool_def "dump"} tool dumps information from the cumulative PIDE session database (which is processed on the spot). Its command-line usage is: @{verbatim [display] \Usage: isabelle dump [OPTIONS] [SESSIONS ...] Options are: -A NAMES dump named aspects (default: ...) -B NAME include session NAME and all descendants -D DIR include session directory and select its sessions -O DIR output directory for dumped files (default: "dump") -R refer to requirements of selected sessions -X NAME exclude sessions from group NAME and all descendants -a select all sessions -b NAME base logic image (default "Pure") -d DIR include session directory -g NAME select session group NAME -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -v verbose -x NAME exclude session NAME and all descendants Dump cumulative PIDE session database, with the following aspects: ...\} \<^medskip> Options \<^verbatim>\-B\, \<^verbatim>\-D\, \<^verbatim>\-R\, \<^verbatim>\-X\, \<^verbatim>\-a\, \<^verbatim>\-d\, \<^verbatim>\-g\, \<^verbatim>\-x\ and the remaining command-line arguments specify sessions as in @{tool build} (\secref{sec:tool-build}): the cumulative PIDE database of all their loaded theories is dumped to the output directory of option \<^verbatim>\-O\ (default: \<^verbatim>\dump\ in the current directory). \<^medskip> Option \<^verbatim>\-b\ specifies an optional base logic image, for improved scalability of the PIDE session. Its theories are only processed if it is included in the overall session selection. \<^medskip> Option \<^verbatim>\-o\ overrides Isabelle system options as for @{tool build} (\secref{sec:tool-build}). \<^medskip> Option \<^verbatim>\-v\ increases the general level of verbosity. \<^medskip> Option \<^verbatim>\-A\ specifies named aspects of the dump, as a comma-separated list. The default is to dump all known aspects, as given in the command-line usage of the tool. The underlying Isabelle/Scala operation \<^scala_method>\isabelle.Dump.dump\ takes aspects as user-defined operations on the final PIDE state and document version. This allows to imitate Prover IDE rendering under program control. \ subsubsection \Examples\ text \ Dump all Isabelle/ZF sessions (which are rather small): @{verbatim [display] \ isabelle dump -v -B ZF\} \<^smallskip> Dump the quite substantial \<^verbatim>\HOL-Analysis\ session, with full bootstrap from Isabelle/Pure: @{verbatim [display] \ isabelle dump -v HOL-Analysis\} \<^smallskip> Dump all sessions connected to HOL-Analysis, using main Isabelle/HOL as basis: @{verbatim [display] \ isabelle dump -v -b HOL -B HOL-Analysis\} This results in uniform PIDE markup for everything, except for the Isabelle/Pure bootstrap process itself. Producing that on the spot requires several GB of heap space, both for the Isabelle/Scala and Isabelle/ML process (in 64bit mode). Here are some relevant settings (\secref{sec:boot}) for such ambitious applications: @{verbatim [display] \ ISABELLE_TOOL_JAVA_OPTIONS="-Xms4g -Xmx32g -Xss16m" ML_OPTIONS="--minheap 4G --maxheap 32G" \} \ section \Update theory sources based on PIDE markup \label{sec:tool-update}\ text \ The @{tool_def "update"} tool updates theory sources based on markup that is produced by the regular @{tool build} process \secref{sec:tool-build}). Its command-line usage is: @{verbatim [display] \Usage: isabelle update [OPTIONS] [SESSIONS ...] Options are: -B NAME include session NAME and all descendants -D DIR include session directory and select its sessions -R refer to requirements of selected sessions -X NAME exclude sessions from group NAME and all descendants -a select all sessions -b build heap images -c clean build -d DIR include session directory -f fresh build -g NAME select session group NAME -j INT maximum number of parallel jobs (default 1) -l NAME additional base logic -n no build -- take existing session build databases -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -u OPT override "update" option for selected sessions -v verbose -x NAME exclude session NAME and all descendants Update theory sources based on PIDE markup produced by "isabelle build".\} \<^medskip> Most options are the same as for @{tool build} (\secref{sec:tool-build}). \<^medskip> Option \<^verbatim>\-l\ specifies one or more base logics: these sessions and their ancestors are \<^emph>\excluded\ from the update. \<^medskip> Option \<^verbatim>\-u\ refers to specific \<^verbatim>\update\ options, by relying on naming convention: ``\<^verbatim>\-u\~\OPT\'' is a shortcut for ``\<^verbatim>\-o\~\<^verbatim>\update_\\OPT\''. \<^medskip> The following \<^verbatim>\update\ options are supported: \<^item> @{system_option_ref update_inner_syntax_cartouches} to update inner syntax (types, terms, etc.)~to use cartouches, instead of double-quoted strings or atomic identifiers. For example, ``\<^theory_text>\lemma \x = x\\'' is replaced by ``\<^theory_text>\lemma \x = x\\'', and ``\<^theory_text>\assume A\'' is replaced by ``\<^theory_text>\assume \A\\''. \<^item> @{system_option update_mixfix_cartouches} to update mixfix templates to use cartouches instead of double-quoted strings. For example, ``\<^theory_text>\(infixl \+\ 65)\'' is replaced by ``\<^theory_text>\(infixl \+\ 65)\''. \<^item> @{system_option_ref update_control_cartouches} to update antiquotations to use the compact form with control symbol and cartouche argument. For example, ``\@{term \x + y\}\'' is replaced by ``\\<^term>\x + y\\'' (the control symbol is literally \<^verbatim>\\<^term>\.) \<^item> @{system_option_ref update_path_cartouches} to update file-system paths to use cartouches: this depends on language markup provided by semantic processing of parsed input. \<^item> @{system_option_ref update_cite} to update {\LaTeX} \<^verbatim>\\cite\ commands and old-style \<^verbatim>\@{cite "name"}\ document antiquotations. It is also possible to produce custom updates in Isabelle/ML, by reporting \<^ML>\Markup.update\ with the precise source position and a replacement text. This operation should be made conditional on specific system options, similar to the ones above. Searching the above option names in ML sources of \<^dir>\$ISABELLE_HOME/src/Pure\ provides some examples. Updates can be in conflict by producing nested or overlapping edits: this may require to run @{tool update} multiple times. \ subsubsection \Examples\ text \ Update some cartouche notation in all theory sources required for session \<^verbatim>\HOL-Analysis\ (and ancestors): @{verbatim [display] \ isabelle update -u mixfix_cartouches HOL-Analysis\} \<^smallskip> Update the same for all application sessions based on \<^verbatim>\HOL-Analysis\, but do not change the underlying \<^verbatim>\HOL\ (and \<^verbatim>\Pure\) session: @{verbatim [display] \ isabelle update -u mixfix_cartouches -l HOL -B HOL-Analysis\} \<^smallskip> Update all sessions that happen to be properly built beforehand: @{verbatim [display] \ isabelle update -u mixfix_cartouches -n -a\} \ section \Explore sessions structure\ text \ The @{tool_def "sessions"} tool explores the sessions structure. Its command-line usage is: @{verbatim [display] \Usage: isabelle sessions [OPTIONS] [SESSIONS ...] Options are: -B NAME include session NAME and all descendants -D DIR include session directory and select its sessions -R refer to requirements of selected sessions -X NAME exclude sessions from group NAME and all descendants -a select all sessions -b follow session build dependencies (default: source imports) -d DIR include session directory -g NAME select session group NAME -x NAME exclude session NAME and all descendants Explore the structure of Isabelle sessions and print result names in topological order (on stdout).\} Arguments and options for session selection resemble @{tool build} (\secref{sec:tool-build}). \ subsubsection \Examples\ text \ All sessions of the Isabelle distribution: @{verbatim [display] \ isabelle sessions -a\} \<^medskip> Sessions that are imported by \<^verbatim>\ZF\: @{verbatim [display] \ isabelle sessions ZF\} \<^medskip> Sessions that are required to build \<^verbatim>\ZF\: @{verbatim [display] \ isabelle sessions -b ZF\} \<^medskip> Sessions that are based on \<^verbatim>\ZF\ (and imported by it): @{verbatim [display] \ isabelle sessions -B ZF\} \<^medskip> All sessions of Isabelle/AFP (based in directory \<^path>\AFP\): @{verbatim [display] \ isabelle sessions -D AFP/thys\} \<^medskip> Sessions required by Isabelle/AFP (based in directory \<^path>\AFP\): @{verbatim [display] \ isabelle sessions -R -D AFP/thys\} \ section \Synchronize source repositories and session images for Isabelle and AFP\ text \ The @{tool_def sync} tool synchronizes a local Isabelle and AFP source repository, possibly with prebuilt \<^verbatim>\.jar\ files and session images. Its command-line usage is: @{verbatim [display] \Usage: isabelle sync [OPTIONS] TARGET Options are: -A ROOT include AFP with given root directory (":" for $AFP_BASE) -H purge heaps directory on target -I NAME include session heap image and build database (based on accidental local state) -J preserve *.jar files -P protect spaces in target file names: more robust, less portable -S PATH SSH control path for connection multiplexing -T thorough treatment of file content and directory times -a REV explicit AFP revision (default: state of working directory) -n no changes: dry-run -p PORT SSH port -r REV explicit revision (default: state of working directory) -v verbose Synchronize Isabelle + AFP repositories, based on "isabelle hg_sync".\} The approach is to apply @{tool hg_sync} (see \secref{sec:tool-hg-sync}) on the underlying Isabelle repository, and an optional AFP repository. Consequently, the Isabelle installation needs to be a Mercurial repository clone: a regular download of the Isabelle distribution is not sufficient! On the target side, AFP is placed into @{setting ISABELLE_HOME} as immediate sub-directory with the literal name \<^verbatim>\AFP\; thus it can be easily included elsewhere, e.g. @{tool build}~\<^verbatim>\-d\~\<^verbatim>\'~~/AFP'\ on the remote side. \<^medskip> Options \<^verbatim>\-P\, \<^verbatim>\-S\, \<^verbatim>\-T\, \<^verbatim>\-n\, \<^verbatim>\-p\, \<^verbatim>\-v\ are the same as the underlying @{tool hg_sync}. \<^medskip> Options \<^verbatim>\-r\ and \<^verbatim>\-a\ are the same as option \<^verbatim>\-r\ for @{tool hg_sync}, but for the Isabelle and AFP repositories, respectively. The AFP version is only used if a corresponding repository is given via option \<^verbatim>\-A\, either with explicit root directory, or as \<^verbatim>\-A:\ to refer to \<^verbatim>\$AFP_BASE\ (this assumes AFP as component of the local Isabelle installation). If no AFP repository is given, an existing \<^verbatim>\AFP\ directory on the target remains unchanged. \<^medskip> Option \<^verbatim>\-J\ uploads existing \<^verbatim>\.jar\ files from the working directory, which are usually Isabelle/Scala/Java modules under control of @{tool scala_build} via \<^verbatim>\etc/build.props\ (see also \secref{sec:scala-build}). Thus the dependency management is accurate: bad uploads will be rebuilt eventually (or ignored). This might fail for very old Isabelle versions, when going into the past via option \<^verbatim>\-r\: here it is better to omit option \<^verbatim>\-J\ and thus purge \<^verbatim>\.jar\ files on the target (because they do not belong to the repository). \<^medskip> Option \<^verbatim>\-I\ uploads a collection of session images. The set of \<^verbatim>\-I\ options specifies the end-points in the session build graph, including all required ancestors. The result collection is uploaded using the underlying \<^verbatim>\rsync\ policies, so unchanged images are not sent again. Session images are assembled within the target \<^verbatim>\heaps\ directory: this scheme fits together with @{tool build}~\<^verbatim>\-o system_heaps\. Images are taken as-is from the local Isabelle installation, regardless of option \<^verbatim>\-r\. Upload of bad images could waste time and space, but running e.g. @{tool build} on the target will check dependencies accurately and rebuild outdated images on demand. \<^medskip> Option \<^verbatim>\-H\ tells the underlying \<^verbatim>\rsync\ process to purge the \<^verbatim>\heaps\ directory on the target, before uploading new images via option \<^verbatim>\-I\. The default is to work monotonically: old material that is not overwritten remains unchanged. Over time, this may lead to unused garbage, due to changes in session names or the Poly/ML version. Option \<^verbatim>\-H\ helps to avoid wasting file-system space. \ subsubsection \Examples\ text \ For quick testing of Isabelle + AFP on a remote machine, upload changed sources, jars, and local sessions images for \<^verbatim>\HOL\: @{verbatim [display] \ isabelle sync -A: -I HOL -J testmachine:test/isabelle_afp\} Assuming that the local \<^verbatim>\HOL\ hierarchy has been up-to-date, and the local and remote ML platforms coincide, a remote @{tool build} will proceed without building \<^verbatim>\HOL\ again. \<^medskip> Here is a variation for extra-clean testing of Isabelle + AFP: no option \<^verbatim>\-J\, but option \<^verbatim>\-T\ to disable the default ``quick check'' of \<^verbatim>\rsync\ (which only inspects file sizes and date stamps); existing heaps are deleted: @{verbatim [display] \ isabelle sync -A: -T -H testmachine:test/isabelle_afp\} \ end diff --git a/src/Pure/Admin/build_log.scala b/src/Pure/Admin/build_log.scala --- a/src/Pure/Admin/build_log.scala +++ b/src/Pure/Admin/build_log.scala @@ -1,1117 +1,1117 @@ /* Title: Pure/Admin/build_log.scala Author: Makarius Management of build log files and database storage. */ package isabelle import java.io.{File => JFile} import java.time.format.{DateTimeFormatter, DateTimeParseException} import java.util.Locale import scala.collection.immutable.SortedMap import scala.collection.mutable import scala.util.matching.Regex object Build_Log { /** content **/ /* properties */ object Prop { val build_tags = SQL.Column.string("build_tags") // lines val build_args = SQL.Column.string("build_args") // lines val build_group_id = SQL.Column.string("build_group_id") val build_id = SQL.Column.string("build_id") val build_engine = SQL.Column.string("build_engine") val build_host = SQL.Column.string("build_host") val build_start = SQL.Column.date("build_start") val build_end = SQL.Column.date("build_end") val isabelle_version = SQL.Column.string("isabelle_version") val afp_version = SQL.Column.string("afp_version") val all_props: List[SQL.Column] = List(build_tags, build_args, build_group_id, build_id, build_engine, build_host, build_start, build_end, isabelle_version, afp_version) } /* settings */ object Settings { val ISABELLE_BUILD_OPTIONS = SQL.Column.string("ISABELLE_BUILD_OPTIONS") val ML_PLATFORM = SQL.Column.string("ML_PLATFORM") val ML_HOME = SQL.Column.string("ML_HOME") val ML_SYSTEM = SQL.Column.string("ML_SYSTEM") val ML_OPTIONS = SQL.Column.string("ML_OPTIONS") val ml_settings = List(ML_PLATFORM, ML_HOME, ML_SYSTEM, ML_OPTIONS) val all_settings = ISABELLE_BUILD_OPTIONS :: ml_settings type Entry = (String, String) type T = List[Entry] object Entry { def unapply(s: String): Option[Entry] = for { (a, b) <- Properties.Eq.unapply(s) } yield (a, Library.perhaps_unquote(b)) def getenv(a: String): String = Properties.Eq(a, quote(Isabelle_System.getenv(a))) } def show(): String = cat_lines( List(Entry.getenv("ISABELLE_TOOL_JAVA_OPTIONS"), Entry.getenv(ISABELLE_BUILD_OPTIONS.name), "") ::: ml_settings.map(c => Entry.getenv(c.name))) } /* file names */ def log_date(date: Date): String = String.format(Locale.ROOT, "%s.%05d", DateTimeFormatter.ofPattern("yyyy-MM-dd").format(date.rep), java.lang.Long.valueOf((date.time - date.midnight.time).ms / 1000)) def log_subdir(date: Date): Path = Path.explode("log") + Path.explode(date.rep.getYear.toString) def log_filename(engine: String, date: Date, more: List[String] = Nil): Path = Path.explode((engine :: log_date(date) :: more).mkString("", "_", ".log")) /** log file **/ def print_date(date: Date): String = Log_File.Date_Format(date) object Log_File { /* log file */ def plain_name(name: String): String = { List(".log", ".log.gz", ".log.xz", ".gz", ".xz").find(name.endsWith) match { case Some(s) => Library.try_unsuffix(s, name).get case None => name } } def apply(name: String, lines: List[String]): Log_File = new Log_File(plain_name(name), lines.map(Library.trim_line)) def apply(name: String, text: String): Log_File = new Log_File(plain_name(name), Library.trim_split_lines(text)) def apply(file: JFile): Log_File = { val name = file.getName val text = if (File.is_gz(name)) File.read_gzip(file) else if (File.is_xz(name)) File.read_xz(file) else File.read(file) apply(name, text) } def apply(path: Path): Log_File = apply(path.file) /* log file collections */ def is_log(file: JFile, prefixes: List[String] = List(Build_History.log_prefix, Identify.log_prefix, Identify.log_prefix2, Isatest.log_prefix, AFP_Test.log_prefix), suffixes: List[String] = List(".log", ".log.gz", ".log.xz") ): Boolean = { val name = file.getName prefixes.exists(name.startsWith) && suffixes.exists(name.endsWith) && name != "isatest.log" && name != "afp-test.log" && name != "main.log" } /* date format */ val Date_Format = { val fmts = Date.Formatter.variants( List("EEE MMM d HH:mm:ss O yyyy", "EEE MMM d HH:mm:ss VV yyyy"), List(Locale.ENGLISH, Locale.GERMAN)) ::: List( DateTimeFormatter.RFC_1123_DATE_TIME, Date.Formatter.pattern("EEE MMM d HH:mm:ss yyyy").withZone(Date.timezone_berlin)) def tune_timezone(s: String): String = s match { case "CET" | "MET" => "GMT+1" case "CEST" | "MEST" => "GMT+2" case "EST" => "Europe/Berlin" case _ => s } def tune_weekday(s: String): String = s match { case "Die" => "Di" case "Mit" => "Mi" case "Don" => "Do" case "Fre" => "Fr" case "Sam" => "Sa" case "Son" => "So" case _ => s } def tune(s: String): String = Word.implode( Word.explode(s) match { case a :: "M\uFFFDr" :: bs => tune_weekday(a) :: "Mär" :: bs.map(tune_timezone) case a :: bs => tune_weekday(a) :: bs.map(tune_timezone) case Nil => Nil } ) Date.Format.make(fmts, tune) } } class Log_File private(val name: String, val lines: List[String]) { log_file => override def toString: String = name def text: String = cat_lines(lines) def err(msg: String): Nothing = error("Error in log file " + quote(name) + ": " + msg) /* date format */ object Strict_Date { def unapply(s: String): Some[Date] = try { Some(Log_File.Date_Format.parse(s)) } catch { case exn: DateTimeParseException => log_file.err(exn.getMessage) } } /* inlined text */ def filter(Marker: Protocol_Message.Marker): List[String] = for (Marker(text) <- lines) yield text def find(Marker: Protocol_Message.Marker): Option[String] = lines.collectFirst({ case Marker(text) => text }) def find_match(regexes: List[Regex]): Option[String] = regexes match { case Nil => None case regex :: rest => lines.iterator.map(regex.unapplySeq(_)).find(res => res.isDefined && res.get.length == 1). map(res => res.get.head) orElse find_match(rest) } /* settings */ def get_setting(name: String): Option[Settings.Entry] = lines.collectFirst({ case Settings.Entry(a, b) if a == name => a -> b }) def get_all_settings: Settings.T = for { c <- Settings.all_settings; entry <- get_setting(c.name) } yield entry /* properties (YXML) */ val cache: XML.Cache = XML.Cache.make() def parse_props(text: String): Properties.T = try { cache.props(XML.Decode.properties(YXML.parse_body(text))) } catch { case _: XML.Error => log_file.err("malformed properties") } def filter_props(marker: Protocol_Message.Marker): List[Properties.T] = for (text <- filter(marker) if YXML.detect(text)) yield parse_props(text) def find_props(marker: Protocol_Message.Marker): Option[Properties.T] = for (text <- find(marker) if YXML.detect(text)) yield parse_props(text) /* parse various formats */ def parse_meta_info(): Meta_Info = Build_Log.parse_meta_info(log_file) def parse_build_info(ml_statistics: Boolean = false): Build_Info = Build_Log.parse_build_info(log_file, ml_statistics) def parse_session_info( command_timings: Boolean = false, theory_timings: Boolean = false, ml_statistics: Boolean = false, task_statistics: Boolean = false): Session_Info = Build_Log.parse_session_info( log_file, command_timings, theory_timings, ml_statistics, task_statistics) } /** digested meta info: produced by Admin/build_other in log.xz file **/ object Meta_Info { val empty: Meta_Info = Meta_Info(Nil, Nil) } sealed case class Meta_Info(props: Properties.T, settings: Settings.T) { def is_empty: Boolean = props.isEmpty && settings.isEmpty def get(c: SQL.Column): Option[String] = Properties.get(props, c.name) orElse Properties.get(settings, c.name) def get_date(c: SQL.Column): Option[Date] = get(c).map(Log_File.Date_Format.parse) } object Identify { val log_prefix = "isabelle_identify_" val log_prefix2 = "plain_identify_" def engine(log_file: Log_File): String = if (log_file.name.startsWith(log_prefix2)) "plain_identify" else "identify" def content(date: Date, isabelle_version: Option[String], afp_version: Option[String]): String = terminate_lines( List("isabelle_identify: " + Build_Log.print_date(date), "") ::: isabelle_version.map("Isabelle version: " + _).toList ::: afp_version.map("AFP version: " + _).toList) val Start = new Regex("""^isabelle_identify: (.+)$""") val No_End = new Regex("""$.""") val Isabelle_Version = List(new Regex("""^Isabelle version: (\S+)$""")) val AFP_Version = List(new Regex("""^AFP version: (\S+)$""")) } object Isatest { val log_prefix = "isatest-makeall-" val engine = "isatest" val Start = new Regex("""^------------------- starting test --- (.+) --- (.+)$""") val End = new Regex("""^------------------- test (?:successful|FAILED) --- (.+) --- .*$""") val Isabelle_Version = List(new Regex("""^Isabelle version: (\S+)$""")) } object AFP_Test { val log_prefix = "afp-test-devel-" val engine = "afp-test" val Start = new Regex("""^Start test(?: for .+)? at ([^,]+), (.*)$""") val Start_Old = new Regex("""^Start test(?: for .+)? at ([^,]+)$""") val End = new Regex("""^End test on (.+), .+, elapsed time:.*$""") val Isabelle_Version = List(new Regex("""^Isabelle version: .* -- hg id (\S+)$""")) val AFP_Version = List(new Regex("""^AFP version: .* -- hg id (\S+)$""")) val Bad_Init = new Regex("""^cp:.*: Disc quota exceeded$""") } object Jenkins { val log_prefix = "jenkins_" val engine = "jenkins" val Host = new Regex("""^Building remotely on (\S+) \((\S+)\).*$""") val Start = new Regex("""^(?:Started by an SCM change|Started from command line by admin|).*$""") val Start_Date = new Regex("""^Build started at (.+)$""") val No_End = new Regex("""$.""") val Isabelle_Version = List(new Regex("""^(?:Build for Isabelle id|Isabelle id) (\w+).*$"""), new Regex("""^ISABELLE_CI_REPO_ID="(\w+)".*$"""), new Regex("""^(\w{12}) tip.*$""")) val AFP_Version = List(new Regex("""^(?:Build for AFP id|AFP id) (\w+).*$"""), new Regex("""^ISABELLE_CI_AFP_ID="(\w+)".*$""")) val CONFIGURATION = "=== CONFIGURATION ===" val BUILD = "=== BUILD ===" } private def parse_meta_info(log_file: Log_File): Meta_Info = { def parse(engine: String, host: String, start: Date, End: Regex, Isabelle_Version: List[Regex], AFP_Version: List[Regex] ): Meta_Info = { val build_id = { val prefix = proper_string(host) orElse proper_string(engine) getOrElse "build" prefix + ":" + start.time.ms } val build_engine = if (engine == "") Nil else List(Prop.build_engine.name -> engine) val build_host = if (host == "") Nil else List(Prop.build_host.name -> host) val start_date = List(Prop.build_start.name -> print_date(start)) val end_date = log_file.lines.last match { case End(log_file.Strict_Date(end_date)) => List(Prop.build_end.name -> print_date(end_date)) case _ => Nil } val isabelle_version = log_file.find_match(Isabelle_Version).map(Prop.isabelle_version.name -> _) val afp_version = log_file.find_match(AFP_Version).map(Prop.afp_version.name -> _) Meta_Info((Prop.build_id.name -> build_id) :: build_engine ::: build_host ::: start_date ::: end_date ::: isabelle_version.toList ::: afp_version.toList, log_file.get_all_settings) } log_file.lines match { case line :: _ if Protocol.Meta_Info_Marker.test_yxml(line) => Meta_Info(log_file.find_props(Protocol.Meta_Info_Marker).get, log_file.get_all_settings) case Identify.Start(log_file.Strict_Date(start)) :: _ => parse(Identify.engine(log_file), "", start, Identify.No_End, Identify.Isabelle_Version, Identify.AFP_Version) case Isatest.Start(log_file.Strict_Date(start), host) :: _ => parse(Isatest.engine, host, start, Isatest.End, Isatest.Isabelle_Version, Nil) case AFP_Test.Start(log_file.Strict_Date(start), host) :: _ => parse(AFP_Test.engine, host, start, AFP_Test.End, AFP_Test.Isabelle_Version, AFP_Test.AFP_Version) case AFP_Test.Start_Old(log_file.Strict_Date(start)) :: _ => parse(AFP_Test.engine, "", start, AFP_Test.End, AFP_Test.Isabelle_Version, AFP_Test.AFP_Version) case line :: _ if line.startsWith("\u0000") => Meta_Info.empty case List(Isatest.End(_)) => Meta_Info.empty case _ :: AFP_Test.Bad_Init() :: _ => Meta_Info.empty case Nil => Meta_Info.empty case _ => log_file.err("cannot detect log file format") } } /** build info: toplevel output of isabelle build or Admin/build_other **/ val SESSION_NAME = "session_name" object Session_Status extends Enumeration { val existing, finished, failed, cancelled = Value } sealed case class Session_Entry( chapter: String = "", groups: List[String] = Nil, threads: Option[Int] = None, timing: Timing = Timing.zero, ml_timing: Timing = Timing.zero, sources: Option[String] = None, heap_size: Option[Space] = None, status: Option[Session_Status.Value] = None, errors: List[String] = Nil, theory_timings: Map[String, Timing] = Map.empty, ml_statistics: List[Properties.T] = Nil ) { def proper_groups: Option[String] = if (groups.isEmpty) None else Some(cat_lines(groups)) def finished: Boolean = status == Some(Session_Status.finished) def failed: Boolean = status == Some(Session_Status.failed) } object Build_Info { val sessions_dummy: Map[String, Session_Entry] = Map("" -> Session_Entry(theory_timings = Map("" -> Timing.zero))) } sealed case class Build_Info(sessions: Map[String, Session_Entry]) { def finished_sessions: List[String] = for ((a, b) <- sessions.toList if b.finished) yield a def failed_sessions: List[String] = for ((a, b) <- sessions.toList if b.failed) yield a } private def parse_build_info(log_file: Log_File, parse_ml_statistics: Boolean): Build_Info = { object Chapter_Name { def unapply(s: String): Some[(String, String)] = space_explode('/', s) match { case List(chapter, name) => Some((chapter, name)) case _ => Some(("", s)) } } val Session_No_Groups = new Regex("""^Session (\S+)$""") val Session_Groups = new Regex("""^Session (\S+) \((.*)\)$""") val Session_Finished1 = new Regex("""^Finished (\S+) \((\d+):(\d+):(\d+) elapsed time, (\d+):(\d+):(\d+) cpu time.*$""") val Session_Finished2 = new Regex("""^Finished ([^\s/]+) \((\d+):(\d+):(\d+) elapsed time.*$""") val Session_Timing = new Regex("""^Timing (\S+) \((\d+) threads, (\d+\.\d+)s elapsed time, (\d+\.\d+)s cpu time, (\d+\.\d+)s GC time.*$""") val Session_Started1 = new Regex("""^(?:Running|Building) (\S+) \.\.\.$""") val Session_Started2 = new Regex("""^(?:Running|Building) (\S+) on \S+ \.\.\.$""") val Sources = new Regex("""^Sources (\S+) (\S{""" + SHA1.digest_length + """})$""") val Heap = new Regex("""^Heap (\S+) \((\d+) bytes\)$""") object Theory_Timing { def unapply(line: String): Option[(String, (String, Timing))] = Protocol.Theory_Timing_Marker.unapply(line.replace('~', '-')).map(log_file.parse_props) match { case Some((SESSION_NAME, session) :: props) => for (theory <- Markup.Name.unapply(props)) yield (session, theory -> Markup.Timing_Properties.get(props)) case _ => None } } var chapter = Map.empty[String, String] var groups = Map.empty[String, List[String]] var threads = Map.empty[String, Int] var timing = Map.empty[String, Timing] var ml_timing = Map.empty[String, Timing] var started = Set.empty[String] var sources = Map.empty[String, String] var heap_sizes = Map.empty[String, Space] var theory_timings = Map.empty[String, Map[String, Timing]] var ml_statistics = Map.empty[String, List[Properties.T]] var errors = Map.empty[String, List[String]] def all_sessions: Set[String] = chapter.keySet ++ groups.keySet ++ threads.keySet ++ timing.keySet ++ ml_timing.keySet ++ started ++ sources.keySet ++ heap_sizes.keySet ++ theory_timings.keySet ++ ml_statistics.keySet for (line <- log_file.lines) { line match { case Session_No_Groups(Chapter_Name(chapt, name)) => chapter += (name -> chapt) groups += (name -> Nil) case Session_Groups(Chapter_Name(chapt, name), grps) => chapter += (name -> chapt) groups += (name -> Word.explode(grps)) case Session_Started1(name) => started += name case Session_Started2(name) => started += name case Session_Finished1(name, Value.Int(e1), Value.Int(e2), Value.Int(e3), Value.Int(c1), Value.Int(c2), Value.Int(c3)) => val elapsed = Time.hms(e1, e2, e3) val cpu = Time.hms(c1, c2, c3) timing += (name -> Timing(elapsed, cpu, Time.zero)) case Session_Finished2(name, Value.Int(e1), Value.Int(e2), Value.Int(e3)) => val elapsed = Time.hms(e1, e2, e3) timing += (name -> Timing(elapsed, Time.zero, Time.zero)) case Session_Timing(name, Value.Int(t), Value.Double(e), Value.Double(c), Value.Double(g)) => val elapsed = Time.seconds(e) val cpu = Time.seconds(c) val gc = Time.seconds(g) ml_timing += (name -> Timing(elapsed, cpu, gc)) threads += (name -> t) case Sources(name, s) => sources += (name -> s) case Heap(name, Value.Long(size)) => heap_sizes += (name -> Space.bytes(size)) case _ if Protocol.Theory_Timing_Marker.test_yxml(line) => line match { case Theory_Timing(name, theory_timing) => theory_timings += (name -> (theory_timings.getOrElse(name, Map.empty) + theory_timing)) case _ => log_file.err("malformed theory_timing " + quote(line)) } case _ if parse_ml_statistics && Protocol.ML_Statistics_Marker.test_yxml(line) => Protocol.ML_Statistics_Marker.unapply(line).map(log_file.parse_props) match { case Some((SESSION_NAME, name) :: props) => ml_statistics += (name -> (props :: ml_statistics.getOrElse(name, Nil))) case _ => log_file.err("malformed ML_statistics " + quote(line)) } case _ if Protocol.Error_Message_Marker.test_yxml(line) => Protocol.Error_Message_Marker.unapply(line).map(log_file.parse_props) match { case Some(List((SESSION_NAME, name), (Markup.CONTENT, msg))) => errors += (name -> (msg :: errors.getOrElse(name, Nil))) case _ => log_file.err("malformed error message " + quote(line)) } case _ => } } val sessions = Map( (for (name <- all_sessions.toList) yield { val status = if (timing.isDefinedAt(name) || ml_timing.isDefinedAt(name)) Session_Status.finished else if (started(name)) Session_Status.failed else Session_Status.existing val entry = Session_Entry( chapter = chapter.getOrElse(name, ""), groups = groups.getOrElse(name, Nil), threads = threads.get(name), timing = timing.getOrElse(name, Timing.zero), ml_timing = ml_timing.getOrElse(name, Timing.zero), sources = sources.get(name), heap_size = heap_sizes.get(name), status = Some(status), errors = errors.getOrElse(name, Nil).reverse, theory_timings = theory_timings.getOrElse(name, Map.empty), ml_statistics = ml_statistics.getOrElse(name, Nil).reverse) (name -> entry) }):_*) Build_Info(sessions) } /** session info: produced by isabelle build as session database **/ sealed case class Session_Info( session_timing: Properties.T, command_timings: List[Properties.T], theory_timings: List[Properties.T], ml_statistics: List[Properties.T], task_statistics: List[Properties.T], errors: List[String] ) { def error(s: String): Session_Info = copy(errors = errors ::: List(s)) } private def parse_session_info( log_file: Log_File, command_timings: Boolean, theory_timings: Boolean, ml_statistics: Boolean, task_statistics: Boolean ): Session_Info = { Session_Info( session_timing = log_file.find_props(Protocol.Session_Timing_Marker) getOrElse Nil, command_timings = if (command_timings) log_file.filter_props(Protocol.Command_Timing_Marker) else Nil, theory_timings = if (theory_timings) log_file.filter_props(Protocol.Theory_Timing_Marker) else Nil, ml_statistics = if (ml_statistics) log_file.filter_props(Protocol.ML_Statistics_Marker) else Nil, task_statistics = if (task_statistics) log_file.filter_props(Protocol.Task_Statistics_Marker) else Nil, errors = log_file.filter(Protocol.Error_Message_Marker)) } def compress_errors( errors: List[String], cache: Compress.Cache = Compress.Cache.none ): Option[Bytes] = if (errors.isEmpty) None else { Some(Bytes(YXML.string_of_body(XML.Encode.list(XML.Encode.string)(errors))). compress(cache = cache)) } def uncompress_errors(bytes: Bytes, cache: XML.Cache = XML.Cache.make()): List[String] = if (bytes.is_empty) Nil else { XML.Decode.list(YXML.string_of_body)( YXML.parse_body(bytes.uncompress(cache = cache.compress).text, cache = cache)) } /** persistent store **/ /* SQL data model */ object Data { def build_log_table(name: String, columns: List[SQL.Column], body: String = ""): SQL.Table = SQL.Table("isabelle_build_log_" + name, columns, body) /* main content */ val log_name = SQL.Column.string("log_name").make_primary_key val session_name = SQL.Column.string("session_name").make_primary_key val theory_name = SQL.Column.string("theory_name").make_primary_key val chapter = SQL.Column.string("chapter") val groups = SQL.Column.string("groups") val threads = SQL.Column.int("threads") val timing_elapsed = SQL.Column.long("timing_elapsed") val timing_cpu = SQL.Column.long("timing_cpu") val timing_gc = SQL.Column.long("timing_gc") val timing_factor = SQL.Column.double("timing_factor") val ml_timing_elapsed = SQL.Column.long("ml_timing_elapsed") val ml_timing_cpu = SQL.Column.long("ml_timing_cpu") val ml_timing_gc = SQL.Column.long("ml_timing_gc") val ml_timing_factor = SQL.Column.double("ml_timing_factor") val theory_timing_elapsed = SQL.Column.long("theory_timing_elapsed") val theory_timing_cpu = SQL.Column.long("theory_timing_cpu") val theory_timing_gc = SQL.Column.long("theory_timing_gc") val heap_size = SQL.Column.long("heap_size") val status = SQL.Column.string("status") val errors = SQL.Column.bytes("errors") val sources = SQL.Column.string("sources") val ml_statistics = SQL.Column.bytes("ml_statistics") val known = SQL.Column.bool("known") val meta_info_table = build_log_table("meta_info", log_name :: Prop.all_props ::: Settings.all_settings) val sessions_table = build_log_table("sessions", List(log_name, session_name, chapter, groups, threads, timing_elapsed, timing_cpu, timing_gc, timing_factor, ml_timing_elapsed, ml_timing_cpu, ml_timing_gc, ml_timing_factor, heap_size, status, errors, sources)) val theories_table = build_log_table("theories", List(log_name, session_name, theory_name, theory_timing_elapsed, theory_timing_cpu, theory_timing_gc)) val ml_statistics_table = build_log_table("ml_statistics", List(log_name, session_name, ml_statistics)) /* AFP versions */ val isabelle_afp_versions_table: SQL.Table = { val version1 = Prop.isabelle_version val version2 = Prop.afp_version build_log_table("isabelle_afp_versions", List(version1.make_primary_key, version2), SQL.select(List(version1, version2), distinct = true) + meta_info_table + SQL.where(SQL.and(version1.defined, version2.defined))) } /* earliest pull date for repository version (PostgreSQL queries) */ def pull_date(afp: Boolean = false): SQL.Column = if (afp) SQL.Column.date("afp_pull_date") else SQL.Column.date("pull_date") def pull_date_table(afp: Boolean = false): SQL.Table = { val (name, versions) = if (afp) ("afp_pull_date", List(Prop.isabelle_version, Prop.afp_version)) else ("pull_date", List(Prop.isabelle_version)) build_log_table(name, versions.map(_.make_primary_key) ::: List(pull_date(afp)), "SELECT " + versions.mkString(", ") + ", min(" + Prop.build_start + ") AS " + pull_date(afp) + " FROM " + meta_info_table + " WHERE " + SQL.AND((versions ::: List(Prop.build_start)).map(_.defined)) + " GROUP BY " + versions.mkString(", ")) } /* recent entries */ def recent_time(days: Int): PostgreSQL.Source = "now() - INTERVAL '" + days.max(0) + " days'" def recent_pull_date_table( days: Int, rev: String = "", afp_rev: Option[String] = None ): SQL.Table = { val afp = afp_rev.isDefined val rev2 = afp_rev.getOrElse("") val table = pull_date_table(afp) val eq_rev = if_proper(rev, Prop.isabelle_version(table).equal(rev)) val eq_rev2 = if_proper(rev2, Prop.afp_version(table).equal(rev2)) SQL.Table("recent_pull_date", table.columns, table.select(table.columns, sql = SQL.where( SQL.or(pull_date(afp)(table).ident + " > " + recent_time(days), SQL.and(eq_rev, eq_rev2))))) } def select_recent_log_names(days: Int): PostgreSQL.Source = { val table1 = meta_info_table val table2 = recent_pull_date_table(days) table1.select(List(log_name), distinct = true, sql = SQL.join_inner + table2.query_named + " ON " + Prop.isabelle_version(table1) + " = " + Prop.isabelle_version(table2)) } def select_recent_versions( days: Int, rev: String = "", afp_rev: Option[String] = None, sql: PostgreSQL.Source = "" ): PostgreSQL.Source = { val afp = afp_rev.isDefined val version = Prop.isabelle_version val table1 = recent_pull_date_table(days, rev = rev, afp_rev = afp_rev) val table2 = meta_info_table val aux_table = SQL.Table("aux", table2.columns, table2.select(sql = sql)) val columns = table1.columns.map(c => c(table1)) ::: List(known.copy(expr = log_name(aux_table).defined)) SQL.select(columns, distinct = true) + table1.query_named + SQL.join_outer + aux_table.query_named + " ON " + version(table1) + " = " + version(aux_table) + SQL.order_by(List(pull_date(afp)(table1)), descending = true) } /* universal view on main data */ val universal_table: SQL.Table = { val afp_pull_date = pull_date(afp = true) val version1 = Prop.isabelle_version val version2 = Prop.afp_version val table1 = meta_info_table val table2 = pull_date_table(afp = true) val table3 = pull_date_table() val a_columns = log_name :: afp_pull_date :: table1.columns.tail val a_table = SQL.Table("a", a_columns, SQL.select(List(log_name, afp_pull_date) ::: table1.columns.tail.map(_.apply(table1))) + table1 + SQL.join_outer + table2 + " ON " + SQL.and( version1(table1).ident + " = " + version1(table2).ident, version2(table1).ident + " = " + version2(table2).ident)) val b_columns = log_name :: pull_date() :: a_columns.tail val b_table = SQL.Table("b", b_columns, SQL.select( List(log_name(a_table), pull_date()(table3)) ::: a_columns.tail.map(_.apply(a_table))) + a_table.query_named + SQL.join_outer + table3 + " ON " + version1(a_table) + " = " + version1(table3)) val c_columns = b_columns ::: sessions_table.columns.tail val c_table = SQL.Table("c", c_columns, SQL.select(log_name(b_table) :: c_columns.tail) + b_table.query_named + SQL.join_inner + sessions_table + " ON " + log_name(b_table) + " = " + log_name(sessions_table)) SQL.Table("isabelle_build_log", c_columns ::: List(ml_statistics), { SQL.select(c_columns.map(_.apply(c_table)) ::: List(ml_statistics)) + c_table.query_named + SQL.join_outer + ml_statistics_table + " ON " + SQL.and( log_name(c_table).ident + " = " + log_name(ml_statistics_table).ident, session_name(c_table).ident + " = " + session_name(ml_statistics_table).ident) }) } } /* database access */ def store(options: Options, cache: XML.Cache = XML.Cache.make()): Store = new Store(options, cache) class Store private[Build_Log](options: Options, val cache: XML.Cache) { def open_database( user: String = options.string("build_log_database_user"), password: String = options.string("build_log_database_password"), database: String = options.string("build_log_database_name"), host: String = options.string("build_log_database_host"), port: Int = options.int("build_log_database_port"), ssh_host: String = options.string("build_log_ssh_host"), ssh_user: String = options.string("build_log_ssh_user"), ssh_port: Int = options.int("build_log_ssh_port") ): PostgreSQL.Database = { PostgreSQL.open_database( user = user, password = password, database = database, host = host, port = port, ssh = if (ssh_host == "") None else Some(SSH.open_session(options, host = ssh_host, user = ssh_user, port = ssh_port)), ssh_close = true) } def update_database( db: PostgreSQL.Database, dirs: List[Path], ml_statistics: Boolean = false): Unit = { val log_files = dirs.flatMap(dir => File.find_files(dir.file, pred = Log_File.is_log(_), follow_links = true)) write_info(db, log_files, ml_statistics = ml_statistics) db.create_view(Data.pull_date_table()) db.create_view(Data.pull_date_table(afp = true)) db.create_view(Data.universal_table) } def snapshot_database( db: PostgreSQL.Database, sqlite_database: Path, days: Int = 100, ml_statistics: Boolean = false ): Unit = { Isabelle_System.make_directory(sqlite_database.dir) sqlite_database.file.delete using(SQLite.open_database(sqlite_database)) { db2 => db.transaction { db2.transaction { // main content db2.create_table(Data.meta_info_table) db2.create_table(Data.sessions_table) db2.create_table(Data.theories_table) db2.create_table(Data.ml_statistics_table) val recent_log_names = db.execute_query_statement( Data.select_recent_log_names(days), List.from[String], res => res.string(Data.log_name)) for (log_name <- recent_log_names) { read_meta_info(db, log_name).foreach(meta_info => update_meta_info(db2, log_name, meta_info)) update_sessions(db2, log_name, read_build_info(db, log_name)) if (ml_statistics) { update_ml_statistics(db2, log_name, read_build_info(db, log_name, ml_statistics = true)) } } // pull_date for (afp <- List(false, true)) { val afp_rev = if (afp) Some("") else None val table = Data.pull_date_table(afp) db2.create_table(table) db2.using_statement(table.insert()) { stmt2 => db.using_statement( Data.recent_pull_date_table(days, afp_rev = afp_rev).query) { stmt => val res = stmt.execute_query() while (res.next()) { for ((c, i) <- table.columns.zipWithIndex) { stmt2.string(i + 1) = res.get_string(c) } stmt2.execute() } } } } // full view db2.create_view(Data.universal_table) } } - db2.rebuild() + db2.vacuum() } } def domain(db: SQL.Database, table: SQL.Table, column: SQL.Column): Set[String] = db.execute_query_statement( table.select(List(column), distinct = true), Set.from[String], res => res.string(column)) def update_meta_info(db: SQL.Database, log_name: String, meta_info: Meta_Info): Unit = db.execute_statement(db.insert_permissive(Data.meta_info_table), body = { stmt => stmt.string(1) = log_name for ((c, i) <- Data.meta_info_table.columns.tail.zipWithIndex) { if (c.T == SQL.Type.Date) stmt.date(i + 2) = meta_info.get_date(c) else stmt.string(i + 2) = meta_info.get(c) } }) def update_sessions(db: SQL.Database, log_name: String, build_info: Build_Info): Unit = db.execute_statement(db.insert_permissive(Data.sessions_table), body = { stmt => val sessions = if (build_info.sessions.isEmpty) Build_Info.sessions_dummy else build_info.sessions for ((session_name, session) <- sessions) { stmt.string(1) = log_name stmt.string(2) = session_name stmt.string(3) = proper_string(session.chapter) stmt.string(4) = session.proper_groups stmt.int(5) = session.threads stmt.long(6) = session.timing.elapsed.proper_ms stmt.long(7) = session.timing.cpu.proper_ms stmt.long(8) = session.timing.gc.proper_ms stmt.double(9) = session.timing.factor stmt.long(10) = session.ml_timing.elapsed.proper_ms stmt.long(11) = session.ml_timing.cpu.proper_ms stmt.long(12) = session.ml_timing.gc.proper_ms stmt.double(13) = session.ml_timing.factor stmt.long(14) = session.heap_size.map(_.bytes) stmt.string(15) = session.status.map(_.toString) stmt.bytes(16) = compress_errors(session.errors, cache = cache.compress) stmt.string(17) = session.sources } }) def update_theories(db: SQL.Database, log_name: String, build_info: Build_Info): Unit = db.execute_statement(db.insert_permissive(Data.theories_table), body = { stmt => val sessions = if (build_info.sessions.forall({ case (_, session) => session.theory_timings.isEmpty })) Build_Info.sessions_dummy else build_info.sessions for { (session_name, session) <- sessions (theory_name, timing) <- session.theory_timings } { stmt.string(1) = log_name stmt.string(2) = session_name stmt.string(3) = theory_name stmt.long(4) = timing.elapsed.ms stmt.long(5) = timing.cpu.ms stmt.long(6) = timing.gc.ms } }) def update_ml_statistics(db: SQL.Database, log_name: String, build_info: Build_Info): Unit = db.execute_statement(db.insert_permissive(Data.ml_statistics_table), body = { stmt => val ml_stats: List[(String, Option[Bytes])] = Par_List.map[(String, Session_Entry), (String, Option[Bytes])]( { case (a, b) => (a, Properties.compress(b.ml_statistics, cache = cache.compress).proper) }, build_info.sessions.iterator.filter(p => p._2.ml_statistics.nonEmpty).toList) val entries = if (ml_stats.nonEmpty) ml_stats else List("" -> None) for ((session_name, ml_statistics) <- entries) { stmt.string(1) = log_name stmt.string(2) = session_name stmt.bytes(3) = ml_statistics } }) def write_info(db: SQL.Database, files: List[JFile], ml_statistics: Boolean = false): Unit = { abstract class Table_Status(table: SQL.Table) { db.create_table(table) private var known: Set[String] = domain(db, table, Data.log_name) def required(file: JFile): Boolean = !known(Log_File.plain_name(file.getName)) def update_db(db: SQL.Database, log_file: Log_File): Unit def update(log_file: Log_File): Unit = { if (!known(log_file.name)) { update_db(db, log_file) known += log_file.name } } } val status = List( new Table_Status(Data.meta_info_table) { override def update_db(db: SQL.Database, log_file: Log_File): Unit = update_meta_info(db, log_file.name, log_file.parse_meta_info()) }, new Table_Status(Data.sessions_table) { override def update_db(db: SQL.Database, log_file: Log_File): Unit = update_sessions(db, log_file.name, log_file.parse_build_info()) }, new Table_Status(Data.theories_table) { override def update_db(db: SQL.Database, log_file: Log_File): Unit = update_theories(db, log_file.name, log_file.parse_build_info()) }, new Table_Status(Data.ml_statistics_table) { override def update_db(db: SQL.Database, log_file: Log_File): Unit = if (ml_statistics) { update_ml_statistics(db, log_file.name, log_file.parse_build_info(ml_statistics = true)) } }) for (file_group <- files.filter(file => status.exists(_.required(file))). grouped(options.int("build_log_transaction_size") max 1)) { val log_files = Par_List.map[JFile, Log_File](Log_File.apply, file_group) db.transaction { log_files.foreach(log_file => status.foreach(_.update(log_file))) } } } def read_meta_info(db: SQL.Database, log_name: String): Option[Meta_Info] = { val table = Data.meta_info_table val columns = table.columns.tail db.execute_query_statementO[Meta_Info]( table.select(columns, sql = Data.log_name.where_equal(log_name)), { res => val results = columns.map(c => c.name -> (if (c.T == SQL.Type.Date) res.get_date(c).map(Log_File.Date_Format(_)) else res.get_string(c))) val n = Prop.all_props.length val props = for ((x, Some(y)) <- results.take(n)) yield (x, y) val settings = for ((x, Some(y)) <- results.drop(n)) yield (x, y) Meta_Info(props, settings) } ) } def read_build_info( db: SQL.Database, log_name: String, session_names: List[String] = Nil, ml_statistics: Boolean = false ): Build_Info = { val table1 = Data.sessions_table val table2 = Data.ml_statistics_table val columns1 = table1.columns.tail.map(_.apply(table1)) val (columns, from) = if (ml_statistics) { val columns = columns1 ::: List(Data.ml_statistics(table2)) val join = table1.ident + SQL.join_outer + table2.ident + " ON " + SQL.and( Data.log_name(table1).ident + " = " + Data.log_name(table2).ident, Data.session_name(table1).ident + " = " + Data.session_name(table2).ident) (columns, SQL.enclose(join)) } else (columns1, table1.ident) val where = SQL.where( SQL.and( Data.log_name(table1).equal(log_name), Data.session_name(table1).ident + " <> ''", if_proper(session_names, Data.session_name(table1).member(session_names)))) val sessions = db.execute_query_statement( SQL.select(columns, sql = from + where), Map.from[String, Session_Entry], { res => val session_name = res.string(Data.session_name) val session_entry = Session_Entry( chapter = res.string(Data.chapter), groups = split_lines(res.string(Data.groups)), threads = res.get_int(Data.threads), timing = res.timing( Data.timing_elapsed, Data.timing_cpu, Data.timing_gc), ml_timing = res.timing( Data.ml_timing_elapsed, Data.ml_timing_cpu, Data.ml_timing_gc), sources = res.get_string(Data.sources), heap_size = res.get_long(Data.heap_size).map(Space.bytes), status = res.get_string(Data.status).map(Session_Status.withName), errors = uncompress_errors(res.bytes(Data.errors), cache = cache), ml_statistics = if (ml_statistics) { Properties.uncompress(res.bytes(Data.ml_statistics), cache = cache) } else Nil) session_name -> session_entry } ) Build_Info(sessions) } } } diff --git a/src/Pure/General/sql.scala b/src/Pure/General/sql.scala --- a/src/Pure/General/sql.scala +++ b/src/Pure/General/sql.scala @@ -1,619 +1,627 @@ /* Title: Pure/General/sql.scala Author: Makarius Support for SQL databases: SQLite and PostgreSQL. */ package isabelle import java.time.OffsetDateTime import java.sql.{DriverManager, Connection, PreparedStatement, ResultSet} import org.sqlite.jdbc4.JDBC4Connection import org.postgresql.{PGConnection, PGNotification} import scala.collection.mutable object SQL { /** SQL language **/ type Source = String /* concrete syntax */ def escape_char(c: Char): String = c match { case '\u0000' => "\\0" case '\'' => "\\'" case '\"' => "\\\"" case '\b' => "\\b" case '\n' => "\\n" case '\r' => "\\r" case '\t' => "\\t" case '\u001a' => "\\Z" case '\\' => "\\\\" case _ => c.toString } def string(s: String): Source = s.iterator.map(escape_char).mkString("'", "", "'") def ident(s: String): Source = Long_Name.implode(Long_Name.explode(s).map(a => quote(a.replace("\"", "\"\"")))) def enclose(s: Source): Source = "(" + s + ")" def enclosure(ss: Iterable[Source]): Source = ss.mkString("(", ", ", ")") def separate(sql: Source): Source = (if (sql.isEmpty || sql.startsWith(" ")) "" else " ") + sql def select(columns: List[Column] = Nil, distinct: Boolean = false, sql: Source = ""): Source = "SELECT " + (if (distinct) "DISTINCT " else "") + (if (columns.isEmpty) "*" else commas(columns.map(_.ident))) + " FROM " + sql val join_outer: Source = " LEFT OUTER JOIN " val join_inner: Source = " INNER JOIN " def join(outer: Boolean): Source = if (outer) join_outer else join_inner def infix(op: Source, args: Iterable[Source]): Source = { val body = args.iterator.filter(_.nonEmpty).mkString(" " + op + " ") if_proper(body, enclose(body)) } def AND(args: Iterable[Source]): Source = infix("AND", args) def OR(args: Iterable[Source]): Source = infix("OR", args) def and(args: Source*): Source = AND(args) def or(args: Source*): Source = OR(args) val TRUE: Source = "TRUE" val FALSE: Source = "FALSE" def equal(sql: Source, s: String): Source = sql + " = " + string(s) def member(sql: Source, set: Iterable[String]): Source = if (set.isEmpty) FALSE else OR(set.iterator.map(equal(sql, _)).toList) def where(sql: Source): Source = if_proper(sql, " WHERE " + sql) /* types */ object Type extends Enumeration { val Boolean = Value("BOOLEAN") val Int = Value("INTEGER") val Long = Value("BIGINT") val Double = Value("DOUBLE PRECISION") val String = Value("TEXT") val Bytes = Value("BLOB") val Date = Value("TIMESTAMP WITH TIME ZONE") } def sql_type_default(T: Type.Value): Source = T.toString def sql_type_sqlite(T: Type.Value): Source = if (T == Type.Boolean) "INTEGER" else if (T == Type.Date) "TEXT" else sql_type_default(T) def sql_type_postgresql(T: Type.Value): Source = if (T == Type.Bytes) "BYTEA" else sql_type_default(T) /* columns */ object Column { def bool(name: String, strict: Boolean = false, primary_key: Boolean = false): Column = Column(name, Type.Boolean, strict, primary_key) def int(name: String, strict: Boolean = false, primary_key: Boolean = false): Column = Column(name, Type.Int, strict, primary_key) def long(name: String, strict: Boolean = false, primary_key: Boolean = false): Column = Column(name, Type.Long, strict, primary_key) def double(name: String, strict: Boolean = false, primary_key: Boolean = false): Column = Column(name, Type.Double, strict, primary_key) def string(name: String, strict: Boolean = false, primary_key: Boolean = false): Column = Column(name, Type.String, strict, primary_key) def bytes(name: String, strict: Boolean = false, primary_key: Boolean = false): Column = Column(name, Type.Bytes, strict, primary_key) def date(name: String, strict: Boolean = false, primary_key: Boolean = false): Column = Column(name, Type.Date, strict, primary_key) } sealed case class Column( name: String, T: Type.Value, strict: Boolean = false, primary_key: Boolean = false, expr: SQL.Source = "" ) { def make_primary_key: Column = copy(primary_key = true) def apply(table: Table): Column = Column(Long_Name.qualify(table.name, name), T, strict = strict, primary_key = primary_key) def ident: Source = if (expr == "") SQL.ident(name) else enclose(expr) + " AS " + SQL.ident(name) def decl(sql_type: Type.Value => Source): Source = ident + " " + sql_type(T) + (if (strict || primary_key) " NOT NULL" else "") def defined: String = ident + " IS NOT NULL" def undefined: String = ident + " IS NULL" def equal(s: String): Source = SQL.equal(ident, s) def member(set: Iterable[String]): Source = SQL.member(ident, set) def where_equal(s: String): Source = SQL.where(equal(s)) def where_member(set: Iterable[String]): Source = SQL.where(member(set)) override def toString: Source = ident } def order_by(columns: List[Column], descending: Boolean = false): Source = " ORDER BY " + columns.mkString(", ") + (if (descending) " DESC" else "") /* tables */ sealed case class Table(name: String, columns: List[Column], body: Source = "") { Library.duplicates(columns.map(_.name)) match { case Nil => case bad => error("Duplicate column names " + commas_quote(bad) + " for table " + quote(name)) } def ident: Source = SQL.ident(name) def query: Source = if (body == "") error("Missing SQL body for table " + quote(name)) else SQL.enclose(body) def query_named: Source = query + " AS " + SQL.ident(name) def create(strict: Boolean, sql_type: Type.Value => Source): Source = { val primary_key = columns.filter(_.primary_key).map(_.name) match { case Nil => Nil case keys => List("PRIMARY KEY " + enclosure(keys)) } "CREATE TABLE " + (if (strict) "" else "IF NOT EXISTS ") + ident + " " + enclosure(columns.map(_.decl(sql_type)) ::: primary_key) } def create_index(index_name: String, index_columns: List[Column], strict: Boolean = false, unique: Boolean = false): Source = "CREATE " + (if (unique) "UNIQUE " else "") + "INDEX " + (if (strict) "" else "IF NOT EXISTS ") + SQL.ident(index_name) + " ON " + ident + " " + enclosure(index_columns.map(_.name)) def insert_cmd(cmd: Source = "INSERT", sql: Source = ""): Source = cmd + " INTO " + ident + " VALUES " + enclosure(columns.map(_ => "?")) + SQL.separate(sql) def insert(sql: Source = ""): Source = insert_cmd(sql = sql) def delete(sql: Source = ""): Source = "DELETE FROM " + ident + SQL.separate(sql) def update(update_columns: List[Column] = Nil, sql: Source = ""): Source = "UPDATE " + ident + " SET " + commas(update_columns.map(c => c.ident + " = ?")) + SQL.separate(sql) def select( select_columns: List[Column] = Nil, distinct: Boolean = false, sql: Source = "" ): Source = SQL.select(select_columns, distinct = distinct, sql = ident + SQL.separate(sql)) override def toString: Source = ident } /* table groups */ object Tables { def list(list: List[Table]): Tables = new Tables(list) + val empty: Tables = list(Nil) def apply(args: Table*): Tables = list(args.toList) } final class Tables private(val list: List[Table]) extends Iterable[Table] { override def toString: String = list.mkString("SQL.Tables(", ", ", ")") + def ::: (other: Tables): Tables = new Tables(other.list ::: list) + def iterator: Iterator[Table] = list.iterator // requires transaction def create_lock(db: Database): Unit = { foreach(db.create_table(_)) lock(db) } // requires transaction def lock(db: Database): Unit = { val sql = db.lock_tables(list) if (sql.nonEmpty) db.execute_statement(sql) } } /** SQL database operations **/ /* statements */ class Statement private[SQL](val db: Database, val rep: PreparedStatement) extends AutoCloseable { stmt => object bool { def update(i: Int, x: Boolean): Unit = rep.setBoolean(i, x) def update(i: Int, x: Option[Boolean]): Unit = { if (x.isDefined) update(i, x.get) else rep.setNull(i, java.sql.Types.BOOLEAN) } } object int { def update(i: Int, x: Int): Unit = rep.setInt(i, x) def update(i: Int, x: Option[Int]): Unit = { if (x.isDefined) update(i, x.get) else rep.setNull(i, java.sql.Types.INTEGER) } } object long { def update(i: Int, x: Long): Unit = rep.setLong(i, x) def update(i: Int, x: Option[Long]): Unit = { if (x.isDefined) update(i, x.get) else rep.setNull(i, java.sql.Types.BIGINT) } } object double { def update(i: Int, x: Double): Unit = rep.setDouble(i, x) def update(i: Int, x: Option[Double]): Unit = { if (x.isDefined) update(i, x.get) else rep.setNull(i, java.sql.Types.DOUBLE) } } object string { def update(i: Int, x: String): Unit = rep.setString(i, x) def update(i: Int, x: Option[String]): Unit = update(i, x.orNull) } object bytes { def update(i: Int, bytes: Bytes): Unit = { if (bytes == null) rep.setBytes(i, null) else rep.setBinaryStream(i, bytes.stream(), bytes.length) } def update(i: Int, bytes: Option[Bytes]): Unit = update(i, bytes.orNull) } object date { def update(i: Int, date: Date): Unit = db.update_date(stmt, i, date) def update(i: Int, date: Option[Date]): Unit = update(i, date.orNull) } def execute(): Boolean = rep.execute() def execute_query(): Result = new Result(this, rep.executeQuery()) def close(): Unit = rep.close() } /* results */ class Result private[SQL](val stmt: Statement, val rep: ResultSet) { res => def next(): Boolean = rep.next() def iterator[A](get: Result => A): Iterator[A] = new Iterator[A] { private var _next: Boolean = res.next() def hasNext: Boolean = _next def next(): A = { val x = get(res); _next = res.next(); x } } def bool(column: Column): Boolean = rep.getBoolean(column.name) def int(column: Column): Int = rep.getInt(column.name) def long(column: Column): Long = rep.getLong(column.name) def double(column: Column): Double = rep.getDouble(column.name) def string(column: Column): String = { val s = rep.getString(column.name) if (s == null) "" else s } def bytes(column: Column): Bytes = { val bs = rep.getBytes(column.name) if (bs == null) Bytes.empty else Bytes(bs) } def date(column: Column): Date = stmt.db.date(res, column) def timing(c1: Column, c2: Column, c3: Column): Timing = Timing(Time.ms(long(c1)), Time.ms(long(c2)), Time.ms(long(c3))) def get[A](column: Column, f: Column => A): Option[A] = { val x = f(column) if (rep.wasNull || x == null) None else Some(x) } def get_bool(column: Column): Option[Boolean] = get(column, bool) def get_int(column: Column): Option[Int] = get(column, int) def get_long(column: Column): Option[Long] = get(column, long) def get_double(column: Column): Option[Double] = get(column, double) def get_string(column: Column): Option[String] = get(column, string) def get_bytes(column: Column): Option[Bytes] = get(column, bytes) def get_date(column: Column): Option[Date] = get(column, date) } /* database */ trait Database extends AutoCloseable { db => def is_sqlite: Boolean = isInstanceOf[SQLite.Database] def is_postgresql: Boolean = isInstanceOf[PostgreSQL.Database] - def rebuild(): Unit = () + def vacuum(tables: SQL.Tables = SQL.Tables.empty): Unit def now(): Date /* types */ def sql_type(T: Type.Value): Source /* connection */ def connection: Connection def sqlite_connection: Option[JDBC4Connection] = connection match { case conn: JDBC4Connection => Some(conn) case _ => None } def postgresql_connection: Option[PGConnection] = connection match { case conn: PGConnection => Some(conn) case _ => None } def the_sqlite_connection: JDBC4Connection = sqlite_connection getOrElse error("SQLite connection expected, but found " + connection.getClass.getName) def the_postgresql_connection: PGConnection = postgresql_connection getOrElse error("PostgreSQL connection expected, but found " + connection.getClass.getName) def close(): Unit = connection.close() def transaction[A](body: => A): A = { val auto_commit = connection.getAutoCommit() try { connection.setAutoCommit(false) val savepoint = connection.setSavepoint() try { val result = body connection.commit() result } catch { case exn: Throwable => connection.rollback(savepoint); throw exn } } finally { connection.setAutoCommit(auto_commit) } } def transaction_lock[A](tables: Tables)(body: => A): A = transaction { tables.lock(db); body } def lock_tables(tables: List[Table]): Source = "" // PostgreSQL only /* statements and results */ def statement(sql: Source): Statement = new Statement(db, connection.prepareStatement(sql)) def using_statement[A](sql: Source)(f: Statement => A): A = using(statement(sql))(f) def execute_statement(sql: Source, body: Statement => Unit = _ => ()): Unit = using_statement(sql) { stmt => body(stmt); stmt.execute() } def execute_query_statement[A, B]( sql: Source, make_result: Iterator[A] => B, get: Result => A ): B = using_statement(sql)(stmt => make_result(stmt.execute_query().iterator(get))) def execute_query_statementO[A](sql: Source, get: Result => A): Option[A] = execute_query_statement[A, Option[A]](sql, _.nextOption, get) def execute_query_statementB(sql: Source): Boolean = using_statement(sql)(stmt => stmt.execute_query().next()) def update_date(stmt: Statement, i: Int, date: Date): Unit def date(res: Result, column: Column): Date def insert_permissive(table: Table, sql: Source = ""): Source /* tables and views */ def tables: List[String] = { val result = new mutable.ListBuffer[String] val rs = connection.getMetaData.getTables(null, null, "%", null) while (rs.next) { result += rs.getString(3) } result.toList } def create_table(table: Table, strict: Boolean = false, sql: Source = ""): Unit = execute_statement(table.create(strict, sql_type) + SQL.separate(sql)) def create_index(table: Table, name: String, columns: List[Column], strict: Boolean = false, unique: Boolean = false): Unit = execute_statement(table.create_index(name, columns, strict, unique)) def create_view(table: Table, strict: Boolean = false): Unit = { if (strict || !tables.contains(table.name)) { execute_statement("CREATE VIEW " + table + " AS " + { table.query; table.body }) } } } } /** SQLite **/ object SQLite { // see https://www.sqlite.org/lang_datefunc.html val date_format: Date.Format = Date.Format("uuuu-MM-dd HH:mm:ss.SSS x") lazy val init_jdbc: Unit = { val lib_path = Path.explode("$ISABELLE_SQLITE_HOME/" + Platform.jvm_platform) val lib_name = File.get_file(lib_path).file_name System.setProperty("org.sqlite.lib.path", File.platform_path(lib_path)) System.setProperty("org.sqlite.lib.name", lib_name) Class.forName("org.sqlite.JDBC") } def open_database(path: Path): Database = { init_jdbc val path0 = path.expand val s0 = File.platform_path(path0) val s1 = if (Platform.is_windows) s0.replace('\\', '/') else s0 val connection = DriverManager.getConnection("jdbc:sqlite:" + s1) new Database(path0.toString, connection) } class Database private[SQLite](name: String, val connection: Connection) extends SQL.Database { override def toString: String = name - override def rebuild(): Unit = execute_statement("VACUUM") + + override def vacuum(tables: SQL.Tables = SQL.Tables.empty): Unit = + execute_statement("VACUUM") // always FULL override def now(): Date = Date.now() def sql_type(T: SQL.Type.Value): SQL.Source = SQL.sql_type_sqlite(T) def update_date(stmt: SQL.Statement, i: Int, date: Date): Unit = if (date == null) stmt.string(i) = (null: String) else stmt.string(i) = date_format(date) def date(res: SQL.Result, column: SQL.Column): Date = proper_string(res.string(column)) match { case None => null case Some(s) => date_format.parse(s) } def insert_permissive(table: SQL.Table, sql: SQL.Source = ""): SQL.Source = table.insert_cmd(cmd = "INSERT OR IGNORE", sql = sql) } } /** PostgreSQL **/ object PostgreSQL { type Source = SQL.Source val default_port = 5432 lazy val init_jdbc: Unit = Class.forName("org.postgresql.Driver") def open_database( user: String, password: String, database: String = "", host: String = "", port: Int = 0, ssh: Option[SSH.Session] = None, ssh_close: Boolean = false, // see https://www.postgresql.org/docs/current/transaction-iso.html transaction_isolation: Int = Connection.TRANSACTION_SERIALIZABLE ): Database = { init_jdbc if (user == "") error("Undefined database user") val db_host = proper_string(host) getOrElse "localhost" val db_port = if (port > 0 && port != default_port) ":" + port else "" val db_name = "/" + (proper_string(database) getOrElse user) val (url, name, port_forwarding) = ssh match { case None => val spec = db_host + db_port + db_name val url = "jdbc:postgresql://" + spec val name = user + "@" + spec (url, name, None) case Some(ssh) => val fw = ssh.port_forwarding(remote_host = db_host, remote_port = if (port > 0) port else default_port, ssh_close = ssh_close) val url = "jdbc:postgresql://localhost:" + fw.port + db_name val name = user + "@" + fw + db_name + " via ssh " + ssh (url, name, Some(fw)) } try { val connection = DriverManager.getConnection(url, user, password) connection.setTransactionIsolation(transaction_isolation) new Database(name, connection, port_forwarding) } catch { case exn: Throwable => port_forwarding.foreach(_.close()); throw exn } } class Database private[PostgreSQL]( name: String, val connection: Connection, port_forwarding: Option[SSH.Port_Forwarding] ) extends SQL.Database { override def toString: String = name + override def vacuum(tables: SQL.Tables = SQL.Tables.empty): Unit = + execute_statement("VACUUM" + if_proper(tables.list, " " + commas(tables.list.map(_.ident)))) + override def now(): Date = { val now = SQL.Column.date("now") execute_query_statementO[Date]("SELECT NOW() as " + now.ident, res => res.date(now)) .getOrElse(error("Failed to get current date/time from database server " + toString)) } def sql_type(T: SQL.Type.Value): SQL.Source = SQL.sql_type_postgresql(T) // see https://jdbc.postgresql.org/documentation/head/8-date-time.html def update_date(stmt: SQL.Statement, i: Int, date: Date): Unit = if (date == null) stmt.rep.setObject(i, null) else stmt.rep.setObject(i, OffsetDateTime.from(date.to(Date.timezone_utc).rep)) def date(res: SQL.Result, column: SQL.Column): Date = { val obj = res.rep.getObject(column.name, classOf[OffsetDateTime]) if (obj == null) null else Date.instant(obj.toInstant) } def insert_permissive(table: SQL.Table, sql: SQL.Source = ""): SQL.Source = table.insert_cmd(sql = if_proper(sql, sql + " ") + "ON CONFLICT DO NOTHING") /* explicit locking: only applicable to PostgreSQL within transaction context */ // see https://www.postgresql.org/docs/current/sql-lock.html // see https://www.postgresql.org/docs/current/explicit-locking.html override def lock_tables(tables: List[SQL.Table]): PostgreSQL.Source = "LOCK TABLE " + tables.mkString(", ") + " IN ACCESS EXCLUSIVE MODE" /* notifications: IPC via database server */ // see https://www.postgresql.org/docs/current/sql-notify.html def listen(name: String): Unit = execute_statement("LISTEN " + SQL.ident(name)) def unlisten(name: String = "*"): Unit = execute_statement("UNLISTEN " + (if (name == "*") name else SQL.ident(name))) def notify(name: String, payload: String = ""): Unit = execute_statement("NOTIFY " + SQL.ident(name) + if_proper(payload, ", " + SQL.string(payload))) def get_notifications(): List[PGNotification] = the_postgresql_connection.getNotifications() match { case null => Nil case array => array.toList } override def close(): Unit = { super.close(); port_forwarding.foreach(_.close()) } } } diff --git a/src/Pure/ML/ml_process.scala b/src/Pure/ML/ml_process.scala --- a/src/Pure/ML/ml_process.scala +++ b/src/Pure/ML/ml_process.scala @@ -1,184 +1,187 @@ /* Title: Pure/ML/ml_process.scala Author: Makarius The raw ML process. */ package isabelle import java.util.{Map => JMap, HashMap} import java.io.{File => JFile} object ML_Process { + def bootstrap_shasum(): SHA1.Shasum = + SHA1.shasum_meta_info(SHA1.digest(Path.explode("$POLYML_EXE"))) + def session_heaps( store: Sessions.Store, session_background: Sessions.Background, logic: String = "" ): List[Path] = { val logic_name = Isabelle_System.default_logic(logic) session_background.sessions_structure.selection(logic_name). build_requirements(List(logic_name)). map(store.the_heap) } def apply( options: Options, session_background: Sessions.Background, session_heaps: List[Path], use_prelude: List[String] = Nil, eval_main: String = "", args: List[String] = Nil, modes: List[String] = Nil, cwd: JFile = null, env: JMap[String, String] = Isabelle_System.settings(), redirect: Boolean = false, cleanup: () => Unit = () => () ): Bash.Process = { val eval_init = if (session_heaps.isEmpty) { List( """ fun chapter (_: string) = (); fun section (_: string) = (); fun subsection (_: string) = (); fun subsubsection (_: string) = (); fun paragraph (_: string) = (); fun subparagraph (_: string) = (); val ML_file = PolyML.use; """, if (Platform.is_windows) "fun exit 0 = OS.Process.exit OS.Process.success" + " | exit 1 = OS.Process.exit OS.Process.failure" + " | exit rc = OS.Process.exit (RunCall.unsafeCast (Word8.fromInt rc))" else "fun exit rc = Posix.Process.exit (Word8.fromInt rc)", "PolyML.Compiler.prompt1 := \"Poly/ML> \"", "PolyML.Compiler.prompt2 := \"Poly/ML# \"") } else { List( "(PolyML.SaveState.loadHierarchy " + ML_Syntax.print_list( ML_Syntax.print_string_bytes)(session_heaps.map(File.platform_path)) + "; PolyML.print_depth 0)") } val eval_modes = if (modes.isEmpty) Nil else List("Print_Mode.add_modes " + ML_Syntax.print_list(ML_Syntax.print_string_bytes)(modes)) // options val eval_options = if (session_heaps.isEmpty) Nil else List("Options.load_default ()") val isabelle_process_options = Isabelle_System.tmp_file("options") Isabelle_System.chmod("600", File.path(isabelle_process_options)) File.write(isabelle_process_options, YXML.string_of_body(options.encode)) // session resources val eval_init_session = if (session_heaps.isEmpty) Nil else List("Resources.init_session_env ()") val init_session = Isabelle_System.tmp_file("init_session") Isabelle_System.chmod("600", File.path(init_session)) File.write(init_session, new Resources(session_background).init_session_yxml) // process val eval_process = proper_string(eval_main).getOrElse( if (session_heaps.isEmpty) { "PolyML.print_depth " + ML_Syntax.print_int(options.int("ML_print_depth")) } else "Isabelle_Process.init ()") // ISABELLE_TMP val isabelle_tmp = Isabelle_System.tmp_dir("process") val ml_runtime_options = { val ml_options = Word.explode(Isabelle_System.getenv("ML_OPTIONS")) val ml_options1 = if (ml_options.exists(_.containsSlice("gcthreads"))) ml_options else ml_options ::: List("--gcthreads", options.int("threads").toString) val ml_options2 = if (!Platform.is_windows || ml_options.exists(_.containsSlice("codepage"))) ml_options1 else ml_options1 ::: List("--codepage", "utf8") ml_options2 ::: List("--exportstats") } // bash val bash_args = ml_runtime_options ::: (eval_init ::: eval_modes ::: eval_options ::: eval_init_session).flatMap(List("--eval", _)) ::: use_prelude.flatMap(List("--use", _)) ::: List("--eval", eval_process) ::: args val bash_env = new HashMap(env) bash_env.put("ISABELLE_PROCESS_OPTIONS", File.standard_path(isabelle_process_options)) bash_env.put("ISABELLE_INIT_SESSION", File.standard_path(init_session)) bash_env.put("ISABELLE_TMP", File.standard_path(isabelle_tmp)) bash_env.put("POLYSTATSDIR", isabelle_tmp.getAbsolutePath) val process_policy = options.string("process_policy") val process_prefix = if_proper(process_policy, process_policy + " ") Bash.process(process_prefix + "\"$POLYML_EXE\" -q " + Bash.strings(bash_args), cwd = cwd, env = bash_env, redirect = redirect, cleanup = { () => isabelle_process_options.delete init_session.delete Isabelle_System.rm_tree(isabelle_tmp) cleanup() }) } /* Isabelle tool wrapper */ val isabelle_tool = Isabelle_Tool("process", "raw ML process (batch mode)", Scala_Project.here, { args => var dirs: List[Path] = Nil var eval_args: List[String] = Nil var logic = Isabelle_System.getenv("ISABELLE_LOGIC") var modes: List[String] = Nil var options = Options.init() val getopts = Getopts(""" Usage: isabelle process [OPTIONS] Options are: -T THEORY load theory -d DIR include session directory -e ML_EXPR evaluate ML expression on startup -f ML_FILE evaluate ML file on startup -l NAME logic session name (default ISABELLE_LOGIC=""" + quote(logic) + """) -m MODE add print mode for output -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) Run the raw Isabelle ML process in batch mode. """, "T:" -> (arg => eval_args = eval_args ::: List("--eval", "use_thy " + ML_Syntax.print_string_bytes(arg))), "d:" -> (arg => dirs = dirs ::: List(Path.explode(arg))), "e:" -> (arg => eval_args = eval_args ::: List("--eval", arg)), "f:" -> (arg => eval_args = eval_args ::: List("--use", arg)), "l:" -> (arg => logic = arg), "m:" -> (arg => modes = arg :: modes), "o:" -> (arg => options = options + arg)) val more_args = getopts(args) if (args.isEmpty || more_args.nonEmpty) getopts.usage() val store = Sessions.store(options) val session_background = Sessions.background(options, logic, dirs = dirs).check_errors val session_heaps = ML_Process.session_heaps(store, session_background, logic = logic) val result = ML_Process(options, session_background, session_heaps, args = eval_args, modes = modes) .result( progress_stdout = Output.writeln(_, stdout = true), progress_stderr = Output.writeln(_)) sys.exit(result.rc) }) } diff --git a/src/Pure/ROOT.ML b/src/Pure/ROOT.ML --- a/src/Pure/ROOT.ML +++ b/src/Pure/ROOT.ML @@ -1,369 +1,368 @@ (* Title: Pure/ROOT.ML Author: Makarius Main entry point for the Isabelle/Pure bootstrap process. Note: When this file is open in the Prover IDE, the ML files of Isabelle/Pure can be explored interactively. This is a separate copy of Pure within Pure: it does not affect the running logic session. *) chapter "Isabelle/Pure bootstrap"; ML_file "ML/ml_name_space.ML"; section "Bootstrap phase 0: Poly/ML setup"; ML_file "ML/ml_init.ML"; ML_file "ML/ml_system.ML"; ML_file "General/basics.ML"; ML_file "General/symbol_explode.ML"; ML_file "Concurrent/multithreading.ML"; ML_file "Concurrent/unsynchronized.ML"; ML_file "Concurrent/synchronized.ML"; ML_file "Concurrent/counter.ML"; ML_file "ML/ml_heap.ML"; ML_file "ML/ml_print_depth0.ML"; ML_file "ML/ml_pretty.ML"; ML_file "ML/ml_compiler0.ML"; section "Bootstrap phase 1: towards ML within position context"; subsection "Library of general tools"; ML_file "library.ML"; ML_file "General/print_mode.ML"; ML_file "General/alist.ML"; ML_file "General/table.ML"; ML_file "General/random.ML"; ML_file "General/value.ML"; ML_file "General/properties.ML"; ML_file "General/output.ML"; ML_file "PIDE/markup.ML"; ML_file "General/utf8.ML"; ML_file "General/scan.ML"; ML_file "General/source.ML"; ML_file "General/symbol.ML"; ML_file "General/position.ML"; ML_file "General/symbol_pos.ML"; ML_file "General/input.ML"; ML_file "General/comment.ML"; ML_file "General/antiquote.ML"; ML_file "ML/ml_lex.ML"; ML_file "ML/ml_compiler1.ML"; section "Bootstrap phase 2: towards ML within Isar context"; subsection "Library of general tools"; ML_file "General/integer.ML"; ML_file "General/stack.ML"; ML_file "General/queue.ML"; ML_file "General/heap.ML"; ML_file "General/same.ML"; ML_file "General/ord_list.ML"; ML_file "General/balanced_tree.ML"; ML_file "General/linear_set.ML"; ML_file "General/buffer.ML"; ML_file "General/pretty.ML"; ML_file "General/rat.ML"; ML_file "PIDE/xml.ML"; ML_file "General/path.ML"; ML_file "General/url.ML"; ML_file "System/bash.ML"; ML_file "General/file_stream.ML"; ML_file "General/bytes.ML"; ML_file "General/file.ML"; ML_file "General/long_name.ML"; ML_file "General/binding.ML"; ML_file "General/seq.ML"; ML_file "General/time.ML"; ML_file "General/timing.ML"; ML_file "General/sha1.ML"; ML_file "PIDE/yxml.ML"; ML_file "ML/ml_profiling.ML"; ML_file "PIDE/byte_message.ML"; ML_file "PIDE/protocol_message.ML"; ML_file "PIDE/document_id.ML"; ML_file "General/socket_io.ML"; ML_file "General/change_table.ML"; ML_file "General/graph.ML"; ML_file "System/options.ML"; subsection "Fundamental structures"; ML_file "name.ML"; ML_file "term.ML"; ML_file "context.ML"; ML_file "config.ML"; ML_file "context_position.ML"; ML_file "soft_type_system.ML"; subsection "Concurrency within the ML runtime"; ML_file "ML/exn_properties.ML"; ML_file_no_debug "ML/exn_debugger.ML"; ML_file "Concurrent/thread_data_virtual.ML"; ML_file "Concurrent/isabelle_thread.ML"; ML_file "Concurrent/single_assignment.ML"; ML_file "Concurrent/par_exn.ML"; ML_file "Concurrent/task_queue.ML"; ML_file "Concurrent/future.ML"; ML_file "Concurrent/event_timer.ML"; ML_file "Concurrent/timeout.ML"; ML_file "Concurrent/lazy.ML"; ML_file "Concurrent/par_list.ML"; ML_file "Concurrent/mailbox.ML"; ML_file "Concurrent/cache.ML"; ML_file "PIDE/active.ML"; ML_file "Thy/export.ML"; subsection "Inner syntax"; ML_file "Syntax/type_annotation.ML"; ML_file "Syntax/term_position.ML"; ML_file "Syntax/lexicon.ML"; ML_file "Syntax/ast.ML"; ML_file "Syntax/syntax_ext.ML"; ML_file "Syntax/parser.ML"; ML_file "Syntax/syntax_trans.ML"; ML_file "Syntax/mixfix.ML"; ML_file "Syntax/printer.ML"; ML_file "Syntax/syntax.ML"; subsection "Core of tactical proof system"; ML_file "term_ord.ML"; ML_file "term_items.ML"; ML_file "term_subst.ML"; ML_file "General/completion.ML"; ML_file "General/name_space.ML"; ML_file "sorts.ML"; ML_file "type.ML"; ML_file "logic.ML"; ML_file "Syntax/simple_syntax.ML"; ML_file "net.ML"; ML_file "item_net.ML"; ML_file "envir.ML"; ML_file "consts.ML"; ML_file "term_xml.ML"; ML_file "primitive_defs.ML"; ML_file "sign.ML"; ML_file "defs.ML"; ML_file "term_sharing.ML"; ML_file "pattern.ML"; ML_file "unify.ML"; ML_file "theory.ML"; ML_file "proofterm.ML"; ML_file "thm.ML"; ML_file "cterm_items.ML"; ML_file "more_pattern.ML"; ML_file "more_unify.ML"; ML_file "more_thm.ML"; ML_file "facts.ML"; ML_file "thm_name.ML"; ML_file "global_theory.ML"; ML_file "pure_thy.ML"; ML_file "drule.ML"; ML_file "morphism.ML"; ML_file "variable.ML"; ML_file "conv.ML"; ML_file "goal_display.ML"; ML_file "tactical.ML"; ML_file "search.ML"; ML_file "tactic.ML"; ML_file "raw_simplifier.ML"; ML_file "conjunction.ML"; ML_file "assumption.ML"; subsection "Isar -- Intelligible Semi-Automated Reasoning"; (*ML support and global execution*) ML_file "ML/ml_syntax.ML"; ML_file "ML/ml_env.ML"; ML_file "ML/ml_options.ML"; ML_file "ML/ml_print_depth.ML"; ML_file_no_debug "Isar/runtime.ML"; ML_file "PIDE/execution.ML"; ML_file "ML/ml_compiler.ML"; ML_file "skip_proof.ML"; ML_file "goal.ML"; (*outer syntax*) ML_file "Isar/keyword.ML"; ML_file "Isar/token.ML"; ML_file "Isar/parse.ML"; ML_file "Thy/document_source.ML"; ML_file "Thy/thy_header.ML"; ML_file "Thy/document_marker.ML"; (*proof context*) ML_file "Isar/object_logic.ML"; ML_file "Isar/rule_cases.ML"; ML_file "Isar/auto_bind.ML"; ML_file "type_infer.ML"; ML_file "Syntax/local_syntax.ML"; ML_file "Isar/proof_context.ML"; ML_file "type_infer_context.ML"; ML_file "Syntax/syntax_phases.ML"; ML_file "Isar/args.ML"; (*theory specifications*) ML_file "Isar/local_defs.ML"; ML_file "Isar/local_theory.ML"; ML_file "Isar/entity.ML"; ML_file "PIDE/command_span.ML"; ML_file "Thy/thy_element.ML"; ML_file "Thy/markdown.ML"; ML_file "Thy/latex.ML"; (*ML with context and antiquotations*) ML_file "ML/ml_context.ML"; ML_file "ML/ml_antiquotation.ML"; ML_file "ML/ml_compiler2.ML"; ML_file "ML/ml_antiquotations.ML"; section "Bootstrap phase 3: towards theory Pure and final ML toplevel setup"; (*basic proof engine*) ML_file "par_tactical.ML"; ML_file "context_tactic.ML"; ML_file "Isar/attrib.ML"; ML_file "Isar/proof_display.ML"; ML_file "Isar/context_rules.ML"; ML_file "Isar/method.ML"; ML_file "Isar/proof.ML"; ML_file "Isar/element.ML"; ML_file "Isar/obtain.ML"; ML_file "Isar/subgoal.ML"; ML_file "Isar/calculation.ML"; (*local theories and targets*) ML_file "Isar/locale.ML"; ML_file "Isar/generic_target.ML"; ML_file "Isar/bundle.ML"; ML_file "Isar/overloading.ML"; ML_file "axclass.ML"; ML_file "Isar/class.ML"; ML_file "Isar/named_target.ML"; ML_file "Isar/expression.ML"; ML_file "Isar/interpretation.ML"; ML_file "Isar/class_declaration.ML"; ML_file "Isar/target_context.ML"; ML_file "Isar/experiment.ML"; ML_file "simplifier.ML"; ML_file "Tools/plugin.ML"; (*executable theory content*) ML_file "Isar/code.ML"; (*specifications*) ML_file "Isar/spec_rules.ML"; ML_file "Isar/specification.ML"; ML_file "Isar/parse_spec.ML"; ML_file "Isar/typedecl.ML"; (*toplevel transactions*) ML_file "Isar/proof_node.ML"; ML_file "Isar/toplevel.ML"; (*proof term operations*) ML_file "Proof/proof_rewrite_rules.ML"; ML_file "Proof/proof_syntax.ML"; ML_file "Proof/proof_checker.ML"; ML_file "Proof/extraction.ML"; (*Isabelle system*) ML_file "PIDE/protocol_command.ML"; ML_file "System/java.ML"; ML_file "System/scala.ML"; ML_file "System/process_result.ML"; ML_file "System/isabelle_system.ML"; (*theory documents*) ML_file "Thy/term_style.ML"; ML_file "Isar/outer_syntax.ML"; ML_file "Thy/document_antiquotation.ML"; ML_file "Thy/document_output.ML"; ML_file "Thy/document_antiquotations.ML"; ML_file "General/graph_display.ML"; ML_file "pure_syn.ML"; ML_file "PIDE/resources.ML"; ML_file "PIDE/command.ML"; ML_file "PIDE/query_operation.ML"; ML_file "Thy/thy_info.ML"; ML_file "thm_deps.ML"; ML_file "Thy/export_theory.ML"; ML_file "Thy/sessions.ML"; ML_file "PIDE/session.ML"; ML_file "PIDE/document.ML"; (*ML add-ons*) ML_file "ML/ml_pp.ML"; ML_file "ML/ml_thms.ML"; ML_file "ML/ml_instantiate.ML"; ML_file "ML/ml_file.ML"; ML_file "ML/ml_pid.ML"; (*theory and proof operations*) ML_file "Isar/isar_cmd.ML"; subsection "Isabelle/Isar system"; ML_file "System/command_line.ML"; ML_file "System/message_channel.ML"; ML_file "System/isabelle_process.ML"; ML_file "System/scala_compiler.ML"; ML_file "System/isabelle_tool.ML"; ML_file "Thy/bibtex.ML"; ML_file "PIDE/protocol.ML"; ML_file "General/output_primitives_virtual.ML"; subsection "Miscellaneous tools and packages for Pure Isabelle"; ML_file "General/base64.ML"; ML_file "General/xz.ML"; ML_file "General/zstd.ML"; ML_file "Tools/prismjs.ML"; ML_file "Tools/build.ML"; ML_file "Tools/named_thms.ML"; ML_file "Tools/print_operation.ML"; ML_file "Tools/rail.ML"; ML_file "Tools/rule_insts.ML"; ML_file "Tools/thy_deps.ML"; ML_file "Tools/class_deps.ML"; ML_file "Tools/find_theorems.ML"; ML_file "Tools/find_consts.ML"; ML_file "Tools/simplifier_trace.ML"; ML_file_no_debug "Tools/debugger.ML"; ML_file "Tools/named_theorems.ML"; ML_file "Tools/doc.ML"; ML_file "Tools/jedit.ML"; ML_file "Tools/ghc.ML"; ML_file "Tools/generated_files.ML"; - diff --git a/src/Pure/Tools/build.scala b/src/Pure/Tools/build.scala --- a/src/Pure/Tools/build.scala +++ b/src/Pure/Tools/build.scala @@ -1,678 +1,681 @@ /* Title: Pure/Tools/build.scala Author: Makarius Options: :folding=explicit: Command-line tools to build and manage Isabelle sessions. */ package isabelle import scala.collection.mutable import scala.util.matching.Regex object Build { /** "isabelle build" **/ /* results */ object Results { def apply(context: Build_Process.Context, results: Map[String, Process_Result]): Results = new Results(context.store, context.build_deps, results) } class Results private( val store: Sessions.Store, val deps: Sessions.Deps, results: Map[String, Process_Result] ) { def cache: Term.Cache = store.cache def sessions_ok: List[String] = (for { name <- deps.sessions_structure.build_topological_order.iterator result <- results.get(name) if result.ok } yield name).toList def info(name: String): Sessions.Info = deps.sessions_structure(name) def sessions: Set[String] = results.keySet def cancelled(name: String): Boolean = !results(name).defined def apply(name: String): Process_Result = results(name).strict val rc: Int = results.valuesIterator.map(_.strict.rc).foldLeft(Process_Result.RC.ok)(_ max _) def ok: Boolean = rc == Process_Result.RC.ok def unfinished: List[String] = sessions.iterator.filterNot(apply(_).ok).toList.sorted override def toString: String = rc.toString } /* engine */ class Engine(val name: String) extends Isabelle_System.Service { override def toString: String = name def init(build_context: Build_Process.Context, build_progress: Progress): Build_Process = new Build_Process(build_context, build_progress) } class Default_Engine extends Engine("") { override def toString: String = "" } lazy val engines: List[Engine] = Isabelle_System.make_services(classOf[Engine]) def get_engine(name: String): Engine = engines.find(_.name == name).getOrElse(error("Bad build engine " + quote(name))) /* options */ def hostname(options: Options): String = Isabelle_System.hostname(options.string("build_hostname")) def build_init(options: Options, cache: Term.Cache = Term.Cache.make()): Sessions.Store = { val build_options = options + "completion_limit=0" + "editor_tracing_messages=0" + ("pide_reports=" + options.bool("build_pide_reports")) val store = Sessions.store(build_options, cache = cache) Isabelle_Fonts.init() store } /* build */ def build( options: Options, selection: Sessions.Selection = Sessions.Selection.empty, browser_info: Browser_Info.Config = Browser_Info.Config.none, progress: Progress = new Progress, check_unknown_files: Boolean = false, build_heap: Boolean = false, clean_build: Boolean = false, dirs: List[Path] = Nil, select_dirs: List[Path] = Nil, infos: List[Sessions.Info] = Nil, numa_shuffling: Boolean = false, max_jobs: Int = 1, list_files: Boolean = false, check_keywords: Set[String] = Set.empty, fresh_build: Boolean = false, no_build: Boolean = false, soft_build: Boolean = false, export_files: Boolean = false, augment_options: String => List[Options.Spec] = _ => Nil, session_setup: (String, Session) => Unit = (_, _) => (), cache: Term.Cache = Term.Cache.make() ): Results = { val store = build_init(options, cache = cache) val build_options = store.options /* session selection and dependencies */ val full_sessions = Sessions.load_structure(build_options, dirs = dirs, select_dirs = select_dirs, infos = infos, augment_options = augment_options) val full_sessions_selection = full_sessions.imports_selection(selection) val build_deps = { val deps0 = Sessions.deps(full_sessions.selection(selection), progress = progress, inlined_files = true, list_files = list_files, check_keywords = check_keywords).check_errors if (soft_build && !fresh_build) { val outdated = deps0.sessions_structure.build_topological_order.flatMap(name => store.try_open_database(name) match { case Some(db) => using(db)(store.read_build(_, name)) match { case Some(build) if build.ok && build.sources == deps0.sources_shasum(name) => None case _ => Some(name) } case None => Some(name) }) Sessions.deps(full_sessions.selection(Sessions.Selection(sessions = outdated)), progress = progress, inlined_files = true).check_errors } else deps0 } /* check unknown files */ if (check_unknown_files) { val source_files = (for { (_, base) <- build_deps.session_bases.iterator (path, _) <- base.session_sources.iterator } yield path).toList Mercurial.check_files(source_files)._2 match { case Nil => case unknown_files => progress.echo_warning("Unknown files (not part of the underlying Mercurial repository):" + unknown_files.map(File.standard_path).sorted.mkString("\n ", "\n ", "")) } } /* build process and results */ val build_context = Build_Process.Context(store, build_deps, progress = progress, hostname = hostname(build_options), build_heap = build_heap, numa_shuffling = numa_shuffling, max_jobs = max_jobs, fresh_build = fresh_build, no_build = no_build, session_setup = session_setup, master = true) store.prepare_output() build_context.prepare_database() if (clean_build) { for (name <- full_sessions.imports_descendants(full_sessions_selection)) { val (relevant, ok) = store.clean_output(name) if (relevant) { if (ok) progress.echo("Cleaned " + name) else progress.echo(name + " FAILED to clean") } } } val results = Isabelle_Thread.uninterruptible { val engine = get_engine(build_options.string("build_engine")) using(engine.init(build_context, progress)) { build_process => val res = build_process.run() Results(build_context, res) } } if (export_files) { for (name <- full_sessions_selection.iterator if results(name).ok) { val info = results.info(name) if (info.export_files.nonEmpty) { progress.echo("Exporting " + info.name + " ...") for ((dir, prune, pats) <- info.export_files) { Export.export_files(store, name, info.dir + dir, progress = if (progress.verbose) progress else new Progress, export_prune = prune, export_patterns = pats) } } } } val presentation_sessions = results.sessions_ok.filter(name => browser_info.enabled(results.info(name))) if (presentation_sessions.nonEmpty && !progress.stopped) { Browser_Info.build(browser_info, results.store, results.deps, presentation_sessions, progress = progress) } if (!results.ok && (progress.verbose || !no_build)) { progress.echo("Unfinished session(s): " + commas(results.unfinished)) } results } /* build logic image */ def build_logic(options: Options, logic: String, progress: Progress = new Progress, build_heap: Boolean = false, dirs: List[Path] = Nil, fresh: Boolean = false, strict: Boolean = false ): Int = { val selection = Sessions.Selection.session(logic) val rc = if (!fresh && build(options, selection = selection, build_heap = build_heap, no_build = true, dirs = dirs).ok) Process_Result.RC.ok else { progress.echo("Build started for Isabelle/" + logic + " ...") build(options, selection = selection, progress = progress, build_heap = build_heap, fresh_build = fresh, dirs = dirs).rc } if (strict && rc != Process_Result.RC.ok) error("Failed to build Isabelle/" + logic) else rc } /* command-line wrapper */ val isabelle_tool1 = Isabelle_Tool("build", "build and manage Isabelle sessions", Scala_Project.here, { args => var base_sessions: List[String] = Nil var select_dirs: List[Path] = Nil var numa_shuffling = false var browser_info = Browser_Info.Config.none var requirements = false var soft_build = false var exclude_session_groups: List[String] = Nil var all_sessions = false var build_heap = false var clean_build = false var dirs: List[Path] = Nil var export_files = false var fresh_build = false var session_groups: List[String] = Nil var max_jobs = 1 var check_keywords: Set[String] = Set.empty var list_files = false var no_build = false var options = Options.init(specs = Options.Spec.ISABELLE_BUILD_OPTIONS) var verbose = false var exclude_sessions: List[String] = Nil val getopts = Getopts(""" Usage: isabelle build [OPTIONS] [SESSIONS ...] Options are: -B NAME include session NAME and all descendants -D DIR include session directory and select its sessions -N cyclic shuffling of NUMA CPU nodes (performance tuning) -P DIR enable HTML/PDF presentation in directory (":" for default) -R refer to requirements of selected sessions -S soft build: only observe changes of sources, not heap images -X NAME exclude sessions from group NAME and all descendants -a select all sessions -b build heap images -c clean build -d DIR include session directory -e export files from session specification into file-system -f fresh build -g NAME select session group NAME -j INT maximum number of parallel jobs (default 1) -k KEYWORD check theory sources for conflicts with proposed keywords -l list session source files -n no build -- take existing session build databases -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -v verbose -x NAME exclude session NAME and all descendants - Build and manage Isabelle sessions, depending on implicit settings: + Build and manage Isabelle sessions: ML heaps, session databases, documents. -""" + Library.indent_lines(2, Build_Log.Settings.show()) + "\n", + Notable system options: see "isabelle options -l -t build" + + Notable system settings: +""" + Library.indent_lines(4, Build_Log.Settings.show()) + "\n", "B:" -> (arg => base_sessions = base_sessions ::: List(arg)), "D:" -> (arg => select_dirs = select_dirs ::: List(Path.explode(arg))), "N" -> (_ => numa_shuffling = true), "P:" -> (arg => browser_info = Browser_Info.Config.make(arg)), "R" -> (_ => requirements = true), "S" -> (_ => soft_build = true), "X:" -> (arg => exclude_session_groups = exclude_session_groups ::: List(arg)), "a" -> (_ => all_sessions = true), "b" -> (_ => build_heap = true), "c" -> (_ => clean_build = true), "d:" -> (arg => dirs = dirs ::: List(Path.explode(arg))), "e" -> (_ => export_files = true), "f" -> (_ => fresh_build = true), "g:" -> (arg => session_groups = session_groups ::: List(arg)), "j:" -> (arg => max_jobs = Value.Int.parse(arg)), "k:" -> (arg => check_keywords = check_keywords + arg), "l" -> (_ => list_files = true), "n" -> (_ => no_build = true), "o:" -> (arg => options = options + arg), "v" -> (_ => verbose = true), "x:" -> (arg => exclude_sessions = exclude_sessions ::: List(arg))) val sessions = getopts(args) val progress = new Console_Progress(verbose = verbose) val start_date = Date.now() progress.echo( "Started at " + Build_Log.print_date(start_date) + " (" + Isabelle_System.getenv("ML_IDENTIFIER") + " on " + hostname(options) +")", verbose = true) progress.echo(Build_Log.Settings.show() + "\n", verbose = true) val results = progress.interrupt_handler { build(options, selection = Sessions.Selection( requirements = requirements, all_sessions = all_sessions, base_sessions = base_sessions, exclude_session_groups = exclude_session_groups, exclude_sessions = exclude_sessions, session_groups = session_groups, sessions = sessions), browser_info = browser_info, progress = progress, check_unknown_files = Mercurial.is_repository(Path.ISABELLE_HOME), build_heap = build_heap, clean_build = clean_build, dirs = dirs, select_dirs = select_dirs, numa_shuffling = Host.numa_check(progress, numa_shuffling), max_jobs = max_jobs, list_files = list_files, check_keywords = check_keywords, fresh_build = fresh_build, no_build = no_build, soft_build = soft_build, export_files = export_files) } val stop_date = Date.now() val elapsed_time = stop_date.time - start_date.time progress.echo("\nFinished at " + Build_Log.print_date(stop_date), verbose = true) val total_timing = results.sessions.iterator.map(a => results(a).timing).foldLeft(Timing.zero)(_ + _). copy(elapsed = elapsed_time) progress.echo(total_timing.message_resources) sys.exit(results.rc) }) /** "isabelle build_worker" **/ /* build_worker */ def build_worker( options: Options, build_uuid: String, progress: Progress = new Progress, dirs: List[Path] = Nil, infos: List[Sessions.Info] = Nil, numa_shuffling: Boolean = false, max_jobs: Int = 1, session_setup: (String, Session) => Unit = (_, _) => (), cache: Term.Cache = Term.Cache.make() ): Results = { val store = build_init(options, cache = cache) val build_options = store.options progress.echo("build worker for " + build_uuid) progress.echo_warning("FIXME") ??? } /* command-line wrapper */ val isabelle_tool2 = Isabelle_Tool("build_worker", "external worker for session build process", Scala_Project.here, { args => var numa_shuffling = false var dirs: List[Path] = Nil var max_jobs = 1 var options = Options.init(specs = Options.Spec.ISABELLE_BUILD_OPTIONS) var verbose = false var build_uuid = "" val getopts = Getopts(""" Usage: isabelle build_worker [OPTIONS] ...] Options are: -N cyclic shuffling of NUMA CPU nodes (performance tuning) -U UUID Universally Unique Identifier of the build process -d DIR include session directory -j INT maximum number of parallel jobs (default 1) -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -v verbose Run as external worker for session build process, as identified via option -U UUID. -""" + Library.indent_lines(2, Build_Log.Settings.show()) + "\n", +""", "N" -> (_ => numa_shuffling = true), "U:" -> (arg => build_uuid = arg), "d:" -> (arg => dirs = dirs ::: List(Path.explode(arg))), "j:" -> (arg => max_jobs = Value.Int.parse(arg)), "o:" -> (arg => options = options + arg), "v" -> (_ => verbose = true)) val more_args = getopts(args) if (more_args.nonEmpty) getopts.usage() if (build_uuid.isEmpty) error("Missing UUID for build process (option -U)") val progress = new Console_Progress(verbose = verbose) val results = progress.interrupt_handler { build_worker(options, build_uuid, progress = progress, dirs = dirs, numa_shuffling = Host.numa_check(progress, numa_shuffling), max_jobs = max_jobs) } sys.exit(results.rc) }) /** "isabelle build_log" **/ /* theory markup/messages from session database */ def read_theory( theory_context: Export.Theory_Context, unicode_symbols: Boolean = false ): Option[Document.Snapshot] = { def decode_bytes(bytes: Bytes): String = Symbol.output(unicode_symbols, UTF8.decode_permissive(bytes)) def read(name: String): Export.Entry = theory_context(name, permissive = true) def read_xml(name: String): XML.Body = YXML.parse_body(decode_bytes(read(name).bytes), cache = theory_context.cache) def read_source_file(name: String): Sessions.Source_File = theory_context.session_context.source_file(name) for { id <- theory_context.document_id() (thy_file, blobs_files) <- theory_context.files(permissive = true) } yield { val master_dir = Path.explode(Url.strip_base_name(thy_file).getOrElse( error("Cannot determine theory master directory: " + quote(thy_file)))) val blobs = blobs_files.map { name => val path = Path.explode(name) val src_path = File.relative_path(master_dir, path).getOrElse(path) val file = read_source_file(name) val bytes = file.bytes val text = decode_bytes(bytes) val chunk = Symbol.Text_Chunk(text) Command.Blob(Document.Node.Name(name), src_path, Some((file.digest, chunk))) -> Document.Blobs.Item(bytes, text, chunk, changed = false) } val thy_source = decode_bytes(read_source_file(thy_file).bytes) val thy_xml = read_xml(Export.MARKUP) val blobs_xml = for (i <- (1 to blobs.length).toList) yield read_xml(Export.MARKUP + i) val markups_index = Command.Markup_Index.make(blobs.map(_._1)) val markups = Command.Markups.make( for ((index, xml) <- markups_index.zip(thy_xml :: blobs_xml)) yield index -> Markup_Tree.from_XML(xml)) val results = Command.Results.make( for (elem @ XML.Elem(Markup(_, Markup.Serial(i)), _) <- read_xml(Export.MESSAGES)) yield i -> elem) val command = Command.unparsed(thy_source, theory = true, id = id, node_name = Document.Node.Name(thy_file, theory = theory_context.theory), blobs_info = Command.Blobs_Info.make(blobs), markups = markups, results = results) val doc_blobs = Document.Blobs.make(blobs) Document.State.init.snippet(command, doc_blobs) } } /* print messages */ def print_log( options: Options, sessions: List[String], theories: List[String] = Nil, message_head: List[Regex] = Nil, message_body: List[Regex] = Nil, progress: Progress = new Progress, margin: Double = Pretty.default_margin, breakgain: Double = Pretty.default_breakgain, metric: Pretty.Metric = Symbol.Metric, unicode_symbols: Boolean = false ): Unit = { val store = Sessions.store(options) val session = new Session(options, Resources.bootstrap) def check(filter: List[Regex], make_string: => String): Boolean = filter.isEmpty || { val s = Output.clean_yxml(make_string) filter.forall(r => r.findFirstIn(Output.clean_yxml(s)).nonEmpty) } def print(session_name: String): Unit = { using(Export.open_session_context0(store, session_name)) { session_context => val result = for { db <- session_context.session_db() theories = store.read_theories(db, session_name) errors = store.read_errors(db, session_name) info <- store.read_build(db, session_name) } yield (theories, errors, info.return_code) result match { case None => store.error_database(session_name) case Some((used_theories, errors, rc)) => theories.filterNot(used_theories.toSet) match { case Nil => case bad => error("Unknown theories " + commas_quote(bad)) } val print_theories = if (theories.isEmpty) used_theories else used_theories.filter(theories.toSet) for (thy <- print_theories) { val thy_heading = "\nTheory " + quote(thy) + " (in " + session_name + ")" + ":" Build_Job.read_theory(session_context.theory(thy), unicode_symbols = unicode_symbols) match { case None => progress.echo(thy_heading + " MISSING") case Some(snapshot) => val rendering = new Rendering(snapshot, options, session) val messages = rendering.text_messages(Text.Range.full) .filter(message => progress.verbose || Protocol.is_exported(message.info)) if (messages.nonEmpty) { val line_document = Line.Document(snapshot.node.source) val buffer = new mutable.ListBuffer[String] for (Text.Info(range, elem) <- messages) { val line = line_document.position(range.start).line1 val pos = Position.Line_File(line, snapshot.node_name.node) def message_text: String = Protocol.message_text(elem, heading = true, pos = pos, margin = margin, breakgain = breakgain, metric = metric) val ok = check(message_head, Protocol.message_heading(elem, pos)) && check(message_body, XML.content(Pretty.unformatted(List(elem)))) if (ok) buffer += message_text } if (buffer.nonEmpty) { progress.echo(thy_heading) buffer.foreach(progress.echo(_)) } } } } if (errors.nonEmpty) { val msg = Symbol.output(unicode_symbols, cat_lines(errors)) progress.echo("\nBuild errors:\n" + Output.error_message_text(msg)) } if (rc != Process_Result.RC.ok) { progress.echo("\n" + Process_Result.RC.print_long(rc)) } } } } val errors = new mutable.ListBuffer[String] for (session_name <- sessions) { Exn.interruptible_capture(print(session_name)) match { case Exn.Res(_) => case Exn.Exn(exn) => errors += Exn.message(exn) } } if (errors.nonEmpty) error(cat_lines(errors.toList)) } /* command-line wrapper */ val isabelle_tool3 = Isabelle_Tool("build_log", "print messages from session build database", Scala_Project.here, { args => /* arguments */ var message_head = List.empty[Regex] var message_body = List.empty[Regex] var unicode_symbols = false var theories: List[String] = Nil var margin = Pretty.default_margin var options = Options.init() var verbose = false val getopts = Getopts(""" Usage: isabelle build_log [OPTIONS] [SESSIONS ...] Options are: -H REGEX filter messages by matching against head -M REGEX filter messages by matching against body -T NAME restrict to given theories (multiple options possible) -U output Unicode symbols -m MARGIN margin for pretty printing (default: """ + margin + """) -o OPTION override Isabelle system OPTION (via NAME=VAL or NAME) -v print all messages, including information etc. Print messages from the session build database of the given sessions, without any checks against current sources nor session structure: results from old sessions or failed builds can be printed as well. Multiple options -H and -M are conjunctive: all given patterns need to match. Patterns match any substring, but ^ or $ may be used to match the start or end explicitly. """, "H:" -> (arg => message_head = message_head ::: List(arg.r)), "M:" -> (arg => message_body = message_body ::: List(arg.r)), "T:" -> (arg => theories = theories ::: List(arg)), "U" -> (_ => unicode_symbols = true), "m:" -> (arg => margin = Value.Double.parse(arg)), "o:" -> (arg => options = options + arg), "v" -> (_ => verbose = true)) val sessions = getopts(args) val progress = new Console_Progress(verbose = verbose) if (sessions.isEmpty) progress.echo_warning("No sessions to print") else { print_log(options, sessions, theories = theories, message_head = message_head, message_body = message_body, margin = margin, progress = progress, unicode_symbols = unicode_symbols) } }) } diff --git a/src/Pure/Tools/build_job.scala b/src/Pure/Tools/build_job.scala --- a/src/Pure/Tools/build_job.scala +++ b/src/Pure/Tools/build_job.scala @@ -1,600 +1,588 @@ /* Title: Pure/Tools/build_job.scala Author: Makarius Build job running prover process, with rudimentary PIDE session. */ package isabelle import scala.collection.mutable trait Build_Job { - def job_name: String - def node_info: Host.Node_Info def cancel(): Unit = () def is_finished: Boolean = false def join: (Process_Result, SHA1.Shasum) = (Process_Result.undefined, SHA1.no_shasum) } object Build_Job { - sealed case class Result(node_info: Host.Node_Info, process_result: Process_Result) { - def ok: Boolean = process_result.ok - } - - /* build session */ - def is_session_name(job_name: String): Boolean = !Long_Name.is_qualified(job_name) - def start_session( build_context: Build_Process.Context, progress: Progress, log: Logger, session_background: Sessions.Background, input_shasum: SHA1.Shasum, node_info: Host.Node_Info ): Session_Job = { new Session_Job(build_context, progress, log, session_background, input_shasum, node_info) } object Session_Context { def load( build_uuid: String, name: String, deps: List[String], ancestors: List[String], session_prefs: String, sources_shasum: SHA1.Shasum, timeout: Time, store: Sessions.Store, progress: Progress = new Progress ): Session_Context = { def default: Session_Context = Session_Context( name, deps, ancestors, session_prefs, sources_shasum, timeout, Time.zero, Bytes.empty, build_uuid) store.try_open_database(name) match { case None => default case Some(db) => def ignore_error(msg: String) = { progress.echo_warning( "Ignoring bad database " + db + " for session " + quote(name) + if_proper(msg, ":\n" + msg)) default } try { val command_timings = store.read_command_timings(db, name) val elapsed = store.read_session_timing(db, name) match { case Markup.Elapsed(s) => Time.seconds(s) case _ => Time.zero } new Session_Context( name, deps, ancestors, session_prefs, sources_shasum, timeout, elapsed, command_timings, build_uuid) } catch { case ERROR(msg) => ignore_error(msg) case exn: java.lang.Error => ignore_error(Exn.message(exn)) case _: XML.Error => ignore_error("XML.Error") } finally { db.close() } } } } sealed case class Session_Context( name: String, deps: List[String], ancestors: List[String], session_prefs: String, sources_shasum: SHA1.Shasum, timeout: Time, old_time: Time, old_command_timings_blob: Bytes, build_uuid: String - ) { - override def toString: String = name - } + ) extends Library.Named class Session_Job private[Build_Job]( build_context: Build_Process.Context, progress: Progress, log: Logger, session_background: Sessions.Background, input_shasum: SHA1.Shasum, - override val node_info: Host.Node_Info + node_info: Host.Node_Info ) extends Build_Job { private val store = build_context.store def session_name: String = session_background.session_name - def job_name: String = session_name private val info: Sessions.Info = session_background.sessions_structure(session_name) private val options: Options = node_info.process_policy(info.options) private val session_sources = Sessions.Sources.load(session_background.base, cache = store.cache.compress) private val store_heap = build_context.store_heap(session_name) private val future_result: Future[(Process_Result, SHA1.Shasum)] = Future.thread("build", uninterruptible = true) { val env = Isabelle_System.settings( List("ISABELLE_ML_DEBUGGER" -> options.bool("ML_debugger").toString)) val session_heaps = session_background.info.parent match { case None => Nil case Some(logic) => ML_Process.session_heaps(store, session_background, logic = logic) } val use_prelude = if (session_heaps.isEmpty) Thy_Header.ml_roots.map(_._1) else Nil val eval_store = if (store_heap) { (if (info.theories.nonEmpty) List("ML_Heap.share_common_data ()") else Nil) ::: List("ML_Heap.save_child " + ML_Syntax.print_string_bytes(File.platform_path(store.output_heap(session_name)))) } else Nil def session_blobs(node_name: Document.Node.Name): List[(Command.Blob, Document.Blobs.Item)] = session_background.base.theory_load_commands.get(node_name.theory) match { case None => Nil case Some(spans) => val syntax = session_background.base.theory_syntax(node_name) val master_dir = Path.explode(node_name.master_dir) for (span <- spans; file <- span.loaded_files(syntax).files) yield { val src_path = Path.explode(file) val blob_name = Document.Node.Name(File.symbolic_path(master_dir + src_path)) val bytes = session_sources(blob_name.node).bytes val text = bytes.text val chunk = Symbol.Text_Chunk(text) Command.Blob(blob_name, src_path, Some((SHA1.digest(bytes), chunk))) -> Document.Blobs.Item(bytes, text, chunk, changed = false) } } /* session */ val resources = new Resources(session_background, log = log, command_timings = build_context.old_command_timings(session_name)) val session = new Session(options, resources) { override val cache: Term.Cache = store.cache override def build_blobs_info(node_name: Document.Node.Name): Command.Blobs_Info = Command.Blobs_Info.make(session_blobs(node_name)) override def build_blobs(node_name: Document.Node.Name): Document.Blobs = Document.Blobs.make(session_blobs(node_name)) } object Build_Session_Errors { private val promise: Promise[List[String]] = Future.promise def result: Exn.Result[List[String]] = promise.join_result def cancel(): Unit = promise.cancel() def apply(errs: List[String]): Unit = { try { promise.fulfill(errs) } catch { case _: IllegalStateException => } } } val export_consumer = Export.consumer(store.open_database(session_name, output = true), store.cache, progress = progress) val stdout = new StringBuilder(1000) val stderr = new StringBuilder(1000) val command_timings = new mutable.ListBuffer[Properties.T] val theory_timings = new mutable.ListBuffer[Properties.T] val session_timings = new mutable.ListBuffer[Properties.T] val runtime_statistics = new mutable.ListBuffer[Properties.T] val task_statistics = new mutable.ListBuffer[Properties.T] def fun( name: String, acc: mutable.ListBuffer[Properties.T], unapply: Properties.T => Option[Properties.T] ): (String, Session.Protocol_Function) = { name -> ((msg: Prover.Protocol_Output) => unapply(msg.properties) match { case Some(props) => acc += props; true case _ => false }) } session.init_protocol_handler(new Session.Protocol_Handler { override def exit(): Unit = Build_Session_Errors.cancel() private def build_session_finished(msg: Prover.Protocol_Output): Boolean = { val (rc, errors) = try { val (rc, errs) = { import XML.Decode._ pair(int, list(x => x))(Symbol.decode_yxml(msg.text)) } val errors = for (err <- errs) yield { val prt = Protocol_Message.expose_no_reports(err) Pretty.string_of(prt, metric = Symbol.Metric) } (rc, errors) } catch { case ERROR(err) => (Process_Result.RC.failure, List(err)) } session.protocol_command("Prover.stop", rc.toString) Build_Session_Errors(errors) true } private def loading_theory(msg: Prover.Protocol_Output): Boolean = msg.properties match { case Markup.Loading_Theory(Markup.Name(name)) => progress.theory(Progress.Theory(name, session = session_name)) false case _ => false } private def export_(msg: Prover.Protocol_Output): Boolean = msg.properties match { case Protocol.Export(args) => export_consumer.make_entry(session_name, args, msg.chunk) true case _ => false } override val functions: Session.Protocol_Functions = List( Markup.Build_Session_Finished.name -> build_session_finished, Markup.Loading_Theory.name -> loading_theory, Markup.EXPORT -> export_, fun(Markup.Theory_Timing.name, theory_timings, Markup.Theory_Timing.unapply), fun(Markup.Session_Timing.name, session_timings, Markup.Session_Timing.unapply), fun(Markup.Task_Statistics.name, task_statistics, Markup.Task_Statistics.unapply)) }) session.command_timings += Session.Consumer("command_timings") { case Session.Command_Timing(props) => for { elapsed <- Markup.Elapsed.unapply(props) elapsed_time = Time.seconds(elapsed) if elapsed_time.is_relevant && elapsed_time >= options.seconds("command_timing_threshold") } command_timings += props.filter(Markup.command_timing_property) } session.runtime_statistics += Session.Consumer("ML_statistics") { case Session.Runtime_Statistics(props) => runtime_statistics += props } session.finished_theories += Session.Consumer[Document.Snapshot]("finished_theories") { case snapshot => if (!progress.stopped) { def export_(name: String, xml: XML.Body, compress: Boolean = true): Unit = { if (!progress.stopped) { val theory_name = snapshot.node_name.theory val args = Protocol.Export.Args(theory_name = theory_name, name = name, compress = compress) val body = Bytes(Symbol.encode(YXML.string_of_body(xml))) export_consumer.make_entry(session_name, args, body) } } def export_text(name: String, text: String, compress: Boolean = true): Unit = export_(name, List(XML.Text(text)), compress = compress) for (command <- snapshot.snippet_command) { export_text(Export.DOCUMENT_ID, command.id.toString, compress = false) } export_text(Export.FILES, cat_lines(snapshot.node_files.map(name => File.symbolic_path(name.path))), compress = false) for ((blob_name, i) <- snapshot.node_files.tail.zipWithIndex) { val xml = snapshot.switch(blob_name).xml_markup() export_(Export.MARKUP + (i + 1), xml) } export_(Export.MARKUP, snapshot.xml_markup()) export_(Export.MESSAGES, snapshot.messages.map(_._1)) } } session.all_messages += Session.Consumer[Any]("build_session_output") { case msg: Prover.Output => val message = msg.message if (msg.is_system) resources.log(Protocol.message_text(message)) if (msg.is_stdout) { stdout ++= Symbol.encode(XML.content(message)) } else if (msg.is_stderr) { stderr ++= Symbol.encode(XML.content(message)) } else if (msg.is_exit) { val err = "Prover terminated" + (msg.properties match { case Markup.Process_Result(result) => ": " + result.print_rc case _ => "" }) Build_Session_Errors(List(err)) } case _ => } build_context.session_setup(session_name, session) val eval_main = Command_Line.ML_tool("Isabelle_Process.init_build ()" :: eval_store) /* process */ val process = Isabelle_Process.start(options, session, session_background, session_heaps, use_prelude = use_prelude, eval_main = eval_main, cwd = info.dir.file, env = env) val timeout_request: Option[Event_Timer.Request] = if (info.timeout_ignored) None else Some(Event_Timer.request(Time.now() + info.timeout) { process.terminate() }) val build_errors = Isabelle_Thread.interrupt_handler(_ => process.terminate()) { Exn.capture { process.await_startup() } match { case Exn.Res(_) => val resources_yxml = resources.init_session_yxml val encode_options: XML.Encode.T[Options] = options => session.prover_options(options).encode val args_yxml = YXML.string_of_body( { import XML.Encode._ pair(string, list(pair(encode_options, list(pair(string, properties)))))( (session_name, info.theories)) }) session.protocol_command("build_session", resources_yxml, args_yxml) Build_Session_Errors.result case Exn.Exn(exn) => Exn.Res(List(Exn.message(exn))) } } val result0 = Isabelle_Thread.interrupt_handler(_ => process.terminate()) { process.await_shutdown() } val was_timeout = timeout_request match { case None => false case Some(request) => !request.cancel() } session.stop() val export_errors = export_consumer.shutdown(close = true).map(Output.error_message_text) val (document_output, document_errors) = try { if (build_errors.isInstanceOf[Exn.Res[_]] && result0.ok && info.documents.nonEmpty) { using(Export.open_database_context(store)) { database_context => val documents = using(database_context.open_session(session_background)) { session_context => Document_Build.build_documents( Document_Build.context(session_context, progress = progress), output_sources = info.document_output, output_pdf = info.document_output) } using(database_context.open_database(session_name, output = true))(session_database => documents.foreach(_.write(session_database.db, session_name))) (documents.flatMap(_.log_lines), Nil) } } else (Nil, Nil) } catch { case exn: Document_Build.Build_Error => (exn.log_lines, exn.log_errors) case Exn.Interrupt.ERROR(msg) => (Nil, List(msg)) } /* process result */ val result1 = { val theory_timing = theory_timings.iterator.flatMap( { case props @ Markup.Name(name) => Some(name -> props) case _ => None }).toMap val used_theory_timings = for { (name, _) <- session_background.base.used_theories } yield theory_timing.getOrElse(name.theory, Markup.Name(name.theory)) val more_output = Library.trim_line(stdout.toString) :: command_timings.toList.map(Protocol.Command_Timing_Marker.apply) ::: used_theory_timings.map(Protocol.Theory_Timing_Marker.apply) ::: session_timings.toList.map(Protocol.Session_Timing_Marker.apply) ::: runtime_statistics.toList.map(Protocol.ML_Statistics_Marker.apply) ::: task_statistics.toList.map(Protocol.Task_Statistics_Marker.apply) ::: document_output result0.output(more_output) .error(Library.trim_line(stderr.toString)) .errors_rc(export_errors ::: document_errors) } val result2 = build_errors match { case Exn.Res(build_errs) => val errs = build_errs ::: document_errors if (errs.nonEmpty) { result1.error_rc.output( errs.flatMap(s => split_lines(Output.error_message_text(s))) ::: errs.map(Protocol.Error_Message_Marker.apply)) } else if (progress.stopped && result1.ok) result1.copy(rc = Process_Result.RC.interrupt) else result1 case Exn.Exn(Exn.Interrupt()) => if (result1.ok) result1.copy(rc = Process_Result.RC.interrupt) else result1 case Exn.Exn(exn) => throw exn } val process_result = if (result2.ok) result2 else if (was_timeout) result2.error(Output.error_message_text("Timeout")).timeout_rc else if (result2.interrupted) result2.error(Output.error_message_text("Interrupt")) else result2 /* output heap */ val output_shasum = if (process_result.ok && store_heap && store.output_heap(session_name).is_file) { SHA1.shasum(ML_Heap.write_digest(store.output_heap(session_name)), session_name) } else SHA1.no_shasum val log_lines = process_result.out_lines.filterNot(Protocol_Message.Marker.test) val build_log = Build_Log.Log_File(session_name, process_result.out_lines). parse_session_info( command_timings = true, theory_timings = true, ml_statistics = true, task_statistics = true) // write log file if (process_result.ok) { File.write_gzip(store.output_log_gz(session_name), terminate_lines(log_lines)) } else File.write(store.output_log(session_name), terminate_lines(log_lines)) // write database using(store.open_database(session_name, output = true))(db => store.write_session_info(db, session_name, session_sources, build_log = if (process_result.timeout) build_log.error("Timeout") else build_log, build = Sessions.Build_Info( sources = build_context.sources_shasum(session_name), input_heaps = input_shasum, output_heap = output_shasum, process_result.rc, build_context.build_uuid))) // messages process_result.err_lines.foreach(progress.echo(_)) if (process_result.ok) { val props = build_log.session_timing val threads = Markup.Session_Timing.Threads.unapply(props) getOrElse 1 val timing = Markup.Timing_Properties.get(props) progress.echo( "Timing " + session_name + " (" + threads + " threads, " + timing.message_factor + ")", verbose = true) progress.echo( "Finished " + session_name + " (" + process_result.timing.message_resources + ")") } else { progress.echo( session_name + " FAILED (see also \"isabelle build_log -H Error " + session_name + "\")") if (!process_result.interrupted) { val tail = info.options.int("process_output_tail") val suffix = if (tail == 0) log_lines else log_lines.drop(log_lines.length - tail max 0) val prefix = if (log_lines.length == suffix.length) Nil else List("...") progress.echo(Library.trim_line(cat_lines(prefix ::: suffix))) } } (process_result.copy(out_lines = log_lines), output_shasum) } override def cancel(): Unit = future_result.cancel() override def is_finished: Boolean = future_result.is_finished override def join: (Process_Result, SHA1.Shasum) = future_result.join } /* theory markup/messages from session database */ def read_theory( theory_context: Export.Theory_Context, unicode_symbols: Boolean = false ): Option[Document.Snapshot] = { def decode_bytes(bytes: Bytes): String = Symbol.output(unicode_symbols, UTF8.decode_permissive(bytes)) def read(name: String): Export.Entry = theory_context(name, permissive = true) def read_xml(name: String): XML.Body = YXML.parse_body(decode_bytes(read(name).bytes), cache = theory_context.cache) def read_source_file(name: String): Sessions.Source_File = theory_context.session_context.source_file(name) for { id <- theory_context.document_id() (thy_file, blobs_files) <- theory_context.files(permissive = true) } yield { val master_dir = Path.explode(Url.strip_base_name(thy_file).getOrElse( error("Cannot determine theory master directory: " + quote(thy_file)))) val blobs = blobs_files.map { name => val path = Path.explode(name) val src_path = File.relative_path(master_dir, path).getOrElse(path) val file = read_source_file(name) val bytes = file.bytes val text = decode_bytes(bytes) val chunk = Symbol.Text_Chunk(text) Command.Blob(Document.Node.Name(name), src_path, Some((file.digest, chunk))) -> Document.Blobs.Item(bytes, text, chunk, changed = false) } val thy_source = decode_bytes(read_source_file(thy_file).bytes) val thy_xml = read_xml(Export.MARKUP) val blobs_xml = for (i <- (1 to blobs.length).toList) yield read_xml(Export.MARKUP + i) val markups_index = Command.Markup_Index.make(blobs.map(_._1)) val markups = Command.Markups.make( for ((index, xml) <- markups_index.zip(thy_xml :: blobs_xml)) yield index -> Markup_Tree.from_XML(xml)) val results = Command.Results.make( for (elem @ XML.Elem(Markup(_, Markup.Serial(i)), _) <- read_xml(Export.MESSAGES)) yield i -> elem) val command = Command.unparsed(thy_source, theory = true, id = id, node_name = Document.Node.Name(thy_file, theory = theory_context.theory), blobs_info = Command.Blobs_Info.make(blobs), markups = markups, results = results) val doc_blobs = Document.Blobs.make(blobs) Document.State.init.snippet(command, doc_blobs) } } } diff --git a/src/Pure/Tools/build_process.scala b/src/Pure/Tools/build_process.scala --- a/src/Pure/Tools/build_process.scala +++ b/src/Pure/Tools/build_process.scala @@ -1,1092 +1,1228 @@ /* Title: Pure/Tools/build_process.scala Author: Makarius Build process for sessions, with build database, optional heap, and optional presentation. */ package isabelle import scala.collection.immutable.SortedMap import scala.math.Ordering import scala.annotation.tailrec object Build_Process { /** static context **/ object Context { def apply( store: Sessions.Store, build_deps: Sessions.Deps, progress: Progress = new Progress, ml_platform: String = Isabelle_System.getenv("ML_PLATFORM"), hostname: String = Isabelle_System.hostname(), numa_shuffling: Boolean = false, build_heap: Boolean = false, max_jobs: Int = 1, fresh_build: Boolean = false, no_build: Boolean = false, session_setup: (String, Session) => Unit = (_, _) => (), build_uuid: String = UUID.random().toString, master: Boolean = false, ): Context = { val sessions_structure = build_deps.sessions_structure val build_graph = sessions_structure.build_graph val sessions = Map.from( for ((name, (info, _)) <- build_graph.iterator) yield { val deps = info.parent.toList val ancestors = sessions_structure.build_requirements(deps) val sources_shasum = build_deps.sources_shasum(name) val session_context = Build_Job.Session_Context.load( build_uuid, name, deps, ancestors, info.session_prefs, sources_shasum, info.timeout, store, progress = progress) name -> session_context }) val sessions_time = { val maximals = build_graph.maximals.toSet def descendants_time(name: String): Double = { if (maximals.contains(name)) sessions(name).old_time.seconds else { val descendants = build_graph.all_succs(List(name)).toSet val g = build_graph.restrict(descendants) (0.0 :: g.maximals.flatMap { desc => val ps = g.all_preds(List(desc)) if (ps.exists(p => !sessions.isDefinedAt(p))) None else Some(ps.map(p => sessions(p).old_time.seconds).sum) }).max } } Map.from( for (name <- sessions.keysIterator) yield name -> descendants_time(name)).withDefaultValue(0.0) } val ordering = new Ordering[String] { def compare(name1: String, name2: String): Int = sessions_time(name2) compare sessions_time(name1) match { case 0 => sessions(name2).timeout compare sessions(name1).timeout match { case 0 => name1 compare name2 case ord => ord } case ord => ord } } val numa_nodes = Host.numa_nodes(enabled = numa_shuffling) new Context(store, build_deps, sessions, ordering, ml_platform, hostname, numa_nodes, build_heap = build_heap, max_jobs = max_jobs, fresh_build = fresh_build, no_build = no_build, session_setup, build_uuid = build_uuid, master = master) } } final class Context private( val store: Sessions.Store, val build_deps: Sessions.Deps, val sessions: State.Sessions, val ordering: Ordering[String], val ml_platform: String, val hostname: String, val numa_nodes: List[Int], val build_heap: Boolean, val max_jobs: Int, val fresh_build: Boolean, val no_build: Boolean, val session_setup: (String, Session) => Unit, val build_uuid: String, val master: Boolean ) { override def toString: String = - "Build_Process.Context(build_uuid = " + quote(build_uuid) + ")" + "Build_Process.Context(build_uuid = " + quote(build_uuid) + + if_proper(master, ", master = true") + ")" def build_options: Options = store.options def sessions_structure: Sessions.Structure = build_deps.sessions_structure def sources_shasum(name: String): SHA1.Shasum = sessions(name).sources_shasum def old_command_timings(name: String): List[Properties.T] = sessions.get(name) match { case Some(session_context) => Properties.uncompress(session_context.old_command_timings_blob, cache = store.cache) case None => Nil } def prepare_database(): Unit = { using_option(store.open_build_database()) { db => db.transaction { Data.all_tables.create_lock(db) Data.clean_build(db) } - db.rebuild() + db.vacuum(Data.all_tables) } } def store_heap(name: String): Boolean = build_heap || Sessions.is_pure(name) || sessions.valuesIterator.exists(_.ancestors.contains(name)) def worker_active: Boolean = max_jobs > 0 } /** dynamic state **/ type Progress_Messages = SortedMap[Long, Progress.Message] + val progress_messages_empty: Progress_Messages = SortedMap.empty + + case class Build( + build_uuid: String, + ml_platform: String, + options: String, + start: Date, + stop: Option[Date], + progress_stopped: Boolean + ) case class Worker( worker_uuid: String, build_uuid: String, hostname: String, java_pid: Long, java_start: Date, start: Date, stamp: Date, stop: Option[Date], serial: Long ) case class Task( name: String, deps: List[String], - info: JSON.Object.T = JSON.Object.empty + info: JSON.Object.T, + build_uuid: String ) { def is_ready: Boolean = deps.isEmpty def resolve(dep: String): Task = if (deps.contains(dep)) copy(deps = deps.filterNot(_ == dep)) else this } case class Job( name: String, worker_uuid: String, build_uuid: String, node_info: Host.Node_Info, build: Option[Build_Job] - ) { + ) extends Library.Named { def no_build: Job = copy(build = None) } case class Result( + name: String, + worker_uuid: String, + build_uuid: String, + node_info: Host.Node_Info, process_result: Process_Result, output_shasum: SHA1.Shasum, - node_info: Host.Node_Info, current: Boolean - ) { + ) extends Library.Named { def ok: Boolean = process_result.ok } + sealed case class Snapshot( + progress_messages: Progress_Messages, + builds: List[Build], // available build configurations + workers: List[Worker], // available worker processes + sessions: State.Sessions, // static build targets + pending: State.Pending, // dynamic build "queue" + running: State.Running, // presently running jobs + results: State.Results) // finished results + object State { type Sessions = Map[String, Build_Job.Session_Context] - type Workers = List[Worker] type Pending = List[Task] type Running = Map[String, Job] type Results = Map[String, Result] def inc_serial(serial: Long): Long = { require(serial < java.lang.Long.MAX_VALUE, "serial overflow") serial + 1 } } sealed case class State( serial: Long = 0, progress_seen: Long = 0, numa_next: Int = 0, - sessions: State.Sessions = Map.empty, // static build targets - workers: State.Workers = Nil, // available worker processes - pending: State.Pending = Nil, // dynamic build "queue" - running: State.Running = Map.empty, // presently running jobs - results: State.Results = Map.empty // finished results + sessions: State.Sessions = Map.empty, + pending: State.Pending = Nil, + running: State.Running = Map.empty, + results: State.Results = Map.empty ) { require(serial >= 0, "serial underflow") def inc_serial: State = copy(serial = State.inc_serial(serial)) def set_serial(i: Long): State = { require(serial <= i, "non-monotonic change of serial") copy(serial = i) } def progress_serial(message_serial: Long = serial): State = if (message_serial > progress_seen) copy(progress_seen = message_serial) else error("Bad serial " + message_serial + " for progress output (already seen)") - def set_workers(new_workers: State.Workers): State = copy(workers = new_workers) - def next_numa_node(numa_nodes: List[Int]): (Option[Int], State) = if (numa_nodes.isEmpty) (None, this) else { val available = numa_nodes.zipWithIndex val used = Set.from(for (job <- running.valuesIterator; i <- job.node_info.numa_node) yield i) val numa_index = available.collectFirst({ case (n, i) if n == numa_next => i }).getOrElse(0) val candidates = available.drop(numa_index) ::: available.take(numa_index) val (n, i) = candidates.find({ case (n, i) => i == numa_index && !used(n) }) orElse candidates.find({ case (n, _) => !used(n) }) getOrElse candidates.head (Some(n), copy(numa_next = numa_nodes((i + 1) % numa_nodes.length))) } def finished: Boolean = pending.isEmpty def remove_pending(name: String): State = copy(pending = pending.flatMap( entry => if (entry.name == name) None else Some(entry.resolve(name)))) def is_running(name: String): Boolean = running.isDefinedAt(name) def stop_running(): Unit = for (job <- running.valuesIterator; build <- job.build) build.cancel() - def finished_running(): List[Build_Job] = + def finished_running(): List[Job] = List.from( for (job <- running.valuesIterator; build <- job.build if build.is_finished) - yield build) + yield job) def add_running(job: Job): State = copy(running = running + (job.name -> job)) def remove_running(name: String): State = copy(running = running - name) def make_result( - name: String, + result_name: (String, String, String), process_result: Process_Result, output_shasum: SHA1.Shasum, node_info: Host.Node_Info = Host.Node_Info.none, current: Boolean = false ): State = { - val entry = name -> Build_Process.Result(process_result, output_shasum, node_info, current) - copy(results = results + entry) + val (name, worker_uuid, build_uuid) = result_name + val result = + Result(name, worker_uuid, build_uuid, node_info, process_result, output_shasum, current) + copy(results = results + (name -> result)) } } /** SQL data model **/ object Data { def make_table(name: String, columns: List[SQL.Column], body: String = ""): SQL.Table = SQL.Table("isabelle_build" + if_proper(name, "_" + name), columns, body = body) + def pull_data[A <: Library.Named]( + data_domain: Set[String], + data_iterator: Set[String] => Iterator[A], + old_data: Map[String, A] + ): Map[String, A] = { + val dom = data_domain -- old_data.keysIterator + val data = old_data -- old_data.keysIterator.filterNot(dom) + if (dom.isEmpty) data + else data_iterator(dom).foldLeft(data) { case (map, a) => map + (a.name -> a) } + } + + def pull0[A <: Library.Named]( + new_data: Map[String, A], + old_data: Map[String, A] + ): Map[String, A] = { + pull_data(new_data.keySet, dom => new_data.valuesIterator.filter(a => dom(a.name)), old_data) + } + + def pull1[A <: Library.Named]( + data_domain: Set[String], + data_base: Set[String] => Map[String, A], + old_data: Map[String, A] + ): Map[String, A] = { + pull_data(data_domain, dom => data_base(dom).valuesIterator, old_data) + } + object Generic { val build_uuid = SQL.Column.string("build_uuid") val worker_uuid = SQL.Column.string("worker_uuid") val name = SQL.Column.string("name") def sql( build_uuid: String = "", worker_uuid: String = "", - name: String = "", names: Iterable[String] = Nil ): SQL.Source = SQL.and( if_proper(build_uuid, Generic.build_uuid.equal(build_uuid)), if_proper(worker_uuid, Generic.worker_uuid.equal(worker_uuid)), - if_proper(name, Generic.name.equal(name)), if_proper(names, Generic.name.member(names))) + + def sql_where( + build_uuid: String = "", + worker_uuid: String = "", + names: Iterable[String] = Nil + ): SQL.Source = { + SQL.where(sql(build_uuid = build_uuid, worker_uuid = worker_uuid, names = names)) + } } /* base table */ object Base { val build_uuid = Generic.build_uuid.make_primary_key val ml_platform = SQL.Column.string("ml_platform") val options = SQL.Column.string("options") val start = SQL.Column.date("start") val stop = SQL.Column.date("stop") val progress_stopped = SQL.Column.bool("progress_stopped") val table = make_table("", List(build_uuid, ml_platform, options, start, stop, progress_stopped)) } + def read_builds(db: SQL.Database, build_uuid: String = ""): List[Build] = + db.execute_query_statement( + Base.table.select(sql = Generic.sql_where(build_uuid = build_uuid)), + List.from[Build], + { res => + val build_uuid = res.string(Base.build_uuid) + val ml_platform = res.string(Base.ml_platform) + val options = res.string(Base.options) + val start = res.date(Base.start) + val stop = res.get_date(Base.stop) + val progress_stopped = res.bool(Base.progress_stopped) + Build(build_uuid, ml_platform, options, start, stop, progress_stopped) + }) + def start_build( db: SQL.Database, build_uuid: String, ml_platform: String, options: String, progress_stopped: Boolean ): Unit = { db.execute_statement(Base.table.insert(), body = { stmt => stmt.string(1) = build_uuid stmt.string(2) = ml_platform stmt.string(3) = options stmt.date(4) = db.now() stmt.date(5) = None stmt.bool(6) = progress_stopped }) } def stop_build(db: SQL.Database, build_uuid: String): Unit = db.execute_statement( Base.table.update(List(Base.stop), sql = Base.build_uuid.where_equal(build_uuid)), body = { stmt => stmt.date(1) = db.now() }) def clean_build(db: SQL.Database): Unit = { val old = db.execute_query_statement( Base.table.select(List(Base.build_uuid), sql = SQL.where(Base.stop.defined)), List.from[String], res => res.string(Base.build_uuid)) if (old.nonEmpty) { - for (table <- List(Base.table, Sessions.table, Progress.table, Workers.table)) { + for (table <- build_uuid_tables) { db.execute_statement(table.delete(sql = Generic.build_uuid.where_member(old))) } } } /* sessions */ object Sessions { val name = Generic.name.make_primary_key val deps = SQL.Column.string("deps") val ancestors = SQL.Column.string("ancestors") val options = SQL.Column.string("options") val sources = SQL.Column.string("sources") val timeout = SQL.Column.long("timeout") val old_time = SQL.Column.long("old_time") val old_command_timings = SQL.Column.bytes("old_command_timings") val build_uuid = Generic.build_uuid val table = make_table("sessions", List(name, deps, ancestors, options, sources, timeout, old_time, old_command_timings, build_uuid)) } def read_sessions_domain(db: SQL.Database): Set[String] = db.execute_query_statement( Sessions.table.select(List(Sessions.name)), Set.from[String], res => res.string(Sessions.name)) def read_sessions(db: SQL.Database, names: Iterable[String] = Nil): State.Sessions = db.execute_query_statement( Sessions.table.select(sql = if_proper(names, Sessions.name.where_member(names))), Map.from[String, Build_Job.Session_Context], { res => val name = res.string(Sessions.name) val deps = split_lines(res.string(Sessions.deps)) val ancestors = split_lines(res.string(Sessions.ancestors)) val options = res.string(Sessions.options) val sources_shasum = SHA1.fake_shasum(res.string(Sessions.sources)) val timeout = Time.ms(res.long(Sessions.timeout)) val old_time = Time.ms(res.long(Sessions.old_time)) val old_command_timings_blob = res.bytes(Sessions.old_command_timings) val build_uuid = res.string(Sessions.build_uuid) name -> Build_Job.Session_Context(name, deps, ancestors, options, sources_shasum, timeout, old_time, old_command_timings_blob, build_uuid) } ) def update_sessions(db:SQL.Database, sessions: State.Sessions): Boolean = { val old_sessions = read_sessions_domain(db) val insert = sessions.iterator.filterNot(p => old_sessions.contains(p._1)).toList for ((name, session) <- insert) { db.execute_statement(Sessions.table.insert(), body = { stmt => stmt.string(1) = name stmt.string(2) = cat_lines(session.deps) stmt.string(3) = cat_lines(session.ancestors) stmt.string(4) = session.session_prefs stmt.string(5) = session.sources_shasum.toString stmt.long(6) = session.timeout.ms stmt.long(7) = session.old_time.ms stmt.bytes(8) = session.old_command_timings_blob stmt.string(9) = session.build_uuid }) } insert.nonEmpty } /* progress */ object Progress { val serial = SQL.Column.long("serial").make_primary_key val kind = SQL.Column.int("kind") val text = SQL.Column.string("text") val verbose = SQL.Column.bool("verbose") val build_uuid = Generic.build_uuid val table = make_table("progress", List(serial, kind, text, verbose, build_uuid)) } def read_progress(db: SQL.Database, seen: Long = 0, build_uuid: String = ""): Progress_Messages = db.execute_query_statement( Progress.table.select( sql = SQL.where( SQL.and( if (seen <= 0) "" else Progress.serial.ident + " > " + seen, Generic.sql(build_uuid = build_uuid)))), SortedMap.from[Long, isabelle.Progress.Message], { res => val serial = res.long(Progress.serial) val kind = isabelle.Progress.Kind(res.int(Progress.kind)) val text = res.string(Progress.text) val verbose = res.bool(Progress.verbose) serial -> isabelle.Progress.Message(kind, text, verbose = verbose) } ) def write_progress( db: SQL.Database, message_serial: Long, message: isabelle.Progress.Message, build_uuid: String ): Unit = { db.execute_statement(Progress.table.insert(), body = { stmt => stmt.long(1) = message_serial stmt.int(2) = message.kind.id stmt.string(3) = message.text stmt.bool(4) = message.verbose stmt.string(5) = build_uuid }) } def sync_progress( db: SQL.Database, seen: Long, build_uuid: String, build_progress: Progress ): (Progress_Messages, Boolean) = { require(build_uuid.nonEmpty) val messages = read_progress(db, seen = seen, build_uuid = build_uuid) val stopped_db = db.execute_query_statementO[Boolean]( Base.table.select(List(Base.progress_stopped), sql = SQL.where(Base.build_uuid.equal(build_uuid))), res => res.bool(Base.progress_stopped) ).getOrElse(false) def stop_db(): Unit = db.execute_statement( Base.table.update( List(Base.progress_stopped), sql = Base.build_uuid.where_equal(build_uuid)), body = { stmt => stmt.bool(1) = true }) val stopped = build_progress.stopped if (stopped_db && !stopped) build_progress.stop() if (stopped && !stopped_db) stop_db() (messages, messages.isEmpty && stopped_db == stopped) } /* workers */ object Workers { val worker_uuid = Generic.worker_uuid.make_primary_key val build_uuid = Generic.build_uuid val hostname = SQL.Column.string("hostname") val java_pid = SQL.Column.long("java_pid") val java_start = SQL.Column.date("java_start") val start = SQL.Column.date("start") val stamp = SQL.Column.date("stamp") val stop = SQL.Column.date("stop") val serial = SQL.Column.long("serial") val table = make_table("workers", List(worker_uuid, build_uuid, hostname, java_pid, java_start, start, stamp, stop, serial)) val serial_max = serial.copy(expr = "MAX(" + serial.ident + ")") } + def read_serial(db: SQL.Database): Long = + db.execute_query_statementO[Long]( + Workers.table.select(List(Workers.serial_max)), _.long(Workers.serial)).getOrElse(0L) + def read_workers( db: SQL.Database, build_uuid: String = "", worker_uuid: String = "" - ): State.Workers = { + ): List[Worker] = { db.execute_query_statement( - Workers.table.select(sql = - SQL.where(Generic.sql(build_uuid = build_uuid, worker_uuid = worker_uuid))), + Workers.table.select( + sql = Generic.sql_where(build_uuid = build_uuid, worker_uuid = worker_uuid)), List.from[Worker], { res => Worker( worker_uuid = res.string(Workers.worker_uuid), build_uuid = res.string(Workers.build_uuid), hostname = res.string(Workers.hostname), java_pid = res.long(Workers.java_pid), java_start = res.date(Workers.java_start), start = res.date(Workers.start), stamp = res.date(Workers.stamp), stop = res.get_date(Workers.stop), serial = res.long(Workers.serial)) }) } - def serial_max(db: SQL.Database): Long = - db.execute_query_statementO[Long]( - Workers.table.select(List(Workers.serial_max)), - res => res.long(Workers.serial) - ).getOrElse(0L) - def start_worker( db: SQL.Database, worker_uuid: String, build_uuid: String, hostname: String, java_pid: Long, - java_start: Date - ): Long = { + java_start: Date, + serial: Long + ): Unit = { def err(msg: String): Nothing = error("Cannot start worker " + worker_uuid + if_proper(msg, "\n" + msg)) val build_stop = db.execute_query_statementO( Base.table.select(List(Base.stop), sql = Base.build_uuid.where_equal(build_uuid)), res => res.get_date(Base.stop)) build_stop match { case Some(None) => case Some(Some(_)) => err("for already stopped build process " + build_uuid) case None => err("for unknown build process " + build_uuid) } - val serial = serial_max(db) db.execute_statement(Workers.table.insert(), body = { stmt => val now = db.now() stmt.string(1) = worker_uuid stmt.string(2) = build_uuid stmt.string(3) = hostname stmt.long(4) = java_pid stmt.date(5) = java_start stmt.date(6) = now stmt.date(7) = now stmt.date(8) = None stmt.long(9) = serial }) - serial } def stamp_worker( db: SQL.Database, worker_uuid: String, serial: Long, stop: Boolean = false ): Unit = { val sql = Workers.table.update(List(Workers.stamp, Workers.stop, Workers.serial), sql = Workers.worker_uuid.where_equal(worker_uuid)) db.execute_statement(sql, body = { stmt => val now = db.now() stmt.date(1) = now stmt.date(2) = if (stop) Some(now) else None stmt.long(3) = serial }) } /* pending jobs */ object Pending { val name = Generic.name.make_primary_key val deps = SQL.Column.string("deps") val info = SQL.Column.string("info") + val build_uuid = Generic.build_uuid - val table = make_table("pending", List(name, deps, info)) + val table = make_table("pending", List(name, deps, info, build_uuid)) } def read_pending(db: SQL.Database): List[Task] = db.execute_query_statement( Pending.table.select(sql = SQL.order_by(List(Pending.name))), List.from[Task], { res => val name = res.string(Pending.name) val deps = res.string(Pending.deps) val info = res.string(Pending.info) - Task(name, split_lines(deps), info = JSON.Object.parse(info)) + val build_uuid = res.string(Pending.build_uuid) + Task(name, split_lines(deps), JSON.Object.parse(info), build_uuid) }) def update_pending(db: SQL.Database, pending: State.Pending): Boolean = { val old_pending = read_pending(db) val (delete, insert) = Library.symmetric_difference(old_pending, pending) if (delete.nonEmpty) { db.execute_statement( - Pending.table.delete(sql = SQL.where(Generic.sql(names = delete.map(_.name))))) + Pending.table.delete(sql = Generic.sql_where(names = delete.map(_.name)))) } - for (entry <- insert) { + for (task <- insert) { db.execute_statement(Pending.table.insert(), body = { stmt => - stmt.string(1) = entry.name - stmt.string(2) = cat_lines(entry.deps) - stmt.string(3) = JSON.Format(entry.info) + stmt.string(1) = task.name + stmt.string(2) = cat_lines(task.deps) + stmt.string(3) = JSON.Format(task.info) + stmt.string(4) = task.build_uuid }) } delete.nonEmpty || insert.nonEmpty } /* running jobs */ object Running { val name = Generic.name.make_primary_key val worker_uuid = Generic.worker_uuid val build_uuid = Generic.build_uuid val hostname = SQL.Column.string("hostname") val numa_node = SQL.Column.int("numa_node") val table = make_table("running", List(name, worker_uuid, build_uuid, hostname, numa_node)) } - def read_running(db: SQL.Database): List[Job] = + def read_running(db: SQL.Database): State.Running = db.execute_query_statement( Running.table.select(sql = SQL.order_by(List(Running.name))), - List.from[Job], + Map.from[String, Job], { res => val name = res.string(Running.name) val worker_uuid = res.string(Running.worker_uuid) val build_uuid = res.string(Running.build_uuid) val hostname = res.string(Running.hostname) val numa_node = res.get_int(Running.numa_node) - Job(name, worker_uuid, build_uuid, Host.Node_Info(hostname, numa_node), None) + name -> Job(name, worker_uuid, build_uuid, Host.Node_Info(hostname, numa_node), None) } ) def update_running(db: SQL.Database, running: State.Running): Boolean = { - val running0 = read_running(db) + val running0 = read_running(db).valuesIterator.toList val running1 = running.valuesIterator.map(_.no_build).toList val (delete, insert) = Library.symmetric_difference(running0, running1) if (delete.nonEmpty) { db.execute_statement( - Running.table.delete(sql = SQL.where(Generic.sql(names = delete.map(_.name))))) + Running.table.delete(sql = Generic.sql_where(names = delete.map(_.name)))) } for (job <- insert) { db.execute_statement(Running.table.insert(), body = { stmt => stmt.string(1) = job.name stmt.string(2) = job.worker_uuid stmt.string(3) = job.build_uuid stmt.string(4) = job.node_info.hostname stmt.int(5) = job.node_info.numa_node }) } delete.nonEmpty || insert.nonEmpty } /* job results */ object Results { val name = Generic.name.make_primary_key + val worker_uuid = Generic.worker_uuid + val build_uuid = Generic.build_uuid val hostname = SQL.Column.string("hostname") val numa_node = SQL.Column.string("numa_node") val rc = SQL.Column.int("rc") val out = SQL.Column.string("out") val err = SQL.Column.string("err") val timing_elapsed = SQL.Column.long("timing_elapsed") val timing_cpu = SQL.Column.long("timing_cpu") val timing_gc = SQL.Column.long("timing_gc") + val output_shasum = SQL.Column.string("output_shasum") + val current = SQL.Column.bool("current") val table = make_table("results", - List(name, hostname, numa_node, rc, out, err, timing_elapsed, timing_cpu, timing_gc)) + List(name, worker_uuid, build_uuid, hostname, numa_node, + rc, out, err, timing_elapsed, timing_cpu, timing_gc, output_shasum, current)) } def read_results_domain(db: SQL.Database): Set[String] = db.execute_query_statement( Results.table.select(List(Results.name)), Set.from[String], res => res.string(Results.name)) - def read_results(db: SQL.Database, names: List[String] = Nil): Map[String, Build_Job.Result] = + def read_results(db: SQL.Database, names: Iterable[String] = Nil): State.Results = db.execute_query_statement( Results.table.select(sql = if_proper(names, Results.name.where_member(names))), - Map.from[String, Build_Job.Result], + Map.from[String, Result], { res => val name = res.string(Results.name) + val worker_uuid = res.string(Results.worker_uuid) + val build_uuid = res.string(Results.build_uuid) val hostname = res.string(Results.hostname) val numa_node = res.get_int(Results.numa_node) + val node_info = Host.Node_Info(hostname, numa_node) + val rc = res.int(Results.rc) val out = res.string(Results.out) val err = res.string(Results.err) val timing = res.timing( Results.timing_elapsed, Results.timing_cpu, Results.timing_gc) - val node_info = Host.Node_Info(hostname, numa_node) val process_result = Process_Result(rc, out_lines = split_lines(out), err_lines = split_lines(err), timing = timing) - name -> Build_Job.Result(node_info, process_result) + + val output_shasum = SHA1.fake_shasum(res.string(Results.output_shasum)) + val current = res.bool(Results.current) + + name -> + Result(name, worker_uuid, build_uuid, node_info, process_result, output_shasum, current) } ) def update_results(db: SQL.Database, results: State.Results): Boolean = { val old_results = read_results_domain(db) - val insert = results.iterator.filterNot(p => old_results.contains(p._1)).toList + val insert = results.valuesIterator.filterNot(res => old_results.contains(res.name)).toList - for ((name, result) <- insert) { - val node_info = result.node_info + for (result <- insert) { val process_result = result.process_result db.execute_statement(Results.table.insert(), body = { stmt => - stmt.string(1) = name - stmt.string(2) = node_info.hostname - stmt.int(3) = node_info.numa_node - stmt.int(4) = process_result.rc - stmt.string(5) = cat_lines(process_result.out_lines) - stmt.string(6) = cat_lines(process_result.err_lines) - stmt.long(7) = process_result.timing.elapsed.ms - stmt.long(8) = process_result.timing.cpu.ms - stmt.long(9) = process_result.timing.gc.ms + stmt.string(1) = result.name + stmt.string(2) = result.worker_uuid + stmt.string(3) = result.build_uuid + stmt.string(4) = result.node_info.hostname + stmt.int(5) = result.node_info.numa_node + stmt.int(6) = process_result.rc + stmt.string(7) = cat_lines(process_result.out_lines) + stmt.string(8) = cat_lines(process_result.err_lines) + stmt.long(9) = process_result.timing.elapsed.ms + stmt.long(10) = process_result.timing.cpu.ms + stmt.long(11) = process_result.timing.gc.ms + stmt.string(12) = result.output_shasum.toString + stmt.bool(13) = result.current }) } insert.nonEmpty } /* collective operations */ val all_tables: SQL.Tables = SQL.Tables( Base.table, Workers.table, Progress.table, Sessions.table, Pending.table, Running.table, Results.table, Host.Data.Node_Info.table) + val build_uuid_tables = + all_tables.filter(table => + table.columns.exists(column => column.name == Generic.build_uuid.name)) + + def pull_database( + db: SQL.Database, + worker_uuid: String, + hostname: String, + state: State + ): State = { + val serial_db = read_serial(db) + if (serial_db == state.serial) state + else { + val serial = serial_db max state.serial + stamp_worker(db, worker_uuid, serial) + + val numa_next = Host.Data.read_numa_next(db, hostname) + val sessions = pull1(read_sessions_domain(db), read_sessions(db, _), state.sessions) + val pending = read_pending(db) + val running = pull0(read_running(db), state.running) + val results = pull1(read_results_domain(db), read_results(db, _), state.results) + + state.copy(serial = serial, numa_next = numa_next, sessions = sessions, + pending = pending, running = running, results = results) + } + } + def update_database( db: SQL.Database, worker_uuid: String, build_uuid: String, hostname: String, state: State ): State = { val changed = List( update_sessions(db, state.sessions), update_pending(db, state.pending), update_running(db, state.running), update_results(db, state.results), Host.Data.update_numa_next(db, hostname, state.numa_next)) - val serial0 = serial_max(db) + val serial0 = state.serial val serial = if (changed.exists(identity)) State.inc_serial(serial0) else serial0 stamp_worker(db, worker_uuid, serial) - state.set_serial(serial).set_workers(read_workers(db)) + state.set_serial(serial) } } } /** main process **/ class Build_Process( protected final val build_context: Build_Process.Context, protected final val build_progress: Progress ) extends AutoCloseable { /* context */ protected final val store: Sessions.Store = build_context.store protected final val build_options: Options = store.options protected final val build_deps: Sessions.Deps = build_context.build_deps + protected final val hostname: String = build_context.hostname protected final val build_uuid: String = build_context.build_uuid protected final val worker_uuid: String = UUID.random().toString override def toString: String = "Build_Process(worker_uuid = " + quote(worker_uuid) + ", build_uuid = " + quote(build_uuid) + if_proper(build_context.master, ", master = true") + ")" /* global state: internal var vs. external database */ - private var _state: Build_Process.State = init_state(Build_Process.State()) + private var _state: Build_Process.State = Build_Process.State() private val _database: Option[SQL.Database] = store.open_build_database() def close(): Unit = synchronized { _database.foreach(_.close()) } protected def synchronized_database[A](body: => A): A = synchronized { _database match { case None => body case Some(db) => - @tailrec def loop(): A = { - val sync_progress = - db.transaction_lock(Build_Process.Data.all_tables) { - val (messages, sync) = - Build_Process.Data.sync_progress( - db, _state.progress_seen, build_uuid, build_progress) - if (sync) Left(body) else Right(messages) - } - sync_progress match { + def pull_database(): Unit = { + _state = Build_Process.Data.pull_database(db, worker_uuid, hostname, _state) + } + + def sync_database(): Unit = { + _state = + Build_Process.Data.update_database(db, worker_uuid, build_uuid, hostname, _state) + } + + def attempt(): Either[A, Build_Process.Progress_Messages] = { + val (messages, sync) = + Build_Process.Data.sync_progress( + db, _state.progress_seen, build_uuid, build_progress) + if (sync) Left { pull_database(); val res = body; sync_database(); res } + else Right(messages) + } + + @tailrec def attempts(): A = { + db.transaction_lock(Build_Process.Data.all_tables) { attempt() } match { case Left(res) => res case Right(messages) => for ((message_serial, message) <- messages) { _state = _state.progress_serial(message_serial = message_serial) if (build_progress.do_output(message)) build_progress.output(message) } - loop() + attempts() } } - loop() - } - } - - private def sync_database(): Unit = - synchronized_database { - for (db <- _database) { - _state = - Build_Process.Data.update_database( - db, worker_uuid, build_uuid, build_context.hostname, _state) + attempts() } } /* progress backed by database */ private def progress_output(message: Progress.Message, build_progress_output: => Unit): Unit = { synchronized_database { _state = _state.inc_serial.progress_serial() for (db <- _database) { Build_Process.Data.write_progress(db, _state.serial, message, build_uuid) Build_Process.Data.stamp_worker(db, worker_uuid, _state.serial) } build_progress_output } } protected object progress extends Progress { override def verbose: Boolean = build_progress.verbose override def output(message: Progress.Message): Unit = progress_output(message, if (do_output(message)) build_progress.output(message)) override def theory(theory: Progress.Theory): Unit = progress_output(theory.message, build_progress.theory(theory)) override def nodes_status(nodes_status: Document_Status.Nodes_Status): Unit = build_progress.nodes_status(nodes_status) override def stop(): Unit = build_progress.stop() override def stopped: Boolean = build_progress.stopped } protected val log: Logger = Logger.make_system_log(progress, build_options) /* policy operations */ protected def init_state(state: Build_Process.State): Build_Process.State = { val sessions1 = build_context.sessions.foldLeft(state.sessions) { case (map, (name, session)) => if (state.sessions.isDefinedAt(name)) map else map + (name -> session) } val old_pending = state.pending.iterator.map(_.name).toSet val new_pending = List.from( for { (name, session_context) <- build_context.sessions.iterator if !old_pending(name) - } yield Build_Process.Task(name, session_context.deps)) + } yield Build_Process.Task(name, session_context.deps, JSON.Object.empty, build_uuid)) val pending1 = new_pending ::: state.pending state.copy(sessions = sessions1, pending = pending1) } protected def next_job(state: Build_Process.State): Option[String] = if (progress.stopped || state.running.size < build_context.max_jobs) { state.pending.filter(entry => entry.is_ready && !state.is_running(entry.name)) .sortBy(_.name)(build_context.ordering) .headOption.map(_.name) } else None protected def start_session(state: Build_Process.State, session_name: String): Build_Process.State = { val ancestor_results = for (a <- build_context.sessions(session_name).ancestors) yield state.results(a) val input_shasum = - if (ancestor_results.isEmpty) { - SHA1.shasum_meta_info(SHA1.digest(Path.explode("$POLYML_EXE"))) - } + if (ancestor_results.isEmpty) ML_Process.bootstrap_shasum() else SHA1.flat_shasum(ancestor_results.map(_.output_shasum)) val store_heap = build_context.store_heap(session_name) val (current, output_shasum) = store.check_output(session_name, sources_shasum = build_context.sources_shasum(session_name), input_shasum = input_shasum, fresh_build = build_context.fresh_build, store_heap = store_heap) val all_current = current && ancestor_results.forall(_.current) + val result_name = (session_name, worker_uuid, build_uuid) + if (all_current) { state .remove_pending(session_name) - .make_result(session_name, Process_Result.ok, output_shasum, current = true) + .make_result(result_name, Process_Result.ok, output_shasum, current = true) } else if (build_context.no_build) { progress.echo("Skipping " + session_name + " ...", verbose = true) state. remove_pending(session_name). - make_result(session_name, Process_Result.error, output_shasum) + make_result(result_name, Process_Result.error, output_shasum) } else if (progress.stopped || !ancestor_results.forall(_.ok)) { progress.echo(session_name + " CANCELLED") state .remove_pending(session_name) - .make_result(session_name, Process_Result.undefined, output_shasum) + .make_result(result_name, Process_Result.undefined, output_shasum) } else { val (numa_node, state1) = state.next_numa_node(build_context.numa_nodes) - val node_info = Host.Node_Info(build_context.hostname, numa_node) + val node_info = Host.Node_Info(hostname, numa_node) progress.echo( (if (store_heap) "Building " else "Running ") + session_name + if_proper(node_info.numa_node, " on " + node_info) + " ...") store.init_output(session_name) val build = Build_Job.start_session(build_context, progress, log, build_deps.background(session_name), input_shasum, node_info) val job = Build_Process.Job(session_name, worker_uuid, build_uuid, node_info, Some(build)) state1.add_running(job) } } /* build process roles */ - final def start_build(): Unit = synchronized_database { + final def is_session_name(job_name: String): Boolean = + !Long_Name.is_qualified(job_name) + + protected final def start_build(): Unit = synchronized_database { for (db <- _database) { Build_Process.Data.start_build(db, build_uuid, build_context.ml_platform, build_context.sessions_structure.session_prefs, progress.stopped) } } - final def stop_build(): Unit = synchronized_database { + protected final def stop_build(): Unit = synchronized_database { for (db <- _database) { Build_Process.Data.stop_build(db, build_uuid) } } - final def start_worker(): Unit = synchronized_database { + protected final def start_worker(): Unit = synchronized_database { for (db <- _database) { val java = ProcessHandle.current() val java_pid = java.pid val java_start = Date.instant(java.info.startInstant.get) - val serial = - Build_Process.Data.start_worker( - db, worker_uuid, build_uuid, build_context.hostname, java_pid, java_start) - _state = _state.set_serial(serial) + _state = _state.inc_serial + Build_Process.Data.start_worker( + db, worker_uuid, build_uuid, hostname, java_pid, java_start, _state.serial) } } - final def stop_worker(): Unit = synchronized_database { + protected final def stop_worker(): Unit = synchronized_database { for (db <- _database) { Build_Process.Data.stamp_worker(db, worker_uuid, _state.serial, stop = true) } } /* run */ def run(): Map[String, Process_Result] = { + if (build_context.master) synchronized_database { _state = init_state(_state) } + def finished(): Boolean = synchronized_database { _state.finished } def sleep(): Unit = Isabelle_Thread.interrupt_handler(_ => progress.stop()) { build_options.seconds("editor_input_delay").sleep() } def start_job(): Boolean = synchronized_database { next_job(_state) match { case Some(name) => - if (Build_Job.is_session_name(name)) { + if (is_session_name(name)) { _state = start_session(_state, name) true } else error("Unsupported build job name " + quote(name)) case None => false } } if (finished()) { progress.echo_warning("Nothing to build") Map.empty[String, Process_Result] } else { if (build_context.master) start_build() start_worker() if (build_context.master && !build_context.worker_active) { progress.echo("Waiting for external workers ...") } try { while (!finished()) { synchronized_database { if (progress.stopped) _state.stop_running() - for (build <- _state.finished_running()) { - val job_name = build.job_name - val (process_result, output_shasum) = build.join + for (job <- _state.finished_running()) { + val result_name = (job.name, worker_uuid, build_uuid) + val (process_result, output_shasum) = job.build.get.join _state = _state. - remove_pending(job_name). - remove_running(job_name). - make_result(job_name, process_result, output_shasum, node_info = build.node_info) + remove_pending(job.name). + remove_running(job.name). + make_result(result_name, process_result, output_shasum, node_info = job.node_info) } } - if (!start_job()) { - sync_database() - sleep() - } + if (!start_job()) sleep() } } finally { stop_worker() if (build_context.master) stop_build() } synchronized_database { for ((name, result) <- _state.results) yield name -> result.process_result } } } + + + /* snapshot */ + + def snapshot(): Build_Process.Snapshot = synchronized_database { + val (progress_messages, builds, workers) = + _database match { + case None => (Build_Process.progress_messages_empty, Nil, Nil) + case Some(db) => + (Build_Process.Data.read_progress(db), + Build_Process.Data.read_builds(db), + Build_Process.Data.read_workers(db)) + } + Build_Process.Snapshot( + progress_messages = progress_messages, + builds = builds, + workers = workers, + sessions = _state.sessions, + pending = _state.pending, + running = _state.running, + results = _state.results) + } } diff --git a/src/Pure/library.scala b/src/Pure/library.scala --- a/src/Pure/library.scala +++ b/src/Pure/library.scala @@ -1,311 +1,316 @@ /* Title: Pure/library.scala Author: Makarius Basic library. */ package isabelle import scala.annotation.tailrec import scala.collection.mutable import scala.util.matching.Regex object Library { /* resource management */ def using[A <: AutoCloseable, B](a: A)(f: A => B): B = { try { f(a) } finally { if (a != null) a.close() } } def using_option[A <: AutoCloseable, B](opt: Option[A])(f: A => B): Option[B] = opt.map(a => using(a)(f)) /* integers */ private val small_int = 10000 private lazy val small_int_table = { val array = new Array[String](small_int) for (i <- 0 until small_int) array(i) = i.toString array } def is_small_int(s: String): Boolean = { val len = s.length 1 <= len && len <= 4 && s.forall(c => '0' <= c && c <= '9') && (len == 1 || s(0) != '0') } def signed_string_of_long(i: Long): String = if (0 <= i && i < small_int) small_int_table(i.toInt) else i.toString def signed_string_of_int(i: Int): String = if (0 <= i && i < small_int) small_int_table(i) else i.toString /* separated chunks */ def separate[A](s: A, list: List[A]): List[A] = { val result = new mutable.ListBuffer[A] var first = true for (x <- list) { if (first) { first = false result += x } else { result += s result += x } } result.toList } def separated_chunks(sep: Char => Boolean, source: CharSequence): Iterator[CharSequence] = new Iterator[CharSequence] { private val end = source.length private def next_chunk(i: Int): Option[(CharSequence, Int)] = { if (i < end) { var j = i while ({ j += 1 j < end && !sep(source.charAt(j)) }) () Some((source.subSequence(i + 1, j), j)) } else None } private var state: Option[(CharSequence, Int)] = if (end == 0) None else next_chunk(-1) def hasNext: Boolean = state.isDefined def next(): CharSequence = state match { case Some((s, i)) => state = next_chunk(i); s case None => Iterator.empty.next() } } def space_explode(sep: Char, str: String): List[String] = separated_chunks(_ == sep, str).map(_.toString).toList /* lines */ def count_newlines(str: String): Int = str.count(_ == '\n') def terminate_lines(lines: IterableOnce[String]): String = { val it = lines.iterator if (it.isEmpty) "" else it.mkString("", "\n", "\n") } def cat_lines(lines: IterableOnce[String]): String = lines.iterator.mkString("\n") def split_lines(str: String): List[String] = space_explode('\n', str) def prefix_lines(prfx: String, str: String): String = isabelle.setup.Library.prefix_lines(prfx, str) def indent_lines(n: Int, str: String): String = prefix_lines(Symbol.spaces(n), str) def first_line(source: CharSequence): String = { val lines = separated_chunks(_ == '\n', source) if (lines.hasNext) lines.next().toString else "" } def trim_line(s: String): String = isabelle.setup.Library.trim_line(s) def trim_split_lines(s: String): List[String] = split_lines(trim_line(s)).map(trim_line) def encode_lines(s: String): String = s.replace('\n', '\u000b') def decode_lines(s: String): String = s.replace('\u000b', '\n') /* strings */ def make_string(f: StringBuilder => Unit, capacity: Int = 16): String = { val s = new StringBuilder(capacity) f(s) s.toString } def try_unprefix(prfx: String, s: String): Option[String] = if (s.startsWith(prfx)) Some(s.substring(prfx.length)) else None def try_unsuffix(sffx: String, s: String): Option[String] = if (s.endsWith(sffx)) Some(s.substring(0, s.length - sffx.length)) else None def perhaps_unprefix(prfx: String, s: String): String = try_unprefix(prfx, s) getOrElse s def perhaps_unsuffix(sffx: String, s: String): String = try_unsuffix(sffx, s) getOrElse s def isolate_substring(s: String): String = new String(s.toCharArray) def strip_ansi_color(s: String): String = s.replaceAll("\u001b\\[\\d+m", "") /* quote */ def single_quote(s: String): String = "'" + s + "'" def quote(s: String): String = "\"" + s + "\"" def try_unquote(s: String): Option[String] = if (s.startsWith("\"") && s.endsWith("\"")) Some(s.substring(1, s.length - 1)) else None def perhaps_unquote(s: String): String = try_unquote(s) getOrElse s def commas(ss: Iterable[String]): String = ss.iterator.mkString(", ") def commas_quote(ss: Iterable[String]): String = ss.iterator.map(quote).mkString(", ") /* CharSequence */ class Reverse(text: CharSequence, start: Int, end: Int) extends CharSequence { require(0 <= start && start <= end && end <= text.length, "bad reverse range") def this(text: CharSequence) = this(text, 0, text.length) def length: Int = end - start def charAt(i: Int): Char = text.charAt(end - i - 1) def subSequence(i: Int, j: Int): CharSequence = if (0 <= i && i <= j && j <= length) new Reverse(text, end - j, end - i) else throw new IndexOutOfBoundsException override def toString: String = { val buf = new StringBuilder(length) for (i <- 0 until length) buf.append(charAt(i)) buf.toString } } class Line_Termination(text: CharSequence) extends CharSequence { def length: Int = text.length + 1 def charAt(i: Int): Char = if (i == text.length) '\n' else text.charAt(i) def subSequence(i: Int, j: Int): CharSequence = if (j == text.length + 1) new Line_Termination(text.subSequence(i, j - 1)) else text.subSequence(i, j) override def toString: String = text.toString + "\n" } /* regular expressions */ def make_regex(s: String): Option[Regex] = try { Some(new Regex(s)) } catch { case ERROR(_) => None } def is_regex_meta(c: Char): Boolean = """()[]{}\^$|?*+.<>-=!""".contains(c) def escape_regex(s: String): String = if (s.exists(is_regex_meta)) { (for (c <- s.iterator) yield { if (is_regex_meta(c)) "\\" + c.toString else c.toString }).mkString } else s /* lists */ def take_prefix[A](pred: A => Boolean, xs: List[A]): (List[A], List[A]) = (xs.takeWhile(pred), xs.dropWhile(pred)) def take_suffix[A](pred: A => Boolean, xs: List[A]): (List[A], List[A]) = { val rev_xs = xs.reverse (rev_xs.dropWhile(pred).reverse, rev_xs.takeWhile(pred).reverse) } def trim[A](pred: A => Boolean, xs: List[A]): List[A] = take_suffix(pred, take_prefix(pred, xs)._2)._1 def member[A, B](xs: List[A])(x: B): Boolean = xs.contains(x) def insert[A](x: A)(xs: List[A]): List[A] = if (xs.contains(x)) xs else x :: xs def remove[A, B](x: B)(xs: List[A]): List[A] = if (member(xs)(x)) xs.filterNot(_ == x) else xs def update[A](x: A)(xs: List[A]): List[A] = x :: remove(x)(xs) def merge[A](xs: List[A], ys: List[A]): List[A] = if (xs.eq(ys)) xs else if (xs.isEmpty) ys else ys.foldRight(xs)(Library.insert(_)(_)) def distinct[A](xs: List[A], eq: (A, A) => Boolean = (x: A, y: A) => x == y): List[A] = { val result = new mutable.ListBuffer[A] xs.foreach(x => if (!result.exists(y => eq(x, y))) result += x) result.toList } def duplicates[A](lst: List[A], eq: (A, A) => Boolean = (x: A, y: A) => x == y): List[A] = { val result = new mutable.ListBuffer[A] @tailrec def dups(rest: List[A]): Unit = rest match { case Nil => case x :: xs => if (!result.exists(y => eq(x, y)) && xs.exists(y => eq(x, y))) result += x dups(xs) } dups(lst) result.toList } def replicate[A](n: Int, a: A): List[A] = if (n < 0) throw new IllegalArgumentException else if (n == 0) Nil else { val res = new mutable.ListBuffer[A] (1 to n).foreach(_ => res += a) res.toList } def the_single[A](xs: List[A]): A = xs match { case List(x) => x case _ => error("Single argument expected") } def symmetric_difference[A](xs: List[A], ys: List[A]): (List[A], List[A]) = (xs.filterNot(ys.toSet), ys.filterNot(xs.toSet)) /* proper values */ def proper_bool(b: Boolean): Option[Boolean] = if (!b) None else Some(b) def proper_string(s: String): Option[String] = if (s == null || s == "") None else Some(s) def proper_list[A](list: List[A]): Option[List[A]] = if (list == null || list.isEmpty) None else Some(list) def if_proper[A](x: Iterable[A], body: => String): String = if (x == null || x.isEmpty) "" else body def if_proper(b: Boolean, body: => String): String = if (!b) "" else body /* reflection */ def is_subclass[A, B](a: Class[A], b: Class[B]): Boolean = { import scala.language.existentials @tailrec def subclass(c: Class[_]): Boolean = { c == b || { val d = c.getSuperclass; d != null && subclass(d) } } subclass(a) } def as_subclass[C](c: Class[C])(x: AnyRef): Option[C] = if (x == null || is_subclass(x.getClass, c)) Some(x.asInstanceOf[C]) else None + + + /* named items */ + + trait Named { def name: String } }