master
branchmaster
branchKetrew is:
This is the 1.1.0
release of Ketrew, it is backwards compatible with the
previous 1.0.0
version.
Ketrew requires at least OCaml 4.02.0 and should be able to build & work on any Unix platform.
If you have opam
up and running:
opam remote add -k git smondet https://github.com/smondet/dev-opam-repo
opam install ketrew
Then you need at runtime ssh
in the $PATH
.
This gets you the ketrew
executable and the ketrew_pure
and ketrew
libraries.
See the development documentation to find out how to build Ketrew (and its dependencies) from the sources.
The EDSL is an OCaml library where all the functions are used to build a
workflow data-structure. Then, one function: Ketrew.Client.submit
is used to
submit workflows to the engine.
A workflow is a Graph of “targets”.
There are 3 kinds of links between targets:
Any OCaml program can use the EDSL (script, compiled, or even inside the toplevel), see the documentation of the EDSL API.
This example is a “single-target” workflow that runs an arbitrary shell command on an LSF-based cluster:
#use "topfind" #thread #require "ketrew" let run_command_with_lsf cmd = let module KEDSL = Ketrew.EDSL in let host = (* `Host.parse` takes an URI and creates a “Host” datastructue: a place to run stuff. *) KEDSL.Host.parse "ssh://user42@MyLSFCluster/home/user42/ketrew-playground/?shell=bash" (* This one is an SSH host, named `MyLSFCluster`. The directory `/home/user42/ketrew-playground/` will be used by Ketrew to monitor the jobs. *) in let program = (* A “program” is a datastructure representing “extended shell scripts”. `Program.sh` creates one out a shell command. *) KEDSL.Program.sh cmd in let lsf_build_process = (* “build process” is a method for making things: `lsf` creates a datastructure that represents a job running a `program` with the LSF scheduling engine, on the host `host`. *) KEDSL.lsf ~queue:"normal-people" ~wall_limit:"1:30" ~processors:(`Min_max (1,1)) ~host program in (* The function `KEDSL.target` creates a node in the workflow graph. This one is very simple, it has a name and a build-process, and since it doesn't have dependencies or fallbacks, it is a “single-node” workflow: *) KEDSL.target "run_command_with_lsf" ~make:lsf_build_process let () = let workflow = (* Create the workflow with the first argument of the command line: *) run_command_with_lsf Sys.argv.(1) in (* Then, `Client.submit` is the only function that “does” something, it submits the workflow to the engine: *) Ketrew.Client.submit workflow (* If Ketrew is in Standalone mode, this means writing the workflow in the database (nothing runs yet, you need to run Ketrew's engine yourself). If Ketrew is in Client-Server mode, this means sending the workflow to the server over HTTPS. The server will start running the workflow right away. *)
If you actually have access to an LSF cluster and want to try this workflow see below: “For The Impatient”.
To learn more about the EDSL, you can also explore examples of more and more complicated workflows (work-in-progress).
Let's say the example above is in a file my_first_workflow.ml
:
The first time you use Ketrew, you need to call init
:
ketrew init
Then you can submit your workflow:
ocaml my_first_workflow.ml 'du -sh $HOME'
When the function Ketrew.Client.submit
is called, the workflow will be
submitted but not yet running. To run the engine do:
ketrew run loop
The engine will run until you type q
or until there is nothing left to do.
Anytime, you can go a check the status and do many things with your workflows, for example with:
ketrew interact
which is an interactive text-based interface.
Let's go back to the beginning; to create a configuration file, run:
ketrew init
This creates $HOME/.ketrew/configuration.json
(see ketrew init --help
to
choose another path).
By default this configures Ketrew in Standalone mode; See the documentation on the configuration file to tweak it.
The default for Ketrew is to run in “Standalone” mode.
From the command-line client, one can both query and run the engine. See
first: ketrew --help
; then:
ketrew status --help
.ketrew run fix
(see ketrew run-engine --help
).ketrew kill
+ the target Identifier,ketrew kill --interactive
(see ketrew kill --help
).See also ketrew interact --help
or ketrew explore --help
for fun
one-key-based navigation.
In this mode, the Ketrew engine runs a proper server which is accessed over an HTTP API.
See the commands ketrew start-server --help
and ketrew stop-server --help
.
The client works in the same way as in “Standalone” mode.
From here:
src/test/Workflow_Examples.ml
for
examples and the documentation of the EDSL API.Ketrew_lsf
or in the tests:
src/test/dummy_plugin.ml
.ketrew
as a client).Ketrew_engine
.It's Apache 2.0.