"...resnet50_tensorflow.git" did not exist on "9062d20086fbcab3bddd319a21392ac5f7b3549a"
Commit e08b06b1 authored by Jonas Kaufmann's avatar Jonas Kaufmann
Browse files

docs: write orchestration concepts

parent b7aeecaf
......@@ -27,30 +27,126 @@
SimBricks Orchestration
###################################
Our orchestration framework replaces hand-crafted scripts for setting up and
running experiments. Instead, they are described in a declarative fashion. The
orchestration framework then takes care of the details managing launching the
respective component simulators, setting up the SimBricks communication channels
between them, and monitoring their execution. All output is collected in a JSON
file, which allows post-processing afterwards.
******************************
Concepts
******************************
To declare experiments, we use multiple important concepts and terminology,
which we now introduce.
Experiments
===========
An *experiment* defines which component simulators to run, how they are
connected, and which workload is executed. To define an experiment, instantiate
the class :class:`~simbricks.orchestration.experiments.Experiment` in your own
Python module, which has member functions to further define the component
simulators to run. SimBricks comes with many pre-defined experiments, which can serve as starting guides and are located in the repository under ``experiments/pyexps``.
.. autoclass:: simbricks.orchestration.experiments.Experiment
:members: add_host, add_pcidev, add_nic, add_network
Runs
====
Experiments can be executed multiple times, for example, to gain statistical
insights when including a random or non-deterministic component. We call each
execution one *run* of the experiment. Each run produces its own output JSON
file. The file name includes the number of the run.
The number of runs can be specified when invoking the orchestration framework,
see :ref:`sec-command-line`. When using simulator checkpointing, we use one run
to boot the simulator and take the checkpoint, and a second one to execute the
actual experiment. This is the reason for two output JSON files being produced.
For more information, see :ref:`sec-checkpointing`.
Component Simulators
====================
SimBricks provides multiple already implemented component simulators, which
can be used in experiments. This selection includes host, NIC, network, and PCI
device simulators. Each simulator is implemented in a class deriving from
:class:`~simbricks.orchestration.simulators.Simulator`, which provides the
necessary commands and arguments for its execution and for specifying the
SimBricks communication channel to connect to. We also offer more specialized
base classes for the different component types, which implement common member
functions, for example, add connected NICs or network component simulators to a
host component simulator. Every already implemented component simulator can be
found in the module. :mod:`simbricks.orchestration.simulators`.
.. automodule:: simbricks.orchestration.simulators
.. autoclass:: simbricks.orchestration.simulators.Simulator
:members: resreq_cores, resreq_mem, prep_cmds, run_cmd, dependencies
.. autoclass:: simbricks.orchestration.simulators.HostSim
:members: add_pcidev, add_nic, add_netdirect
.. autoclass:: simbricks.orchestration.simulators.NICSim
:members: set_network
.. autoclass:: simbricks.orchestration.simulators.NetSim
:members: connect_network
.. autoclass:: simbricks.orchestration.simulators.PCIDevSim
.. _sec-node_configuration:
Node Configuration
==================
The configuration and workload to run on individual host simulators or, more
generally, nodes that should run in the experiment, are defined using the
classes :class:`~simbricks.orchestration.nodeconfig.NodeConfig` and
:class:`~simbricks.orchestration.nodeconfig.AppConfig`, respectively.
:class:`~simbricks.orchestration.nodeconfig.NodeConfig` defines, for example,
the networking configuration like IP address and subnet mask, how much system
memory the node has, and which disk image to run. The latter can be used, for
example, to run a specific version of the Linux kernel on a node. You can find
more information on this in the :ref:`next section <sec-howto-custom_image>`.
:class:`~simbricks.orchestration.nodeconfig.NodeConfig` contains an attribute
for a :class:`~simbricks.orchestration.nodeconfig.AppConfig`.
.. automodule:: simbricks.orchestration.nodeconfig
.. autoclass:: simbricks.orchestration.nodeconfig.NodeConfig
:members: ip, prefix, mtu, cores, memory, disk_image, app, run_cmds, cleanup_cmds, config_files
.. _sec-app_configuration:
Application Configuration
-------------------------
The class :class:`~simbricks.orchestration.nodeconfig.AppConfig` offers member
functions to define the concrete commands to run on the node. It also provides a
member function
:meth:`~simbricks.orchestration.nodeconfig.AppConfig.config_files` to specify
additional files to be made available on the host, which are specified as key
value pairs, where the key represents the filename inside the simulated guest
system and the value is an IO handle to the file on the host running the
simulators.
.. autoclass:: simbricks.orchestration.nodeconfig.AppConfig
:members: run_cmds, config_files
******************************
Running Experiments
******************************
.. _sec-command-line:
Command Line
====================
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment