diff --git a/doc/api/smartsim_api.rst b/doc/api/smartsim_api.rst index d9615e04cc..1ce097bed7 100644 --- a/doc/api/smartsim_api.rst +++ b/doc/api/smartsim_api.rst @@ -42,11 +42,8 @@ Settings .. currentmodule:: smartsim.settings Settings are provided to ``Model`` and ``Ensemble`` objects -to provide parameters for how a job should be executed. Some -are specifically meant for certain launchers like ``SbatchSettings`` -is solely meant for system using Slurm as a workload manager. -``MpirunSettings`` for OpenMPI based jobs is supported by Slurm -and PBSPro. +to provide parameters for how a job should be executed. For +more information, see ``LaunchSettings`` Types of Settings: @@ -54,16 +51,7 @@ Types of Settings: .. autosummary:: RunSettings - SrunSettings - AprunSettings - MpirunSettings - MpiexecSettings - OrterunSettings - JsrunSettings DragonRunSettings - SbatchSettings - QsubBatchSettings - BsubBatchSettings Settings objects can accept a container object that defines a container runtime, image, and arguments to use for the workload. Below is a list of @@ -76,89 +64,26 @@ Types of Containers: Singularity -.. _rs-api: +.. _ls_api: -RunSettings +LaunchSettings ----------- When running SmartSim on laptops and single node workstations, -the base ``RunSettings`` object is used to parameterize jobs. -``RunSettings`` include a ``run_command`` parameter for local +the base ``LaunchSettings`` object is used to parameterize jobs. +``LaunchSettings`` include a ``run_command`` parameter for local launches that utilize a parallel launch binary like ``mpirun``, ``mpiexec``, and others. .. autosummary:: - RunSettings.add_exe_args + RunSettings.env_vars + RunSettings.launch_args RunSettings.update_env - -.. autoclass:: RunSettings - :inherited-members: - :undoc-members: - :members: - - -.. _srun_api: - -SrunSettings ------------- - - -``SrunSettings`` can be used for running on existing allocations, -running jobs in interactive allocations, and for adding srun -steps to a batch. - - -.. autosummary:: - - SrunSettings.set_nodes - SrunSettings.set_node_feature - SrunSettings.set_tasks - SrunSettings.set_tasks_per_node - SrunSettings.set_walltime - SrunSettings.set_hostlist - SrunSettings.set_excluded_hosts - SrunSettings.set_cpus_per_task - SrunSettings.add_exe_args - SrunSettings.format_run_args - SrunSettings.format_env_vars - SrunSettings.update_env - -.. autoclass:: SrunSettings - :inherited-members: - :undoc-members: - :members: - - -.. _aprun_api: - -AprunSettings -------------- - - -``AprunSettings`` can be used on any system that supports the -Cray ALPS layer. SmartSim supports using ``AprunSettings`` -on PBSPro WLM systems. - -``AprunSettings`` can be used in interactive session (on allocation) -and within batch launches (e.g., ``QsubBatchSettings``) - - -.. autosummary:: - - AprunSettings.set_cpus_per_task - AprunSettings.set_hostlist - AprunSettings.set_tasks - AprunSettings.set_tasks_per_node - AprunSettings.make_mpmd - AprunSettings.add_exe_args - AprunSettings.format_run_args - AprunSettings.format_env_vars - AprunSettings.update_env - -.. autoclass:: AprunSettings + +.. autoclass:: LaunchSettings :inherited-members: :undoc-members: :members: @@ -174,7 +99,7 @@ PBS, if Dragon is available in the Python environment (see `_dragon_install` for instructions on how to install it through ``smart``). ``DragonRunSettings`` can be used in interactive sessions (on allcation) -and within batch launches (i.e. ``SbatchSettings`` or ``QsubBatchSettings``, +and within batch launches (i.e. ``sbatch`` or ``qsubbatch``, for Slurm and PBS sessions, respectively). .. autosummary:: @@ -187,207 +112,6 @@ for Slurm and PBS sessions, respectively). :members: -.. _jsrun_api: - -JsrunSettings -------------- - - -``JsrunSettings`` can be used on any system that supports the -IBM LSF launcher. - -``JsrunSettings`` can be used in interactive session (on allocation) -and within batch launches (i.e. ``BsubBatchSettings``) - - -.. autosummary:: - - JsrunSettings.set_num_rs - JsrunSettings.set_cpus_per_rs - JsrunSettings.set_gpus_per_rs - JsrunSettings.set_rs_per_host - JsrunSettings.set_tasks - JsrunSettings.set_tasks_per_rs - JsrunSettings.set_binding - JsrunSettings.make_mpmd - JsrunSettings.set_mpmd_preamble - JsrunSettings.update_env - JsrunSettings.set_erf_sets - JsrunSettings.format_env_vars - JsrunSettings.format_run_args - - -.. autoclass:: JsrunSettings - :inherited-members: - :undoc-members: - :members: - -.. _openmpi_run_api: - -MpirunSettings --------------- - - -``MpirunSettings`` are for launching with OpenMPI. ``MpirunSettings`` are -supported on Slurm and PBSpro. - - -.. autosummary:: - - MpirunSettings.set_cpus_per_task - MpirunSettings.set_hostlist - MpirunSettings.set_tasks - MpirunSettings.set_task_map - MpirunSettings.make_mpmd - MpirunSettings.add_exe_args - MpirunSettings.format_run_args - MpirunSettings.format_env_vars - MpirunSettings.update_env - -.. autoclass:: MpirunSettings - :inherited-members: - :undoc-members: - :members: - -.. _openmpi_exec_api: - -MpiexecSettings ---------------- - - -``MpiexecSettings`` are for launching with OpenMPI's ``mpiexec``. ``MpirunSettings`` are -supported on Slurm and PBSpro. - - -.. autosummary:: - - MpiexecSettings.set_cpus_per_task - MpiexecSettings.set_hostlist - MpiexecSettings.set_tasks - MpiexecSettings.set_task_map - MpiexecSettings.make_mpmd - MpiexecSettings.add_exe_args - MpiexecSettings.format_run_args - MpiexecSettings.format_env_vars - MpiexecSettings.update_env - -.. autoclass:: MpiexecSettings - :inherited-members: - :undoc-members: - :members: - -.. _openmpi_orte_api: - -OrterunSettings ---------------- - - -``OrterunSettings`` are for launching with OpenMPI's ``orterun``. ``OrterunSettings`` are -supported on Slurm and PBSpro. - - -.. autosummary:: - - OrterunSettings.set_cpus_per_task - OrterunSettings.set_hostlist - OrterunSettings.set_tasks - OrterunSettings.set_task_map - OrterunSettings.make_mpmd - OrterunSettings.add_exe_args - OrterunSettings.format_run_args - OrterunSettings.format_env_vars - OrterunSettings.update_env - -.. autoclass:: OrterunSettings - :inherited-members: - :undoc-members: - :members: - - ------------------------------------------- - - -.. _sbatch_api: - -SbatchSettings --------------- - - -``SbatchSettings`` are used for launching batches onto Slurm -WLM systems. - - -.. autosummary:: - - SbatchSettings.set_account - SbatchSettings.set_batch_command - SbatchSettings.set_nodes - SbatchSettings.set_hostlist - SbatchSettings.set_partition - SbatchSettings.set_queue - SbatchSettings.set_walltime - SbatchSettings.format_batch_args - -.. autoclass:: SbatchSettings - :inherited-members: - :undoc-members: - :members: - -.. _qsub_api: - -QsubBatchSettings ------------------ - - -``QsubBatchSettings`` are used to configure jobs that should -be launched as a batch on PBSPro systems. - - -.. autosummary:: - - QsubBatchSettings.set_account - QsubBatchSettings.set_batch_command - QsubBatchSettings.set_nodes - QsubBatchSettings.set_ncpus - QsubBatchSettings.set_queue - QsubBatchSettings.set_resource - QsubBatchSettings.set_walltime - QsubBatchSettings.format_batch_args - - -.. autoclass:: QsubBatchSettings - :inherited-members: - :undoc-members: - :members: - - -.. _bsub_api: - -BsubBatchSettings ------------------ - - -``BsubBatchSettings`` are used to configure jobs that should -be launched as a batch on LSF systems. - - -.. autosummary:: - - BsubBatchSettings.set_walltime - BsubBatchSettings.set_smts - BsubBatchSettings.set_project - BsubBatchSettings.set_nodes - BsubBatchSettings.set_expert_mode_req - BsubBatchSettings.set_hostlist - BsubBatchSettings.set_tasks - BsubBatchSettings.format_batch_args - - -.. autoclass:: BsubBatchSettings - :inherited-members: - :undoc-members: - :members: - .. _singularity_api: Singularity @@ -405,75 +129,69 @@ container. .. _orc_api: -Orchestrator +FeatureStore ============ .. currentmodule:: smartsim.database .. autosummary:: - Orchestrator.__init__ - Orchestrator.db_identifier - Orchestrator.num_shards - Orchestrator.db_nodes - Orchestrator.hosts - Orchestrator.reset_hosts - Orchestrator.remove_stale_files - Orchestrator.get_address - Orchestrator.is_active - Orchestrator.set_cpus - Orchestrator.set_walltime - Orchestrator.set_hosts - Orchestrator.set_batch_arg - Orchestrator.set_run_arg - Orchestrator.enable_checkpoints - Orchestrator.set_max_memory - Orchestrator.set_eviction_strategy - Orchestrator.set_max_clients - Orchestrator.set_max_message_size - Orchestrator.set_db_conf - Orchestrator.telemetry - Orchestrator.checkpoint_file - Orchestrator.batch - -Orchestrator + FeatureStore.__init__ + FeatureStore.fs_identifier + FeatureStore.num_shards + FeatureStore.fs_nodes + FeatureStore.hosts + FeatureStore.reset_hosts + FeatureStore.remove_stale_files + FeatureStore.get_address + FeatureStore.is_active + FeatureStore.set_cpus + FeatureStore.set_walltime + FeatureStore.set_hosts + FeatureStore.set_batch_arg + FeatureStore.set_run_arg + FeatureStore.enable_checkpoints + FeatureStore.set_max_memory + FeatureStore.set_eviction_strategy + FeatureStore.set_max_clients + FeatureStore.set_max_message_size + FeatureStore.set_fs_conf + FeatureStore.telemetry + FeatureStore.checkpoint_file + FeatureStore.batch + +FeatureStore ------------ -.. _orchestrator_api: +.. _featurestore_api: -.. autoclass:: Orchestrator +.. autoclass:: FeatureStore :members: :inherited-members: :undoc-members: .. _model_api: -Model -===== +Application +=========== -.. currentmodule:: smartsim.entity.model +.. currentmodule:: smartsim.entity .. autosummary:: - Model.__init__ - Model.attach_generator_files - Model.colocate_db - Model.colocate_db_tcp - Model.colocate_db_uds - Model.colocated - Model.add_ml_model - Model.add_script - Model.add_function - Model.params_to_args - Model.register_incoming_entity - Model.enable_key_prefixing - Model.disable_key_prefixing - Model.query_key_prefixing - -Model + Application.__init__ + Application.exe + Application.exe_args + Application.file_parameters + Application.incoming_entities + Application.key_prefixing_enabled + Application.add_exe_args + Application.as_executable_sequence + +Application ----- -.. autoclass:: Model +.. autoclass:: Application :members: :show-inheritance: :inherited-members: @@ -481,20 +199,20 @@ Model Ensemble ======== -.. currentmodule:: smartsim.entity.ensemble +.. currentmodule:: smartsim.builders .. autosummary:: Ensemble.__init__ - Ensemble.add_model - Ensemble.add_ml_model - Ensemble.add_script - Ensemble.add_function - Ensemble.attach_generator_files - Ensemble.enable_key_prefixing - Ensemble.models - Ensemble.query_key_prefixing - Ensemble.register_incoming_entity + Ensemble.exe + Ensemble.exe_args + Ensemble.exe_arg_parameters + Ensemble.files + Ensemble.file_parameters + Ensemble.max_permutations + Ensemble.permutation_strategy + Ensemble.replicas + Ensemble.build_jobs Ensemble -------- diff --git a/doc/batch_settings.rst b/doc/batch_settings.rst index 07cef4c95e..b54b73f19c 100644 --- a/doc/batch_settings.rst +++ b/doc/batch_settings.rst @@ -13,11 +13,7 @@ launching capabilities tailored for specific workload managers (WLMs). Each Smar `launcher` interfaces with a ``BatchSettings`` subclass specific to a system's WLM: - The Slurm `launcher` supports: - - :ref:`SbatchSettings` -- The PBS Pro `launcher` supports: - - :ref:`QsubBatchSettings` -- The LSF `launcher` supports: - - :ref:`BsubBatchSettings` + - :ref:`LaunchSettings` .. note:: The local `launcher` does not support batch jobs. diff --git a/doc/experiment.rst b/doc/experiment.rst index 716df12282..b8c2b7484a 100644 --- a/doc/experiment.rst +++ b/doc/experiment.rst @@ -27,8 +27,8 @@ Settings are given to ``Model`` and ``Ensemble`` objects to provide parameters f Once a workflow component is initialized (e.g. ``Orchestrator``, ``Model`` or ``Ensemble``), a user has access to the associated entity API which supports configuring and retrieving the entities' information: -* :ref:`Orchestrator API` -* :ref:`Model API` +* :ref:`FeatureStore API` +* :ref:`Application API` * :ref:`Ensemble API` There is no limit to the number of SmartSim entities a user can @@ -103,10 +103,10 @@ associated ``Experiment.create_...`` factory method shown below. - Return Type * - ``create_database`` - ``orch = exp.create_database([port, db_nodes, ...])`` - - :ref:`Orchestrator ` + - :ref:`FeatureStore ` * - ``create_model`` - ``model = exp.create_model(name, run_settings)`` - - :ref:`Model ` + - :ref:`Application ` * - ``create_ensemble`` - ``ensemble = exp.create_ensemble(name[, params, ...])`` - :ref:`Ensemble ` diff --git a/doc/run_settings.rst b/doc/run_settings.rst index ed12df8cbe..842b85aa37 100644 --- a/doc/run_settings.rst +++ b/doc/run_settings.rst @@ -122,12 +122,12 @@ for each job scheduler. .. group-tab:: Slurm - The Slurm `launcher` supports the :ref:`SrunSettings API ` as well as the :ref:`MpirunSettings API `, - :ref:`MpiexecSettings API ` and :ref:`OrterunSettings API ` that each can be used to run executables - with launch binaries like `"srun"`, `"mpirun"`, `"mpiexec"` and `"orterun"`. Below we step through initializing a ``SrunSettings`` and ``MpirunSettings`` + The Slurm `launcher` supports the :ref:`LaunchSettings API ` that can be + used to run executables with launch binaries like `"srun"`, `"mpirun"`, `"mpiexec"` + and `"orterun"`. Below we step through initializing a ``LaunchSettings`` instance on a Slurm based machine using the associated `run_command`. - **SrunSettings** + **LaunchSettings** Run a job with the `srun` command on a Slurm based system. Any arguments passed in the `run_args` dict will be converted into `srun` arguments and prefixed with `"--"`. @@ -151,165 +151,6 @@ for each job scheduler. # Set the number of tasks for this job run_settings.set_tasks_per_node(25) - **MpirunSettings** - - Run a job with the `mpirun` command (MPI-standard) on a Slurm based system. Any - arguments passed in the `run_args` dict will be converted into `mpirun` arguments - and prefixed with `"--"`. Values of `None` can be provided for arguments that do - not have values. - - .. code-block:: python - - from smartsim import Experiment - - # Initialize the Experiment and provide launcher Slurm - exp = Experiment("name-of-experiment", launcher="slurm") - - # Initialize a MpirunSettings object - run_settings = exp.create_run_settings(exe="echo", exe_args="Hello World", run_command="mpirun") - # Set the number of cpus to use per task - run_settings.set_cpus_per_task(2) - # Set the number of tasks for this job - run_settings.set_tasks(100) - # Set the number of tasks for this job - run_settings.set_tasks_per_node(25) - - Users may replace `mpirun` with `mpiexec` or `orterun`. - - - .. note:: - SmartSim will look for an allocation by accessing the associated WLM job ID environment variable. If an allocation - is present, the entity will be launched on the reserved compute resources. A user may also specify the allocation ID - when initializing a run settings object via the `alloc` argument. If an allocation is specified, the entity receiving - these run parameters will launch on that allocation. - - .. group-tab:: PBS Pro - The PBS Pro `launcher` supports the :ref:`AprunSettings API ` as well as the :ref:`MpirunSettings API `, - :ref:`MpiexecSettings API ` and :ref:`OrterunSettings API ` that each can be used to run executables - with launch binaries like `"aprun"`, `"mpirun"`, `"mpiexec"` and `"orterun"`. Below we step through initializing a ``AprunSettings`` and ``MpirunSettings`` - instance on a PBS Pro based machine using the associated `run_command`. - - **AprunSettings** - - Run a job with `aprun` command on a PBS Pro based system. Any arguments passed in - the `run_args` dict will be converted into `aprun` arguments and prefixed with `--`. - Values of `None` can be provided for arguments that do not have values. - - .. code-block:: python - - from smartsim import Experiment - - # Initialize the experiment and provide launcher PBS Pro - exp = Experiment("name-of-experiment", launcher="pbs") - - # Initialize a AprunSettings object - run_settings = exp.create_run_settings(exe="echo", exe_args="Hello World", run_command="aprun") - # Set the number of cpus to use per task - run_settings.set_cpus_per_task(2) - # Set the number of tasks for this job - run_settings.set_tasks(100) - # Set the number of tasks for this job - run_settings.set_tasks_per_node(25) - - **MpirunSettings** - - Run a job with `mpirun` command on a PBS Pro based system. Any arguments passed - in the `run_args` dict will be converted into `mpirun` arguments and prefixed with `--`. - Values of `None` can be provided for arguments that do not have values. - - .. code-block:: python - - from smartsim import Experiment - - # Initialize the experiment and provide launcher PBS Pro - exp = Experiment("name-of-experiment", launcher="pbs") - - # Initialize a MpirunSettings object - run_settings = exp.create_run_settings(exe="echo", exe_args="Hello World", run_command="mpirun") - # Set the number of cpus to use per task - run_settings.set_cpus_per_task(2) - # Set the number of tasks for this job - run_settings.set_tasks(100) - # Set the number of tasks for this job - run_settings.set_tasks_per_node(25) - - Users may replace `mpirun` with `mpiexec` or `orterun`. - - .. group-tab:: PALS - The PALS `launcher` supports the :ref:`MpiexecSettings API ` that can be used to run executables - with the `mpiexec` launch binary. Below we step through initializing a ``MpiexecSettings`` instance on a PALS - based machine using the associated `run_command`. - - **MpiexecSettings** - - Run a job with `mpiexec` command on a PALS based system. Any arguments passed in the `run_args` dict will be converted into `mpiexec` arguments and prefixed with `--`. - Values of `None` can be provided for arguments that do not have values. - - .. code-block:: python - - from smartsim import Experiment - - # Initialize the experiment and provide launcher PALS - exp = Experiment("name-of-experiment", launcher="pals") - - # Initialize a MpiexecSettings object - run_settings = exp.create_run_settings(exe="echo", exe_args="Hello World", run_command="mpiexec") - # Set the number of tasks for this job - run_settings.set_tasks(100) - # Set the number of tasks for this job - run_settings.set_tasks_per_node(25) - - .. group-tab:: LSF - The LSF `launcher` supports the :ref:`JsrunSettings API ` as well as the :ref:`MpirunSettings API `, - :ref:`MpiexecSettings API ` and :ref:`OrterunSettings API ` that each can be used to run executables - with launch binaries like `"jsrun"`, `"mpirun"`, `"mpiexec"` and `"orterun"`. Below we step through initializing a ``JsrunSettings`` and ``MpirunSettings`` - instance on a LSF based machine using the associated `run_command`. - - **JsrunSettings** - - Run a job with `jsrun` command on a LSF based system. Any arguments passed in the - `run_args` dict will be converted into `jsrun` arguments and prefixed with `--`. - Values of `None` can be provided for arguments that do not have values. - - .. code-block:: python - - from smartsim import Experiment - - # Initialize the experiment and provide launcher LSF - exp = Experiment("name-of-experiment", launcher="lsf") - - # Initialize a JsrunSettings object - run_settings = exp.create_run_settings(exe="echo", exe_args="Hello World", run_command="jsrun") - # Set the number of cpus to use per task - run_settings.set_cpus_per_task(2) - # Set the number of tasks for this job - run_settings.set_tasks(100) - # Set the number of tasks for this job - run_settings.set_tasks_per_node(25) - - **MpirunSettings** - - Run a job with `mpirun` command on a LSF based system. Any arguments passed in the - `run_args` dict will be converted into `mpirun` arguments and prefixed with `--`. - Values of `None` can be provided for arguments that do not have values. - - .. code-block:: python - - from smartsim import Experiment - - # Initialize the experiment and provide launcher LSF - exp = Experiment("name-of-experiment", launcher="lsf") - - # Initialize a MpirunSettings object - run_settings = exp.create_run_settings(exe="echo", exe_args="Hello World", run_command="mpirun") - # Set the number of cpus to use per task - run_settings.set_cpus_per_task(2) - # Set the number of tasks for this job - run_settings.set_tasks(100) - # Set the number of tasks for this job - run_settings.set_tasks_per_node(25) - - Users may replace `mpirun` with `mpiexec` or `orterun`. .. group-tab:: Dragon The Dragon `launcher` does not need any launch binary. Below we step through initializing a ``DragonRunSettings`` instance on a Slurm-