From 2418b4179af1054e67cc62305dae6808e7387ca6 Mon Sep 17 00:00:00 2001 From: Hanna Date: Thu, 7 Nov 2024 13:48:53 +0100 Subject: [PATCH] first version of new runner explanation --- doc/intro.rst | 68 +++++++++++++++++++++++---------------------------- 1 file changed, 31 insertions(+), 37 deletions(-) diff --git a/doc/intro.rst b/doc/intro.rst index f0588846..67b90d27 100644 --- a/doc/intro.rst +++ b/doc/intro.rst @@ -247,7 +247,7 @@ the reason for the contract enforced on the Optimizee constructor Note that all the (non-exploring) paramters to the `Optimizer` is passed in to its constructor through a :func:`~collections.namedtuple` to keep the paramters documented. For examples see :class:`.GeneticAlgorithmParameters` -or :class:`.SimulatedAnnealingParameters` +or :class:`.CrossEntropyParameters` The :meth:`~l2l.optimizers.optimizer.Optimizer.post_process` function: ---------------------------------------------------------------------- @@ -295,43 +295,28 @@ logging and recording. See the source of :file:`bin/l2l-template.py` for more de Execution setup ~~~~~~~~~~~~~~~ -The L2L framework works with JUBE in order to deploy the execution of the different instances of the optimizee on -the available computational resources. This requires that the trajectory contains a parameter group called JUBE_params +The L2L framework works together with workers from a runner class to distribute the execution of the different instances of the optimizee on +the available computational resources. This requires that the trajectory contains a parameter group called runner_params which contains details for the right execution of the program. -**Mandatory** steps to define the execution of the optimizees: -1. Add a parameter group to the :obj: traj called JUBE_params using its :meth: f_add_parameter_group. -2. Setup the execution command :attr: exec by using the trajectory :meth: f_add_parameter_to_group. -Add parameter to group receives three parameters, which in this case should be specified as: -group_name=JUBE_params, key="exec", val= -This will be used to launch individual optimizees. An example of a simple call without using MPI calls -is: "python " + os.path.join(paths.simulation_path, "run_files/run_optimizee.py" -3. Setup the ready and working paths :attr: exec by using the trajectory :meth: f_add_parameter_to_group. -Add parameter to group receives three parameters, which in this case should be specified as: -group_name=JUBE_params, key="paths", val= - should contain the root working path. An example of this path is: -paths = Paths(name, dict(run_num='test'), root_dir_path=, suffix="-example") - -In order to launch simulations on a laptop or a local cluster without a scheduler, only the mandatory parameters must -be specified. These parameters are part of the template. - -To launch the simulations on a cluster with a scheduler, the following optional parameters must be defined. They currently match -slurm but this can also be adjusted to other schedulers. -1. Name of the scheduler, :atr: "scheduler", e.g. "Slurm" -2. Command to submit jobs to the schedulers, :atr: "submit_cmd", e.g. "sbatch" -3. Template file for the particular scheduler, :atr: "job_file", e.g. "job.run" -4. Number of nodes to request for each run, :atr: "nodes", e.g. "1" -5. Requested time for the compute resources, :atr: "walltime", e.g. "00:01:00" -6. MPI Processes per node, :atr: "ppn", e.g. "1" -7. CPU cores per MPI process, :atr: "cpu_pp", e.g. "1" -8. Threads per process, :atr: "threads_pp", e.g. "1" -9. Type of emails to be sent from the scheduler, :atr: "mail_mode", e.g. "ALL" -10. Email to notify events from the scheduler, :atr: "mail_address", e.g. "me@mymail.com" -11. Error file for the job, :atr: "err_file", e.g. "stderr" -12. Output file for the job, :atr: "out_file", e.g. "stdout" -13. MPI Processes per job, :atr: "tasks_per_job", e.g. "1" - -See the :file: 'l2l-template-scheduler.py' for a base file with all these parameters. +**Mandatory** steps to define the execution of the optimizees, if you do not want to use the default parameters: + +1. Create a dictionary that contains the runner parameters: + * **srun**: This is the srun command that is called when running the program in parallel to execute an individual. + The default-parameter is an empty string, which means that the program is only executed locally. + * **exec**: This is the command to execute an individual. The default is set to python. + * **max_workers**: The maximum number of workers must be determined by the user, + depending on how many computer resources were requested and how many are required per individual. + For example: a total of 1 node with 100 cores was requested; + 50 cores are required for an individual, therefore a maximum of 2 workers may be used. + The default is set to 32 workers. + * **work_path**: Specifies the path for the workspace. The results of the simulation, + the trajectories and the logs for the individual workers are stored here. The default-parameter is set to the root_dir_path + of the experiment. + * **path_obj**: Strores the path object. +2. Pass the dictionary to the experiment while calling **experiment.prepare_experiment(runner_params=params)**. + +See the :file:`bin/l2l-template.py` for a base file with all these parameters. Examples ******** @@ -348,12 +333,21 @@ Data postprocessing Todo... +.. _checkpointing: + +Checkpointing +************* + +Currently, checkpointing is only available for genetic algorithms. +Here, a generation from a previous simulation can be read in and continued from there. +For an example look at :file:`bin/l2l-fun-ga-checkpoint.py` + .. _parallelization: Parallelization *************** -We also support running different instances of the experiments on different cores and hosts using Jube. +We also support running different instances of the experiments on different cores. .. _logging: