nntool.slurm.function¶
Classes
|
The function for the slurm job, which can be used for distributed or non-distributed job (controlled by use_distributed_env in the slurm dataclass). |
- class nntool.slurm.function.SlurmFunction(submit_fn, default_submit_fn_args=None, default_submit_fn_kwargs=None)[source]¶
The function for the slurm job, which can be used for distributed or non-distributed job (controlled by use_distributed_env in the slurm dataclass).
- is_configured()[source]¶
Whether the slurm function has been configured.
- Returns:
True if the slurm function has been configured, False otherwise
- Return type:
bool
- is_distributed()[source]¶
Whether the slurm function is distributed.
- Returns:
True if the slurm function is distributed, False otherwise
- Return type:
bool
- configure(slurm_config, slurm_params_kwargs=None, slurm_submit_kwargs=None, slurm_task_kwargs=None, system_argv=None, pack_code_include_fn=None, pack_code_exclude_fn=None)[source]¶
Update the slurm configuration for the slurm function. A slurm function for the slurm job, which can be used for distributed or non-distributed job (controlled by use_distributed_env in the slurm dataclass).
Exported Distributed Enviroment Variables
NNTOOL_SLURM_HAS_BEEN_SET_UPis a special environment variable to indicate that the slurm has been set up.- After the set up, the distributed job will be launched and the following variables are exported:
num_processes: intnum_machines: intmachine_rank: intmain_process_ip: strmain_process_port: int
- Parameters:
slurm_config (SlurmConfig) – SlurmConfig, the slurm configuration dataclass, defaults to None
slurm_params_kwargs (Dict[str, str] | None) – extra slurm arguments for the slurm configuration, defaults to {}
slurm_submit_kwargs (Dict[str, str] | None) – extra slurm arguments for srun or sbatch, defaults to {}
slurm_task_kwargs (Dict[str, str] | None) – extra arguments for the setting of distributed task, defaults to {}
system_argv (List[str] | None) – the system arguments for the second launch in the distributed task (by default it will use the current system arguments sys.argv[1:]), defaults to None
- Returns:
a new copy with configured slurm parameters
- Return type:
- submit(*submit_fn_args, **submit_fn_kwargs)[source]¶
An alias function to
__call__.- Parameters:
submit_fn_args – arguments for the submit_fn
submit_fn_kwargs – keyword arguments for the submit_fn
- Raises:
Exception – if the submit_fn is not set up
- Returns:
Slurm Job or the return value of the submit_fn
- Return type:
Job | Any
- map_array(*submit_fn_args, **submit_fn_kwargs)[source]¶
Run the submit_fn with the given arguments and keyword arguments. The function is non-blocking in the mode of slurm, while other modes cause blocking. If there is no given arguments or keyword arguments, the default arguments and keyword arguments will be used.
- Parameters:
submit_fn_args – arguments for the submit_fn
submit_fn_kwargs – keyword arguments for the submit_fn
- Raises:
Exception – if the submit_fn is not set up
- Returns:
Slurm Job or the return value of the submit_fn
- Return type:
Job[Any] | List[Job[Any]] | Any
- on_condition(jobs, condition='afterok')[source]¶
Mark this job should be executed after the provided slurm jobs have been done. This function allows combining different conditions by multiple calling.
- Parameters:
jobs (Job | List[Job] | Tuple[Job]) – dependent jobs
condition (Literal['afterany', 'afterok', 'afternotok']) – run condition, defaults to “afterok”
- Returns:
the function itself
- Return type:
- afterok(*jobs)[source]¶
Mark the function should be executed after the provided slurm jobs have been done.
- Returns:
the new slurm function with the condition
- Return type:
- afterany(*jobs)[source]¶
Mark the function should be executed after any one of the provided slurm jobs has been done.
- Returns:
the new slurm function with the condition
- Return type: