mednet.engine.callbacks¶
Lightning callbacks to log custom measurements.
Classes
|
Callback to log various training metrics and device information. |
- class mednet.engine.callbacks.LoggingCallback(resource_monitor)[source]¶
Bases:
Callback
Callback to log various training metrics and device information.
Rationale:
Losses are logged at the end of every batch, accumulated and handled by the lightning framework.
Everything else is done at the end of a training or validation epoch and mostly concerns runtime metrics such as memory and cpu/gpu utilisation.
- Parameters:
resource_monitor (
ResourceMonitor
) – A monitor that watches resource usage (CPU/GPU) in a separate process and totally asynchronously with the code execution.
- on_train_start(trainer, pl_module)[source]¶
Execute actions when training starts (lightning callback).
This method is executed whenever you start training a module.
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.
- on_train_end(trainer, pl_module)[source]¶
Execute actions when training ends (lightning callback).
This method is executed whenever you end training a module.
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.
- on_train_epoch_start(trainer, pl_module)[source]¶
Execute actions when a training epoch starts (lightning callback).
This method is executed whenever a training epoch starts. Presumably, batches happen as often as possible. You want to make this code very fast. Do not log things to the terminal or the such, or do complicated (lengthy) calculations.
Warning
This is executed while you are training. Be very succint or face the consequences of slow training!
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.
- Return type:
- on_train_epoch_end(trainer, pl_module)[source]¶
Execute actions after a training epoch ends (lightning callback).
This method is executed whenever a training epoch ends. Presumably, epochs happen as often as possible. You want to make this code relatively fast to avoid significative runtime slow-downs.
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.
- on_train_batch_end(trainer, pl_module, outputs, batch, batch_idx)[source]¶
Execute actions after a training batch ends (lightning callback).
This method is executed whenever a training batch ends. Presumably, batches happen as often as possible. You want to make this code very fast. Do not log things to the terminal or the such, or do complicated (lengthy) calculations.
Warning
This is executed while you are training. Be very succint or face the consequences of slow training!
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.outputs (
Mapping
[str
,Tensor
]) – The outputs of the module’straining_step
.batch (
Mapping
[str
,Any
]) – The data that the training step received.batch_idx (
int
) – The relative number of the batch.
- Return type:
- on_validation_epoch_start(trainer, pl_module)[source]¶
Execute actions before a validation batch starts (lightning callback).
This method is executed whenever a validation batch starts. Presumably, batches happen as often as possible. You want to make this code very fast. Do not log things to the terminal or the such, or do complicated (lengthy) calculations.
Warning
This is executed while you are training. Be very succint or face the consequences of slow training!
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.
- Return type:
- on_validation_epoch_end(trainer, pl_module)[source]¶
Execute actions after a validation batch ends (lightning callback).
This method is executed whenever a validation epoch ends. Presumably, epochs happen as often as possible. You want to make this code relatively fast to avoid significative runtime slow-downs.
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.
- Return type:
- on_validation_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx=0)[source]¶
Execute actions after a validation after ends (lightning callback).
This method is executed whenever a validation batch ends. Presumably, batches happen as often as possible. You want to make this code very fast. Do not log things to the terminal or the such, or do complicated (lengthy) calculations.
Warning
This is executed while you are training. Be very succint or face the consequences of slow training!
- Parameters:
trainer (
Trainer
) – The Lightning trainer object.pl_module (
LightningModule
) – The lightning module that is being trained.outputs (
Tensor
) – The outputs of the module’straining_step
.batch (
Mapping
[str
,Any
]) – The data that the training step received.batch_idx (
int
) – The relative number of the batch.dataloader_idx (
int
) – Index of the dataloader used during validation. Use this to figure out which dataset was used for this validation epoch.
- Return type: