mindcv.optim¶
optim init
- mindcv.optim.create_optimizer(params, opt='adam', lr=0.001, weight_decay=0, momentum=0.9, nesterov=False, filter_bias_and_bn=True, loss_scale=1.0, schedule_decay=0.004, checkpoint_path='', eps=1e-10, **kwargs)[源代码]¶
Creates optimizer by name.
- 参数
params – network parameters. Union[list[Parameter],list[dict]], which must be the list of parameters or list of dicts. When the list element is a dictionary, the key of the dictionary can be “params”, “lr”, “weight_decay”,”grad_centralization” and “order_params”.
opt (str) – Wrapped optimizer. You could choose like ‘sgd’, ‘nesterov’, ‘momentum’, ‘adam’, ‘adamw’, ‘rmsprop’, ‘adagrad’, ‘lamb’. ‘adam’ is the default choise for convolution-based networks. ‘adamw’ is recommended for ViT-based networks. Default: ‘adam’.
lr (Optional[float]) – learning rate: float or lr scheduler. Fixed and dynamic learning rate are supported. Default: 1e-3.
weight_decay (float) – weight decay factor. It should be noted that weight decay can be a constant value or a Cell. It is a Cell only when dynamic weight decay is applied. Dynamic weight decay is similar to dynamic learning rate, users need to customize a weight decay schedule only with global step as input, and during training, the optimizer calls the instance of WeightDecaySchedule to get the weight decay value of current step. Default: 0.
momentum (float) – momentum if the optimizer supports. Default: 0.9.
nesterov (bool) – Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. Default: False.
filter_bias_and_bn (bool) – whether to filter batch norm paramters and bias from weight decay. If True, weight decay will not apply on BN parameters and bias in Conv or Dense layers. Default: True.
loss_scale (float) – A floating point value for the loss scale, which must be larger than 0.0. Default: 1.0.
schedule_decay (float) –
checkpoint_path (str) –
eps (float) –
- 返回
Optimizer object