mindcv.optim

optim init

mindcv.optim.create_optimizer(params, opt='adam', lr=0.001, weight_decay=0, momentum=0.9, nesterov=False, filter_bias_and_bn=True, loss_scale=1.0, schedule_decay=0.004, checkpoint_path='', eps=1e-10, **kwargs)[源代码]

Creates optimizer by name.

参数
  • params – network parameters. Union[list[Parameter],list[dict]], which must be the list of parameters or list of dicts. When the list element is a dictionary, the key of the dictionary can be “params”, “lr”, “weight_decay”,”grad_centralization” and “order_params”.

  • opt (str) – Wrapped optimizer. You could choose like ‘sgd’, ‘nesterov’, ‘momentum’, ‘adam’, ‘adamw’, ‘rmsprop’, ‘adagrad’, ‘lamb’. ‘adam’ is the default choise for convolution-based networks. ‘adamw’ is recommended for ViT-based networks. Default: ‘adam’.

  • lr (Optional[float]) – learning rate: float or lr scheduler. Fixed and dynamic learning rate are supported. Default: 1e-3.

  • weight_decay (float) – weight decay factor. It should be noted that weight decay can be a constant value or a Cell. It is a Cell only when dynamic weight decay is applied. Dynamic weight decay is similar to dynamic learning rate, users need to customize a weight decay schedule only with global step as input, and during training, the optimizer calls the instance of WeightDecaySchedule to get the weight decay value of current step. Default: 0.

  • momentum (float) – momentum if the optimizer supports. Default: 0.9.

  • nesterov (bool) – Whether to use Nesterov Accelerated Gradient (NAG) algorithm to update the gradients. Default: False.

  • filter_bias_and_bn (bool) – whether to filter batch norm paramters and bias from weight decay. If True, weight decay will not apply on BN parameters and bias in Conv or Dense layers. Default: True.

  • loss_scale (float) – A floating point value for the loss scale, which must be larger than 0.0. Default: 1.0.

  • schedule_decay (float) –

  • checkpoint_path (str) –

  • eps (float) –

返回

Optimizer object