General
MAML is a model for meta-learning that learns the model’s parameters
such that a small amount of gradient updates can lead to fast leaning on a new task.
Properties
- MAML does not expand the number of learned parameters
- No constraint on the architecture or network of the model
- Can be combined with other deep-learning frameworks such as RNN, CNN, and MLP
Problem Setup
Single Task
Model: ![]()
Dataset: ![]()
Goal: ![]()
Task: ![]()
is the distribution to generate data for tasks, either in super vised learning, classification, or reinforcement learning.
Multi Task
Model: ![]()
where
is the task index.
Dataset: ![]()
Goal: ![]()
Example
Assume we have a model:
with parameters
. We adapt a new task
to the model, our parameter
will become
. The new parameter
is computerd using one or more gradient descent updates on task
. One gradient update is thus
![]()
The stepsize
can be learned or set as a fixed hyperparameter. By optimizing the model for the performance of
with respect to
across the tasks sampled from
. The meta objective is thus:
![]()
In meta-oprimization, we optmize the model parameters
, whereas the objective is computed using the ipdated model parameters
. MAML aims to optimize the model parameters such that one or small number of gradient steps on a new task will produce maximally effective behavior on that task. The meta-optimization across tasks is performed via stochastic gradient descent (SDG), such that the model parameters
are updated as follows
![]()
where
Source: paperswithcode.com, https://arxiv.org/abs/1703.03400v3
Neueste Kommentare