On Learning Under Dataset Noise

I am extremely curious about how data or experience drives the model to learn imperfect but useful prediction rules. One of the aspect of such learning process is the noisy experience in the data from which the model is going to learn. At the last day of the second decade of the 21st century. I would like to summarize some papers that I have encountered along my daily browsing.

Note that, learning under noise is highly correlated with topics like curriculum learning, data augmentation, data selection and active learning, which may or may not be covered in this post, based on which I hope one day I would write something about.

I want to do this summary due to the very paper named (and the blog post):

This is a method paper for empirical improvement. Their basic ideas are:

  1. Dataset pruning
  2. Examples ranking
  3. Confidence-weighed training

which once done properly, I think, is the best practice of learning under noise.

One thing that really interests me is their so-called model-agnostic dataset uncertainty estimation method.

Learning under noise

Uncertainty estimate

Data selection

Curriculum learning