The Question of Distillation from Weak Teachers
During my PhD, I worked on the problem of Distillation from Weak Teachers. While model distillation (or knowledge distillation) is a well known concept in th...
During my PhD, I worked on the problem of Distillation from Weak Teachers. While model distillation (or knowledge distillation) is a well known concept in th...
The EM algorithm is a versatile technique for performing Maximum Likelihood Estimation (MLE) under hidden variables. In this post, we will go over the Expect...