{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Model-based Machine Learning Project\n",
    "\n",
    "## Project description\n",
    "Everybody familiar with machine learning knows that good data is essiential for making a good model. Garbage in - garbage out. However, data is a scarce resource and is typically expensine to gather. Active learning is a method that tries to minimize the amount of data needed by choosing the most informative data points to label and thus, avoid spending time on labeling data which add no new information to the model. But to choose the most informative data point, it is common practice to use information from the model itself to decide the informativeness of a data point. So now we have chicken-or-the-egg problem!\n",
    "\n",
    "To make a good model, we should have some informative data points. But on the other hand, to get informative data points we should have a good model. \n",
    "\n",
    "The typical approach is to 1) get some initial data, 2) fit a single model, 3) use the model to find informative data points and then 4) repeat 2-3 until converge or some critiria is satisfied.(*1)\n",
    "\n",
    "One way to tackle this problem, is to use multiple models, say 3 or 10, to decide on the most informative data points. This is called Query-by-Committee (QBC) and works well in practice both for classification and regression problems. One can also see this committee as an *ensemble model*, for example multiple Gaussian Processes with a specific kernel but different (hyper)parameters. Now the question is, why restrict ourself to only consider a small amount of models, when we can go fully Bayesian?\n",
    "\n",
    "The boring answer is that a fully Bayesian approach is computationally expensive (infeasible with n>2000..). However, a fully Bayesian approach also have some clear advantages:\n",
    "* We can partly solve the problem of fitting a wrong model(*2) by considering an arbitrarily amount of models instead.\n",
    "* Avoid being overconfident on point estimates --> we have the full posterior\n",
    "* Priors on unknown parameters help with regularization of the model when we have only a small data set (MAP instead of MLE)\n",
    "\n",
    "As such, we might be able to avoid on wrong model fits and thus, also avoid labeling data points that are not really informative given the true parameters of the model. This is what I will investigate in this project.\n",
    "\n",
    "*1: This is of course a simplied description. Using the model to find informative data points is called a *model-based* sampling strategy but you could also use a *model-free* sampling strategy, e.g. maximin distance to existing data point, so you don't have the problem of relying on the model. However, in practice the model-based strategies tends to outperform the model-free strategies.\n",
    "\n",
    "*2: Under the assumption that the model supports the data.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python NOSTROMO",
   "language": "python",
   "name": "nostromo"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
