{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Fixed Confidence Best Arm Identification in Bayesian Settings\n",
    "\n",
    "This code is the official implementation of 'Fixed Confidence Best Arm Identification in Bayesian Settings.'\n",
    "\n",
    "To proceed, simply press 'shift+enter' for each cell. \n",
    "\n",
    "### Requirements\n",
    "The following cell includes all the required packages for this code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "executionInfo": {
     "elapsed": 126,
     "status": "ok",
     "timestamp": 1702800713658,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "eaxTLEc5WWBD"
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from scipy.stats import norm\n",
    "from scipy.integrate import quad\n",
    "from scipy.optimize import minimize, minimize_scalar\n",
    "import random\n",
    "import time"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prior Distribution class\n",
    "\n",
    "This class is for implementing prior distribution $H$ and computing prior-dependent constants such as $L(H)$ efficiently. \n",
    "*prior\\_distribution* takes three main parameters:\n",
    "- *prior\\_mean*: corresponds to $(m_i)_{i=1}^k$, the mean vector of the prior distribution.\n",
    "- *prior\\_std*: corrsponds to $(\\xi_i)_{i=1}^k$, the vector of standard deviations.\n",
    "- *instance\\_std*: corresponds to $(\\sigma_i)_{i=1}^k$, the standard deviations of the reward distributions. \n",
    "\n",
    "We will mainly use two methods from this distribution:\n",
    "- sample\\_instance(): Sample $(\\mu_i)_{i=1}^k$ from the prior distribution $H$. \n",
    "- get_Delta_0($\\delta$): Compute $\\Delta_0$, which is defined in Algorithm 1. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "executionInfo": {
     "elapsed": 196,
     "status": "ok",
     "timestamp": 1702800715193,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "Xedzt0j2dqK5"
   },
   "outputs": [],
   "source": [
    "class prior_distribution:\n",
    "  def __init__(self, prior_mean, prior_std, instance_std):\n",
    "    self.K=np.size(prior_mean)\n",
    "    self.prior_mean = prior_mean\n",
    "    self.prior_std = prior_std\n",
    "    self.prior_cov_mat = np.diag(self.prior_std**2)\n",
    "    self.instance_std = instance_std\n",
    "    self._get_Lij()\n",
    "\n",
    "  def sample_instance(self):\n",
    "    return np.random.multivariate_normal(self.prior_mean, self.prior_cov_mat)\n",
    "\n",
    "  def _get_Lij(self):\n",
    "    self.Lij_whole = np.zeros((self.K, self.K)) #whole list of Lij\n",
    "    for i in range(self.K):\n",
    "      for j in range(self.K):\n",
    "        if i==j:\n",
    "          self.Lij_whole[i][j]=0\n",
    "        else:\n",
    "          integrand= lambda x: self._integrand(x,i,j)\n",
    "          self.Lij_whole[i][j]=quad(integrand, -np.inf, np.inf)[0]\n",
    "    self.Lij=np.sum(self.Lij_whole)\n",
    "\n",
    "  def _integrand(self,x,i,j):\n",
    "    product=1\n",
    "    for s in range(self.K):\n",
    "      if (s==i or s==j):\n",
    "        product=product*norm.pdf(self._standardize(x,s))\n",
    "      else:\n",
    "        product=product*norm.cdf(self._standardize(x,s))\n",
    "    return product\n",
    "\n",
    "  def _standardize(self,x,i):\n",
    "    return (x-self.prior_mean[i])/self.prior_std[i]\n",
    "\n",
    "  def get_Delta_0(self, delta):\n",
    "    return delta/(4*self.Lij)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "XnOWbW4-2H6H"
   },
   "source": [
    "# Best-Arm-Identification Algorithm\n",
    "## 0) Parent Class\n",
    "Each BAI algorithm will have the following four methonds as its main method:\n",
    "- \\_\\_init\\_\\_: for initialization\n",
    "- sample(): A sampling rule $(A_t)_t$, which determines the arm to draw at round $t$ based on the previous history.\n",
    "- stopping\\_criterion(): when to stop the sampling\n",
    "- update(): Update the information after each sampling.\n",
    "- recommendation(): A decision rule $J$, which determines the arm the forecaster recommends based on his sampling history\n",
    "\n",
    "All algorithms take only two inputs:\n",
    "- prior_dist: corresponds to $H$ in our main paper. It will be *prior\\_distribution* instance.\n",
    "- delta: the confidence level $\\delta$ in our main paper. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "executionInfo": {
     "elapsed": 153,
     "status": "ok",
     "timestamp": 1702800717566,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "P7X4dR0Qd3YL"
   },
   "outputs": [],
   "source": [
    "from ast import Pass\n",
    "class BAI_Algorithm:                    #Parent class for all algorithms\n",
    "  def __init__(self, prior_dist, delta):\n",
    "    Pass\n",
    "\n",
    "  def sample(self):\n",
    "    raise NotImplementedError\n",
    "\n",
    "  def stopping_criterion(self): # False for continue, True for stop\n",
    "    raise NotImplementedError\n",
    "\n",
    "  def update(self):\n",
    "    raise NotImplementedError\n",
    "\n",
    "  def recommendation(self):\n",
    "    raise NotImplementedError"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1) Top-Two Thompson Sampling\n",
    "\n",
    "is a class that implements the algorithm devised in the paper ['Simple bayesian algorithms for best arm identification'](https://proceedings.mlr.press/v49/russo16.html)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "executionInfo": {
     "elapsed": 151,
     "status": "ok",
     "timestamp": 1702800955389,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "WxtRkL315nsK"
   },
   "outputs": [],
   "source": [
    "class TTTS(BAI_Algorithm):              #Top Two Thompson Sampling\n",
    "  def __init__(self, prior_dist, delta):\n",
    "    self.beta = 0.5\n",
    "    self.prior_dist = prior_dist\n",
    "    self.K= prior_dist.K\n",
    "    self.delta = delta\n",
    "    self.C = 1\n",
    "    self.alpha=1\n",
    "    self.hist=[ [] for _ in range(self.K) ] # list of empty lists\n",
    "    self.pos_mean=np.zeros(self.K)\n",
    "    self.pos_std=np.zeros(self.K)\n",
    "    self.n_list=np.zeros(self.K)\n",
    "    self.m_hat = np.zeros(self.K)\n",
    "\n",
    "  def sample(self):\n",
    "    if np.min(self.n_list)==0:\n",
    "      return np.argmin(self.n_list)\n",
    "    alpha=np.zeros(self.K)\n",
    "    index_sample=np.random.binomial(1,self.beta)\n",
    "    alpha=self.posterior_sample()\n",
    "    bestI=np.argmax(alpha)\n",
    "\n",
    "    final_index=bestI         #With probability beta, return best index from the sample\n",
    "\n",
    "    if index_sample==0:       #With prob. 1-beta, repeat sampling until new best index appears\n",
    "      bestJ=bestI\n",
    "      while bestI==bestJ:\n",
    "        alpha=self.posterior_sample()\n",
    "        _mask = np.zeros(self.K, dtype=bool)\n",
    "        _mask[bestI] = True\n",
    "        masked = np.ma.array(alpha, mask=_mask)\n",
    "        bestJ=np.argmax(masked)\n",
    "      final_index=bestJ\n",
    "    return final_index\n",
    "\n",
    "  def posterior_sample(self):               #Function that calculates posterior distribution\n",
    "\n",
    "    sample_result=np.zeros(self.K)\n",
    "    for i in range(self.K):\n",
    "      sample_result[i]=np.random.normal(self.pos_mean[i],self.pos_std[i])\n",
    "    return sample_result\n",
    "\n",
    "\n",
    "  def update(self, a, reward):\n",
    "                          #Posterior update\n",
    "    self.m_hat[a] = (reward+self.n_list[a]*self.m_hat[a])/(self.n_list[a]+1)               #sample mean\n",
    "    self.n_list[a] = self.n_list[a]+1\n",
    "    m_p   = self.prior_dist.prior_mean[a]       #prior mean\n",
    "    var_a = self.prior_dist.instance_std[a]**2  # variance of the arm\n",
    "    var_p = self.prior_dist.prior_std[a]**2     # variance of the prior distribution\n",
    "    self.pos_mean[a] = (m_p*var_p + self.n_list[a]*self.m_hat[a]*var_a)/(self.n_list[a]*var_a + var_p)\n",
    "    self.pos_std[a]  = np.sqrt(var_a*var_p/(self.n_list[a]*var_a + var_p))\n",
    "\n",
    "\n",
    "\n",
    "\n",
    "  def stopping_criterion(self):\n",
    "    T=np.sum(self.n_list)\n",
    "    if T>100000000:\n",
    "        return True\n",
    "    if np.min(self.n_list)==0:\n",
    "      return False\n",
    "    i_max=np.argmax(self.m_hat)\n",
    "    W = np.zeros(self.K)\n",
    "    for j in range(self.K):\n",
    "      if i_max==j:\n",
    "          W[j]=np.inf\n",
    "      else:\n",
    "        tempi=self.n_list[i_max]/self.prior_dist.instance_std[i_max]**2\n",
    "        tempj=self.n_list[j]/self.prior_dist.instance_std[j]**2\n",
    "        infx=(tempi*self.m_hat[i_max]+tempj*self.m_hat[j])/(tempi+tempj)\n",
    "        W[j]=self.n_list[i_max]*self.Kinf_neg(i_max,infx)+self.n_list[j]*self.Kinf_pos(j,infx)\n",
    "\n",
    "    minWij=np.min(W)\n",
    "    threshold = np.log(self.C*(T**self.alpha) / self.delta) ####################### Need to be adjusted later\n",
    "\n",
    "    if minWij>threshold:\n",
    "      return True\n",
    "    else:\n",
    "      return False\n",
    "\n",
    "\n",
    "\n",
    "  def Kinf_neg(self, i, x):\n",
    "    if x>self.m_hat[i]:\n",
    "      return 0\n",
    "    else:\n",
    "      return self.KL_div(i,x)\n",
    "\n",
    "  def Kinf_pos(self,j,x):\n",
    "    if x<self.m_hat[j]:\n",
    "      return 0\n",
    "    else:\n",
    "      return self.KL_div(j,x)\n",
    "\n",
    "  def KL_div(self,i,x):\n",
    "    return (x-self.m_hat[i])**2/(2*prior_dist.instance_std[i]**2)\n",
    "\n",
    "  def recommendation(self):\n",
    "    return np.argmax(self.m_hat)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2) Top-Two UCB\n",
    "is a class that implements the algorithm devised in the paper ['Non-asymptotic analysis of a ucb-based top two algorithm'](https://proceedings.neurips.cc/paper_files/paper/2023/hash/d9b564716709357b4bccec9fc9ad04d2-Abstract-Conference.html)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "class TTUCB(BAI_Algorithm):              #Top Two Thompson Sampling - UNDER PROGRESS\n",
    "  def __init__(self, prior_dist, delta):\n",
    "    self.beta = 0.5\n",
    "    self.prior_dist = prior_dist\n",
    "    self.K= prior_dist.K\n",
    "    self.delta = delta\n",
    "    self.n_la = np.zeros((self.K, self.K)) # number of arm 'a'(second) pull when the leader was 'l'(first)\n",
    "    self.n_list=np.zeros(self.K) # number of each arm pull\n",
    "    self.leader_list=np.zeros(self.K)\n",
    "    self.m_hat = np.zeros(self.K)\n",
    "    self.width = np.ones(self.K) # Not precisely width, without log factor\n",
    "    self.UCB = self.m_hat+self.width*np.inf\n",
    "\n",
    "  def sample(self):\n",
    "    if np.min(self.n_list)==0: #Pull each arm at least once\n",
    "        return np.argmin(self.n_list)\n",
    "    bestI=np.argmax(self.m_hat)\n",
    "    leader_ind = np.argmax(self.UCB)\n",
    "    if self.beta*self.leader_list[leader_ind]<self.n_la[leader_ind][leader_ind]: #Pick Challenger\n",
    "        challenger_measure = np.zeros(self.K)\n",
    "        for i in range(self.K):\n",
    "            if i==leader_ind:\n",
    "                challenger_measure[i]=np.inf\n",
    "            else:\n",
    "                challenger_measure[i]=np.max(self.m_hat[leader_ind]-self.m_hat[i], 0)/np.sqrt(1/self.n_list[leader_ind]+1/self.n_list[i])\n",
    "        challenger_ind=np.argmin(challenger_measure)\n",
    "        return challenger_ind\n",
    "    return leader_ind\n",
    "    \n",
    "    \n",
    "\n",
    "\n",
    "\n",
    "  def update(self, a, reward):\n",
    "      leader_ind = np.argmax(self.UCB)\n",
    "      self.n_la[leader_ind][a]=self.n_la[leader_ind][a]+1\n",
    "      self.leader_list[leader_ind]=self.leader_list[leader_ind]+1\n",
    "      self.m_hat[a] = (reward+self.n_list[a]*self.m_hat[a])/(self.n_list[a]+1)               #sample mean\n",
    "      self.n_list[a]=self.n_list[a]+1\n",
    "\n",
    "      T=np.sum(self.n_list)\n",
    "      self.width[a]=1/np.sqrt(self.n_list[a])\n",
    "      self.UCB=self.m_hat+self.width*np.sqrt(4*np.log(T+1))\n",
    "      \n",
    "\n",
    "\n",
    "  def stopping_criterion(self):\n",
    "    T=np.sum(self.n_list)\n",
    "    if T>100000000:\n",
    "        return True\n",
    "    if np.min(self.n_list)==0:\n",
    "      return False\n",
    "    i_max=np.argmax(self.m_hat)\n",
    "    W = np.zeros(self.K)\n",
    "    minW = np.inf\n",
    "    for j in range(self.K):\n",
    "        if i_max==j:\n",
    "            W[j]=np.inf\n",
    "        else:\n",
    "            W[j]=(self.m_hat[i_max]-self.m_hat[j])/np.sqrt(1/self.n_list[i_max]+1/self.n_list[j])\n",
    "        minW=np.min((minW, W[j]))\n",
    "\n",
    "    C=2*self.C_G(0.5*np.log((self.K-1)/self.delta))+4*np.log(4+np.log((T-1)/2))\n",
    "    threshold = np.sqrt(2*C)\n",
    "    if minW>threshold:\n",
    "      return True\n",
    "    else:\n",
    "      return False\n",
    "\n",
    "\n",
    "  def recommendation(self):\n",
    "    return np.argmax(self.m_hat)\n",
    "\n",
    "  def C_G(self,x):\n",
    "      return x+ np.log(x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3) Our Algorithm (Algorithm 1, Successive Elimination)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "executionInfo": {
     "elapsed": 140,
     "status": "ok",
     "timestamp": 1702800721936,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "bCA2xpUzNyfU"
   },
   "outputs": [],
   "source": [
    "class Elimination(BAI_Algorithm):\n",
    "  def __init__(self, prior_dist, delta):\n",
    "    self.prior_dist = prior_dist\n",
    "    self.K= prior_dist.K\n",
    "    self.delta=delta\n",
    "    self.Delta0=prior_dist.get_Delta_0(delta)\n",
    "    self.survived_arms = list(range(self.K))\n",
    "    self.need_elimination = False\n",
    "    self.Delta_safe = np.inf\n",
    "\n",
    "    #self.hist=[ [] for _ in range(self.K) ] # list of empty lists\n",
    "    self.n_list = np.zeros(self.K)\n",
    "    self.m_hat = np.zeros(self.K)\n",
    "\n",
    "\n",
    "  def sample(self):\n",
    "#    n_list = [len(self.hist[i]) for i in self.survived_arms]\n",
    "    chosen = np.argmin(self.n_list[self.survived_arms])\n",
    "    if chosen==len(self.survived_arms)-1:\n",
    "      self.need_elimination=True\n",
    "    return self.survived_arms[chosen]\n",
    "\n",
    "  def recommendation(self):\n",
    "    if len(self.survived_arms)==1:\n",
    "      return self.survived_arms[0]\n",
    "    elif self.Delta_safe<self.Delta0:\n",
    "      return random.choice(self.survived_arms)\n",
    "    else:\n",
    "      print('Error: recommendation should be done only after stopping criterion')\n",
    "      return -1\n",
    "\n",
    "  def stopping_criterion(self):\n",
    "    if self.need_elimination:\n",
    "      self.elim()\n",
    "\n",
    "    if len(self.survived_arms)==1:                # Note that these two parameters, self,survived_arms and self.Delta_safe only changes when elim() has been called.\n",
    "      return True\n",
    "    elif self.Delta_safe<self.Delta0:\n",
    "      return True\n",
    "    else:\n",
    "      return False\n",
    "\n",
    "\n",
    "  def elim(self):\n",
    "    ucbs=np.zeros(len(self.survived_arms))\n",
    "    lcbs=np.zeros(len(self.survived_arms))\n",
    "    next_survived=[]\n",
    "\n",
    "    for i in range(len(self.survived_arms)):                      # Compute UCB and LCB\n",
    "      ucbs[i], lcbs[i] = self.conf_bounds(self.m_hat,self.survived_arms[i])\n",
    "    lcbmax=np.max(lcbs)                               # Maximum LCB\n",
    "    ucbmax=np.max(ucbs)                               # Maximum UCB\n",
    "    for i in range(len(self.survived_arms)):                       # Include all arms which satisfies UCB>LCBMAX to a new basket\n",
    "      if ucbs[i]>lcbmax:\n",
    "        next_survived.append(self.survived_arms[i])\n",
    "\n",
    "    self.Delta_safe=ucbmax-lcbmax\n",
    "    self.survived_arms=next_survived\n",
    "    self.need_elimination = False\n",
    "\n",
    "  def conf_bounds(self,m_hat,i):\n",
    "    m_hat=self.m_hat[i]\n",
    "    n=self.n_list[i]\n",
    "    width=np.sqrt(\n",
    "        2*self.prior_dist.instance_std[i]**2\n",
    "        *np.log(12*self.K*n**2/self.delta**2/np.pi**2)/n\n",
    "    )\n",
    "    return m_hat+width, m_hat-width\n",
    "\n",
    "  def update(self, a, reward):\n",
    "    self.m_hat[a]=(self.m_hat[a]*self.n_list[a]+reward)/(self.n_list[a]+1)\n",
    "    self.n_list[a]=self.n_list[a]+1\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "p-v2O_fU2Fd2"
   },
   "source": [
    "#\n",
    " General Experiment Design\n",
    "A class designed to effectively manage the experimental environment. This is mainly for guaranteeing same instance samples $(\\mu_i)_{i=1}^k$ for all algorithms. It takes the following major inputs:\n",
    "- prior\\_dist: corresponds to $H$ in our main paper. It will be *prior\\_distribution* instance.\n",
    "- $\\delta$: Error probability. \n",
    "- algolist: list of algorithms for this experiment. \n",
    "\n",
    "Methods:\n",
    "- single\\_experiment(mean, algoname): run single experiment for *algoname* with given mean $(\\mu_i)_{i=1}^k$. Returns stopping time and whether the prediction is correct or wrong. \n",
    "- monte\\_carlo\\_experiment(num\\_of\\_exp): repeat experiment *num\\_of\\_exp* times, but for each repetition all algorithms share same $(\\mu_i)_{i=1}^k$. Output is expected stopping time and success rate of each algorithm. \n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "executionInfo": {
     "elapsed": 132,
     "status": "ok",
     "timestamp": 1702800724516,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "nY_V_--KlYth"
   },
   "outputs": [],
   "source": [
    "class Experiment:\n",
    "  def __init__(self, prior_dist, delta, algolist):\n",
    "    self.delta=delta\n",
    "    self.algolist=algolist                              # Strings of algorithm names, such as ['Elim', 'TTTS']\n",
    "    self.prior_dist=prior_dist\n",
    "    self.K=self.prior_dist.K                            # Number of Arms\n",
    "    self.stopping_time_hist = []                        # Record of Stopping time\n",
    "    self.success_hist = []             \n",
    "    self.time_spent=[]                                  # Record of time spent (unit: sec)\n",
    "\n",
    "\n",
    "  def monte_carlo_experiment(self, num_of_exp): # running multiple experiment - num_of_exp times\n",
    "    exp_stopping_time=np.zeros(len(self.algolist))\n",
    "    success_rate=np.zeros(len(self.algolist))\n",
    "    for i in range(num_of_exp):\n",
    "      mean=self.prior_dist.sample_instance()\n",
    "      for alg in range(len(self.algolist)):\n",
    "        algoname=self.algolist[alg]\n",
    "        start_time=time.time()\n",
    "        sample_stopping_time, sample_success = self.single_experiment(mean, algoname)\n",
    "        elapsed=time.time()-start_time\n",
    "        self.stopping_time_hist.append(sample_stopping_time)\n",
    "        self.success_hist.append(sample_success)\n",
    "        self.time_spent.append(elapsed)\n",
    "        exp_stopping_time[alg] =exp_stopping_time[alg]+sample_stopping_time/num_of_exp\n",
    "        success_rate[alg]      =success_rate[alg]+sample_success/num_of_exp\n",
    "    return exp_stopping_time, success_rate\n",
    "\n",
    "\n",
    "\n",
    "  def single_experiment(self, mean, algoname):\n",
    "    print(\"Starting an experiment - mean \")\n",
    "    print(mean)\n",
    "    alg=BAI_Algorithm(self.prior_dist,self.delta)\n",
    "    if algoname=='Elim':\n",
    "      print(\"Algo: Elim\")\n",
    "      alg = Elimination(self.prior_dist, self.delta)\n",
    "    elif algoname=='TTTS':\n",
    "      print(\"Algo: TTTS\")\n",
    "      alg = TTTS(self.prior_dist, self.delta)\n",
    "    elif algoname=='TTUCB':\n",
    "      print(\"Algo: TTUCB\")\n",
    "      alg = TTUCB(self.prior_dist, self.delta)\n",
    "    answer=np.argmax(mean)\n",
    "    std=self.prior_dist.instance_std\n",
    "    stopping_time=0\n",
    "    while not alg.stopping_criterion():\n",
    "      a=alg.sample()\n",
    "      reward = np.random.normal(mean[a],std[a])\n",
    "      alg.update(a, reward)\n",
    "      stopping_time=stopping_time+1\n",
    "    print('Final stopping time: %d'%stopping_time)\n",
    "    if answer==alg.recommendation():\n",
    "      return stopping_time,1\n",
    "    else:\n",
    "      return stopping_time,0\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Start of the Main Code\n",
    "Parameters\n",
    "- $K$: number of arms.\n",
    "- *prior\\_mean*: corresponds to $(m_i)_{i=1}^k$, the mean vector of the prior distribution.\n",
    "- *prior\\_std*: corrsponds to $(\\xi_i)_{i=1}^k$, the vector of standard deviations.\n",
    "- *instance\\_std*: corresponds to $(\\sigma_i)_{i=1}^k$, the standard deviations of the reward distributions. \n",
    "- $\\delta$: error probability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "executionInfo": {
     "elapsed": 144,
     "status": "ok",
     "timestamp": 1702800727426,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "CtsNGPqo9Mr_"
   },
   "outputs": [],
   "source": [
    "K=10\n",
    "\n",
    "prior_mean=np.random.normal(0,1,K)\n",
    "prior_std=np.random.uniform(0.5,1.5,K)\n",
    "instance_std=np.random.uniform(0.5,1.5,K)\n",
    "delta=0.1"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.498\n",
      "1.262\n",
      "1.485\n",
      "0.963\n",
      "1.375\n",
      "0.969\n",
      "1.357\n",
      "1.238\n",
      "1.088\n",
      "0.699\n",
      "[-0.05261541  0.52767271 -0.33152129 -0.36781578 -0.27298596  0.90868056\n",
      "  0.41845086 -1.16969592  0.87341374 -0.40510192]\n",
      "[0.60372243 1.47713147 1.1629126  0.98813025 0.56025528 0.5131734\n",
      " 0.9966126  1.33176296 0.82807594 0.83345155]\n",
      "[1.49848059 1.2616226  1.48519329 0.96316585 1.37547129 0.96947624\n",
      " 1.35714261 1.23770972 1.08750015 0.69850409]\n"
     ]
    }
   ],
   "source": [
    "for num in instance_std:\n",
    "    print(f\"{num:.3f}\")\n",
    "print(prior_mean)\n",
    "print(prior_std)\n",
    "print(instance_std)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create prior distribution $H$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.02295460533220602"
      ]
     },
     "execution_count": 36,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "prior_dist=prior_distribution(prior_mean, prior_std,instance_std)\n",
    "prior_dist.get_Delta_0(delta)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Experiment instance"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "executionInfo": {
     "elapsed": 166,
     "status": "ok",
     "timestamp": 1702800962508,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "MqKIm4yh-Uwq"
   },
   "outputs": [],
   "source": [
    "myExp=Experiment(prior_dist, delta, ['TTUCB', 'Elim'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Run the experiment 500 times.\n",
    "- res: average stopping time variable\n",
    "\n",
    "- ult: success rate variable"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Starting an experiment - mean \n",
      "[ 0.25528724  4.75096401 -0.33664548 -0.59653064  0.38963043  1.21926175\n",
      "  1.09682653 -2.37778621  0.43921825 -1.45202965]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 31\n",
      "Starting an experiment - mean \n",
      "[ 0.25528724  4.75096401 -0.33664548 -0.59653064  0.38963043  1.21926175\n",
      "  1.09682653 -2.37778621  0.43921825 -1.45202965]\n",
      "Algo: Elim\n",
      "Final stopping time: 91\n",
      "Starting an experiment - mean \n",
      "[-0.3632643   1.69729698 -0.15311838  0.9177278  -0.22217531  0.41517566\n",
      " -0.13509455 -0.15444146  1.78432413  2.08676965]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1851\n",
      "Starting an experiment - mean \n",
      "[-0.3632643   1.69729698 -0.15311838  0.9177278  -0.22217531  0.41517566\n",
      " -0.13509455 -0.15444146  1.78432413  2.08676965]\n",
      "Algo: Elim\n",
      "Final stopping time: 4573\n",
      "Starting an experiment - mean \n",
      "[-0.85251203 -2.07892373 -1.83512867  0.95547776  0.19339355  1.4678991\n",
      " -0.29362518 -3.63694519  0.09558368 -0.81641633]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 807\n",
      "Starting an experiment - mean \n",
      "[-0.85251203 -2.07892373 -1.83512867  0.95547776  0.19339355  1.4678991\n",
      " -0.29362518 -3.63694519  0.09558368 -0.81641633]\n",
      "Algo: Elim\n",
      "Final stopping time: 1450\n",
      "Starting an experiment - mean \n",
      "[-0.43562532  0.65370906  1.34041439 -1.78623825 -0.93259208  0.85356072\n",
      "  0.45813544 -0.37594264  0.5294146  -0.82316061]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1168\n",
      "Starting an experiment - mean \n",
      "[-0.43562532  0.65370906  1.34041439 -1.78623825 -0.93259208  0.85356072\n",
      "  0.45813544 -0.37594264  0.5294146  -0.82316061]\n",
      "Algo: Elim\n",
      "Final stopping time: 3247\n",
      "Starting an experiment - mean \n",
      "[-0.0719189  -3.2759647  -0.33345931 -1.41936233 -0.10292223  0.9670966\n",
      " -2.12995885 -2.33550516  0.90063603  1.55924872]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 515\n",
      "Starting an experiment - mean \n",
      "[-0.0719189  -3.2759647  -0.33345931 -1.41936233 -0.10292223  0.9670966\n",
      " -2.12995885 -2.33550516  0.90063603  1.55924872]\n",
      "Algo: Elim\n",
      "Final stopping time: 865\n",
      "Starting an experiment - mean \n",
      "[ 1.14897885  2.14506155 -1.47917     0.04942026 -0.94369282  0.36064006\n",
      "  0.37339241 -0.92075087  0.82709129 -0.06370744]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 220\n",
      "Starting an experiment - mean \n",
      "[ 1.14897885  2.14506155 -1.47917     0.04942026 -0.94369282  0.36064006\n",
      "  0.37339241 -0.92075087  0.82709129 -0.06370744]\n",
      "Algo: Elim\n",
      "Final stopping time: 1272\n",
      "Starting an experiment - mean \n",
      "[-0.15858929 -1.61719078  0.19067601  0.99459796 -0.53957012  0.89726508\n",
      " -0.14287434 -2.70123832  0.32854782 -1.2589087 ]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 10175\n",
      "Starting an experiment - mean \n",
      "[-0.15858929 -1.61719078  0.19067601  0.99459796 -0.53957012  0.89726508\n",
      " -0.14287434 -2.70123832  0.32854782 -1.2589087 ]\n",
      "Algo: Elim\n",
      "Final stopping time: 51510\n",
      "Starting an experiment - mean \n",
      "[-0.08850125 -1.66545111  0.35729302 -2.47161509  0.10369513  0.83714755\n",
      "  1.01935037 -1.95224657  0.32540277 -0.30499933]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 5417\n",
      "Starting an experiment - mean \n",
      "[-0.08850125 -1.66545111  0.35729302 -2.47161509  0.10369513  0.83714755\n",
      "  1.01935037 -1.95224657  0.32540277 -0.30499933]\n",
      "Algo: Elim\n",
      "Final stopping time: 14118\n",
      "Starting an experiment - mean \n",
      "[-0.4105383   1.13332605 -2.17571361  0.11222745 -0.79329035  1.03991603\n",
      "  0.3858053   0.39686422  0.87227945 -0.42651837]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 13428\n",
      "Starting an experiment - mean \n",
      "[-0.4105383   1.13332605 -2.17571361  0.11222745 -0.79329035  1.03991603\n",
      "  0.3858053   0.39686422  0.87227945 -0.42651837]\n",
      "Algo: Elim\n",
      "Final stopping time: 90216\n",
      "Starting an experiment - mean \n",
      "[-1.3079754  -0.75497023 -0.38508793  0.30566363  0.31768623  0.80136574\n",
      "  1.50574224 -1.13040762 -0.58885986  0.38403149]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 306\n",
      "Starting an experiment - mean \n",
      "[-1.3079754  -0.75497023 -0.38508793  0.30566363  0.31768623  0.80136574\n",
      "  1.50574224 -1.13040762 -0.58885986  0.38403149]\n",
      "Algo: Elim\n",
      "Final stopping time: 1749\n",
      "Starting an experiment - mean \n",
      "[ 0.44470755 -1.25085464 -0.35621296 -0.16934585 -0.57248522  0.4610093\n",
      " -0.49417971  0.48295924  0.20940267 -0.16523475]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 258663\n",
      "Starting an experiment - mean \n",
      "[ 0.44470755 -1.25085464 -0.35621296 -0.16934585 -0.57248522  0.4610093\n",
      " -0.49417971  0.48295924  0.20940267 -0.16523475]\n",
      "Algo: Elim\n",
      "Final stopping time: 2036621\n",
      "Starting an experiment - mean \n",
      "[ 0.52763544  1.17342741 -1.77902021 -1.49588339 -0.4066282   0.83736342\n",
      "  0.63700506 -1.25578486  0.32949717 -1.40492081]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1056\n",
      "Starting an experiment - mean \n",
      "[ 0.52763544  1.17342741 -1.77902021 -1.49588339 -0.4066282   0.83736342\n",
      "  0.63700506 -1.25578486  0.32949717 -1.40492081]\n",
      "Algo: Elim\n",
      "Final stopping time: 7009\n",
      "Starting an experiment - mean \n",
      "[-1.15784639  0.87959325 -1.95514143  0.83616774 -0.10303358  0.84739873\n",
      "  0.34576978 -0.15764263 -0.13120778 -0.01229118]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 178750\n",
      "Starting an experiment - mean \n",
      "[-1.15784639  0.87959325 -1.95514143  0.83616774 -0.10303358  0.84739873\n",
      "  0.34576978 -0.15764263 -0.13120778 -0.01229118]\n",
      "Algo: Elim\n",
      "Final stopping time: 715552\n",
      "Starting an experiment - mean \n",
      "[ 0.85715765  0.6651657  -0.01482706 -0.26745168 -0.09114702  1.10847253\n",
      " -1.70935735 -2.62021347  0.6418751   0.02030858]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1704\n",
      "Starting an experiment - mean \n",
      "[ 0.85715765  0.6651657  -0.01482706 -0.26745168 -0.09114702  1.10847253\n",
      " -1.70935735 -2.62021347  0.6418751   0.02030858]\n",
      "Algo: Elim\n",
      "Final stopping time: 9755\n",
      "Starting an experiment - mean \n",
      "[-1.06768337 -0.41852551  0.84807337 -0.81045679  0.4254489   0.55762579\n",
      " -2.27230434  0.34290238  1.51333398 -1.5700033 ]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 248\n",
      "Starting an experiment - mean \n",
      "[-1.06768337 -0.41852551  0.84807337 -0.81045679  0.4254489   0.55762579\n",
      " -2.27230434  0.34290238  1.51333398 -1.5700033 ]\n",
      "Algo: Elim\n",
      "Final stopping time: 1525\n",
      "Starting an experiment - mean \n",
      "[ 0.14984337  1.60846377 -1.06757272 -2.29227499 -0.57193914  0.66521534\n",
      "  0.31013473 -2.5831894  -0.70976161  1.36193321]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 2360\n",
      "Starting an experiment - mean \n",
      "[ 0.14984337  1.60846377 -1.06757272 -2.29227499 -0.57193914  0.66521534\n",
      "  0.31013473 -2.5831894  -0.70976161  1.36193321]\n",
      "Algo: Elim\n",
      "Final stopping time: 5108\n",
      "Starting an experiment - mean \n",
      "[-0.82037247  2.77007537 -0.60384571 -1.6080881  -0.33618714  0.40846607\n",
      " -0.21692404 -2.16235001  1.21496458  0.32142112]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 95\n",
      "Starting an experiment - mean \n",
      "[-0.82037247  2.77007537 -0.60384571 -1.6080881  -0.33618714  0.40846607\n",
      " -0.21692404 -2.16235001  1.21496458  0.32142112]\n",
      "Algo: Elim\n",
      "Final stopping time: 207\n",
      "Starting an experiment - mean \n",
      "[ 1.63231422e-02  1.55703135e+00  3.72298444e+00  5.61542236e-02\n",
      "  6.91481011e-01  5.58530018e-01  3.18310255e+00 -1.62424380e+00\n",
      "  1.63860304e+00 -3.00952580e-03]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 100\n",
      "Starting an experiment - mean \n",
      "[ 1.63231422e-02  1.55703135e+00  3.72298444e+00  5.61542236e-02\n",
      "  6.91481011e-01  5.58530018e-01  3.18310255e+00 -1.62424380e+00\n",
      "  1.63860304e+00 -3.00952580e-03]\n",
      "Algo: Elim\n",
      "Final stopping time: 2585\n",
      "Starting an experiment - mean \n",
      "[-0.43726062  1.08731788 -1.34364557  1.48167289  0.24860917  1.73024534\n",
      " -1.19658197  1.14400792  1.0897997  -0.31477151]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 2053\n",
      "Starting an experiment - mean \n",
      "[-0.43726062  1.08731788 -1.34364557  1.48167289  0.24860917  1.73024534\n",
      " -1.19658197  1.14400792  1.0897997  -0.31477151]\n",
      "Algo: Elim\n",
      "Final stopping time: 5434\n",
      "Starting an experiment - mean \n",
      "[-0.0394097  -0.57113017 -0.14814505  0.09014889 -0.74650841  1.15383637\n",
      "  1.47866398 -0.53580666  1.34828008 -0.27842612]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 8677\n",
      "Starting an experiment - mean \n",
      "[-0.0394097  -0.57113017 -0.14814505  0.09014889 -0.74650841  1.15383637\n",
      "  1.47866398 -0.53580666  1.34828008 -0.27842612]\n",
      "Algo: Elim\n",
      "Final stopping time: 38795\n",
      "Starting an experiment - mean \n",
      "[-0.03028543 -3.8240494  -1.97959278 -1.54225919  0.19849124  0.3566103\n",
      "  0.45269237 -0.63482616  1.22968373 -0.65572259]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 529\n",
      "Starting an experiment - mean \n",
      "[-0.03028543 -3.8240494  -1.97959278 -1.54225919  0.19849124  0.3566103\n",
      "  0.45269237 -0.63482616  1.22968373 -0.65572259]\n",
      "Algo: Elim\n",
      "Final stopping time: 1367\n",
      "Starting an experiment - mean \n",
      "[-0.20609929  1.90618595  0.7607494   0.50046444 -0.49021609  0.74628372\n",
      "  0.7424141   1.68539102 -0.0100296  -0.76703369]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 3105\n",
      "Starting an experiment - mean \n",
      "[-0.20609929  1.90618595  0.7607494   0.50046444 -0.49021609  0.74628372\n",
      "  0.7424141   1.68539102 -0.0100296  -0.76703369]\n",
      "Algo: Elim\n",
      "Final stopping time: 15461\n",
      "Starting an experiment - mean \n",
      "[ 0.01367194 -0.81281167  0.14049231 -1.16909717 -0.84137922  0.92139173\n",
      " -0.1288358  -1.49558777  1.28055639 -0.34520153]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 543\n",
      "Starting an experiment - mean \n",
      "[ 0.01367194 -0.81281167  0.14049231 -1.16909717 -0.84137922  0.92139173\n",
      " -0.1288358  -1.49558777  1.28055639 -0.34520153]\n",
      "Algo: Elim\n",
      "Final stopping time: 2893\n",
      "Starting an experiment - mean \n",
      "[ 0.21539932 -2.42071459  0.76469565  1.32063002 -0.53408225  0.05224983\n",
      " -0.25742508  0.39114123  2.28825246 -1.01983915]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 186\n",
      "Starting an experiment - mean \n",
      "[ 0.21539932 -2.42071459  0.76469565  1.32063002 -0.53408225  0.05224983\n",
      " -0.25742508  0.39114123  2.28825246 -1.01983915]\n",
      "Algo: Elim\n",
      "Final stopping time: 542\n",
      "Starting an experiment - mean \n",
      "[ 0.2402469   2.19499809  1.57876982 -0.2022328  -0.46917715  1.19264201\n",
      "  1.12157732 -2.1102034   1.7801498   1.25295213]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 840\n",
      "Starting an experiment - mean \n",
      "[ 0.2402469   2.19499809  1.57876982 -0.2022328  -0.46917715  1.19264201\n",
      "  1.12157732 -2.1102034   1.7801498   1.25295213]\n",
      "Algo: Elim\n",
      "Final stopping time: 5544\n",
      "Starting an experiment - mean \n",
      "[-0.84249174  1.71609279 -1.12523242 -0.82991542 -0.46691041  0.74133916\n",
      " -0.15982929 -0.6511885   1.46940034 -0.85720263]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1684\n",
      "Starting an experiment - mean \n",
      "[-0.84249174  1.71609279 -1.12523242 -0.82991542 -0.46691041  0.74133916\n",
      " -0.15982929 -0.6511885   1.46940034 -0.85720263]\n",
      "Algo: Elim\n",
      "Final stopping time: 10424\n",
      "Starting an experiment - mean \n",
      "[-0.60548397  1.44288011 -0.90895097 -1.99901664  0.23050697  0.70692324\n",
      "  1.38737923 -0.91192422  0.70535634 -1.43603879]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 26704\n",
      "Starting an experiment - mean \n",
      "[-0.60548397  1.44288011 -0.90895097 -1.99901664  0.23050697  0.70692324\n",
      "  1.38737923 -0.91192422  0.70535634 -1.43603879]\n",
      "Algo: Elim\n",
      "Final stopping time: 435964\n",
      "Starting an experiment - mean \n",
      "[ 0.7564799  -0.09689432  0.59240255 -0.20071102 -0.74751868  0.85866569\n",
      " -1.57728608 -2.5206416  -0.4240859   0.44018165]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 6702\n",
      "Starting an experiment - mean \n",
      "[ 0.7564799  -0.09689432  0.59240255 -0.20071102 -0.74751868  0.85866569\n",
      " -1.57728608 -2.5206416  -0.4240859   0.44018165]\n",
      "Algo: Elim\n",
      "Final stopping time: 62403\n",
      "Starting an experiment - mean \n",
      "[-0.58755538 -0.81998033 -0.68174924  0.54564107 -0.78965849  1.46831836\n",
      "  0.34396045 -3.97530401  1.3137357  -0.57580483]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 5489\n",
      "Starting an experiment - mean \n",
      "[-0.58755538 -0.81998033 -0.68174924  0.54564107 -0.78965849  1.46831836\n",
      "  0.34396045 -3.97530401  1.3137357  -0.57580483]\n",
      "Algo: Elim\n",
      "Final stopping time: 20814\n",
      "Starting an experiment - mean \n",
      "[-0.58363402  2.60000835 -0.92661943  1.77298475  0.57351523  1.08919303\n",
      "  1.67393385  0.92548954  0.39518837  0.47300613]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 290\n",
      "Starting an experiment - mean \n",
      "[-0.58363402  2.60000835 -0.92661943  1.77298475  0.57351523  1.08919303\n",
      "  1.67393385  0.92548954  0.39518837  0.47300613]\n",
      "Algo: Elim\n",
      "Final stopping time: 1308\n",
      "Starting an experiment - mean \n",
      "[ 0.32120428  0.87845239 -1.93772107 -0.08413163 -1.43542076  0.54047327\n",
      "  0.28653385 -1.07022213  1.18591992 -0.77444723]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1977\n",
      "Starting an experiment - mean \n",
      "[ 0.32120428  0.87845239 -1.93772107 -0.08413163 -1.43542076  0.54047327\n",
      "  0.28653385 -1.07022213  1.18591992 -0.77444723]\n",
      "Algo: Elim\n",
      "Final stopping time: 7442\n",
      "Starting an experiment - mean \n",
      "[ 0.55100978  0.79503547 -0.716161   -1.24836486  0.07492675  1.75710254\n",
      "  0.73877905 -1.10719081  1.76309506 -0.07392676]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 5085874\n",
      "Starting an experiment - mean \n",
      "[ 0.55100978  0.79503547 -0.716161   -1.24836486  0.07492675  1.75710254\n",
      "  0.73877905 -1.10719081  1.76309506 -0.07392676]\n",
      "Algo: Elim\n",
      "Final stopping time: 1212005\n",
      "Starting an experiment - mean \n",
      "[ 1.41422487 -0.7622189  -0.89744119  0.26951496 -0.80570307  0.80794304\n",
      "  2.00592371 -0.52393607  1.31863266 -0.32688131]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 674\n",
      "Starting an experiment - mean \n",
      "[ 1.41422487 -0.7622189  -0.89744119  0.26951496 -0.80570307  0.80794304\n",
      "  2.00592371 -0.52393607  1.31863266 -0.32688131]\n",
      "Algo: Elim\n",
      "Final stopping time: 2417\n",
      "Starting an experiment - mean \n",
      "[ 0.20832052  0.12707139 -0.0131443  -0.57458533 -0.26231081  0.7343488\n",
      "  0.22183271 -0.21501816  1.572697   -1.40874386]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 271\n",
      "Starting an experiment - mean \n",
      "[ 0.20832052  0.12707139 -0.0131443  -0.57458533 -0.26231081  0.7343488\n",
      "  0.22183271 -0.21501816  1.572697   -1.40874386]\n",
      "Algo: Elim\n",
      "Final stopping time: 1227\n",
      "Starting an experiment - mean \n",
      "[-0.15269873  1.35886604  0.9215184  -1.00538192 -0.84495076  1.71468765\n",
      "  1.40998281 -0.77260011  1.34888186 -0.87650391]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 2276\n",
      "Starting an experiment - mean \n",
      "[-0.15269873  1.35886604  0.9215184  -1.00538192 -0.84495076  1.71468765\n",
      "  1.40998281 -0.77260011  1.34888186 -0.87650391]\n",
      "Algo: Elim\n",
      "Final stopping time: 9966\n",
      "Starting an experiment - mean \n",
      "[-0.59714184 -3.75344613 -2.3855143  -2.59563122  0.66004473  1.13337042\n",
      "  1.21895685 -1.50370427  0.13200016 -0.21354562]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 20497\n",
      "Starting an experiment - mean \n",
      "[-0.59714184 -3.75344613 -2.3855143  -2.59563122  0.66004473  1.13337042\n",
      "  1.21895685 -1.50370427  0.13200016 -0.21354562]\n",
      "Algo: Elim\n",
      "Final stopping time: 72904\n",
      "Starting an experiment - mean \n",
      "[ 1.1478788   1.40568418  0.58488585  0.311221   -0.60510545  1.54022969\n",
      "  0.87690474  0.07339487  0.66227065 -0.06117891]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 9637\n",
      "Starting an experiment - mean \n",
      "[ 1.1478788   1.40568418  0.58488585  0.311221   -0.60510545  1.54022969\n",
      "  0.87690474  0.07339487  0.66227065 -0.06117891]\n",
      "Algo: Elim\n",
      "Final stopping time: 33100\n",
      "Starting an experiment - mean \n",
      "[ 0.34530593 -1.85229123  0.74905022  0.13613854  0.26410394  1.0330078\n",
      "  0.81814474 -1.05490523  0.62968895  0.1512609 ]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 3660\n",
      "Starting an experiment - mean \n",
      "[ 0.34530593 -1.85229123  0.74905022  0.13613854  0.26410394  1.0330078\n",
      "  0.81814474 -1.05490523  0.62968895  0.1512609 ]\n",
      "Algo: Elim\n",
      "Final stopping time: 24317\n",
      "Starting an experiment - mean \n",
      "[-2.43526269e-01 -8.14959012e-03  1.15515519e+00 -4.45053837e-01\n",
      " -5.68888573e-01  1.56469070e+00  9.54582623e-02 -2.59818017e+00\n",
      "  3.01069700e-01  2.49737582e-03]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 959\n",
      "Starting an experiment - mean \n",
      "[-2.43526269e-01 -8.14959012e-03  1.15515519e+00 -4.45053837e-01\n",
      " -5.68888573e-01  1.56469070e+00  9.54582623e-02 -2.59818017e+00\n",
      "  3.01069700e-01  2.49737582e-03]\n",
      "Algo: Elim\n",
      "Final stopping time: 3125\n",
      "Starting an experiment - mean \n",
      "[-0.82484414  1.92890036  1.65795033  0.36175321 -0.96952713  1.34178732\n",
      "  0.59532226 -0.1271613   0.82180565  1.04057456]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1599\n",
      "Starting an experiment - mean \n",
      "[-0.82484414  1.92890036  1.65795033  0.36175321 -0.96952713  1.34178732\n",
      "  0.59532226 -0.1271613   0.82180565  1.04057456]\n",
      "Algo: Elim\n",
      "Final stopping time: 8159\n",
      "Starting an experiment - mean \n",
      "[ 0.91711161  0.19583473 -2.48193251 -0.3401074   0.05735277  1.40439629\n",
      "  0.52614036 -0.39436432  1.93607174 -0.50577024]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 411\n",
      "Starting an experiment - mean \n",
      "[ 0.91711161  0.19583473 -2.48193251 -0.3401074   0.05735277  1.40439629\n",
      "  0.52614036 -0.39436432  1.93607174 -0.50577024]\n",
      "Algo: Elim\n",
      "Final stopping time: 2132\n",
      "Starting an experiment - mean \n",
      "[-0.77370984  1.75788559 -0.49460121  0.4590849   0.21519428  0.44669441\n",
      " -0.6732355  -2.08320816  1.2208101  -0.37747429]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 544\n",
      "Starting an experiment - mean \n",
      "[-0.77370984  1.75788559 -0.49460121  0.4590849   0.21519428  0.44669441\n",
      " -0.6732355  -2.08320816  1.2208101  -0.37747429]\n",
      "Algo: Elim\n",
      "Final stopping time: 2105\n",
      "Starting an experiment - mean \n",
      "[-0.24395337  1.46437594  0.68608111  0.50620132 -1.41643045  1.85186094\n",
      "  0.0787199   0.44165075  1.41883146  0.58603361]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1097\n",
      "Starting an experiment - mean \n",
      "[-0.24395337  1.46437594  0.68608111  0.50620132 -1.41643045  1.85186094\n",
      "  0.0787199   0.44165075  1.41883146  0.58603361]\n",
      "Algo: Elim\n",
      "Final stopping time: 3676\n",
      "Starting an experiment - mean \n",
      "[ 0.58957818  1.93302792 -0.61386193 -1.43139559  0.18898006  0.10891259\n",
      "  0.70304038 -3.87190546  1.44344436  0.27924561]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 344\n",
      "Starting an experiment - mean \n",
      "[ 0.58957818  1.93302792 -0.61386193 -1.43139559  0.18898006  0.10891259\n",
      "  0.70304038 -3.87190546  1.44344436  0.27924561]\n",
      "Algo: Elim\n",
      "Final stopping time: 2790\n",
      "Starting an experiment - mean \n",
      "[ 0.23035333 -1.02229449  0.32776792  2.0818022   0.07770131  0.17408874\n",
      " -1.75544586 -0.66171223  0.50983539 -0.23507126]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 103\n",
      "Starting an experiment - mean \n",
      "[ 0.23035333 -1.02229449  0.32776792  2.0818022   0.07770131  0.17408874\n",
      " -1.75544586 -0.66171223  0.50983539 -0.23507126]\n",
      "Algo: Elim\n",
      "Final stopping time: 452\n",
      "Starting an experiment - mean \n",
      "[ 0.04863439  1.62638044  1.80604334 -0.02732846 -0.61462536  0.64571872\n",
      "  1.13849982 -2.6785882   2.07994716 -0.35617484]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 2147\n",
      "Starting an experiment - mean \n",
      "[ 0.04863439  1.62638044  1.80604334 -0.02732846 -0.61462536  0.64571872\n",
      "  1.13849982 -2.6785882   2.07994716 -0.35617484]\n",
      "Algo: Elim\n",
      "Final stopping time: 9661\n",
      "Starting an experiment - mean \n",
      "[ 0.05727515  0.44012625 -1.64554491  0.16067791 -1.0323611   2.0814487\n",
      "  0.63784965 -4.16721914 -1.29341834 -1.07315884]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 136\n",
      "Starting an experiment - mean \n",
      "[ 0.05727515  0.44012625 -1.64554491  0.16067791 -1.0323611   2.0814487\n",
      "  0.63784965 -4.16721914 -1.29341834 -1.07315884]\n",
      "Algo: Elim\n",
      "Final stopping time: 497\n",
      "Starting an experiment - mean \n",
      "[ 0.21623845  1.15599578  0.81932558 -0.81971762 -0.30903797  1.55686314\n",
      "  0.13357902 -2.64640936 -0.27541758 -1.84977339]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 685\n",
      "Starting an experiment - mean \n",
      "[ 0.21623845  1.15599578  0.81932558 -0.81971762 -0.30903797  1.55686314\n",
      "  0.13357902 -2.64640936 -0.27541758 -1.84977339]\n",
      "Algo: Elim\n",
      "Final stopping time: 3638\n",
      "Starting an experiment - mean \n",
      "[ 0.21527716  1.65985352 -2.03109803 -0.1650518   1.12928031  1.03891951\n",
      " -0.73365474  0.54088077  0.19506328  0.55993506]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 609\n",
      "Starting an experiment - mean \n",
      "[ 0.21527716  1.65985352 -2.03109803 -0.1650518   1.12928031  1.03891951\n",
      " -0.73365474  0.54088077  0.19506328  0.55993506]\n",
      "Algo: Elim\n",
      "Final stopping time: 2251\n",
      "Starting an experiment - mean \n",
      "[ 0.62396021  0.86077909  1.01047059 -0.86269248 -1.31187511  1.49914426\n",
      " -0.92078188 -0.61929037  0.28186123  0.79817902]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 909\n",
      "Starting an experiment - mean \n",
      "[ 0.62396021  0.86077909  1.01047059 -0.86269248 -1.31187511  1.49914426\n",
      " -0.92078188 -0.61929037  0.28186123  0.79817902]\n",
      "Algo: Elim\n",
      "Final stopping time: 2928\n",
      "Starting an experiment - mean \n",
      "[ 0.34345543  3.16958324 -1.33767847 -0.40225486 -0.01419035  0.60513032\n",
      " -0.64908908 -0.9555705   1.51887103 -1.51302867]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 72\n",
      "Starting an experiment - mean \n",
      "[ 0.34345543  3.16958324 -1.33767847 -0.40225486 -0.01419035  0.60513032\n",
      " -0.64908908 -0.9555705   1.51887103 -1.51302867]\n",
      "Algo: Elim\n",
      "Final stopping time: 155\n",
      "Starting an experiment - mean \n",
      "[-0.27218136  3.84158539 -0.3721809   1.36921416  0.01626839  1.44320422\n",
      "  1.40964348 -2.19061154  0.12420324  0.74067276]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 62\n",
      "Starting an experiment - mean \n",
      "[-0.27218136  3.84158539 -0.3721809   1.36921416  0.01626839  1.44320422\n",
      "  1.40964348 -2.19061154  0.12420324  0.74067276]\n",
      "Algo: Elim\n",
      "Final stopping time: 175\n",
      "Starting an experiment - mean \n",
      "[-0.21592907  0.27749396 -1.30421831 -2.81587652 -0.22662191  1.42022524\n",
      " -0.94410689 -0.13979698  1.3401002  -2.00612153]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 24756\n",
      "Starting an experiment - mean \n",
      "[-0.21592907  0.27749396 -1.30421831 -2.81587652 -0.22662191  1.42022524\n",
      " -0.94410689 -0.13979698  1.3401002  -2.00612153]\n",
      "Algo: Elim\n",
      "Final stopping time: 70855\n",
      "Starting an experiment - mean \n",
      "[ 0.67226424 -1.02711841 -0.17550445  0.59835208 -0.98948126  0.69308873\n",
      "  0.33550924 -2.03787668  2.06950728 -0.92897301]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 189\n",
      "Starting an experiment - mean \n",
      "[ 0.67226424 -1.02711841 -0.17550445  0.59835208 -0.98948126  0.69308873\n",
      "  0.33550924 -2.03787668  2.06950728 -0.92897301]\n",
      "Algo: Elim\n",
      "Final stopping time: 599\n",
      "Starting an experiment - mean \n",
      "[-0.27670605  1.4842974   0.5066769   0.29754918  0.10249581  1.10768807\n",
      " -0.00853048 -4.15689738  2.38010235 -0.56874475]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 273\n",
      "Starting an experiment - mean \n",
      "[-0.27670605  1.4842974   0.5066769   0.29754918  0.10249581  1.10768807\n",
      " -0.00853048 -4.15689738  2.38010235 -0.56874475]\n",
      "Algo: Elim\n",
      "Final stopping time: 857\n",
      "Starting an experiment - mean \n",
      "[-0.84474259  3.5934331  -1.03219724 -0.30116383 -0.73273412  1.43341292\n",
      "  0.25600769  0.37270717  1.43262682 -0.53538335]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 64\n",
      "Starting an experiment - mean \n",
      "[-0.84474259  3.5934331  -1.03219724 -0.30116383 -0.73273412  1.43341292\n",
      "  0.25600769  0.37270717  1.43262682 -0.53538335]\n",
      "Algo: Elim\n",
      "Final stopping time: 201\n",
      "Starting an experiment - mean \n",
      "[-0.50712247  0.38477912  0.27008264  0.91616435 -1.11210284  0.09920528\n",
      " -1.36136343 -2.68104192  0.37510708 -0.23164388]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1118\n",
      "Starting an experiment - mean \n",
      "[-0.50712247  0.38477912  0.27008264  0.91616435 -1.11210284  0.09920528\n",
      " -1.36136343 -2.68104192  0.37510708 -0.23164388]\n",
      "Algo: Elim\n",
      "Final stopping time: 3324\n",
      "Starting an experiment - mean \n",
      "[ 0.66326376  0.56218474  0.74139273 -1.03531518 -0.19873761 -0.31879723\n",
      " -1.22541047  0.01103535  1.22469598 -0.93419466]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 778\n",
      "Starting an experiment - mean \n",
      "[ 0.66326376  0.56218474  0.74139273 -1.03531518 -0.19873761 -0.31879723\n",
      " -1.22541047  0.01103535  1.22469598 -0.93419466]\n",
      "Algo: Elim\n",
      "Final stopping time: 4871\n",
      "Starting an experiment - mean \n",
      "[ 0.07789329  0.59115898 -0.24790812 -0.5166546  -0.04114645 -0.46117633\n",
      " -1.95366505 -1.97973079  0.64545625 -1.2474412 ]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 35028\n",
      "Starting an experiment - mean \n",
      "[ 0.07789329  0.59115898 -0.24790812 -0.5166546  -0.04114645 -0.46117633\n",
      " -1.95366505 -1.97973079  0.64545625 -1.2474412 ]\n",
      "Algo: Elim\n",
      "Final stopping time: 207067\n",
      "Starting an experiment - mean \n",
      "[ 0.18378959 -0.83631738  0.42878517  0.61075637 -0.03173242  0.05511736\n",
      " -0.37900696 -0.85673038  1.3682263  -0.83056754]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 350\n",
      "Starting an experiment - mean \n",
      "[ 0.18378959 -0.83631738  0.42878517  0.61075637 -0.03173242  0.05511736\n",
      " -0.37900696 -0.85673038  1.3682263  -0.83056754]\n",
      "Algo: Elim\n",
      "Final stopping time: 1084\n",
      "Starting an experiment - mean \n",
      "[ 0.81848382  2.18850003 -1.04723276  1.20313707 -1.27192426  0.4751723\n",
      "  0.67005098 -1.74004205  1.5436342  -0.08495898]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 291\n",
      "Starting an experiment - mean \n",
      "[ 0.81848382  2.18850003 -1.04723276  1.20313707 -1.27192426  0.4751723\n",
      "  0.67005098 -1.74004205  1.5436342  -0.08495898]\n",
      "Algo: Elim\n",
      "Final stopping time: 1824\n",
      "Starting an experiment - mean \n",
      "[ 0.61178342  1.97053516  0.26459766  0.74523292 -0.88776607  1.0680447\n",
      "  0.10398698 -1.35722015 -1.33744262  0.6798107 ]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 220\n",
      "Starting an experiment - mean \n",
      "[ 0.61178342  1.97053516  0.26459766  0.74523292 -0.88776607  1.0680447\n",
      "  0.10398698 -1.35722015 -1.33744262  0.6798107 ]\n",
      "Algo: Elim\n",
      "Final stopping time: 1193\n",
      "Starting an experiment - mean \n",
      "[ 0.09735425  2.32565449  0.7684657   0.07367514 -0.73480942  1.32033084\n",
      " -0.45726277 -2.21942349  1.01526213 -0.07619396]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 246\n",
      "Starting an experiment - mean \n",
      "[ 0.09735425  2.32565449  0.7684657   0.07367514 -0.73480942  1.32033084\n",
      " -0.45726277 -2.21942349  1.01526213 -0.07619396]\n",
      "Algo: Elim\n",
      "Final stopping time: 878\n",
      "Starting an experiment - mean \n",
      "[-0.46280458  0.68127364 -0.89051629  0.53174283  0.37316877  1.28448878\n",
      "  1.88468605 -2.62240444  0.11949426 -0.51139926]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 471\n",
      "Starting an experiment - mean \n",
      "[-0.46280458  0.68127364 -0.89051629  0.53174283  0.37316877  1.28448878\n",
      "  1.88468605 -2.62240444  0.11949426 -0.51139926]\n",
      "Algo: Elim\n",
      "Final stopping time: 1734\n",
      "Starting an experiment - mean \n",
      "[-0.4398849  -1.00652313  0.14422065  0.17878543 -0.20810685  0.57089495\n",
      "  0.40573296 -2.20610804  0.32723876 -0.41415629]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 7640\n",
      "Starting an experiment - mean \n",
      "[-0.4398849  -1.00652313  0.14422065  0.17878543 -0.20810685  0.57089495\n",
      "  0.40573296 -2.20610804  0.32723876 -0.41415629]\n",
      "Algo: Elim\n",
      "Final stopping time: 25271\n",
      "Starting an experiment - mean \n",
      "[-0.25526    -3.97193847 -0.12412451 -0.10624618 -0.15728393  1.2837724\n",
      "  0.78836744 -2.91659185  1.25731552 -0.26240268]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 260478\n",
      "Starting an experiment - mean \n",
      "[-0.25526    -3.97193847 -0.12412451 -0.10624618 -0.15728393  1.2837724\n",
      "  0.78836744 -2.91659185  1.25731552 -0.26240268]\n",
      "Algo: Elim\n",
      "Final stopping time: 688000\n",
      "Starting an experiment - mean \n",
      "[-0.6431886  -1.20645356 -1.11217092 -0.27371112 -0.50138868  0.12291855\n",
      "  0.07821047 -4.26280499  2.10718593  0.6235176 ]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 134\n",
      "Starting an experiment - mean \n",
      "[-0.6431886  -1.20645356 -1.11217092 -0.27371112 -0.50138868  0.12291855\n",
      "  0.07821047 -4.26280499  2.10718593  0.6235176 ]\n",
      "Algo: Elim\n",
      "Final stopping time: 221\n",
      "Starting an experiment - mean \n",
      "[ 0.07299327  3.30301327 -0.14352586  0.64857435 -0.32384206  1.36714719\n",
      " -0.6471856  -0.06083934 -0.43277969 -0.19550903]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 98\n",
      "Starting an experiment - mean \n",
      "[ 0.07299327  3.30301327 -0.14352586  0.64857435 -0.32384206  1.36714719\n",
      " -0.6471856  -0.06083934 -0.43277969 -0.19550903]\n",
      "Algo: Elim\n",
      "Final stopping time: 263\n",
      "Starting an experiment - mean \n",
      "[ 0.01827719 -0.34829818 -0.35418127 -0.49775093 -1.05060808  0.56800294\n",
      "  1.33890972 -2.49816835 -0.78060269 -0.39963821]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 181\n",
      "Starting an experiment - mean \n",
      "[ 0.01827719 -0.34829818 -0.35418127 -0.49775093 -1.05060808  0.56800294\n",
      "  1.33890972 -2.49816835 -0.78060269 -0.39963821]\n",
      "Algo: Elim\n",
      "Final stopping time: 1248\n",
      "Starting an experiment - mean \n",
      "[-1.25889402 -0.16033008  0.57311971 -0.88487905 -0.30259052  1.33580173\n",
      " -0.46789718 -2.1156851  -0.60956775  0.51655676]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 400\n",
      "Starting an experiment - mean \n",
      "[-1.25889402 -0.16033008  0.57311971 -0.88487905 -0.30259052  1.33580173\n",
      " -0.46789718 -2.1156851  -0.60956775  0.51655676]\n",
      "Algo: Elim\n",
      "Final stopping time: 1272\n",
      "Starting an experiment - mean \n",
      "[ 0.31812118  2.90557921 -0.17080406  0.05440629  1.55603971  1.39293568\n",
      " -0.41826318 -0.93043798  1.53583885 -1.41051739]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 125\n",
      "Starting an experiment - mean \n",
      "[ 0.31812118  2.90557921 -0.17080406  0.05440629  1.55603971  1.39293568\n",
      " -0.41826318 -0.93043798  1.53583885 -1.41051739]\n",
      "Algo: Elim\n",
      "Final stopping time: 616\n",
      "Starting an experiment - mean \n",
      "[-0.38277118  0.80206934  0.12026038 -0.05563762 -0.25734235  0.78130366\n",
      " -0.24973464  0.08898352 -0.34640104  0.03167669]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 103726\n",
      "Starting an experiment - mean \n",
      "[-0.38277118  0.80206934  0.12026038 -0.05563762 -0.25734235  0.78130366\n",
      " -0.24973464  0.08898352 -0.34640104  0.03167669]\n",
      "Algo: Elim\n",
      "Final stopping time: 1662687\n",
      "Starting an experiment - mean \n",
      "[-0.07881791  1.59765642  2.59035128 -0.0508068  -0.7686916   0.55256588\n",
      " -1.29192413 -0.538541    1.83226075 -0.43401678]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 263\n",
      "Starting an experiment - mean \n",
      "[-0.07881791  1.59765642  2.59035128 -0.0508068  -0.7686916   0.55256588\n",
      " -1.29192413 -0.538541    1.83226075 -0.43401678]\n",
      "Algo: Elim\n",
      "Final stopping time: 884\n",
      "Starting an experiment - mean \n",
      "[ 0.10151492  1.78812798 -1.79853022 -0.46180054  0.24903947  0.64233119\n",
      "  1.03042213 -0.49499439  0.440636   -0.25474983]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 340\n",
      "Starting an experiment - mean \n",
      "[ 0.10151492  1.78812798 -1.79853022 -0.46180054  0.24903947  0.64233119\n",
      "  1.03042213 -0.49499439  0.440636   -0.25474983]\n",
      "Algo: Elim\n",
      "Final stopping time: 1205\n",
      "Starting an experiment - mean \n",
      "[-0.33921541  0.25076509  0.37181353  0.47610285 -0.20347682  1.79469709\n",
      "  0.37347464 -1.35461759  0.47769739  0.59133661]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 207\n",
      "Starting an experiment - mean \n",
      "[-0.33921541  0.25076509  0.37181353  0.47610285 -0.20347682  1.79469709\n",
      "  0.37347464 -1.35461759  0.47769739  0.59133661]\n",
      "Algo: Elim\n",
      "Final stopping time: 647\n",
      "Starting an experiment - mean \n",
      "[ 0.17833653  0.43656936  0.38349734 -0.61129258 -0.24716966  1.24371082\n",
      " -0.53859307 -2.206163    0.86600069 -1.32277774]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1260\n",
      "Starting an experiment - mean \n",
      "[ 0.17833653  0.43656936  0.38349734 -0.61129258 -0.24716966  1.24371082\n",
      " -0.53859307 -2.206163    0.86600069 -1.32277774]\n",
      "Algo: Elim\n",
      "Final stopping time: 2923\n",
      "Starting an experiment - mean \n",
      "[-0.70465965  0.38043588 -2.48723543 -0.96306133 -0.75737356  0.54410066\n",
      "  1.32572567 -1.30482618  0.67198129 -1.15814235]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 497\n",
      "Starting an experiment - mean \n",
      "[-0.70465965  0.38043588 -2.48723543 -0.96306133 -0.75737356  0.54410066\n",
      "  1.32572567 -1.30482618  0.67198129 -1.15814235]\n",
      "Algo: Elim\n",
      "Final stopping time: 1983\n",
      "Starting an experiment - mean \n",
      "[ 0.6230725  -0.03093752 -0.74144194  0.13890957  0.11442457  1.09369153\n",
      "  1.02909565 -1.5120724   1.02593895 -0.50598918]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 52236\n",
      "Starting an experiment - mean \n",
      "[ 0.6230725  -0.03093752 -0.74144194  0.13890957  0.11442457  1.09369153\n",
      "  1.02909565 -1.5120724   1.02593895 -0.50598918]\n",
      "Algo: Elim\n",
      "Final stopping time: 204331\n",
      "Starting an experiment - mean \n",
      "[-0.89731036 -0.64730437  0.771684   -0.33626143 -0.7929227   0.08805702\n",
      " -1.06948169 -0.41519865  0.34126316 -0.72816223]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1090\n",
      "Starting an experiment - mean \n",
      "[-0.89731036 -0.64730437  0.771684   -0.33626143 -0.7929227   0.08805702\n",
      " -1.06948169 -0.41519865  0.34126316 -0.72816223]\n",
      "Algo: Elim\n",
      "Final stopping time: 3732\n",
      "Starting an experiment - mean \n",
      "[ 0.20913805  1.3212347  -0.41768451 -0.21211141  0.27191976  0.960876\n",
      " -0.39657509 -3.02753794  0.92167553 -1.59519304]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 1576\n",
      "Starting an experiment - mean \n",
      "[ 0.20913805  1.3212347  -0.41768451 -0.21211141  0.27191976  0.960876\n",
      " -0.39657509 -3.02753794  0.92167553 -1.59519304]\n",
      "Algo: Elim\n",
      "Final stopping time: 5813\n",
      "Starting an experiment - mean \n",
      "[-0.50050139  0.08189881 -1.22426333 -0.67319175 -0.79397068 -0.19009532\n",
      " -0.44350722 -0.19837988  1.3636287   0.18210738]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 141\n",
      "Starting an experiment - mean \n",
      "[-0.50050139  0.08189881 -1.22426333 -0.67319175 -0.79397068 -0.19009532\n",
      " -0.44350722 -0.19837988  1.3636287   0.18210738]\n",
      "Algo: Elim\n",
      "Final stopping time: 400\n",
      "Starting an experiment - mean \n",
      "[ 0.76911663 -0.00273495 -0.09905039  0.0560932   0.13456968  0.84662735\n",
      " -1.36757456  0.16244199  2.01308305  1.20539389]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 480\n",
      "Starting an experiment - mean \n",
      "[ 0.76911663 -0.00273495 -0.09905039  0.0560932   0.13456968  0.84662735\n",
      " -1.36757456  0.16244199  2.01308305  1.20539389]\n",
      "Algo: Elim\n",
      "Final stopping time: 755\n",
      "Starting an experiment - mean \n",
      "[ 0.74904969 -0.6779648  -2.48406996  1.34145666  0.02728006  0.53621378\n",
      "  1.43898263 -0.60722989  0.32164337  0.83464254]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 10413\n",
      "Starting an experiment - mean \n",
      "[ 0.74904969 -0.6779648  -2.48406996  1.34145666  0.02728006  0.53621378\n",
      "  1.43898263 -0.60722989  0.32164337  0.83464254]\n",
      "Algo: Elim\n",
      "Final stopping time: 49031\n",
      "Starting an experiment - mean \n",
      "[ 0.13441811 -0.3400285  -1.30832711 -0.7878551  -0.0309684   1.31712476\n",
      " -0.22311615  0.10568803  1.4713716  -1.07433388]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 4325\n",
      "Starting an experiment - mean \n",
      "[ 0.13441811 -0.3400285  -1.30832711 -0.7878551  -0.0309684   1.31712476\n",
      " -0.22311615  0.10568803  1.4713716  -1.07433388]\n",
      "Algo: Elim\n",
      "Final stopping time: 21651\n",
      "Starting an experiment - mean \n",
      "[ 0.11967944  2.60960354  0.22724971 -0.23377575  0.62573745  1.1075344\n",
      "  0.18283099 -1.32359508  0.08245839  0.12973196]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 98\n",
      "Starting an experiment - mean \n",
      "[ 0.11967944  2.60960354  0.22724971 -0.23377575  0.62573745  1.1075344\n",
      "  0.18283099 -1.32359508  0.08245839  0.12973196]\n",
      "Algo: Elim\n",
      "Final stopping time: 267\n",
      "Starting an experiment - mean \n",
      "[-0.3849586   1.41565766  0.68109824 -0.78595966 -0.56430107  0.68393974\n",
      "  0.19971975  0.0490007  -1.13124029 -1.48656168]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 387\n",
      "Starting an experiment - mean \n",
      "[-0.3849586   1.41565766  0.68109824 -0.78595966 -0.56430107  0.68393974\n",
      "  0.19971975  0.0490007  -1.13124029 -1.48656168]\n",
      "Algo: Elim\n",
      "Final stopping time: 1505\n",
      "Starting an experiment - mean \n",
      "[ 0.61685403  1.4276782   0.12628456 -0.83219171 -0.17806544  0.95341713\n",
      " -0.32938178  1.06436465  2.09181204 -1.99157832]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 309\n",
      "Starting an experiment - mean \n",
      "[ 0.61685403  1.4276782   0.12628456 -0.83219171 -0.17806544  0.95341713\n",
      " -0.32938178  1.06436465  2.09181204 -1.99157832]\n",
      "Algo: Elim\n",
      "Final stopping time: 1102\n",
      "Starting an experiment - mean \n",
      "[-0.21949224  1.86862953  0.27396094  1.80071773  0.80412082  1.86642205\n",
      " -0.1196411  -2.78601632  1.73921168  0.33313968]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 35366480\n",
      "Starting an experiment - mean \n",
      "[-0.21949224  1.86862953  0.27396094  1.80071773  0.80412082  1.86642205\n",
      " -0.1196411  -2.78601632  1.73921168  0.33313968]\n",
      "Algo: Elim\n",
      "Final stopping time: 1729264\n",
      "Starting an experiment - mean \n",
      "[ 0.7334254  -0.7177483   0.1156498   0.11914768  0.04036798  0.53478365\n",
      "  1.39317177 -0.6362539   2.28543411  0.78118474]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 283\n",
      "Starting an experiment - mean \n",
      "[ 0.7334254  -0.7177483   0.1156498   0.11914768  0.04036798  0.53478365\n",
      "  1.39317177 -0.6362539   2.28543411  0.78118474]\n",
      "Algo: Elim\n",
      "Final stopping time: 1072\n",
      "Starting an experiment - mean \n",
      "[ 0.29268556  1.17355708 -0.2218175  -1.1009817  -0.14410772  0.71869114\n",
      "  0.09930412 -1.57629965 -0.24396512 -0.71364378]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 879\n",
      "Starting an experiment - mean \n",
      "[ 0.29268556  1.17355708 -0.2218175  -1.1009817  -0.14410772  0.71869114\n",
      "  0.09930412 -1.57629965 -0.24396512 -0.71364378]\n",
      "Algo: Elim\n",
      "Final stopping time: 3342\n",
      "Starting an experiment - mean \n",
      "[ 0.70540084  1.39278298 -1.70191293  0.2665429  -0.53981977  1.41674891\n",
      " -0.80649592  0.03212709  0.53154903 -0.22107815]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 269542\n",
      "Starting an experiment - mean \n",
      "[ 0.70540084  1.39278298 -1.70191293  0.2665429  -0.53981977  1.41674891\n",
      " -0.80649592  0.03212709  0.53154903 -0.22107815]\n",
      "Algo: Elim\n",
      "Final stopping time: 949530\n",
      "Starting an experiment - mean \n",
      "[-0.09373526  2.47030983 -0.68640829 -0.63765133 -0.42671049  1.3231103\n",
      "  0.25683933 -0.38612172  1.85197683 -0.40664737]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 405\n",
      "Starting an experiment - mean \n",
      "[-0.09373526  2.47030983 -0.68640829 -0.63765133 -0.42671049  1.3231103\n",
      "  0.25683933 -0.38612172  1.85197683 -0.40664737]\n",
      "Algo: Elim\n",
      "Final stopping time: 1053\n",
      "Starting an experiment - mean \n",
      "[-0.0981816   1.52772017  0.51005169  0.08175576 -0.17772725  0.9051571\n",
      "  0.294095   -2.79284296  1.650713   -1.80587734]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 16545\n",
      "Starting an experiment - mean \n",
      "[-0.0981816   1.52772017  0.51005169  0.08175576 -0.17772725  0.9051571\n",
      "  0.294095   -2.79284296  1.650713   -1.80587734]\n",
      "Algo: Elim\n",
      "Final stopping time: 39080\n",
      "Starting an experiment - mean \n",
      "[ 0.24532677  0.90594031 -0.16157349  0.84157942  0.00392258  0.33645731\n",
      "  0.18746568 -1.66079933  0.0870834  -0.18340025]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 14538\n",
      "Starting an experiment - mean \n",
      "[ 0.24532677  0.90594031 -0.16157349  0.84157942  0.00392258  0.33645731\n",
      "  0.18746568 -1.66079933  0.0870834  -0.18340025]\n",
      "Algo: Elim\n",
      "Final stopping time: 109263\n",
      "Starting an experiment - mean \n",
      "[-0.08020202 -0.9501707   0.61887735 -0.88724656  0.1985164   1.44527016\n",
      "  0.14597385 -0.59119784  1.3138627  -1.15052167]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 9209\n",
      "Starting an experiment - mean \n",
      "[-0.08020202 -0.9501707   0.61887735 -0.88724656  0.1985164   1.44527016\n",
      "  0.14597385 -0.59119784  1.3138627  -1.15052167]\n",
      "Algo: Elim\n",
      "Final stopping time: 13906\n",
      "Starting an experiment - mean \n",
      "[-0.46557251  0.18669066 -0.09593575  0.90679875  0.25273021  1.08735504\n",
      " -0.26221846 -3.52995276  0.41216247  1.10779169]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 221883\n",
      "Starting an experiment - mean \n",
      "[-0.46557251  0.18669066 -0.09593575  0.90679875  0.25273021  1.08735504\n",
      " -0.26221846 -3.52995276  0.41216247  1.10779169]\n",
      "Algo: Elim\n",
      "Final stopping time: 475398\n",
      "Starting an experiment - mean \n",
      "[-0.54143345  1.88949932 -0.0155136  -0.17154022  0.3967036   0.66957071\n",
      "  0.88090602 -3.30684227  0.38887499 -1.2918264 ]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 137\n",
      "Starting an experiment - mean \n",
      "[-0.54143345  1.88949932 -0.0155136  -0.17154022  0.3967036   0.66957071\n",
      "  0.88090602 -3.30684227  0.38887499 -1.2918264 ]\n",
      "Algo: Elim\n",
      "Final stopping time: 641\n",
      "Starting an experiment - mean \n",
      "[-1.67636564 -0.7023827   1.03741903 -0.06202934 -0.05847168  0.78354231\n",
      "  1.38555364 -0.08557426  1.40891245 -1.44167639]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 189970\n",
      "Starting an experiment - mean \n",
      "[-1.67636564 -0.7023827   1.03741903 -0.06202934 -0.05847168  0.78354231\n",
      "  1.38555364 -0.08557426  1.40891245 -1.44167639]\n",
      "Algo: Elim\n",
      "Final stopping time: 1214321\n",
      "Starting an experiment - mean \n",
      "[-0.21459041 -0.28268004  1.00403553  0.34474679 -0.26929059  0.57023574\n",
      "  1.85677766 -0.51467165  1.60564942 -1.41499545]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 3235\n",
      "Starting an experiment - mean \n",
      "[-0.21459041 -0.28268004  1.00403553  0.34474679 -0.26929059  0.57023574\n",
      "  1.85677766 -0.51467165  1.60564942 -1.41499545]\n",
      "Algo: Elim\n",
      "Final stopping time: 11129\n",
      "Starting an experiment - mean \n",
      "[ 0.01577542  0.96590116  0.24896186  0.96361093 -0.66370369  1.62482015\n",
      "  0.6108908  -0.3402913   0.63683202 -1.40369137]\n",
      "Algo: TTUCB\n",
      "Final stopping time: 714\n",
      "Starting an experiment - mean \n",
      "[ 0.01577542  0.96590116  0.24896186  0.96361093 -0.66370369  1.62482015\n",
      "  0.6108908  -0.3402913   0.63683202 -1.40369137]\n",
      "Algo: Elim\n",
      "Final stopping time: 1977\n",
      "-4418.53435921669\n"
     ]
    }
   ],
   "source": [
    "tictictic=time.time()\n",
    "res, ult= myExp.monte_carlo_experiment(500)\n",
    "toctoctoc=time.time()\n",
    "print(tictictic-toctoctoc)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Stopping time variables\n",
    "\n",
    "- ElimStop: Stopping time history for our algorithm\n",
    "- TTTSStop: Stopping time history for TTTS\n",
    "- TTUCBStop: Stopping time history for TTUCB\n",
    "\n",
    "### Computation time variables\n",
    "\n",
    "- ElimCompTime: Computation time history for our algorithm\n",
    "- TTTSComp: Computation time history for TTTS\n",
    "- TTUCBComp: Computation time history for TTUCB\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 141,
     "status": "ok",
     "timestamp": 1702786024399,
     "user": {
      "displayName": "Kyoungseok Jang",
      "userId": "07027991917961288603"
     },
     "user_tz": 300
    },
    "id": "DQlyiFmxwODu",
    "outputId": "102e3eb6-8861-4586-b366-3425dc420200"
   },
   "outputs": [],
   "source": [
    "lnow=500\n",
    "ElimStop = [myExp.stopping_time_hist[2*i+1] for i in range(lnow)]\n",
    "#TTTSStop = [myExp.stopping_time_hist[3*i+1] for i in range(lnow)]\n",
    "TTUCBStop = [myExp.stopping_time_hist[2*i] for i in range(lnow)]\n",
    "ElimCompTime = [myExp.time_spent[2*i+1] for i in range(lnow)]\n",
    "#TTTSCompTime = [myExp.time_spent[3*i+1] for i in range(lnow)]\n",
    "TTUCBCompTime = [myExp.time_spent[2*i] for i in range(lnow)]\n",
    "np.savetxt('Elim_K2_1000.txt', ElimStop, delimiter=',')\n",
    "#np.savetxt('TTTS_K2_1000.txt', TTTSStop, delimiter=',')\n",
    "np.savetxt('TTUCB_K2_1000.txt', TTUCBStop, delimiter=',')\n",
    "np.savetxt('Elim_Comptime_K2_1000.txt', ElimCompTime, delimiter=',')\n",
    "#np.savetxt('TTTS_Comptime_K2_1000.txt', TTTSCompTime, delimiter=',')\n",
    "np.savetxt('TTUCB_Comptime_K2_1000.txt', TTUCBCompTime, delimiter=',')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "110410.874\n",
      "709560.568\n"
     ]
    }
   ],
   "source": [
    "print(np.mean(ElimStop))\n",
    "#print(np.mean(TTTSStop))\n",
    "print(np.mean(TTUCBStop))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "1.1770308356285095\n",
      "243.8882636961937\n"
     ]
    }
   ],
   "source": [
    "print(np.mean(ElimCompTime))\n",
    "#print(np.mean(TTTSCompTime))\n",
    "print(np.mean(TTUCBCompTime))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2355069\n",
      "100000001\n"
     ]
    }
   ],
   "source": [
    "print(np.max(ElimStop))\n",
    "#print(np.max(TTTSStop))\n",
    "print(np.max(TTUCBStop))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "100 experiment 12213 seconds\n",
    "200 experiment 105223.81118011475 seconds\n",
    "100 experiments 4418 seconds"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "authorship_tag": "ABX9TyM3WMzKUtrcLePPPBMWj71C",
   "name": "",
   "version": ""
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
