Added Thompson Sampling Strategy for Gaussian with Unknown Mean + Variance #12

aicherc · 2017-03-14T22:45:30Z

Also moved strategy.py into a subfolder and split it out a bit.

Note: I found that selecting prior parameters is important.

This resolves #7

kyleclo · 2017-03-14T23:03:18Z

strategy/thompson_sampling_strategy.py

+        samples = [self._sample(**params) for params in self.posterior_params]
+        return np.argmax(samples)
+
+    def pull_arm(self, arm_index):


Just realized this isn't private. Can you change that (or I can do it). Users shouldn't be interacting with bandit directly except through .fit()

Sounds good to me. I can add that tonight

kyleclo · 2017-03-14T23:07:27Z

strategy/thompson_sampling_strategy.py

+    Attributes:
+        num_arms (int)
+        posterior_params (list of list): posterior parameters
+        estimated_arm_means (ndarray): posterior predictive mean


I don't think it's the posterior predictive mean; Isn't it just the posterior mean?

I think they are the same in this case, since the posterior mean of the mean is also the posterior predictive mean. And we want to maximize the posterior predictive mean.
The variance/standard deviation is another story. The posterior variance of the mean is smaller than the posterior predictive variance

ohhh yea i see. yea they were same since wes & i only considered returning the mean. hmm i dunno what's best here

kyleclo · 2017-03-14T23:37:56Z

strategy/thompson_sampling_strategy.py

+    """
+    def __init__(self, bandit, **kwargs):
+        self.bandit = bandit
+        self.num_arms = bandit.num_arms


i'd like to make this into a property to keep style consistent.

kyleclo · 2017-03-14T23:40:58Z

strategy/thompson_sampling_strategy.py

+
+    @property
+    def estimated_arm_sds(self):
+        return np.array([self.sigma2 + params['sigma2']


why is it the sum of these two?

I think this was in the first commit and fixed in the second.
Originally, I wanted estimated_arm_sds to be the estimated arm standard deviation (posterior predictive standard deviation). The sum was changed to just params['sigma2'] in the second commit.

Don't forget there's a 'view all changes' (so you don't have to look at commits one-by-one)

aicherc · 2017-03-15T03:13:31Z

Made some unnecessary minor changes (property + private function).
Question remains whether estimated_arm_sds should return a numpy array with (1) arm reward standard deviation estimates or (2) arm mean posterior's standard deviation.

Currently it is (2). The goal is to give the user insight into why different arms may be pulled (max of samples from arm posteriors), hence why its the standard deviation of the posterior.

The name might be a bit misleading.

kyleclo · 2017-03-15T17:56:46Z

hmm, ya i dunno what's best between posterior predictive SD or posterior SD. would it be fine to return both, but name them as such?

aicherc added 2 commits March 14, 2017 14:22

Moved strategy into subfolder and added estimated_arm_sds

18075c1

Added ThompsonSampling for Gaussian with unknown mean + variance

c38873a

aicherc requested review from kyleclo and wesleytlee March 14, 2017 22:45

aicherc changed the title ~~Add Thompson Sampling for Gaussian with Unknown Variance~~ Added Thompson Sampling Strategy for Gaussian with Unknown Mean + Variance Mar 14, 2017

kyleclo reviewed Mar 14, 2017

View reviewed changes

Changed pull_arm to private _pull_arm and num_arms to property

f4201f6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added Thompson Sampling Strategy for Gaussian with Unknown Mean + Variance #12

Added Thompson Sampling Strategy for Gaussian with Unknown Mean + Variance #12

aicherc commented Mar 14, 2017

kyleclo Mar 14, 2017

aicherc Mar 14, 2017

kyleclo Mar 14, 2017

aicherc Mar 14, 2017

kyleclo Mar 15, 2017

kyleclo Mar 14, 2017

kyleclo Mar 14, 2017

aicherc Mar 15, 2017

aicherc commented Mar 15, 2017

kyleclo commented Mar 15, 2017

Added Thompson Sampling Strategy for Gaussian with Unknown Mean + Variance #12

Are you sure you want to change the base?

Added Thompson Sampling Strategy for Gaussian with Unknown Mean + Variance #12

Conversation

aicherc commented Mar 14, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aicherc commented Mar 15, 2017

kyleclo commented Mar 15, 2017