Random High-Low Searching

Consider a row of n identical boxes arranged in a line from left to 
right.  We know that in one of the boxes, selected secretly at random, 
has been placed a red marble, whereas in all the boxes to the left of 
that box has been placed a black marble, and in all the boxes to the 
right has been placed a white marble.  Assuming we adopt the strategy
of randopmly selecting from among the boxes that have not yet been
ruled out, what is the expected number of boxes we would need to check 
to find the red marble?

Let f_n denote the expected number of guesses before finding the box
with the red marble.  Obviously the probability of selecting the 
correct box on the first guess is 1/n.  If the correct box is not 
selected on the first guess then the expected number of additional 
guesses to reach the drawn number is f_k, where k is the remaining 
number of possible choices after the first guess.  Therefore, letting
F_k denote the weighted average of f_k for all possible k following
the first guess, we have

             / 1 \           / n-1 \  /             \
     f_n  = ( --- ) f_1  +  ( ----- )(  f_1  +  F_k  )         (1)
             \ n /           \  n  /  \             /

To determine F_k, suppose the first guess is the jth box. If j is not 
the correct box then either j is too high and k=j-1, or j is too low 
and k=n-j.  The probabilities of these two outcomes are (j-1)/(n-1) 
and (n-j)/(n-1) respectively.  Thus the weighted average of all 
possible values of f_k is
                  _                                         _
           1     |   / j-1 \               / n-j \           |
   F_k  =  - SUM |  ( ----- ) f_{j-1}  +  ( ----- ) f_{n-j}  |
           n     |_  \ n-1 /               \ n-1 /          _|


Inserting this expression for F_k into (1) gives the recurrence

                               2    n-1
                f_n  =  1  +  ---  SUM   j f_j
                              n^2   j=1

from which it follows that

                    / n^2 - 1 \               / 2n - 1 \
           f_n  =  ( --------- ) f_{n-1}  +  ( -------- )
                    \   n^2   /               \   n^2  /

In terms of the variable  u_j = j f_j - 1 this recurrence reduces to

                       /     1 \
              u_n  =  ( 1 + --- ) u_{n-1}  +  2
                       \     n /

with u_1=0.  The solution of this first order recurrence is just

                                  n+1  1
                 u_n  =  2 (n+1) SUM  ---
                                  k=3  k

Expressing this in terms of fn gives

                        / n+1 \  /   n   1 \
              f_n  =  2( ----- )(  SUM  --- ) - 3
                        \  n  /  \  k=1  k /

which for large n can be approximated by

                   f_n  ~  2 ln(n) - (3 - 2g)

where g = 0.57721... is Euler's constant.


By the way, if, instead of choosing randomly from the boxes not 
already ruled out, we select the box in the middle of the viable 
range, then the expected number of guesses would be

                 1 - 2^(m+1) + (m+1)(n+1)
         f_n  =  ------------------------
                            n

where m is the least integer greater than log_2(n).  In particular, 
if n is a power of 2 then

                         / n+1 \  ln(n)     n-2
                f_n  =  ( ----- ) -----  -  ---
                         \  n  /  ln(2)      n

which approaches

                         ln(n)
                 f_n  ~  -----  -  1
                         ln(2)

for large n.  It's interesting that this is just one less than the 
maximum number of guesses that could be required using the "mid-box" 
strategy.

For a more explicit determination of the constituient probabilities,
see Expected Length of Random High-Low Search.

Return to MathPages Main Menu
Сайт управляется системой uCoz