Two-Dimensional Bernoulli Trials

Suppose we have a tray with a grid of "dimples" onto which marbles 
can be placed.  Say this tray is 20x20 (2 dimensional).  We dip 
this tray into a large barrel of red and blue marbles, with a 
population density of 95% red and 5% blue marbles, and each dimple
acquires a marble.  What is the probability of achieving a 5x5 
square region of all red marbles?

This question is harder than it looks.  The direct inclusion-exclusion 
approach will work in principle, but seems to be computationally 
intractable for the stated problem.  Also, the simple recurrence 
relations that can be used to compute the probabilities of runs in 
one-dimensional Bernoulli trials aren't as easy to apply in this 
two-dimensional case.  So, does anyone know a good method for 
computing the exact answer to the question?

In general suppose we have an m-by-m square array of Bernoulli
trials, and the probability of success on each individual trial
is p.  What is the probability that this array will contain at
least one n-by-n square of successes?

Denoting this probability by Pr{m,n,p}, we obviously have

                 Pr{m,1,p}  =  (1-p)^(m^2)

so in the following we assume n>1.  If  m-n = 1  then I think the 
answer is

                /                                         \
 Pr{n+1,n,p} = ( 4 - 4p^n - 2p^(2n-1) + 4p^(2n) - p^(2n+1) ) p^(n^2) 
                \                                         /

In general, I think the answer for m-n=k can be expressed as

               Pr{n+k,n,p}  =  Q(p) p^(n^2)                    (1)

where Q(p) is a polynomial in p with coefficients c[j] and exponents
a[j] as shown below

  Q(p) =  c[0]  +  c[1] p^a[1]  +  c[2] p^a[2]  +  ... + c[t] p^a[t]

So the above solution for k=1 (= m-n) can be specified by the 
coefficients and exponents of Q as follows

                j   c[j]   a[j]
               ---  ----   ----
                0     4      0
                1    -4      n
                2    -2     2n-1
                3     4     2n
                4    -1     2n+1

If  k = 2  I think the answer is given by equation (1) with the 
coefficients and exponents of Q shown below

                j    c[j]   a[j]
               ---   ----   ----
                0      9      0
                1    -12      n
                2     -8     2n-1
                3     16     2n
                4     -4     2n+1
                5     -8     3n-2
                6     16     3n-1
                7     -8     3n
                8     -2     4n-4
                9     10     4n-2
               10    -12     4n-1
               11      4     4n

With k=3 the coefficients and exponents of Q are evidently given
by
               j      c[j]    a[j]
              ---     ----    ----
               0       16      0
               1      -24      n
               2      -18     2n-1
               3       36     2n
               4       -9     2n+1
               5      -24     3n-2
               6       48     3n-1
               7      -24     3n
               8       -8     4n-4
               9      -12     4n-3
              10       64     4n-2
              11      -60     4n-1
              12       16     4n
              13       -8     5n-6
              14       16     5n-4
              15       32     5n-3
              16      -80     5n-2
              17       48     5n-1
              18       -8     5n
              19       -2     6n-9
              20        8     6n-6
              21        4     6n-5
              22       -8     6n-4
              23      -30     6n-3
              24       48     6n-2
              25      -24     6n-1
              26        4     6n

These results suggest that the first several terms of Pr{n+k,n,p}
are given by

          /                                                       \
 p^(n^2) ( (k+1)^2 - 2(k+1)(k) p^n - k^2 p^(2n-1) (2-4p+p^2) - ... )
          \                                                       /

so if p is sufficiently small (say, less than 1/2) these terms could 
be used to approximate the answer.  Unfortunately for large values
of p the result doesn't converge in these terms.  For example, the
original problem had m=20,n=5, so we have k=15 and the first few 
terms of the solution are

          /                                          \
    p^25 ( 256 - 480p^5 - 225p^9 [2 - 4p + p^2] - ... )
          \                                          /

If p was equal to, say, 1/2 we would probably be safe in assuming all
subsequent terms are negligible (because the coefficients c[j] don't
seem to increase very fast), allowing us to estimate the result as

         (0.5)^25 (256 - 15 - 0.10986)  ~  7.18(10)^-6

However, with p=0.95 as in the original problem the result clearly
doesn't converge unless nearly all of the terms in Q(p) are included.
Since the number of terms seems to at least double with each increment 
of k, it seems there would be well over 100,000 terms in the full 
expression for Pr{20,5,p}.

I suppose another approach would be to regard the m^2 trials as a
linear sequence (taking them row by row), and treat the squares of
successes as linear patterns, e.g., with m=5,n=2 the squares are
just linear sequences of the form 1100011 (where 0 means "don't care").  
However, I don't see any simply way of excluding wrap-around" squares.

Does anyone know a good way of computing the answer to the original
question?

Return to MathPages Main Menu
Сайт управляется системой uCoz