Two-Dimensional Bernoulli Trials
Suppose we have a tray with a grid of "dimples" onto which marbles
can be placed. Say this tray is 20x20 (2 dimensional). We dip
this tray into a large barrel of red and blue marbles, with a
population density of 95% red and 5% blue marbles, and each dimple
acquires a marble. What is the probability of achieving a 5x5
square region of all red marbles?
This question is harder than it looks. The direct inclusion-exclusion
approach will work in principle, but seems to be computationally
intractable for the stated problem. Also, the simple recurrence
relations that can be used to compute the probabilities of runs in
one-dimensional Bernoulli trials aren't as easy to apply in this
two-dimensional case. So, does anyone know a good method for
computing the exact answer to the question?
In general suppose we have an m-by-m square array of Bernoulli
trials, and the probability of success on each individual trial
is p. What is the probability that this array will contain at
least one n-by-n square of successes?
Denoting this probability by Pr{m,n,p}, we obviously have
Pr{m,1,p} = (1-p)^(m^2)
so in the following we assume n>1. If m-n = 1 then I think the
answer is
/ \
Pr{n+1,n,p} = ( 4 - 4p^n - 2p^(2n-1) + 4p^(2n) - p^(2n+1) ) p^(n^2)
\ /
In general, I think the answer for m-n=k can be expressed as
Pr{n+k,n,p} = Q(p) p^(n^2) (1)
where Q(p) is a polynomial in p with coefficients c[j] and exponents
a[j] as shown below
Q(p) = c[0] + c[1] p^a[1] + c[2] p^a[2] + ... + c[t] p^a[t]
So the above solution for k=1 (= m-n) can be specified by the
coefficients and exponents of Q as follows
j c[j] a[j]
--- ---- ----
0 4 0
1 -4 n
2 -2 2n-1
3 4 2n
4 -1 2n+1
If k = 2 I think the answer is given by equation (1) with the
coefficients and exponents of Q shown below
j c[j] a[j]
--- ---- ----
0 9 0
1 -12 n
2 -8 2n-1
3 16 2n
4 -4 2n+1
5 -8 3n-2
6 16 3n-1
7 -8 3n
8 -2 4n-4
9 10 4n-2
10 -12 4n-1
11 4 4n
With k=3 the coefficients and exponents of Q are evidently given
by
j c[j] a[j]
--- ---- ----
0 16 0
1 -24 n
2 -18 2n-1
3 36 2n
4 -9 2n+1
5 -24 3n-2
6 48 3n-1
7 -24 3n
8 -8 4n-4
9 -12 4n-3
10 64 4n-2
11 -60 4n-1
12 16 4n
13 -8 5n-6
14 16 5n-4
15 32 5n-3
16 -80 5n-2
17 48 5n-1
18 -8 5n
19 -2 6n-9
20 8 6n-6
21 4 6n-5
22 -8 6n-4
23 -30 6n-3
24 48 6n-2
25 -24 6n-1
26 4 6n
These results suggest that the first several terms of Pr{n+k,n,p}
are given by
/ \
p^(n^2) ( (k+1)^2 - 2(k+1)(k) p^n - k^2 p^(2n-1) (2-4p+p^2) - ... )
\ /
so if p is sufficiently small (say, less than 1/2) these terms could
be used to approximate the answer. Unfortunately for large values
of p the result doesn't converge in these terms. For example, the
original problem had m=20,n=5, so we have k=15 and the first few
terms of the solution are
/ \
p^25 ( 256 - 480p^5 - 225p^9 [2 - 4p + p^2] - ... )
\ /
If p was equal to, say, 1/2 we would probably be safe in assuming all
subsequent terms are negligible (because the coefficients c[j] don't
seem to increase very fast), allowing us to estimate the result as
(0.5)^25 (256 - 15 - 0.10986) ~ 7.18(10)^-6
However, with p=0.95 as in the original problem the result clearly
doesn't converge unless nearly all of the terms in Q(p) are included.
Since the number of terms seems to at least double with each increment
of k, it seems there would be well over 100,000 terms in the full
expression for Pr{20,5,p}.
I suppose another approach would be to regard the m^2 trials as a
linear sequence (taking them row by row), and treat the squares of
successes as linear patterns, e.g., with m=5,n=2 the squares are
just linear sequences of the form 1100011 (where 0 means "don't care").
However, I don't see any simply way of excluding wrap-around" squares.
Does anyone know a good way of computing the answer to the original
question?
Return to MathPages Main Menu
Сайт управляется системой
uCoz