N Points On Sphere all in One Hemisphere

What are the odds that n randomly chosen points on a sphere lie in 
the same hemisphere?  We might first consider the question of points 
on a circle.  William Feller's book "An Introduction to Probability 
Theory and Its Applications" gives the probability that n arcs of 
length A chosen independently and randomly on the perimeter of a 
circle of circumference C cover the whole circle as

                     m           / n \   /        A  \ n-1
      Pr{cover} =  SUM  (-1)^k  (     ) (  1 - k ---  )
                    k=0          \ k /   \        C  /

where m is the greatest integer less than C/A.  The negation of this
covering event is that the arcs leave at least one gap, meaning that
the largest separation between the center points of two neighboring
arcs is greater than A.  This is equivalent to the condition that
all n center points fall on an arc of length less than C-A.  Thus,
the probability that all points fall on an arc of length less than
C-A is just  1 - Pr{cover}.  Letting x denote the fraction (C-A)/C
we have
                     m           / n \   /            \ n-1
      Pr{x;n} =  - SUM  (-1)^k  (     ) (  1 - k (1-x) )
                    k=1          \ k /   \            /

In particular, with x = 1/2 (meaning all n points fall in some half of
the circle) this gives Pr{1/2,n} = n/2^(n-1).

Another way of approaching this problem is to split the surface
of S^k in half along various axes, and use inclusion-exclusion to
compute the probability of all the points falling into one of those
discrete halves.  Then carry this over to the limit as the number
of halves approaches all possible halves.  This is admittedly not 
very efficient, but some of the results for the discrete cases are
interesting.

To illustrate, consider the simple case of S^1.  Divide the perimeter
into 2k equal arcs denoted sequentially by a_1, a_2,..., a_(2k), and
let A_i denote the event of all n randomly placed points lying on 
the semi-circle composed of the arcs a_i, a_(i+1),...a_(i+k-1).  To
determine the probability that at least one A_i is true, apply the
inclusion-exclusion principle step-by-step as follows:

   Pr{A_1}  =  (1/2)^n

   Pr{A_1 U A_2}  =  (1/2)^n  +  (1/2)^n  -  ((k-1)/2k)^n

   Pr{A_1 U A_2 U A_3}  

               =  Pr{A_1 U A_2} + Pr{A_3} - Pr{(A_1 U A_2) AND A_3}

and so on.  As each additional A_i up to A_(k+1) is included in the
union, the probability of the union is increased by 

                     / 1 \ n      / k-1 \ n
                    ( --- )   -  ( ----- )
                     \ 2 /        \  2k /

so we have

                                / 1 \ n      // 1 \ n    / k-1 \ n\
 Pr{A_1 U A_2 U...U A_(k+1)} = ( --- )  + k (( --- )  - ( ----- )  )
                                \ 2 /        \\ 2 /      \  2k /  /

As each of the remaining terms A_(k+i), i=2 to k, is added to the
union, the probability is increased by

       / 1 \ n      / k-1 \ n      / i-1 \ n      / i-2 \ n
      ( --- )   -  ( ----- )   -  ( ----- )   +  ( ----- )
       \ 2 /        \  2k /        \  2k /        \  2k /

The 3rd and 4th terms in these quantities cancel on consecutive values
of i, so their net effect is just a single term  -((k-1)/(2k))^n.  Thus, 
the probability of all n points falling in k contiguous arcs is
                                  _                        _
                                 |  / 1 \ n      / k-1 \ n  |
  Pr{A_1 U ... U A_(2k)}  =   2k | ( --- )   -  ( ----- )   | 
                                 |_ \ 2 /        \  2k /   _|

                                           _                 _
                                     k    |      /     1 \ n  |
                              =   ------- | 1 - ( 1 - --- )   |
                                  2^(n-1) |_     \     k /   _|

Expanding the binomial, we have

  Pr{A_1 U ... U A_(2k)} 
                       _                                       _
                 n    |     n-1  /1\    (n-1)(n-2)  /1\ 2       |
           =  ------- | 1 - --- ( - ) + ---------- ( - )  - ... |
              2^(n-1) |_     2!  \k/        3!      \k/        _|

which gives the result n/2^(n-1) in the limit as k -> inf.  It's
interesting that if our circle is not infinitely divisible, but 
is really composed of a large number of discrete segments (as 
may actually be the case in some quantum sense), then the true 
probability is given by the above discrete formula (with some 
suitable k) rather than the continuous limit.

So much for the trivial S^1 case.  Can we apply this same method 
to the S^2 case?  In principle it's certainly possible; simply draw 
several great circles on the sphere and then use inclusion-exclusion 
to find the probability of all n points falling in one of those
hemispheres.  However, it's not so easy to come up with a nice metric
for the surface of a sphere consisting entirely of symmetrical great 
circles such that the resolution can be increased indefinitely.

Choosing just a single axis and drawing the corresponding great circle, 
it's clear that the probability of all n points falling in one of these 
two specific hemispheres is (1/2)^(n-1).  Now suppose we choose three 
orthogonal axes and draw the corresponding great circles.  This divides 
the surface into 8 identical "triangular" regions that define 6 distinct 
hemispheres.  By inclusion-exclusion we find that the probability of 
all n points falling in one of these 6 hemispheres is exactly

                    / 1 \ n         / 1 \ n        / 1 \ n
   Pr{n;3}  =    6 ( --- )   -  12 ( --- )   +  8 ( --- )
                    \ 2 /           \ 4 /          \ 8 /

In the case of FOUR great circles (normal to the four axes of a cube)
the derivation is a bit more challenging.  In this case the surface
is divided into 6 "square" regions and 8 "triangular" regions.  If we 
let s and t denote the areas of these two types of regions as a fraction 
of the total area, then each of the 8 hemispheres has area 4t+3s = 1/2.  
The non-zero overlaps between two hemispheres come in two types: there 
are 12 of area 2t+2s and 12 of area 2t+s.  The non-zero overlaps 
between three hemispheres also come in two types: there are 24 of area 
t+s and 8 of area t.  Finally, the non-zero overlaps between four 
hemispheres come in two types: there are 8 of area t and 6 of area s.  
Thus, the probability of all n points falling within one of these 8 
hemispheres is

   Pr{n;4} = 

       8(4t+3s)^n - 12(2t+2s)^n - 12(2t+s)^n + 24(t+s)^n - 6(s)^n

Noting that the spherical angle at each corner of the triangular
regions is  2invtan(1/sqrt(2))  we have

                     t = 0.043869914...
                     s = 0.108173448...

and the probability can be expressed as
                     _                                              _
            / 1 \ n |           n      n       / 1-q \ n           n |
Pr{n;4} = 2( --- )  | 4 - 6(1-q)  -  6q   + 12( ----- )   - 3(1-2q)  |
            \ 2 /   |_                         \  2  /              _|

where q = 2invtan(1/sqrt(2))/PI.

By the way, the four great circles in this example are the projections 
of the edges of an inscribed "cuboctahedron", which is one of the
13 semi-regular Archimedean solids.  The only other Archimedian solid
whose projected edges are great circles is the "icosidodecahedron",
which has 6 great circles (one normal to each of the 6 axes of the
icosahedron).  I started working out the details of that case, but
then I decided to tackle the harder case of 10 great circles on a 
sphere, normal to the 10 axes of a dodecahedron.  This gives 20 
distinct hemispheres.  (Interestingly, the corresponding solid is 
not one of the Archimedean solids, because it has two distinct 
kinds of verticies.)

These 10 great circles divide the surface of the sphere into a total 
of 92 regions: 60 triangles, 20 hexagons, and 12 pentagons.  Letting
t, h, and p denote these individual areas as fractions of the total
sphere's area, each hemisphere has area 6p + 10h + 30t = 1/2.  The
inclusion-exclusion formula gives the overall probability that all
n points fall in one of these 20 hemispheres as

 Pr{n;10} =   20(6p+10h+30t)^n - 30(4p+8h+22t)^n - 60(4p+6h+18t)^n

            + 60(3p+6h+16t)^n + 60(3p+5h+14t)^n - 60(2p+5h+12t)^n
 
            - 60(2p+4h+12t)^n + 120(2p+4h+11t)^n - 60(2p+4h+10t)^n

            - 30(2p+2h+8t)^n + 60(2p+2h+7t)^n - 30(2p+2h+6t)^n

            + 12(p+5h+10t)^n


Another (simpler) variation on the circle problem is to determine 
the probability that n points uniformly distributed on a unit interval 
will all fall within some interval of length q.  In this case the 
probability is  nq^(n-1) - (n-1)q^n.  This problem also has many 
fascinating geometrical interpretations.

Doug Zare comments that if we let P(n,d) be the probability that the 
origin is contained in the convex hull of n points chosen uniformly 
and independently on the surface of the d-sphere, then

                 P(n-1,d-1) + P(n-1,d)
       P(n,d) = -----------------------
                           2

which implies that the values are partial sums of rows of Pascal's
triangle, normalized.  That's a nice result.  I've been focusing on 
the complementary problem of finding the probability Pr{n,d} that n 
randomly chosen points on S^d all fall within some "hemisphere".  I 
found the following explicit formulas, which agree exactly with the
above recurrence relation:


  2^(n-1) Pr{n,0} =  1



  2^(n-1) Pr{n,1} =      n


                             n(n-1)
  2^(n-1) Pr{n,2} =  1   +   ------
                               2!

                                    n(n-1)(n-2)
  2^(n-1) Pr{n,3} =      n     +    -----------
                                         3!

                             n(n-1)             n(n-1)(n-2)(n-3)
  2^(n-1) Pr{n,4} =  1   +   ------      +      ----------------
                               2!                      4!

and so on.  The terms in the numerator of Pr{n,d} are alternating 
binomial coefficients from the nth row of Pascal's triangle, and 
if d=(n-2)/2 (where n is even), the formula covers just half of the 
row. The sum of alternating coefficients over this half-row is (2^n)/4, 
so we have  Pr{2d+2,d} = 1/2.

Oddly enough, the above formula for S^2 disagrees with the formula
given in the rec.puzzles FAQ, which is 1 - [1 - (1/2)^(n-2)]^n.  The 
FAQ's formula would more reasonable if n was replaced with n-1, but 
it still seems to differ from the correct answer for n>4.  As a 
confidence builder I wrote a little program to randomly pick n points
on a sphere and check to see if they are all in one hemisphere.  The 
results are summarized below:
                                           rec.puzzles FAQ
           n    simulation     Pr{n,2}     (with n-1 for n)
         ----   ----------    --------     ---------------
           1      1.0000       1.0000         0.0000
           2      1.0000       1.0000         2.0000
           3      1.0000       1.0000         1.0000
           4      0.8742       0.8750         0.8750
           5      0.6862       0.6875         0.6835
           6      0.4954       0.5000         0.4871
           7      0.3365       0.3438         0.3211
           8      0.2295       0.2266         0.1993
           9      0.1508       0.1445         0.1183
          10      0.0889       0.0898         0.0681

These results (10000 samples per n) definitely support the formula 
Pr{n,2}, and disagree with the FAQ.

Anyway, it's interesting that the formulas for Pr{n,d} seem to be 
closely related to the formula for the probability of n points all 
falling within some k contiguous arcs of a circle (S^1) that has 
been divided into 2k equal arcs.  This was given in an earlier post
as 
                      _                                      _
               1     |     n(n-1) 1     n(n-1)(n-2)  1        |
 Pr{n;k} =  -------  | n - ------ -  +  ----------- --- - ... |
            2^(n-1)  |_      2!   k         3!      k^2      _|


It's almost as if, as k->1, this probability on the discretized 
circle approaches  Pr{n,2j+1} - Pr{n,2j} + 1/2^(n-1)  for sufficiently 
large j.  It's also interesting that the formula for Pr{n,d} is
fundamentally different - but complementary - for odd dimensions 
and for even dimensions, sort of like sine and cosine functions.

By the way, regarding the discrete cases on S^2, I filled in the
case of 6 great circles (normal to the axes of an icosahedron). To
summarize all the discrete results I've found, here are the
formulas for the probability that n random points will all fall 
on one side of one of a G great circles:

  P1{n} = 2(t)^n       
    
        where t = 0.50000000

  P3{n} = 6(4t)^n - 12(2t)^n + 8(t)^n     
        
        where t = 0.125000

  P4{n} = 8(4t+3s)^n - 12(2t+2s)^n - 12(2t+s)^n + 24(t+s)^n - 6(s)^n

         where t = 0.043869914,  s = 0.108173448

  P6{n} = 12(6p+10t)^n - 30(4p+6t)^n + 20(3p+4t)^n
                       - 30(2p+4t)^n + 60(2p+3t)^n - 30(2p+2t)^n

         where  t = 0.014312286762175,  p = 0.059479522063042

  P10{n} =   20(6p+10h+30t)^n - 30(4p+8h+22t)^n - 60(4p+6h+18t)^n
            + 60(3p+6h+16t)^n + 60(3p+5h+14t)^n - 60(2p+5h+12t)^n
            - 60(2p+4h+12t)^n + 120(2p+4h+11t)^n - 60(2p+4h+10t)^n
            - 30(2p+2h+8t)^n  +  60(2p+2h+7t)^n - 30(2p+2h+6t)^n
            + 12(p+5h+10t)^n

      where  p = 0.013028988120384
             h = 0.029670196644958
             t = 0.004170754471287

It's interesting to review the coefficients for each of these cases.
The solids listed below indicate that the great circles are normal to 
the axes of that solid.

   G
  ----  
   1                                                         2 = 2
   3   octahedron                                       6-12+8 = 2
   4   hexahedron/tetrahedron                     8-12-12+24-6 = 2
   6   icosahedron                           12-30+20-30+60-30 = 2
  10   dodecahedron    20-30-60+60+60-60-60+120-60-30+60-30+12 = 2


(The hexahedron, i.e., cube, and the tetrahedron are the same case,
because the four "axes" of the tetrahedron, taken to be the lines
from the center to each vertex, coincide with the four axes of the
cube, where each axis goes through two verticies.)

Notice that the sum of the coefficients is always 2.  In contrast, ]
the sum of the coefficients on S^1 is always 0.  This seems to be the 
Euler topological invariant of the surface.  Along these same lines,
notice from the above formula for Pr{n,2} that n great circles divide 
a 2-sphere into n^2 - n + 2 parts (assuming no three of the circles 
intersect at a point, which has probability zero for randomly selected 
axes).  This invariant provides one of the TWO keys to a very nice 
proof of the original problem.  The second key is to split up the 
random selection of a point into two orthogonal steps.  The first 
step is to randomly choose an axis for the point.  This narrows down 
the location to one of two places, the two points of intersection of 
the axis with the sphere.  The second step is to randomly select one 
of those two points.

So, to place n points on the surface of a 2-sphere, first randomly
select n axes, each of which corresponds to a great circle at it's 
"equator".  These great circles split the sphere into n^2 - n + 2 
regions.  We then need to make n binary choices to determine for
each point whether to place it at the "North" or "South" pole of 
it's axis, so there are 2^n possible choices overall, and each of
these is equally likely (becasue we randomly choose North or South
in each case).

For each point, let's highlight the hemisphere (Northern or Sourthern)
that contains the point we've chosen.  Clearly the n points all lie
in a single hemisphere if and only if there is some region on the
sphere where all n highlighted hemispheres overlap.

The question then becomes:  How many of our 2^n equi-probable choices
will result in some overlap of all n selected hemispheres?  Here we
notice that each of the n^2 - n + 2 regions into which the sphere 
has been divided represents a unique set of choices for which the 
n hemispheres all overlap (the overlap consisting of that region
itself).  Therefore, the probability that all n points lie in one
hemisphere is (n^2 - n + 2)/2^n, in agreement with the formula 
for Pr{n,2}.
Return to MathPages Main Menu
Сайт управляется системой uCoz