Multipole Logo Multipole Sprite
Unix
Mac OS X
Windows
Unix
Windows

Computational Simulation of Multi-Body Interactions with O(n) Scaling

Abstract

Molecular dynamics calculations have historically required significant amounts of computer time since they require integration of the equations of motion over significant periods of time for large numbers of particles. This research describes a new class of algorithms which not only speeds up these computations, but provides for much slower growth in the time required for doing even larger problems. Both the underlying theory and implementation details are provided.

Introduction

Molecular dynamics solves Hamilton's equation of motion for a set of generalized coordinates qi and momenta pi



q
 

i 
= ∂H

∂pi
          

p
 

i 
= − ∂H

∂qi

For a potential V, which is conservative, Cartesian coordinates xi and velocities vi are used, yielding a Hamiltonian


H({pi,xi}) =

i 
pi2/2mi + V({xi})

From Newton's equations, the forces Fi are given by


Fi = −

∂xi
V({xi})

which are a function of the interparticle distance only.

Put in pragmatic terms, the problem is to calculate these forces efficiently. The majority of the time required is in computing the non-bonded interactions (for N atoms there are approximately N2 interactions, so the complexity of the algorithm is denoted O(N2)). In fact, the computation of bonded forces is even more expensive O(N3), but since there are far fewer bonded interactions this cost turns out to be minor. For certain classes of problems, this has been reduced to O(N). Since this work was done specifically for the domain of structural mechanics where there exist a wide range of constraints for which there is no molecular analog, simplification to the chemical world remains a formidable task.

There have been previous efforts to speed up the direct method of calculating potentials [Hockney 81,Appel 85]. These methods use both heuristics and approximations. For example, many molecular simulations disregard any potential due to particles more than a certain distance from a particular atom. Arnold [Arnold 90] and Rycerz [Rycerz 91] discuss an O(N) algorithm for molecular dynamics simulation that approximates the potential by restricting the pairwise potentials to the nearest neighbors of each atom. Typically these methods have no known a priori estimates of the accuracy of the approximation nor any a posteriori metric of precision.

The most straightforward technique for decreasing the number of non-bonded interactions is to consider only interactions within a preset (or perhaps dynamically adjustable) distance. The computational overhead in doing this becomes significant for larger systems. Lists of "nearby" atoms can be maintained and updated regularly, saving some additional time, but none of these changes the fundamental scaling of the problem - it grows with the square of the number of particles.

When use of neighbor lists becomes impractical, there are two general classes of solutions:

The work described here is essentially a hybrid of these two techniques and provides a significant advantage over each individually while at the same time yielding a scaling of computational work that is a linear function of the number of atoms, not a quadratic one.

We have implemented a multipole algorithm that solves the interparticle gravitational potential (1/r) of N particles to within a prescribable round-off error and whose computational effort grows only linearly with increasing N, i.e. O(N). This makes possible simulations of a large number of atoms on the order of 500,000 or more. Typically, thousands to millions of time steps are required for a realistic dynamic simulation. The linear nature of the cost of computing the potential with the sequential multipole algorithm allows many more time-steps, in less time, than previously practical.

The multipole algorithm can be implemented on a parallel (SIMD - Single Instruction, Multiple Datapath) computer and the computational effort asymptotically approaches O(logN) as the number of separate processors approaches N. Katzenelson [Katzenelson 88] has estimated 24 seconds would be required per time step for a million equally distributed particles on the CM-2 Connection machine (65,536 processors, costing $6MM) using a parallel multipole method. He has shown that even with a smaller array of processors, e.g. 2x2x2, it is possible to get impressive performance improvements.

We plan to extend our initial implementation of the O(N) N-body algorithm to include a Lennard-Jones potential and to implement the parallel multipole potential algorithm for both the gravitational and the Lennard-Jones potential. The net effect of our algorithm is to decrease the number of actual function evaluations. Since direct computation of 1/r is much less time consuming then the computation of 1/r6 − 1/r12, the speed-ups that we will achieve with Lennard-Jones will be far in excess of what are reported for gravitational potentials.

In this work we review the method and its computational structure, and then discuss the efficient implementation of the algorithm on a sequential computer. We numerically verify the algorithm, and compare its speed with that of an O(N2) direct method. We conclude by presenting ideas on how the research concerning this algorithm should proceed.

We must stress that, while all algorithmic analysis uses the capital O to parameterize complexity, the constant of proportionality are not the same for all the algorithms. There is considerable computational overhead in setting up the data structures to permit the faster evaluation, so it is incorrect to assume an N-fold speed-up for all N with this new algorithm.

Multipole Expansion of the Potential

Greengard and Rokhlin [Greengard 86] discovered an algorithm with complexity O(N). Their algorithm computes the potentials (or the forces) by partitioning them into two parts:


φtotal = φnear−fieldfar−field

where φnear−field is the potential due to nearby particles and φfar−field is the potential due to faraway particles. φnear−field is computed by direct evaluation and φfar−field is represented as an absolutely convergent Taylor series. Given a required precision ε, the number of terms p in the Taylor series can be determined in advance to obtain a particular accuracy.

Zhao [Zhao 87] expands the various functions to a series of the derivatives of 1/R where R is the distance from the center of a sphere (whose function is being computed) to the point at which the potential is desired. Unlike Greengard [Greengard 88], who uses spherical coordinates, Zhao's algorithm avoids the evaluation of (direction) cosines by use of Cartesian coordinates (x1,x2,x3).

The algorithm relies on the representation of the potential fields of sources by multipole expansions, and the ability to shift these expansions to common points of reference so they can be summed into single expansions. These expansions represent the potential fields of a region of space. They are evaluated at the position of each source to find the resulting potentials. The ability to choose how many terms of the expansion are computed gives control over the accuracy of the algorithm.

In the next few sections we give a brief overview of Zhao's formulation of the O(N) many body algorithm. For a more detailed explanation of the algorithm see [Zhao 87], [Katzenelson 88], or [Simon 90].

Theorems and Lemmas

The theorems and lemmas derived by Zhao are summarized below. They are essential in understanding the multipole algorithm in the next section. For the detailed statement and proof of these theorems, see [Zhao 87] or [Greengard 88].

The gravitational potential at a point x ∈ R3 due to a point mass μ located at y ∈ R3


φy(x) = − μ

|| x − y ||

Figure 1 illustrates the quantities needed in the multipole theorems. The potential at point x ∈ R3 is due to mass μ at point y ∈ R3. y(0) is the reference point, and γ is the angle between vectors [(y(0)y)] and [(y(0)x)] . Also there are quantities r = ||x − y ||, r0 = ||x −y(0) ||, and R = ||x − y(0) ||.

figure1
Figure 1: Gravitational field.

The multipole theorems produce an expansion for 1/R in a region of analysis outside a sphere. The theorems are:

The ability to sum individual expansions to obtain the multipole expansion of a potential depends on these expansions having a common reference point and a common region of convergence. If they do not, we must shift the series to a common reference point using the lemmas that follow.

Predicted Error

From the Zhao's Multipole Expansion theorem, the error estimate for the multipole method is:


|ε| ≤ C
r0

R

p+1

 

where p is the number of terms in the Taylor series expansion. We choose to set the ratio |r0/R| in the error estimate to 1/2 . This restricts the applicability of the expansion to within twice the radius of the circumscribed sphere. Therefore, for a required accuracy of ε, p = log2 ε, we can then expect a factor of 2 decrease in error for each additional term in in the expansion.

As we retain more and more terms in the multipole expansions, the accuracy of the algorithm improves. When p = 20, the error of the computation is at the level of machine round-off precision (or beyond).

In a Cartesian coordinate system, we can define a computational data structure made up of many cubes that represent the three-dimensional space under consideration. We will use this data structure to implement the multipole-expansion method to compute the potential within the predicted error in O(N). The details of this computational data structure are discussed next.

Computational Data Structure

Three-dimensional space may be regarded as a cube which may be divided into eight subcubes, each of which may further be divided into eight subcubes, etc. This decomposition is represented using a tree data structure in which each cube is represented by a node. This data structure is known as an eight-way tree or "oct-tree" [Knuth 81].

Each level of the tree corresponds to one level of the three-dimensional spatial decomposition. Level 0, the root, is equivalent to the entire cube, while level l + 1, the children, is obtained from level l, the parent, by subdivision of each region into eight equal parts. The number of distinct cubes at level l is equal to 8l. Figure 2 illustrates the decomposition of the N-body cube.

figure2
Figure 2:

The N-body cube and the first two levels of refinement, indicated by the solid and dashed lines, respectively.

The cube is continually subdivided until there are only a few particles associated with each node. The nodes of a level that are not further decomposed are leaves. Given a uniform spatial distribution of particles, the level of decomposition grows logarithmically with the number of particles to enforce the constraint that the number of particles in each leaf cube is bounded by a predetermined constant. The level of decomposition, or height of the tree, has been determined to be log8 N. Choosing fewer levels will force more particles per leaf node. If we define the average number of particles for a leaf node to be n, then the condition n << N must be met to maintain the linear nature of the multipole algorithm. Conversely, if we choose more levels, then too much time is spent calculating the far-field potential.

For each cube we define its near field, far field, and interactive field. The near-field of a cube consists of neighboring cubes at the same level of decomposition such that the distance from the center of the cube to the center of the near-field neighbor is within 2√3 times the cube edge length. This value is the body diagonal of the cube and the diameter of the circumscribed sphere as required by the multipole theorems. The number of nodes in the near-field is bounded by 81. The far-field of a cube is the exterior of the near-field. Finally, the interactive-field of a cube is the part of its far-field that is in the near-field of the cube's parent. That is, all cubes in the parent's near-field that are not in the near-field of the current cube under consideration are members of the interactive field. The number of nodes in the interactive-field is bounded by 567. There are 8 interactive-field nodes for each of the 81 near-field nodes of the parent, minus the 81 near-field nodes of the current node (8*81−81 = 567).

The near, far, and interactive fields in two dimensions are illustrated in figure 3.

. . . . . . . .
. . . . . . . .
i i i i i i . .
i i n n n i . .
i i n s n i . .
i i n n n i . .
i i i i i i . .
i i i i i i . .

Figure 3: Near-field (n), far-field (.), and interactive-field (i) of square s.

Now we have defined the computational data structure of the O(N) N-body algorithm. The next section describes how Zhao's algorithm can utilize this structure to compute the potential of N particles in three-dimensions.

Algorithm

A formal description of the multipole-expansion algorithm follows. It comes directly from [Zhao 87]. The notation used to describe the multipole-expansion algorithm includes:

Following is a formal description of the algorithm.

Each local expansion is described by the coefficients of a p-term polynomial. Direct evaluation of this polynomial at a particle yields the potential. The force is available analytically and therefore there is no need for numerical differentiation.

Zhao's O(N) algorithm distinguishes between the computation of the potential due to the near field particles ("near-field potentials") and far field particles ("far-field potentials"). The near-field potential is computed using direct evaluation. The far-field potential is obtained by evaluating the p-term multipole expansion at the particle position.

The goal of the multipole computation is to compute φfar−field with O(N) work. This is achieved by propagating values, first from the leaves to the root through the computational tree, then from the root to the leaves in two distinct phases. In the first phase the far-field interactions φil are computed by shifting the multipole expansions of the child nodes to the center of the parent node and adding the results. φil is calculated for all cubes at all levels, beginning at the finest level of refinement (the leaf nodes). The multipole-expansion of the field due to each particle is calculated relative to the center of each leaf node and summed. Proceeding upward, the expansions of the eight child nodes are shifted relative to the parent's center and summed. This computation is performed at each level propagating values upward through the tree. At the end of this phase of the algorithm, each node at every level has a multipole-expansion representation of the field due to all particles within its boundaries. These expansions are only valid outside the region defined by the near field of each cube.

In the second phase, we form the local expansions ψil for all cubes at all levels beginning at the coarsest level (root node). The far-field multipole-expansions of all nodes in the interactive field of the node under consideration are converted into local expansions relative to the node's center and are summed. Next, shift the expansion of the parent cube to the center of the current cube under consideration and sum with the local expansions computed from the interactive field. Note, the root node does not have a parent or interactive field so it's local expansion is defined to be zero. The local expansions are propagated down the tree. They describe the field due to all particles in the system that are not contained in the current node or its nearest neighbors.

Finally, for each cube i at the finest level n, we evaluate the local expansion φin for each particle in cube i. Each local expansion is described by the coefficients of a p-term Taylor series. Direct evaluation of this series at a particle yields the potential. Summing the local expansion at each leaf node forms φfar−field. The interactions of each particle in cube i are computed directly and summed to form φnear−field. Finally summing φfar−field and φnear−field gives us the desired potential.

Multipole-expansion Computer Program

A computer program using this multipole-expansion algorithm has been implemented in the C computer language. The efficient implementation of the algorithm requires a language that supports data structures and dynamic memory. C provides both and is also portable across many computer architectures. The source code is in appendix B. The code has been compiled and executed on many computers including the Cray Y-MP, IBM RS/6000, Stardent Titan, Sun 4/280, Decstation 3100, MIPS 3420, and a VAX 8800.

There are significant optimization opportunities for the multipole-expansion algorithm. In the next section, we discuss the optimizations (both implemented and potential) of the algorithm. Next, we discuss the memory requirements of our method. Finally we present results from the execution of the algorithm in which we compare speed and error of the multipole-expansion method to that of the direct evaluation method.

Algorithm Optimizations

Reviewing the algorithm and monitoring of the algorithm execution revealed opportunities for both machine independent and machine dependent optimizations.

The machine independent optimizations we discovered follow. Some of the optimizations are discussed in [Zhao 87] and [Katzenelson 88]. This fact is noted in the relevant optimization discussion.

The following machine dependent optimizations were included in the code found in appendix B. Each was discovered by the author (Cristy).

Memory Requirements

In addition to reducing the execution time of the multipole-expansion algorithm, we are also concerned about its computer memory requirements. The multipole-expansion method requires maintaining the decomposition tree in memory and therefore requires more memory than the naive direct evaluation method. Computers that are memory constrained may inhibit good running times of the algorithm due to excessive paging. In some cases the algorithm may not run at all due to shortages of memory. The memory requirements of the multipole-expansion algorithm are discussed here.

The particles themselves require storage. Each particle has a position associated with it. Other information may also be associated with each particle such as velocity, acceleration, mass, etc.

The total number of cubes at all levels of the decomposition in the multipole-expansion tree is


80 + 81 + ... + 8n = 8n+1−1

7
= 8N − 1

7

Each node of the decomposition tree stores the coordinate of the node center, pointers to its children and its parent, and pointers to its near-field and interactive-field. Each node also stores the multipole and local expansions.

Each node has these memory requirements in computer words:

  1. 3 - x, y, and z coordinate of the node center.
  2. 1 - parent pointer.
  3. 8 - children pointers.
  4. 80 - near-field pointers.
  5. 140 - interactive-field pointers.
  6. φin(x) and ψin(x) -
    1

    6
    (p+1)(p+2)(p+3)
    coefficients.

For a large number of particles, the memory required can potentially reach many gigabytes. Reducing this memory requirement is highly desirable.

A tree of depth four, for example, has 4673 (1+64+511+4095) nodes. With p = 3 we need 20 terms in each expansion. The total memory words required just to store the expansions is about 1,275 kilobytes.

On some computer architectures (i.e. Connection Machine) it is possible to address the near and interactive fields directly. This improvement can save us over 220 words of memory per node. Addressing the near and interactive fields directly is discussed in section 6.

Execution Time of the Algorithm

In this section we compare the execution time of the O(N) multipole-expansion computer program to that of direct evaluation. The results were obtained from the program running on an IBM RS/6000 model 550. For each of our tests, we use the super-node heuristic (section 5.1) and require the average accuracy to be at least 10−3. Note, that in these tests we did not precompute constants in the lemmas to avoid repeated calculations. A significant speed-up is expected when this is implemented.

For the speed comparison tests, the positions of the particles were randomly distributed within a cube and the resulting density was roughly uniform. Table 1 compares the execution of the multipole-expansion algorithm to that of direct evaluation. The number of terms in the expansion, p, was chosen as 3 and the calculated potentials are accurate in average to about 10−4. The running time excludes time spent executing in system mode and is accurate to the nearest second. Figure 4 shows the results of the test on the running time of the algorithm as a plot. In the plot, both the running-time axis and number-of-particles axis are scaled logarithmically.

From table 1 and figure 4 one can see that the CPU time requirements of the multipole algorithm grow linearly with number of particles. The improvement in speed, as expected, increases with the number of particles. Note that the 256,000 particle case executes over 30 times faster than the direct method.

Note, that due to the overhead of creating the decomposition tree for the multipole method, the direct method is faster for systems of fewer than 512 particles.

Number of Execution Time (seconds)
Particles Multipole-expansion Direct Speed-up
511 1 1 1.00
1023 1 1 1.00
2047 3 4 1.33
4095 11 17 1.55
8191 18 67 3.72
32767 165 1084 6.57
262143 2307 69218 30.00

Table 1: Execution time of the multipole-expansion method with p = 3 and 80 near-field nodes compared to that of the direct evaluation method.

figure4
Figure 4: Execution time growth rate of the multipole-expansion method compared to that of the direct evaluation method.

In the remaining tests, the number of terms in the expansions are noted in the table captions as p. They show what effect changing p or the near-field size has on the speed and accuracy of the algorithm. Table 2 shows how the speed-up of the algorithm is reduced as p is increased from 3 to 6. Increased p means improved accuracy at the cost of longer running times.

Number of Execution Time (seconds)
Particles Multipole-expansion Direct Speed-up
511 2 1 0.50
1023 2 1 0.50
2047 4 4 1.00
4095 13 17 1.30
8191 53 67 1.26
32767 203 1084 5.34
262143 2790 69218 24.80

Table 2: Execution time of the multipole-expansion method with p = 6 and 80 near-field nodes compared to that of the direct evaluation method.

The highest cost per node is associated with converting the multipole to a local expansion for nodes in the interactive field. By reducing the number of near-field nodes, and thus the interactive-field nodes, the program can be sped up at the cost of introducing more error. A smaller near-field also reduces the number of direct calculations. The accuracy of the algorithm improves as we retain more and more terms in the multipole expansion. The error can therefore be compensated by increasing p. If great accuracy is not required, we can reduce the near field size and retain just a few terms in the expansions. The combination of these heuristics can significantly speed-up the algorithm. The effects of this trade-off are shown in tables 3 and 4.

Number of Execution Time (seconds)
Particles Multipole-expansion Direct Speed-up
511 1 1 1.00
1023 1 1 1.00
2047 3 4 1.33
4095 9 17 1.89
8191 22 67 3.05
32767 133 1084 8.15
262143 1785 69218 38.78

Table 3: Execution time of the multipole-expansion method with p = 4 and 56 near-field nodes compared to that of the direct evaluation method.


Number of Execution Time (seconds)
Particles Multipole-expansion Direct Speed-up
511 1 1 1.00
1023 1 1 1.00
2047 3 4 1.33
4095 6 17 2.83
8191 18 67 3.72
32767 78 1084 13.90
262143 975 69218 70.99

Table 4: Execution time of the multipole-expansion method with p = 5 and 26 near-field nodes compared to that of the direct evaluation method.

In addition to good running-times, we are interested in getting accurate results. The accuracy of the computer program is discussed in the next section.

Accuracy of the Computer Program

The accuracy of the multipole-expansion method can be measured by comparing the results of multipole-expansion method to the exact potentials (within machine precision) computed with the direct method for the same set of particles. The near-field of each node is defined so that the ratio in the error bounds, described in section 4.2, is 1/2. Table 5 confirms the accuracies of the results obtained by the multipole algorithm as in agreement with the error bounds. p is the highest degree of harmonics retained in an expansion. As p increases the accuracy of the multipole-expansion method improves.

p Potential Energy εpotential−energy
1 7.6658134 3.4 x 10−4
2 7.6662437 9.3 x 10−5
3 7.6661345 1.6 x 10−5
4 7.6661536 3.3 x 10−6
5 7.6661512 8.6 x 10−7
6 7.6661500 3.8 x 10−7
7 7.6661498 5.3 x 10−7
8 7.6661504 6.0 x 10−8
9 7.6661503 8.0 x 10−10

Table 5: Accuracy test of the multipole-expansion method for 1023 uniformly distributed particles.

The error due to truncating a multipole expansion is a mixed-sign series. Consequently, the error from truncating the multipole expansion does not necessarily decrease monotonically when retaining more terms. While the theoretical bounds on the maximum error decreases exponentially with an increase of precision, the actual error decays irregularly. However, the error will always be less than the predicted error as p increases.

Table 6 shows the effect of different values of p and near-field sizes. We expect, when p is increased from 3 to 6, the difference in maximum error between the multipole-expansion and the direct method to be reduced. This greater accuracy requires more computation. From the previous section we know that we can speed up the algorithm by reducing the near-field size. Table 6 shows that the error introduced by this heuristic is still within our bounds of required accuracy.

Number of
Particles Near-field Precision Potential Energy εpotential−energy
511 80 3 1.91 0
511 80 6 1.91 0
511 56 4 1.91 0
511 26 5 1.91 0
1023 80 3 7.67 1.6 x 10−5
1023 80 6 7.67 3.8 x 10−7
1023 56 4 7.67 1.0 x 10−5
1023 26 5 7.67 5.4 x 10−6
2047 80 3 30.64 7.0 x 10−5
2047 80 6 30.64 5.4 x 10−6
2047 56 4 30.64 3.2 x 10−5
2047 26 5 30.64 4.8 x 10−5
4095 80 3 122.79 1.7 x 10−4
4095 80 6 122.79 2.0 x 10−5
4095 56 4 122.79 4.3 x 10−5
4095 26 5 122.79 3.4 x 10−4
8191 80 3 491.47 5.8 x 10−5
8191 80 6 491.47 1.0 x 10−4
8191 56 4 491.47 1.1 x 10−5
8191 26 5 491.47 1.6 x 10−3
32767 80 3 7897.16 8.3 x 10−4
32767 80 6 7897.16 1.6 x 10−3
32767 56 4 7897.16 3.6 x 10−4
32767 26 5 7897.16 0.03
262143 80 3 504945.13 0.15
262143 80 6 504945.13 0.11
262143 56 4 504945.13 4.5 x 10−3
262143 26 5 504945.13 2.2 x 10−4

Table 6: Accuracy test of the multipole-expansion method for various near-field and p configurations.

Future Research

The algorithm as implemented demonstrates that the gravitational potential can be calculated in O(N). There are still many areas that need to be investigated concerning the N-body algorithm. These areas include

Acknowledgements

Steve Lustig spent many hours introducing me to numerical simulations and helping me understand the mathematical foundation of the multipole algorithm. Steve also derived the theorems for the multipole Lennard-Jones potential and helped me debug the C code. Discussions with Feng Zhao helped me overcome some initial misunderstanding about the algorithm.

Richard Merrifield derived the lexicographical ordering of a tetrahedral array, a useful optimization of the algorithm.

Steve Lustig, Richard Merrifield, and Karen Bloch reviewed this paper and made suggestions for its improvement.

References

[Appel 85]
Appel, A. W., An Efficient Program for Many-body Simulation, SIAM. J. Sci. Stat. Comput., 6(1), January 1985, pp. 85-103.
[Arnold 90]
A. Arnold, N. Mauser, An Efficient Method of Bookkeeping Next Neighbors in Molecular Dynamics Simulations, Computer Physics Communications, 1990, 59, pp. 267-275.
[Bertschinger 91]
E. Bertschinger, J. Gelb, Cosmological N-body Simulations, Computers in Physics, March/April 1991, pp. 164-179.
[Greengard 86]
Leslie Greengard and Vladimir Rokhlin, A Fast Algorithm for Particle Simulations, Technical Report 459, Yale Computer Science Department, Yale University, New Haven, CT, April 1986.
[Greengard 88]
Leslie Greengard, The Rapid Evaluation of Potential Fields in Particle Systems, The MIT Press, Cambridge, MA, 1988.
[Hockney 81]
R.W. Hockney, J.W. Eastwood, Computer Simulations Using Particles, McGraw-Hill, New York, NY, 1981.
[Katzenelson 88]
J. Katzenelson, Computational Structure of the N-body Problem, MIT Artificial Intelligence Laboratory Memo 1042, Cambridge, MA, April 1988.
[Kittel 56]
C. Kittel, Introduction to Solid State Physics, Appendix A, New York, NY, 1956.
[Knuth 81]
D. Knuth, The Art of Computer Programming, Volume 1-3, Addison-Wesley, 2nd edition, 1981.
[Lustig 90]
S. Lustig, Toward a Molecular Dynamics Algorithm with O(logN) Efficiency: Multipole Series Representations and Shifting Lemmas for the Lennard-Jones 12-6 Potential, du Pont Central Research and Development Memo, November 1990.
[Merrifield 91]
R. E. Merrifield, Lexicographical Ordering of a Tetrahedral Array, du Pont Central Research and Development Memo, February 1991.
[Rycerz 91]
Z. Rycerz, P. Jacobs, Vectorized Program of Order N for Molecular Dynamics Simulation of Condensed Matter, Computer Physics Communications, 1991, 62, pp. 125-144.
[Samet 90]
H. Samet, Applications of Spatial Data Structures, Addison-Wesley Publishing Company, Reading, MA., 1990.
[Simon 90]
T. Simon, Optimization of an O(N) Algorithm for N-body Simulations, MIT Artificial Intelligence Laboratory Technical Report, Cambridge, MA, June 1990.
[Zhao 87]
F. Zhao, An O(N) Algorithm for Three-dimensional N-body simulations, MIT Articial Intelligence Laboratory Technical Report TR-995, Cambridge, MA, October 1987.
[Zhao 89]
F. Zhao, S. Johnsson, The Parallel Multipole Method on the Connection Machine, MIT Articial Intelligence Laboratory Technical Report CS89-6, Cambridge, MA., October 1989.

Multipole-expansion Sample Execution

This appendix contains a sample execution output from the three-dimensional N-body algorithm.

Particles: 262143
Precision: 3
Coefficients: 20
Cube depth: 5
Cube lower left front vertex: 0 0 0
Cube upper right behind vertex: 256 256 256
CreateCube: 1s
Classification: 2s
DefineNearField: 38
DefineInteractiveField: 23s
ComputePhi: 10s
ComputePsi: 2228s
Near: 39394.834
Far: 465545.44
The multipole-expansion potential is 504945.28 (2307).
The naive potential is 504945.13 (69218).
The expansion error: 0.14561708

Multipole-expansion C Code

This appendix contains sequential code implementing the three-dimensional N-body algorithm in the C language. The numbers to the left of the code is the number of times the statement has been executed for 1023 uniformly distributed particles and 3 terms in the expansions. Lines which have not been executed are prefixed with #####.

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%                 N   N         BBBB    OOO   DDDD    Y   Y
           	%                 NN  N         B   B  O   O  D   D    Y Y
           	%                 N N N  =====  BBBB   O   O  D   D     Y
           	%                 N  NN         B   B  O   O  D   D     Y
           	%                 N   N         BBBB    OOO   DDDD      Y
           	%
           	%
           	%          An O(N) Algorithm for Three-Dimensional N-Body Simulations
           	%
           	%
           	%
           	%                           Software Design
           	%                             John Cristy
           	%                            January  2000
           	%
           	%                          Algorithm Analysis
           	%                             Steve Lustig
           	%
           	%
           	%  Copyright 2000 E. I. du Pont de Nemours & Company
           	%
           	%  Permission to copy, modify, distribute, or sell this software or its
           	%  documentation for any purpose is hereby denied without specific, written
           	%  prior permission from E. I. du Pont de Nemours & Company.
           	%
           	%  E. I. du Pont de Nemours & Company disclaims all warranties with regard
           	%  to this software, including all implied warranties of merchantability
           	%  and fitness, in no event shall E. I. du Pont de Nemours & Company be
           	%  liable for any special, indirect or consequential damages or any
           	%  damages whatsoever resulting from loss of use, data or profits, whether
           	%  in an action of contract, negligence or other tortious action, arising
           	%  out of or in connection with the use or performance of this software.
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  N-body implements a sequential multipole algorithm that solves the
           	%  interparticle gravitational potential 1/r of N particles to within
           	%  round-off error and whose computational effort grows only linearly with
           	%  increasing N, i.e. O(N).  This makes possible simulations of a large
           	%  number of atoms on the order of 500,000 or more.  Typically, thousands
           	%  to millions of time steps are required for a realistic dynamic
           	%  simulation.  The linear nature of the cost of computing the potential
           	%  with the sequential multipole algorithm allows many more time-steps, in
           	%  less time, than previously possible.
           	%
           	%  The multipole-expansion algorithm computes the potential by
           	%  partitioning them into two parts: close-by particles (`near-field
           	%  potentials') and the potential of far-away particles (`far-field
           	%  potentials').  The near-field potential is computed using direct
           	%  evaluation.  The far-field potential is represented as an absolutely
           	%  convergent Taylor series.  Given a required precision e, the number of
           	%  terms p in the Taylor series can be determined in advance to obtain a
           	%  particular accuracy.
           	%
           	%  The goal of the multipole computation is to compute the far-field with
           	%  O(N) work.  This is achieved by propagating values, first upwards
           	%  through the computational tree, then downward in two distinct phases.
           	%  In the first (upward) phase the far-field interactions are computed by
           	%  shifting the multipole expansions of the child nodes to the center of
           	%  the parent node and adding the results.  The far-field interaction is
           	%  calculated for all cubes at all levels, beginning at the finest level
           	%  of refinement (the leaf nodes).  The multipole-expansion of the field
           	%  due to each particle is calculated relative to the center of each leaf
           	%  node and summed.  Proceeding upward, the expansions of the eight child
           	%  nodes are shifted relative to the parent's center and summed.  This
           	%  computation is performed at each level propagating values upward
           	%  through the tree.  At the end of this phase of the algorithm, each node
           	%  at every level has a multipole-expansion representation of the field
           	%  due to all particles within its boundaries.  These expansions are only
           	%  valid outside the region defined by the near field of each cube.
           	%
           	%  In the second (downward) phase, we form the local expansions for all
           	%  cubes at all levels beginning at the coarsest level (root node).  The
           	%  far-field multipole-expansions of all nodes in the interactive field of
           	%  the node under consideration are converted into local expansions
           	%  relative to the node's center and are summed.  Next, shift the
           	%  expansion of the parent cube to the center of the current cube under
           	%  consideration and sum with the local expansions computed from the
           	%  interactive field.  Note, the root node does not have a parent or
           	%  interactive field so it's local expansion is defined to be zero.  The
           	%  local expansions are propagated down the tree.  They describe the field
           	%  due to all particles in the system that are not contained in the
           	%  current node or its nearest neighbors.
           	%
           	%  Finally, for each cube i at the finest level n, we evaluate the local
           	%  expansion for each particle in cube i.  Each local expansion is
           	%  described by the coefficients of a p-term Taylor series.  Direct
           	%  evaluation of this series at a particle yields the potential.
           	%  Summating the local expansion at each leaf node generates the potential
           	%  or force due to all particles outside the nearest neighbor cubes. The
           	%  interactions of each particle in cube i are computed directly and
           	%  summed to generate the potential due to the nearest neightbors.
           	%  Finally summating far-field and near-field potential gives us the
           	%  desired potential.
           	%
           	%  A discussion and formal description of the multipole-expansion algorithm
           	%  can be found in these references:
           	%
           	%    "Computational Simulation of Multi-Body Interactions woth O(N) Scaling",
           	%    John Cristy and David A. Pensak, Central Research and Development,
           	%    E.I. Du Pont de Nemours, Technical Report, June 2000.
           	%
           	%    "An O(N) Algorithm for Three-dimensional N-body Simulations", Feng Zhao,
           	%    MIT Artificial Intelligence Lab, Technical Report 995, 1987.
           	%
           	%  The formula for the lexicographical ordering of a tetrahedral array was
           	%  derived by Richard Merrifield.
           	%
           	%
           	*/

           	/*
           	  Include declarations
           	*/
           	#include <stdio.h>
           	#include <ctype.h>
           	#include <math.h>
           	#ifdef __STDC__
           	#include <stdlib.h>
           	#else
           	#ifndef vms
           	#include <malloc.h>
           	#include <memory.h>
           	
           	extern long
           	  strtol(),
           	  time();
           	#endif
           	#endif
           	/*
           	  Define declarations for the N-body program.
           	*/
           	#define AbsoluteValue(x)  ((x) < 0 ? -(x) : (x))
           	#ifndef False
           	#define False  0
           	#endif
           	#define CircumscribedSphereRadius  3.0  /* choose from: 3.0, 2.5, 2.0, 1.5 */
           	#define GravitationalConstant  1.0e-3
           	#define HighestDegreeHarmonic  20
           	#define Max(x,y)  (((x) > (y)) ? (x) : (y))
           	#define MaxChildren  8
           	#define MaxNodesInteractiveField  140  /* choose from: 140, 119, 77, 56 */
           	#define MaxNodesNearField  80  /* choose from: 80, 56, 32, 26 */
           	#define MaxTreeDepth  8  /* Log2(MaxRange) */
           	#define Min(x,y)  (((x) < (y)) ? (x) : (y))
           	#define NodesInAQueue  4681  /* 1+8+64+512+4096 */
           	#define TetrahedralArray(array,a,b,g)  \
           	  (*(array+(cube.tetra_three[a]-cube.tetra_two[(a)+(b)]+(g))))
           	#ifndef True
           	#define True  1
           	#endif
           	#define Warning(message,qualifier)  \
           	{  \
           	  (void) fprintf(stderr,"%s: %s",application_name,message);  \
           	  if (qualifier != (char *) NULL)  \
           	    (void) fprintf(stderr," (%s)",qualifier);  \
           	  (void) fprintf(stderr,".\n");  \
           	}

           	/*
           	  Structures.
           	*/
           	typedef struct _Particle
           	{
           	  float
           	    x,
           	    y,
           	    z;
           	
           	  struct _Particle
           	    *next;
           	} Particle;
           	
           	typedef struct _Node
           	{
           	  struct _Node
           	    *parent,
           	    *child[MaxChildren],
           	    *near_field[MaxNodesNearField+1],
           	    *interactive_field[MaxNodesInteractiveField+8];
           	
           	  unsigned int
           	    id,
           	    level;
           	
           	  float
           	    mid_x,
           	    mid_y,
           	    mid_z,
           	    *phi,
           	    *psi;
           	
           	  Particle
           	    *particle;
           	} Node;
           	
           	typedef struct _Nodes
           	{
           	  Node
           	    nodes[NodesInAQueue];
           	
           	  float
           	    *phi,
           	    *psi;
           	
           	  struct _Nodes
           	    *next;
           	} Nodes;
           	
           	typedef struct _Cube
           	{
           	  Node
           	    *root;
           	
           	  unsigned int
           	    depth;
           	
           	  int
           	    level;
           	
           	  Node
           	    **near_neighbor;
           	
           	  unsigned int
           	    nodes,
           	    free_nodes;
           	
           	  Node
           	    *next_node;
           	
           	  Nodes
           	    *node_queue;
           	
           	  double
           	    edge_length[MaxTreeDepth],
           	    diagonal_length[MaxTreeDepth],
           	    binomial[HighestDegreeHarmonic+1][HighestDegreeHarmonic+1];
           	
           	  unsigned int
           	    tetra_two[HighestDegreeHarmonic+3],
           	    tetra_three[HighestDegreeHarmonic+3],
           	    precision,
           	    coefficients;
           	
           	  float
           	    *phi,
           	    *phi_tilde,
           	    *psi,
           	    *psi_tilde;
           	
           	  double
           	    *ijk_binomial,
           	    *ijk_factorial,
           	    near_potential,
           	    far_potential;
           	} Cube;
           	/*
           	  Variable declarations.
           	*/
           	char
           	  *application_name;

           	/*
           	  Forward declarations.
           	*/
           	static double
           	  Power();
           	
           	static int
           	  InNearField();
           	
           	static Node
           	  *InitializeNode();
           	
           	static void
           	  Error(),
           	  EvaluatePotential(),
           	  FindNeighbors(),
           	  LocalExpansion(),
           	  MultipoleExpansion();

           	/*
           	  Global variables.
           	*/
           	static Cube
           	  cube;

           	/*
           	  Machine dependent elapsed process time routine.
           	*/
           	long int ProcessTime()
     16 -> 	{
           	#ifdef SYSV
           	#include <sys/types.h>
           	#include <sys/times.h>
           	#include <sys/param.h>
           	
           	  struct tms
           	    usage;
           	
           	  times(&usage);
           	  return(usage.tms_utime/HZ);
           	#else
           	#include <sys/time.h>
           	#include <sys/resource.h>
           	
           	  struct rusage
           	    usage;
           	
           	  (void) getrusage(RUSAGE_SELF,&usage);
           	  return(usage.ru_utime.tv_sec);
           	#endif
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  B i n o m i a l
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function Binomial returns the binomial coefficient (n,k).
           	%
           	%  The format of the Binomial routine is:
           	%
           	%    Binomial(n,k)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o n,k: Specifies an unsigned int.
           	%
           	%
           	*/
           	static unsigned int Binomial(n,k)
           	int
           	  n,
           	  k;
     23 -> 	{
           	  register int
           	    i,
           	    j;
           	
           	  unsigned int
           	    binomial;
           	
           	  if (n < k)
      6 -> 	    return(0);
     17 -> 	  binomial=1;
           	  k=Min(k,n-k);
     17 -> 	  j=n;
           	  for (i=1; i <= k; i++)
           	  {
     14 -> 	    binomial=(binomial*(double) j/i)+0.5;
           	    j--;
           	  }
     17 -> 	  return(binomial);
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  C l a s s i f i c a t i o n
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function Classification creates a linked list of particles in the leaf
           	%  nodes of the particle cube.
           	%
           	%  The format of the Classification routine is:
           	%
           	%    Classification(node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void Classification(particles,number_particles)
           	Particle
           	  *particles;
           	
           	unsigned int
           	  number_particles;
      1 -> 	{
           	  register int
           	    i;
           	
           	  register Node
           	    *node;
           	
           	  register Particle
           	    *p;
           	
           	  register unsigned int
           	    id,
           	    level;
           	
           	  p=particles;
           	  for (i=0; i < number_particles; i++)
           	  {
           	    /*
           	      Descend the tree to a particular leaf node.
           	    */
   1023 -> 	    node=cube.root;
           	    for (level=1; level < cube.depth; level++)
           	    {
           	      id=(p->x > node->mid_x) | (p->y > node->mid_y) << 1 |
   2046 -> 	        (p->z > node->mid_z) << 2;
           	      node=node->child[id];
           	    }
   1023 -> 	    p->next=node->particle;
           	    node->particle=p;
           	    p++;
           	  }
      1 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  C o m p u t e P h i
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function ComputePhi forms the multipole expansions of the potential field
           	%  due to particle in each node.
           	%
           	%  The format of the ComputePhi routine is:
           	%
           	%    ComputePhi(node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void ComputePhi(node)
           	register Node
           	  *node;
     73 -> 	{
           	  double
           	    x_distance,
           	    y_distance,
           	    z_distance;
           	
           	  int
           	    i,
           	    j,
           	    k;
           	
           	  register double
           	    *ijk,
           	    sum;
           	
           	  register float
           	    *phi;
           	
           	  register int
           	    a,
           	    b,
           	    g;
           	
           	  unsigned int
           	    id;
           	
           	  /*
           	    Recursively descend the particle cube.
           	  */
           	  if (node->level < (cube.depth-1))
      9 -> 	    for (id=0; id < MaxChildren; id++)
     72 -> 	      ComputePhi(node->child[id]);
     73 -> 	  if (node->level == (cube.depth-1))
     64 -> 	    if (node->particle != (Particle *) NULL)
           	      {
           	        register Particle
           	          *p;
           	
           	        /*
           	          Form multipole expansions of potential field due to particles
           	          in each cube about the cube center at the leaf nodes.
           	        */
     64 -> 	        phi=node->phi;
           	        ijk=cube.ijk_factorial;
    256 -> 	        for (i=0; i <= cube.precision; i++)
    256 -> 	          for (j=0; j <= (cube.precision-i); j++)
    640 -> 	            for (k=0; k <= (cube.precision-i-j); k++)
           	            {
   1280 -> 	              sum=0.0;
           	              for (p=node->particle; p != (Particle *) NULL; p=p->next)
           	              {
  20460 -> 	                x_distance=p->x-node->mid_x;
           	                y_distance=p->y-node->mid_y;
           	                z_distance=p->z-node->mid_z;
           	                sum+=Power(x_distance,i)*Power(y_distance,j)*
           	                  Power(z_distance,k);
           	              }
   1280 -> 	              sum*=Power((double) -1.0,i+j+k)*(*ijk++);
           	              *phi++=sum;
           	            }
           	      }
     73 -> 	  if (node->level > 0)
           	    {
           	      /*
           	        Form multipole expansions about the centers of each cube at all
           	        internal nodes, each expansion representing the potential field
           	        due to all particles contained in the cube.  Shift the center
           	        of the cube's expansion to the parent cube center using the
           	        `Translation of a Multipole Expansion' lemma.
           	      */
     72 -> 	      MultipoleExpansion(node,node->parent,cube.phi_tilde);
           	      phi=node->parent->phi;
    288 -> 	      for (i=0; i <= cube.precision; i++)
    288 -> 	        for (j=0; j <= (cube.precision-i); j++)
    720 -> 	          for (k=0; k <= (cube.precision-i-j); k++)
           	          {
   1440 -> 	            sum=0.0;
   2520 -> 	            for (a=0; a <= i; a++)
   2520 -> 	              for (b=0; b <= j; b++)
   4032 -> 	                for (g=0; g <= k; g++)
           	                  sum+=TetrahedralArray(node->phi,a,b,g)*
   6048 -> 	                    TetrahedralArray(cube.phi_tilde,i-a,j-b,k-g);
   1440 -> 	            *phi+=sum;
           	            phi++;
           	          }
           	    }
     73 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  C o m p u t e P s i
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function ComputePsi forms a local expansion which describes the potential
           	%  field due to all particles in the system that are not contained in the
           	%  current cube or its near-field.
           	%
           	%  The format of the ComputePsi routine is:
           	%
           	%    ComputePsi(node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void ComputePsi(node)
           	register Node
           	  *node;
     73 -> 	{
           	  double
           	    x_distance,
           	    y_distance,
           	    z_distance;
           	
           	  int
           	    i,
           	    j,
           	    k,
           	    n;
           	
           	  register int
           	    a,
           	    b,
           	    g;
           	
           	  register double
           	    *ijk,
           	    sum;
           	
           	  register float
           	    *phi,
           	    *psi;
           	
           	  register Node
           	    **interactive_neighbor;
           	
           	  unsigned int
           	    id;
           	
           	  if (node->level == (cube.depth-1))
           	    {
           	      /*
           	        Form a local expansion by using the `Conversion of Multipole
           	        into a Local Expansion' lemma to convert the multipole expansion
           	        of each cube in the interaction-field of the current leaf cube
           	        to a local expansion about the center of the leaf.
           	      */
     64 -> 	      interactive_neighbor=node->interactive_field;
           	      while (*interactive_neighbor != (Node *) NULL)
           	      {
           	        /*
           	          Shift multipole expansion of the interactive cube to the
           	          center of the current leaf cube.
           	        */
   1336 -> 	        LocalExpansion(*interactive_neighbor,node,cube.psi_tilde);
           	        psi=node->psi;
           	        phi=(*interactive_neighbor)->phi;
           	        ijk=cube.ijk_factorial;
   5344 -> 	        for (i=0; i <= cube.precision; i++)
  13360 -> 	          for (j=0; j <= (cube.precision-i); j++)
  13360 -> 	            for (k=0; k <= (cube.precision-i-j); k++)
           	            {
  26720 -> 	              sum=0.0;
           	              n=i+j+k;
  46760 -> 	              for (a=0; a <= (cube.precision-n); a++)
  46760 -> 	                for (b=0; b <= (cube.precision-n-a); b++)
  74816 -> 	                  for (g=0; g <= (cube.precision-n-a-b); g++)
           	                    sum+=TetrahedralArray(phi,a,b,g)*
 112224 -> 	                      TetrahedralArray(cube.psi_tilde,i+a,j+b,k+g);
  26720 -> 	              *psi+=sum*(*ijk++);
           	              psi++;
           	            }
   1336 -> 	        interactive_neighbor++;
           	      }
           	      /*
           	        Shift local expansion of the parent cube to the center of the
           	        current leaf cube using the `Translation of a Local Expansion'
           	        lemma.
           	      */
     64 -> 	      x_distance=node->mid_x-node->parent->mid_x;
           	      y_distance=node->mid_y-node->parent->mid_y;
           	      z_distance=node->mid_z-node->parent->mid_z;
           	      psi=node->psi;
           	      ijk=cube.ijk_factorial;
    256 -> 	      for (i=0; i <= cube.precision; i++)
    256 -> 	        for (j=0; j <= (cube.precision-i); j++)
    640 -> 	          for (k=0; k <= (cube.precision-i-j); k++)
           	          {
   1280 -> 	            sum=0.0;
           	            n=i+j+k;
   2240 -> 	            for (a=0; a <= (cube.precision-n); a++)
   2240 -> 	              for (b=0; b <= (cube.precision-n-a); b++)
   3584 -> 	                for (g=0; g <= (cube.precision-n-a-b); g++)
           	                  sum+=TetrahedralArray(node->parent->psi,i+a,j+b,k+g)*
           	                    TetrahedralArray(cube.ijk_factorial,a,b,g)*
   5376 -> 	                    Power(x_distance,a)*Power(y_distance,b)*Power(z_distance,g);
   1280 -> 	            *psi+=sum*(*ijk++);
           	            psi++;
           	          }
     64 -> 	      EvaluatePotential(node);
           	    }
           	  else
      9 -> 	    if (node->level > 0)
           	      {
           	        /*
           	          Form a local expansion by using the `Conversion of Multipole
           	          into a Local Expansion' lemma to convert the multipole
           	          expansion of each cube in the interaction-field of the
           	          current cube to a local expansion about the center of the
           	          cube.
           	        */
      8 -> 	        interactive_neighbor=node->interactive_field;
           	        while (*interactive_neighbor != (Node *) NULL)
           	        {
           	          /*
           	            Shift multipole expansion of the interactive cube to the
           	            center of the current cube.
           	          */
  ##### -> 	          LocalExpansion(*interactive_neighbor,node,cube.psi_tilde);
           	          psi=node->psi;
           	          phi=(*interactive_neighbor)->phi;
  ##### -> 	          for (i=0; i <= cube.precision; i++)
  ##### -> 	            for (j=0; j <= (cube.precision-i); j++)
  ##### -> 	              for (k=0; k <= (cube.precision-i-j); k++)
           	              {
  ##### -> 	                sum=0.0;
           	                n=i+j+k;
  ##### -> 	                for (a=0; a <= (cube.precision-n); a++)
  ##### -> 	                  for (b=0; b <= (cube.precision-n-a); b++)
  ##### -> 	                    for (g=0; g <= (cube.precision-n-a-b); g++)
           	                      sum+=TetrahedralArray(phi,a,b,g)*
  ##### -> 	                        TetrahedralArray(cube.psi_tilde,i+a,j+b,k+g);
  ##### -> 	                *psi+=sum;
           	                psi++;
           	              }
  ##### -> 	          interactive_neighbor++;
           	        }
           	        /*
           	          Shift local expansion of the parent cube to the center of the
           	          current node using the `Translation of a Local Expansion'.
           	        */
      8 -> 	        x_distance=node->mid_x-node->parent->mid_x;
           	        y_distance=node->mid_y-node->parent->mid_y;
           	        z_distance=node->mid_z-node->parent->mid_z;
           	        psi=node->psi;
     32 -> 	        for (i=0; i <= cube.precision; i++)
     32 -> 	          for (j=0; j <= (cube.precision-i); j++)
     80 -> 	            for (k=0; k <= (cube.precision-i-j); k++)
           	            {
    160 -> 	              sum=0.0;
           	              n=i+j+k;
    280 -> 	              for (a=0; a <= (cube.precision-n); a++)
    280 -> 	                for (b=0; b <= (cube.precision-n-a); b++)
    448 -> 	                  for (g=0; g <= (cube.precision-n-a-b); g++)
           	                    sum+=TetrahedralArray(node->parent->psi,i+a,j+b,k+g)*
           	                      TetrahedralArray(cube.ijk_factorial,a,b,g)*
           	                      Power(x_distance,a)*Power(y_distance,b)*
    672 -> 	                      Power(z_distance,g);
    160 -> 	              *psi+=sum;
           	              psi++;
           	            }
           	      }
           	  /*
           	    Recursively descend the particle cube.
           	  */
     73 -> 	  if (node->level < (cube.depth-1))
      9 -> 	    for (id=0; id < MaxChildren; id++)
     72 -> 	      ComputePsi(node->child[id]);
     73 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  C r e a t e C u b e
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function CreateCube creates all levels of the particle cube.
           	%
           	%  The format of the CreateCube routine is:
           	%
           	%    CreateCube(node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void CreateCube(node)
           	register Node
           	  *node;
      9 -> 	{
           	  register double
           	    bisect;
           	
           	  register unsigned int
           	    id,
           	    level;
           	
           	  /*
           	    Create a child cube and recursively descend to the next level.
           	  */
           	  level=node->level+1;
           	  bisect=cube.edge_length[level]*0.5;
     72 -> 	  for (id=0; id < MaxChildren; id++)
           	  {
           	    node->child[id]=InitializeNode(id,level,node,
           	      node->mid_x+(id & 1 ? bisect : -bisect),
           	      node->mid_y+(id & 2 ? bisect : -bisect),
     72 -> 	      node->mid_z+(id & 4 ? bisect : -bisect));
     72 -> 	    if (node->child[id] == (Node *) NULL)
  ##### -> 	      Error("unable to allocate memory",(char *) NULL);
     72 -> 	    if (level < (cube.depth-1))
      8 -> 	      CreateCube(node->child[id]);
           	  }
      9 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  D e f i n e I n t e r a c t i v e F i e l d
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function DefineInteractiveField assigns the near-field of a particular cube.
           	%  The interaction-field of a node is the set of nodes which are children of
           	%  the nodes in the near-field of it parent and which are not in the near-field
           	%  of itself.  If all eight children of a parent node is in the interaction-
           	%  field, they are replaced by their common parent node (super_node).
           	%
           	%  The format of the DefineInteractiveField routine is:
           	%
           	%    DefineInteractiveField(node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void DefineInteractiveField(node)
           	register Node
           	  *node;
     73 -> 	{
           	  register Node
           	    **interactive_neighbor,
           	    **near_neighbor;
           	
           	  register unsigned int
           	    id,
           	    super_node;
           	
           	  /*
           	    Recursively descend the particle cube.
           	  */
           	  if (node->level < (cube.depth-1))
      9 -> 	    for (id=0; id < MaxChildren; id++)
     72 -> 	      DefineInteractiveField(node->child[id]);
     73 -> 	  if (node->level > 0)
           	    {
           	      /*
           	        Nodes in the parents near-field are candidates for the interactive
           	        field.
           	      */
     72 -> 	      interactive_neighbor=node->interactive_field;
           	      near_neighbor=node->parent->near_field;
           	      while (*near_neighbor != (Node *) NULL)
           	      {
    448 -> 	        super_node=True;
   3584 -> 	        for (id=0; id < MaxChildren; id++)
   3584 -> 	          if (InNearField((*near_neighbor)->child[id],node))
   1688 -> 	            super_node=False;
           	          else
           	            {
   1896 -> 	              *interactive_neighbor=(*near_neighbor)->child[id];
           	              interactive_neighbor++;
           	            }
    448 -> 	        if (super_node)
           	          {
           	            /*
           	              Supernode:  All 8 children represented by this node.
           	            */
     80 -> 	            interactive_neighbor-=MaxChildren;
           	            *interactive_neighbor=(*near_neighbor);
           	            interactive_neighbor++;
           	          }
    448 -> 	        near_neighbor++;
           	      }
     72 -> 	      *interactive_neighbor=(Node *) NULL;
           	    }
     73 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  D e f i n e N e a r F i e l d
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function DefineNearField assigns the near-field of a particular cube.  The
           	%  near-field of a node is the set of nodes at the same level of refinement
           	%  which shares a boundary point with it.
           	%
           	%  The format of the DefineNearField routine is:
           	%
           	%    DefineNearField(node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void DefineNearField(node)
           	register Node
           	  *node;
     73 -> 	{
           	  register unsigned int
           	    id;
           	
           	  /*
           	    Recursively descend the particle cube.
           	  */
           	  if (node->level < (cube.depth-1))
      9 -> 	    for (id=0; id < MaxChildren; id++)
     72 -> 	      DefineNearField(node->child[id]);
     73 -> 	  if (node->level > 0)
           	    {
           	      /*
           	        Find the near neighbors of this particular node.
           	      */
     72 -> 	      cube.near_neighbor=node->near_field;
           	      FindNeighbors(cube.root,node);
           	      *cube.near_neighbor=(Node *) NULL;
           	    }
     73 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%   E r r o r
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function Error displays an error message and then terminates the program.
           	%
           	%  The format of the Error routine is:
           	%
           	%      Error(message,qualifier)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o message: Specifies the message to display before terminating the
           	%      program.
           	%
           	%    o qualifier: Specifies any qualifier to the message.
           	%
           	%
           	*/
           	static void Error(message,qualifier)
           	char
           	  *message,
           	  *qualifier;
  ##### -> 	{
           	  (void) fprintf(stderr,"%s: %s",application_name,message);
           	  if (qualifier != (char *) NULL)
  ##### -> 	    (void) fprintf(stderr," (%s)",qualifier);
  ##### -> 	  (void) fprintf(stderr,".\n");
           	  exit(1);
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  E v a l u a t e P o t e n t i a l
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function EvaluatePotential evaluates the local expansions at the particle
           	%  position to determine the far field potential.  The direct near-field
           	%  potential is summed to give the total potential.
           	%
           	%  The format of the EvaluatePotential routine is:
           	%
           	%    EvaluatePotential(node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void EvaluatePotential(node)
           	Node
           	  *node;
     64 -> 	{
           	  double
           	    distance,
           	    far_potential,
           	    near_potential,
           	    x_distance,
           	    y_distance,
           	    z_distance;
           	
           	  register float
           	    *psi;
           	
           	  register int
           	    i,
           	    j,
           	    k;
           	
           	  register Node
           	    **near_neighbor;
           	
           	  register Particle
           	    *p,
           	    *q;
           	
           	  far_potential=0.0;
           	  near_potential=0.0;
   1023 -> 	  for (p=node->particle; p != (Particle *) NULL; p=p->next)
           	  {
           	    /*
           	      Evaluate local expansions at particle positions.
           	    */
   1023 -> 	    x_distance=p->x-node->mid_x;
           	    y_distance=p->y-node->mid_y;
           	    z_distance=p->z-node->mid_z;
           	    psi=node->psi;
   4092 -> 	    for (i=0; i <= cube.precision; i++)
  10230 -> 	      for (j=0; j <= (cube.precision-i); j++)
  10230 -> 	        for (k=0; k <= (cube.precision-i-j); k++)
           	        {
           	          far_potential+=(*psi)*Power(x_distance,i)*Power(y_distance,j)*
  20460 -> 	            Power(z_distance,k);
           	          psi++;
           	        }
           	    /*
           	      Compute potential due to particles in current cube directly.  Use
           	      Newton's third law to reduce the number of pairwise interactions.
           	    */
   1023 -> 	    for (q=p->next; q != (Particle *) NULL; q=q->next)
           	    {
   8222 -> 	      x_distance=p->x-q->x;
           	      y_distance=p->y-q->y;
           	      z_distance=p->z-q->z;
           	      distance=sqrt(x_distance*x_distance+y_distance*y_distance+
           	        z_distance*z_distance);
           	      near_potential+=1.0/distance;
           	    }
           	    /*
           	      Compute potential due to particles in near-field directly.
           	    */
   1023 -> 	    near_neighbor=node->near_field;
           	    while (*near_neighbor != (Node *) NULL)
           	    {
  34024 -> 	      for (q=(*near_neighbor)->particle; q != (Particle *) NULL; q=q->next)
           	      {
 540504 -> 	        if (p > q)
 270252 -> 	          continue;  /* canonical form */
 270252 -> 	        x_distance=p->x-q->x;
           	        y_distance=p->y-q->y;
           	        z_distance=p->z-q->z;
           	        distance=sqrt(x_distance*x_distance+y_distance*y_distance+
           	          z_distance*z_distance);
           	        near_potential+=1.0/distance;
           	      }
  34024 -> 	      near_neighbor++;
           	    }
           	  }
     64 -> 	  cube.near_potential+=2.0*GravitationalConstant*near_potential;
           	  cube.far_potential+=GravitationalConstant*far_potential;
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  F i n d N e i g h b o r s
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function FindNeighbors determines which nodes in the particle cube are
           	%  in the near-field of the reference node.  The near-field of a cube is
           	%  defined as consisting of those cubes that are on the same level as the
           	%  cube, and have a distance to the center of the cube less than sqrt(3)
           	%  of the cube edge length.
           	%
           	%  The format of the FindNeighbors routine is:
           	%
           	%    FindNeighbors(node,reference_node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%    o reference_node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static void FindNeighbors(node,reference_node)
           	register Node
           	  *node,
           	  *reference_node;
   4744 -> 	{
           	  register double
           	    distance_squared,
           	    x_distance,
           	    y_distance,
           	    z_distance;
           	
           	  unsigned int
           	    id;
           	
           	  /*
           	    Recursively descend the particle cube.
           	  */
           	  if (node->level < reference_node->level)
    584 -> 	    for (id=0; id < MaxChildren; id++)
   4672 -> 	      FindNeighbors(node->child[id],reference_node);
   4744 -> 	  if (node->level == reference_node->level)
   4160 -> 	    if (node != reference_node)
           	      {
           	        /*
           	          Determine if this node is within the specified distance of the
           	          reference node.
           	        */
   4088 -> 	        x_distance=reference_node->mid_x-node->mid_x;
           	        y_distance=reference_node->mid_y-node->mid_y;
           	        z_distance=reference_node->mid_z-node->mid_z;
           	        distance_squared=x_distance*x_distance+y_distance*y_distance+
           	          z_distance*z_distance;
           	        if (distance_squared <= cube.diagonal_length[node->level])
           	          {
   2192 -> 	            *cube.near_neighbor=node;
           	            cube.near_neighbor++;
           	          }
           	      }
   4744 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  I n i t i a l i z e N o d e
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function InitializeNode allocates memory for a new node in the color cube
           	%  tree and presets all fields to zero.
           	%
           	%  The format of the InitializeNode routine is:
           	%
           	%      node=InitializeNode(id,level,parent,mid_x,mid_y,mid_z)
           	%
           	%  A description of each parameter follows.
           	%
           	%    o node: The InitializeNode routine returns this integer address.
           	%
           	%    o id: Specifies the child number of the node.
           	%
           	%    o level: Specifies the level in the particle cube the node resides.
           	%
           	%    o parent: Specifies the parent of the initialized node.
           	%
           	%    o mid_x: Specifies the mid point of the x extent for this node.
           	%
           	%    o mid_y: Specifies the mid point of the y extent for this node.
           	%
           	%    o mid_z: Specifies the mid point of the z extent for this node.
           	%
           	%
           	*/
           	static Node *InitializeNode(id,level,parent,mid_x,mid_y,mid_z)
           	unsigned int
           	  id,
           	  level;
           	
           	Node
           	  *parent;
           	
           	float
           	  mid_x,
           	  mid_y,
           	  mid_z;
     73 -> 	{
           	  register float
           	    *phi,
           	    *psi;
           	
           	  register int
           	    i;
           	
           	  register Node
           	    *node;
           	
           	  if (cube.free_nodes == 0)
           	    {
           	      Nodes
           	        *nodes;
           	
           	      /*
           	        Allocate a queue of nodes.
           	      */
      1 -> 	      nodes=(Nodes *) malloc(sizeof(Nodes));
           	      if (nodes == (Nodes *) NULL)
  ##### -> 	        return((Node *) NULL);
           	      nodes->phi=(float *)
      1 -> 	        malloc(NodesInAQueue*cube.coefficients*sizeof(float));
           	      if (nodes->phi == (float *) NULL)
  ##### -> 	        return((Node *) NULL);
           	      nodes->psi=(float *)
      1 -> 	        malloc(NodesInAQueue*cube.coefficients*sizeof(float));
           	      if (nodes->phi == (float *) NULL)
  ##### -> 	        return((Node *) NULL);
      1 -> 	      nodes->next=cube.node_queue;
           	      cube.node_queue=nodes;
           	      cube.next_node=nodes->nodes;
           	      cube.free_nodes=NodesInAQueue;
           	      cube.phi=nodes->phi;
           	      cube.psi=nodes->psi;
           	    }
           	  /*
           	    Initialize node fields.
           	  */
     73 -> 	  node=cube.next_node;
           	  node->parent=parent;
           	  for (i=0; i < MaxChildren; i++)
    584 -> 	    node->child[i]=(Node *) NULL;
     73 -> 	  node->near_field[0]=(Node *) NULL;
           	  node->interactive_field[0]=(Node *) NULL;
           	  node->id=id;
           	  node->level=level;
           	  node->mid_x=mid_x;
           	  node->mid_y=mid_y;
           	  node->mid_z=mid_z;
           	  node->phi=cube.phi;
           	  node->psi=cube.psi;
           	  /*
           	    Set multipole and local expansions to zero.
           	  */
           	  phi=node->phi;
           	  psi=node->psi;
           	  for (i=0; i < cube.coefficients; i++)
           	  {
   1460 -> 	    *phi++=0.0;
           	    *psi++=0.0;
           	  }
     73 -> 	  node->particle=(Particle *) NULL;
           	  /*
           	    Prepare for next node.
           	  */
           	  cube.nodes++;
           	  cube.free_nodes--;
           	  cube.next_node++;
           	  cube.phi+=cube.coefficients;
           	  cube.psi+=cube.coefficients;
           	  return(node);
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  I n N e a r F i e l d
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function InNearField returns True if the specified node is a member of the
           	%  near-field.
           	%
           	%  The format of the InNearField routine is:
           	%
           	%    InNearField(reference_node,node)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o reference_node: Specifies a pointer to a Node structure.
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%
           	*/
           	static int InNearField(reference_node,node)
           	register Node
           	  *reference_node,
           	  *node;
   3584 -> 	{
           	  register Node
           	    **near_neighbor;
           	
           	  if (reference_node == node)
  ##### -> 	    return(True);
           	  /*
           	    Search the near-field for the specified node.
           	  */
   3584 -> 	  near_neighbor=node->near_field;
           	  while (*near_neighbor != (Node *) NULL)
           	  {
  89180 -> 	    if (*near_neighbor == reference_node)
   1688 -> 	      return(True);
  87492 -> 	    near_neighbor++;
           	  }
   1896 -> 	  return(False);
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  L o c a l E x p a n s i o n
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function LocalExpansion forms a local expansion which describes the
           	%  potential field due to all particles in the system that are not contained
           	%  in the current cube or its near-field.
           	%
           	%  The format of the LocalExpansion routine is:
           	%
           	%    LocalExpansion(interactive_neighbor,node,psi_tilde)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o interactive_neighbor: Specifies a pointer to a Node structure.
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%    o psi_tilde: Specifies a pointer to an array of floats representing the
           	%      local expansion for this node.
           	%
           	%
           	*/
           	static void LocalExpansion(interactive_neighbor,node,psi_tilde)
           	Node
           	  *interactive_neighbor,
           	  *node;
           	
           	float
           	  *psi_tilde;
   1336 -> 	{
           	  double
           	    distance,
           	    tau_x,
           	    tau_y,
           	    tau_z,
           	    x_distance,
           	    y_distance,
           	    z_distance;
           	
           	  int
           	    i,
           	    j,
           	    k,
           	    n;
           	
           	  register double
           	    *ijk,
           	    sum;
           	
           	  register int
           	    a,
           	    b,
           	    g;
           	
           	  static double
           	    r_zero[HighestDegreeHarmonic+1];
           	
           	  /*
           	    Initialize Psi' as specified by Theorem 3.2.2.
           	  */
           	  x_distance=interactive_neighbor->mid_x-node->mid_x;
           	  y_distance=interactive_neighbor->mid_y-node->mid_y;
           	  z_distance=interactive_neighbor->mid_z-node->mid_z;
           	  distance=sqrt(x_distance*x_distance+y_distance*y_distance+
           	    z_distance*z_distance);
           	  tau_x=2.0*(x_distance/distance);
           	  tau_y=2.0*(y_distance/distance);
           	  tau_z=2.0*(z_distance/distance);
           	  for (i=0; i <= cube.precision; i++)
   5344 -> 	    r_zero[i]=1.0/Power(distance,i+1);
   1336 -> 	  ijk=cube.ijk_factorial;
   5344 -> 	  for (i=0; i <= cube.precision; i++)
  13360 -> 	    for (j=0; j <= (cube.precision-i); j++)
  13360 -> 	      for (k=0; k <= (cube.precision-i-j); k++)
           	      {
  26720 -> 	        sum=0.0;
  32064 -> 	        for (a=0; a <= (i/2); a++)
  32064 -> 	          for (b=0; b <= (j/2); b++)
  37408 -> 	            for (g=0; g <= (k/2); g++)
           	              sum+=TetrahedralArray(cube.ijk_binomial,i-a,j-b,k-g)*
           	                cube.binomial[i-a][a]*cube.binomial[j-b][b]*
           	                cube.binomial[k-g][g]*Power(tau_x,i-2*a)*
  42752 -> 	                Power(tau_y,j-2*b)*Power(tau_z,k-2*g);
  26720 -> 	        n=i+j+k;
           	        sum*=Power((double) -1.0,n)*r_zero[n]/(*ijk++);
           	        *psi_tilde++=sum;
           	      }
   1336 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  M u l t i p o l e E x p a n s i o n
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function MultipoleExpansion forms the multipole expansions of the potential
           	%  field due to particle in each node.
           	%
           	%  The format of the MultipoleExpansion routine is:
           	%
           	%    MultipoleExpansion(node,parent,phi_tilde)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o node: Specifies a pointer to a Node structure.
           	%
           	%    o parent: Specifies a pointer to a Node structure.
           	%
           	%    o phi_tilde: Specifies a pointer to an array of floats representing the
           	%      multipole expansion for this node.
           	%
           	%
           	*/
           	static void MultipoleExpansion(node,parent,phi_tilde)
           	Node
           	  *node,
           	  *parent;
           	
           	register float
           	  *phi_tilde;
     72 -> 	{
           	  double
           	    x_distance,
           	    y_distance,
           	    z_distance;
           	
           	  register double
           	    *ijk;
           	
           	  register int
           	    i,
           	    j,
           	    k;
           	
           	  /*
           	    Initialize Phi' as specified by Theorem 3.2.1.
           	  */
           	  x_distance=node->mid_x-parent->mid_x;
           	  y_distance=node->mid_y-parent->mid_y;
           	  z_distance=node->mid_z-parent->mid_z;
           	  ijk=cube.ijk_factorial;
    288 -> 	  for (i=0; i <= cube.precision; i++)
    288 -> 	    for (j=0; j <= (cube.precision-i); j++)
    720 -> 	      for (k=0; k <= (cube.precision-i-j); k++)
           	        *phi_tilde++=Power(-x_distance,i)*Power(-y_distance,j)*
   1440 -> 	          Power(-z_distance,k)*(*ijk++);
     72 -> 	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  M u l t i p o l e P o t e n t i a l
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function MultipolePotential calculates the interaction potentials of a
           	%  number of particles in O(N).
           	%
           	%  The format of the MultipolePotential routine is:
           	%
           	%    potential=MultipolePotential(particles,number_particles,precision,
           	%      minimum_extent,maximum_extent,tree_depth)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o particles: Specifies a pointer to an array of Particles structures.
           	%
           	%    o number_particles: Specifies the number of particles in the
           	%      particles array.
           	%
           	%    o precision: Specifies the number of terms to use in the multipole
           	%      and local expansions.
           	%
           	%    o minimum_extent: Specifies the minumim of the cube extent.
           	%
           	%    o maximum_extent: Specifies the maximum of the cube extent.
           	%
           	%    o tree_depth: Normally, this integer value is zero or one.  A zero
           	%      tells MultipolePotential to choose a optimal tree depth of
           	%      Log8(number_particles).
           	%
           	%
           	*/
           	double MultipolePotential(particles,number_particles,precision,minimum_extent,
           	  maximum_extent,tree_depth)
           	Particle
           	  *particles;
           	
           	unsigned int
           	  number_particles,
           	  precision,
           	  tree_depth;
           	
           	float
           	  maximum_extent,
           	  minimum_extent;
      1 -> 	{
           	  double
           	    factorial[HighestDegreeHarmonic+1];
           	
           	  int
           	    count,
           	    start_time;
           	
           	  Nodes
           	    *nodes;
           	
           	  register double
           	    *ijk,
           	    *ijk_binomial,
           	    sum;
           	
           	  register int
           	    i,
           	    j,
           	    k,
           	    n;
           	
           	  unsigned int
           	    level;
           	
           	  (void) fprintf(stderr,"Particles: %d\n",number_particles);
           	  cube.precision=precision;
           	  if (cube.precision > HighestDegreeHarmonic)
  ##### -> 	    cube.precision=HighestDegreeHarmonic;
      1 -> 	  (void) fprintf(stderr,"Precision: %d\n",cube.precision);
           	  cube.coefficients=Binomial((int) cube.precision+3,3);
           	  (void) fprintf(stderr,"Coefficients: %d\n",cube.coefficients);
           	  /*
           	    Initialize ijk factorial table.
           	  */
           	  factorial[0]=1.0;
           	  sum=1.0;
           	  for (i=1; i <= cube.precision; i++)
           	  {
      3 -> 	    sum*=(double) i;
           	    factorial[i]=sum;
           	  }
      1 -> 	  cube.ijk_factorial=(double *) malloc(cube.coefficients*sizeof(double));
           	  if (cube.ijk_factorial == (double *) NULL)
  ##### -> 	    Error("unable to allocate memory",(char *) NULL);
      1 -> 	  ijk=cube.ijk_factorial;
      4 -> 	  for (i=0; i <= cube.precision; i++)
     10 -> 	    for (j=0; j <= (cube.precision-i); j++)
     10 -> 	      for (k=0; k <= (cube.precision-i-j); k++)
     20 -> 	        *ijk++=1.0/(factorial[i]*factorial[j]*factorial[k]);
           	  /*
           	    Initialize ijk binomial table: [(-1/2)*(-1/2-1)*(-1/2-2)*...].
           	  */
      1 -> 	  factorial[0]=1.0;
           	  sum=1.0;
           	  for (i=1; i <= cube.precision; i++)
           	  {
      3 -> 	    sum*=(-0.5-(double) i+1.0);
           	    factorial[i]=sum;
           	  }
      1 -> 	  cube.ijk_binomial=(double *) malloc(cube.coefficients*sizeof(double));
           	  if (cube.ijk_binomial == (double *) NULL)
  ##### -> 	    Error("unable to allocate memory",(char *) NULL);
      1 -> 	  ijk_binomial=cube.ijk_binomial;
           	  ijk=cube.ijk_factorial;
      4 -> 	  for (i=0; i <= cube.precision; i++)
     10 -> 	    for (j=0; j <= (cube.precision-i); j++)
     10 -> 	      for (k=0; k <= (cube.precision-i-j); k++)
     20 -> 	        *ijk_binomial++=factorial[i+j+k]*(*ijk++);
           	  /*
           	    Initialize binomial coefficient table.
           	  */
      1 -> 	  for (n=0; n <= cube.precision; n++)
      4 -> 	    for (k=0; k <= n; k++)
     10 -> 	      cube.binomial[n][k]=(double) Binomial(n,k);
           	  /*
           	    Initialize Tetrahedral cubic array lookup table.
           	  */
      1 -> 	  for (i=0; i <= (cube.precision+2); i++)
      6 -> 	    cube.tetra_three[i]=cube.coefficients-Binomial((int) cube.precision-i+2,3);
      1 -> 	  for (i=0; i <= (cube.precision+2); i++)
      6 -> 	    cube.tetra_two[i]=Binomial((int) cube.precision-i+2,2);
           	  /*
           	    Allocate Phi' & Psi'.
           	  */
      1 -> 	  cube.phi_tilde=(float *) malloc(cube.coefficients*sizeof(float));
           	  cube.psi_tilde=(float *) malloc(cube.coefficients*sizeof(float));
           	  if ((cube.phi_tilde == (float *) NULL) ||
           	      (cube.psi_tilde == (float *) NULL))
  ##### -> 	    Error("unable to allocate memory",(char *) NULL);
      1 -> 	  cube.near_potential=0.0;
           	  cube.far_potential=0.0;
           	  /*
           	    Choose a level of refinement:  depth is log8(number_particles)-1.
           	  */
           	  if (tree_depth > 1)
  ##### -> 	    cube.depth=Min(tree_depth,8);
           	  else
           	    {
      1 -> 	      cube.depth=0;
           	      count=number_particles >> 3;
           	      do
           	      {
      3 -> 	        count>>=3;
           	        cube.depth++;
           	      }
           	      while (count > 0);
      1 -> 	      if (cube.depth < 1)
  ##### -> 	        cube.depth=1;
           	    }
      1 -> 	  (void) fprintf(stderr,"Cube depth: %d\n",cube.depth);
           	  /*
           	    Initialize edge length table.
           	  */
           	  (void) fprintf(stderr,"Cube lower left front vertex: %.8g %.8g %.8g\n",
           	    minimum_extent,minimum_extent,minimum_extent);
           	  (void) fprintf(stderr,"Cube upper right behind vertex: %.8g %.8g %.8g\n",
           	    maximum_extent,maximum_extent,maximum_extent);
           	  cube.edge_length[0]=maximum_extent-minimum_extent;
           	  cube.diagonal_length[0]=2.0*CircumscribedSphereRadius*cube.edge_length[0]*
           	    cube.edge_length[0];
           	  for (level=1; level <= cube.depth; level++)
           	  {
      3 -> 	    cube.edge_length[level]=cube.edge_length[level-1]*0.5;
           	    cube.diagonal_length[level]=2.0*CircumscribedSphereRadius*
           	      cube.edge_length[level]*cube.edge_length[level];
           	  }
           	  /*
           	    Allocate particle description tree.
           	  */
      1 -> 	  cube.node_queue=(Nodes *) NULL;
           	  cube.nodes=0;
           	  cube.free_nodes=0;
           	  cube.root=InitializeNode(0,0,(Node *) NULL,
           	    minimum_extent+cube.edge_length[0]*0.5,
           	    minimum_extent+cube.edge_length[0]*0.5,
           	    minimum_extent+cube.edge_length[0]*0.5);
           	  if (cube.root == (Node *) NULL)
  ##### -> 	    Error("unable to allocate memory",(char *) NULL);
      1 -> 	  cube.root->parent=cube.root;
           	  /*
           	    Multibody steps:
           	  */
           	  (void) fprintf(stderr,"CreateCube: ");
           	  start_time=ProcessTime();
           	  CreateCube(cube.root);
           	  (void) fprintf(stderr,"%ds\n",ProcessTime()-start_time);
           	  (void) fprintf(stderr,"Classification: ");
           	  start_time=ProcessTime();
           	  Classification(particles,number_particles);
           	  (void) fprintf(stderr,"%ds\n",ProcessTime()-start_time);
           	  (void) fprintf(stderr,"DefineNearField: ");
           	  start_time=ProcessTime();
           	  DefineNearField(cube.root);
           	  (void) fprintf(stderr,"%ds\n",ProcessTime()-start_time);
           	  (void) fprintf(stderr,"DefineInteractiveField: ");
           	  start_time=ProcessTime();
           	  DefineInteractiveField(cube.root);
           	  (void) fprintf(stderr,"%ds\n",ProcessTime()-start_time);
           	  (void) fprintf(stderr,"ComputePhi: ");
           	  start_time=ProcessTime();
           	  ComputePhi(cube.root);
           	  (void) fprintf(stderr,"%ds\n",ProcessTime()-start_time);
           	  (void) fprintf(stderr,"ComputePsi: ");
           	  start_time=ProcessTime();
           	  ComputePsi(cube.root);
           	  (void) fprintf(stderr,"%ds\n",ProcessTime()-start_time);
           	  (void) fprintf(stderr,"Near: %.8g\nFar: %.8g\n",cube.near_potential,
           	    cube.far_potential);
           	  /*
           	    Release N-body cube tree storage.
           	  */
           	  do
           	  {
      1 -> 	    nodes=cube.node_queue->next;
           	    (void) free((void *) cube.node_queue->phi);
           	    (void) free((void *) cube.node_queue->psi);
           	    (void) free((void *) cube.node_queue);
           	    cube.node_queue=nodes;
           	  }
           	  while (cube.node_queue != (Nodes *) NULL);
      1 -> 	  (void) free((void *) cube.ijk_factorial);
           	  (void) free((void *) cube.ijk_binomial);
           	  (void) free((void *) cube.phi_tilde);
           	  (void) free((void *) cube.psi_tilde);
           	  return(cube.near_potential+cube.far_potential);
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  N a i v e P o t e n t i a l
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function NaivePotential calculates the interaction potentials of a
           	%  number of particles using the naive direct method.
           	%
           	%  The format of the NaivePotential routine is:
           	%
           	%    potential=NaivePotential(particles,number_particles)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o particles: Specifies a pointer to an array of Particles structures.
           	%
           	%    o number_particles: Specifies the number of particles in the
           	%      particles array.
           	%
           	%
           	*/
           	static double NaivePotential(particles,number_particles)
           	Particle
           	  *particles;
           	
           	unsigned int
           	  number_particles;
      1 -> 	{
           	  double
           	    distance,
           	    total_potential,
           	    x_distance,
           	    y_distance,
           	    z_distance;
           	
           	  register int
           	    i,
           	    j;
           	
           	  register Particle
           	    *p,
           	    *q;
           	
           	  /*
           	    Calculate gravitational total potential.  Use Newton's third law to
           	    reduce the number of pairwise interactions.
           	  */
           	  total_potential=0.0;
           	  p=particles;
           	  for (i=0; i < number_particles; i++)
           	  {
   1023 -> 	    q=p;
           	    for (j=(i+1); j < number_particles; j++)
           	    {
 522753 -> 	      q++;
           	      x_distance=p->x-q->x;
           	      y_distance=p->y-q->y;
           	      z_distance=p->z-q->z;
           	      distance=sqrt(x_distance*x_distance+y_distance*y_distance+
           	        z_distance*z_distance);
           	      total_potential+=1.0/distance;
           	    }
   1023 -> 	    p++;
           	  }
      1 -> 	  return(2.0*GravitationalConstant*total_potential);
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%  P o w e r
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%  Function Power returns the x value raised to the y'th power.
           	%
           	%  The format of the Power routine is:
           	%
           	%    Power(x,n)
           	%
           	%  A description of each parameter follows:
           	%
           	%    o x: Specifies a double value.
           	%
           	%    o n: Specifies a non-negative integer.
           	%
           	%
           	*/
           	static double Power(x,n)
           	register double
           	  x;
           	
           	register int
           	  n;
 306824 -> 	{
           	  register double
           	    power;
           	
           	  for (power=1.0; n > 0; n--)
 251590 -> 	    power*=x;
 306824 -> 	  return(power);
           	}

           	/*
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	%
           	%   M a i n
           	%
           	%
           	%
           	%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
           	%
           	%
           	*/
           	main(argc,argv)
           	int
           	  argc;
           	
           	char
           	  **argv;
      1 -> 	{
           	  double
           	    multipole_potential,
           	    naive_potential;
           	
           	  float
           	    maximum_extent,
           	    minimum_extent;
           	
           	  int
           	    i;
           	
           	  long int
           	    start_time;
           	
           	  Particle
           	    *particles;
           	
           	  unsigned int
           	    number_particles,
           	    precision;
           	
           	  /*
           	    Allocate particle array.
           	  */
           	  application_name=argv[0];
           	  precision=3;
           	  if (argc > 1)
  ##### -> 	    precision=atoi(argv[1]);
      1 -> 	  (void) scanf("%d",&number_particles);
           	  particles=(Particle *) malloc(number_particles*sizeof(Particle));
           	  if (particles == (Particle *) NULL)
  ##### -> 	    Error("unable to allocate memory",(char *) NULL);
           	  /*
           	    Read particles positions.
           	  */
      1 -> 	  (void) scanf("%f %f %f",&particles[0].x,&particles[0].y,&particles[0].z);
           	  minimum_extent=particles[0].x;
           	  maximum_extent=particles[0].x;
   1022 -> 	  for (i=1; i < number_particles; i++)
           	  {
   1022 -> 	    (void) scanf("%f %f %f",&particles[i].x,&particles[i].y,&particles[i].z);
           	    /*
           	      Define vertexes of the particle cube.
           	    */
           	    if (particles[i].x < minimum_extent)
      1 -> 	      minimum_extent=particles[i].x;
   1022 -> 	    if (particles[i].y < minimum_extent)
      4 -> 	      minimum_extent=particles[i].y;
   1022 -> 	    if (particles[i].z < minimum_extent)
      3 -> 	      minimum_extent=particles[i].z;
   1022 -> 	    if (particles[i].x > maximum_extent)
      4 -> 	      maximum_extent=particles[i].x;
   1022 -> 	    if (particles[i].y > maximum_extent)
      1 -> 	      maximum_extent=particles[i].y;
   1022 -> 	    if (particles[i].z > maximum_extent)
      3 -> 	      maximum_extent=particles[i].z;
           	  }
      1 -> 	  minimum_extent=floor(minimum_extent);
           	  maximum_extent=ceil(maximum_extent);
           	  start_time=ProcessTime();
           	  multipole_potential=MultipolePotential(particles,number_particles,precision,
           	    minimum_extent,maximum_extent,0);
           	  (void) fprintf(stderr,"The multipole-expansion potential is %.8g (%ds).\n",
           	    multipole_potential,ProcessTime()-start_time);
           	  start_time=ProcessTime();
           	  naive_potential=NaivePotential(particles,number_particles);
           	  (void) fprintf(stderr,"The naive potential is %.8g (%ds).\n",
           	    naive_potential,ProcessTime()-start_time);
           	  (void) fprintf(stderr,"The expansion error: %.8g\n\n",
           	    AbsoluteValue(multipole_potential-naive_potential));
      1 -> 	  (void) free((void *) particles);
           	  return(False);
           	}