VOBOZ


Version 1.03 Documentation


Note: this is the documentation to v.1.03, not the current version, 1.2. In v.1.2, the biggest differences are (1) it's easier to install -- thanks, Rick Wagner! -- (2) it's combined with ZOBOV, and (3) boz returns some more data in the text file. Also, boz works a bit differently now; it may be worth trying boz from both v. 1.2 and v. 1.03. The extra fields in v.1.2 are explained in the comments at the end of boz.c. This disclaimer may disappear if I get a chance to update this page.

   VOBOZ (VOronoi BOund Zones) is an "almost-parameter-free" algorithm designed to detect haloes in N-body cosmological simulations, written by Mark Neyrinck, with "advice" from my advisors, Andrew Hamilton and Nick Gnedin.  This file contains instructions on installing and using VOBOZ.  It would be difficult to use these instructions by themselves; they are intended to be used in conjunction with the paper describing VOBOZ, available at astro-ph/0402346.  However, we hope that, with help from the paper, this file contains enough information to get one started using VOBOZ.  Of course, questions, comments, and bug reports (without absolute guarantees of prompt fixing) are very welcome; please direct them to Mark Neyrinck.  This is free software, which may be freely copied, modified, and redistributed, as long as the authors are acknowledged.  There is no warranty or other guarantee of fitness for VOBOZ; it is provided "as is."

    The URL of this file is http://www.ifa.hawaii.edu/~neyrinck/voboz/vobozhelp.html.

    Click here to go back to the VOBOZ home page.  This document was last updated on 1/18/2008.

    Here is a table of contents, for easy navigation:
Installation
Use
    voz
        vozinit
        voz1b1
        voztie
    jovoz
    boz

Installation

     Assuming you already have a UNIX computer and a C compiler for it, there are two things that you need to download to use VOBOZ: our software, voboz1.03.tar.gz; and the Qhull package, qhull-2003.1-src.tgz.  On a Sun workstation, I have had difficulty getting this newest version of Qhull to compile; if this happens to you, try the penultimate version, qhull2002.1.tgz)

    To install VOBOZ, first gunzip and untar the Qhull package into its own directory (e.g. ~/qhull-2003.1/).  Go into the ./src directory therefrom (e.g. ~/qhull-2003.1/src/), and mess around with the Makefile until it lets you compile everything via make.  Qhull also includes a utility which is supposed to ease this process, Make-config.sh.

    Next, you get to compile VOBOZ itself.  Gunzip and untar voboz.tar.gz, whose contents will go into their own, voboz/ directory.  You should get the following files:

Makefile      jovoz.c       voz.h         voztie.c
boz.c         readfiles.c   voz1b1.c      vozutil.c
findrtop.c    readme.txt    vozinit.c


Now, unfortunately, more Makefile fiddling will probably be necessary in the VOBOZ directory, but we sincerely hope that without too much trouble, "make" will produce for you a quintet of useful programs.  Make sure that the directories linking to the Qhull header files and library file are correct in the Makefile.  By default, the Makefile is set to work on the IBM p690 supercomputer at NCSA.  Please email me if you encounter difficulties.

Use

    The VOBOZ algorithm has three steps: (1) computing the Voronoi diagram of the simulation particles, returning the volume of each particle and its set of adjacent particles; (2) grouping them into prospective haloes; and (3) discarding particles in each halo which are not bound to it, possibly destroying the halo.  Likewise, there are three different groups of programs comprising VOBOZ:

voz

    This trio of programs (VOronoi Zones) computes the Voronoi diagram of a set of particles satisfying periodic boundary conditions, and, for each particle, the programs return the volume of its Voronoi cell, and its set of Voronoi adjacencies.  This step is broken into three steps because the memory requirements of computing the Voronoi diagram can be prohibitive if a large simulation is analyzed directly.  So, we break up the simulation into at least two equal parts in each dimension, resulting in at least eight "sub-boxes."

vozinit

    This first program does not actually have to be run, but it gives some idea of how equal the partition of the simulation will be, checks that the number of guard points is sufficient, and generates a script file which may be used to complete the "voz" step.  The input parameters are:

arg1: position file
-- discussed below
arg2: buffer size (default 0.1) -- discussed below
arg3: box size -- the range of positions of particles in each dimension
arg4: number of divisions (default 2) -- the no. of partitions in each dimension; must be at least 2 (giving 8 sub-boxes)
arg5: suffix describing this run -- a label for the output files

    The position file contains the positions of all particles.  By default, this is a Fortran 77 - formatted file, written as below.   The user may also wish to modify the input format in readfiles.c to suit his/her own data format.

open(unit=1, file=outname, form='unformatted')
write(1) num_particles
write(1) (x(i),i=1,num_particles)
write(1) (y(i),i=1,num_particles)
write(1) (z(i),i=1,num_particles)
close(1)
 
     The buffer size sets the size, in units such that the box size of the data cube is 1, of the buffer around each sub-box when calculating the Voronoi diagram.  To reduce the amount of memory required, VOBOZ divides the particle data cube into at least two equal parts in each dimension, resulting in a minimum of eight sub-boxes.  Undoubtedly, particles along the edge of each sub-box will have neighbors outside it, and the buffer should be big enough to catch all neighbors.  To make sure that it is big enough, guard particles are deployed inside the buffer.  If a guard particle is returned as a neighbor of particle p inside the sub-box, there is a chance that a particle outside the buffer should be included in p's list of neighbors (which also affects the volume returned).  See the paper for more details on this process.
    It is possible that, for a given border size, there will be an insufficient number of guard points, a hard-coded number.   If vozinit gives you an error message declaring as much, you will have to either increase the buffer size, or increase the number of guard points, set in voz.h.

    The output of vozinit is a script file which, if paths are defined to allow it, will run voz1b1 on each sub-box, and then voztie:

voz1b1

    This program (VOZ one-by-one) calculates the Voronoi diagram on one sub-box of the data cube.  Ideally, one would not have to use the input parameters, since they would be set in the vozinit script file, but one argument might have to be changed if a guard point is encountered while diagramming one of the sub-boxes.  Only the sub-box(es) encountering guard particles need be rediagrammed. Here are the relevant arguments:

arg2: border size
arg6-8: 0 to (Ndiv-1)


    If a guard point is encountered, the easiest way to fix the problem is to expand the border size (which will result in a larger memory requirement, and calculation time).  This may require the number of guard particles to be increased in voz.h and voz1b1 to be recompiled.  Arguments 6-8 are labels for the sub-box being calculated.

    The output of voz1b1 is a file containing the Voronoi adjacencies and volumes of particles in the sub-box: part.%s.%02d.%02d.%02d, where %s is the "suffix," and the %02d's are the three two-digit labels identifying the sub-box.

voztie

    This program ties the sub-boxes together, returning single adjacency (adj%s.dat, where %s is the "suffix,") and volume (vol%s.dat) files for the whole datacube.  The volume file is formatted simply in C (not Fortran77) format: it consists of a 4-byte integer containing the number of particles, followed by an array of 4-byte floats containing each particle volume.  The adjacency file is formatted in a more complicated manner to reduce its size from the doubly linked list used within the code.  If you wish to access this, please look at the voztie.c code, or the jovoz.c code, which reads the adjacency file.

jovoz

    This program (JOin VOronoi Zones) first finds "zones," (one for each density maximum) and then links them together in the manner described in the paper.  Its arguments are:

arg1: adjacency file
arg2: volume file
arg3: output zone membership file
arg4: output text file
arg5: volume tolerance (e.g. 1)


    The volume tolerance is the main parameter present in VOBOZ, limiting the growth of haloes into low-density regions.   Its canonical value is 0.01, corresponding to a density of 100 times the mean (the volumes in the volume file should have a mean density of 1).

    There are two outputs of jovoz: a zone membership file which contains the particles in each zone, followed by the zones linked to each zone, and a text file.  The text file lists, for each zone: an integer identifying the zone; the number of (pre-unbinding) particles linked to it before a zone with a higher core density is encountered; the ratio of its "strongest link" (see the paper for the definition) volume to its peak volume, called r in the paper; and the "strongest link" volume.

boz

    This program (BOund Zones) removes unbound particles from the haloes found by jovoz.  Its arguments are:

arg1: box size
arg2: Omega_matter
arg3: scale factor a
arg4: position file
arg5: velocity file
arg6: volume file
arg7: input zone file
arg8: output bound zone file
arg9: output text file
arg10: unbinding f

    The new argument here is f.   To unbind particles accurately in high-velocity-dispersion regions, it is necessary to unbind only the most unbound particles first.  This is because jovoz may have included some particles with quite different velocities than the velocity centroid of the halo, skewing the initial calculation of its velocity centroid.  In practice, this is done by multiplying the potential energy by a factor large enough that no particles are unbound (the largest ratio of kinetic to potential energy of any particle in the halo), and then dividing this multiplier by f  at each subsequent iteration, continuing until either the halo is completely unbound or the multiplier reaches unity.  If no particles are unbound in an iteration while the multiplier is still descending, the multiplier is again reduced to the maximal ratio of kinetic to potential energy; this is to eliminate as many unnecessary iterations as possible.  After the multiplier reaches one, the iterations continue with the true unbinding criterion until either the halo is completely unbound, or no particles are unbound in an iteration.

    If the user is cavalier with processor time, (s)he may wish to use the parameter-free way of unbinding, one particle at a time.  There are some commented-out blocks which should facilitate changing the code to do this.

    There are also a few hard-coded parameters (which should not affect the results) in boz: the threshold number NPTOTOL of particles above which a halo is unbound using the deeper and shallower potential bounds (see paper for details), and the number of partitions NGRID in each dimension into which the halo is gridded in calculating these bounds.  For haloes smaller than the threshold, the potential is calculated directly, an order-n2 operation.  If there are haloes with large numbers of particles in the simulation, it might expedite things to use a large number of partitions, since the accuracy of the bounds increases with the number of partitions, requiring only one or two bounds to be calculated for more particles.  On the other hand, the time spent on each halo with a number of particles above the threshold scales as NGRID3, so one should not be too cavalier in partitioning.  A third parameter here allows one to check the progress of boz, telling it to report on the unbinding of haloes with an initial number of particles above PRNTHR. All of these parameters appear in #define statements at the top of boz.c.

    boz is the only program in the package which we have formally parallelized (although voz1b1 may be run on different parts of the simulation in parallel).  There are several "#pragma" compiler directives which have been commented out, which are for the xlc_r compiler on the IBM p690 supercomputer at NCSA.  These can probably guide users wishing to parallelize under other architectures, as well.  Also, if the "output bound zone file" already exists when boz is started, all of the haloes previously unbound are read in from it, and progress recommences at the halo where boz left off the last time.
    Of course, it is vitally important, if tedious, to get the units right in boz.  The velocity file is read in the same format as the position file, in readfiles.c.   When velread(char *velfile, float ***v, float cell) is called from boz.c, cell should be set so that velocities in km/sec are returned.  This can be done by altering the #defined NSIM (the "factory" setting is 256), the box size with which boz is called, or both; cell = boxsize/NSIM, and represents the physical size in Mpc/h of a cell in the simulation if there is one particle per cell.  The velocities in the file which velread is by default set to read must be multiplied by 100*cell to be in km/sec; it is likely that this is not so for the user's velocity file, so he/she should look at the way velocities are read and converted, and the way the "boundness"(kinetic+potential energy) is calculated. A key quantity to look it is "potfactreal," which, as indicated in the program, is GM_particle, times whatever unit conversions are necessary to get the potential energy into (km/sec)^2.  We apologize for any pain the user experiences in checking these units, and will gladly try to help if a user needs it.

    The output from  boz (and of VOBOZ, since this is the last step) consists of two files.  The first, the "output bound zone file," is formatted as follows:

4-byte integer: # of haloes/zones (including destroyed ones), and then, for each halo (even the destroyed ones): {
    4-byte integer: # of bound particles -- this is zero for destroyed haloes
    an array of 4-byte integers containing the particle numbers of each bound particle in the halo
}

    The second output file, the "output text file," is formatted as  follows:

First line: Number of pre-unbinding zones; number of bound haloes, and then, on one line for each BOUND halo: {
    An integer identifying the zone (this is the same as in the jovoz text file, and is the number of the halo in the "output bound zone file")
    The number of bound particles, followed by the total (pre-unbinding) number of particles, in the halo
    The ratio of the halo's "strongest link" (see the paper for the definition) volume to its peak volume, called r in the paper
    The volume of the halo's peak particle -- note that this particle may have been unbound
    The x, y, and z coordinates of the halo's peak particle, in units of the boxsize
    vx, vy, and vz, the three coordinates of the halo's velocity centroid
}