Input and output formats for gauge configurations
In lattice QCD calculations, there are several popular gauge configuration formats on the market.
These can differ in several ways, including how their header is implemented and how the to-be-saved
Gaugefield
object is converted to binary. When deciding between ILDG and NERSC, note that
the ILDG takes up more storage space, since it’s not compressed. The benefit of ILDG is that
it has more informative metadata and it allows one to store configurations on the Lattice
Data Grid.
ILDG
The International Lattice Data Grid (ILDG) format has two main advantages, namely:
The ILDG is the largest attempt by the lattice community to make gauge configurations generated by groups around the world publicly available, and we strive to be part of that community. If we would like to use their storage, we need to adhere to their format.
The ILDG format is perhaps the most descriptive (in the sense of metadata) and safe (in the sense of being sensitive to corrupted configurations) format available to the lattice community. More information about the ILDG effort can be found here. You may also be interested in this ILDG publication.
How an ILDG configuration is packaged
A file saved in ILDG format format consists of several parts packaged using the Lattice QCD Interchange Message Encapsulation (LIME) format. (You can learn more about LIME below.) LIME files are generally organized as follows:
One encapsulates ASCII or binary data into records.
The records are packaged into messages. In SIMULATeQCD, we do not really take advantage of this hierarchy, and organize our output LIME files as four records, each containing a single message.
The ildg-format
record is an XML document with some set of non-mutable parameters needed to
read the binary. Here is an example:
<?xml version="1.0" encoding="UTF-8"?>
<ildgFormat xmlns="http://www.lqcd.org/ildg"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.lqcd.org/ildg/filefmt.xsd">
<version> 1.0 </version>
<field> su3gauge </field>
<precision> 32 </precision>
<lx> 20 </lx> <ly> 20 </ly> <lz> 20 </lz> <lt> 64 </lt>
</ildgFormat>
The ildg-binary-data
record is the stored binary data. It is a sequence of IEEE floating
point numbers. The precision is given in the ildg-format
record. The endianness is big.
In this format a Gaugefield
is stored as an 8 (or 7) dimensional array of floating point
(or complex) numbers. The dimensions ordered from slowest to fastest running index are
site index \(t\)
site index \(z\)
site index \(y\)
site index \(x\)
direction index \(\mu\)
color index \(a\)
color index \(b\)
index indicating real (0) or imaginary (1) part
The next record is a ildg-data-lfn
record. When the configuration is uploaded somewhere,
the Logical File Name (LFN) is the string used to identify it. The ILDG standard for
constructing LFNs is
LFN = "mc://ldg/"+collaborationName+"/"+projectName+"/"+ensembleName+"/"+fileName;
Therefore this is how we construct it automatically in SIMULATeQCD.
The last record is the checksum
, which is a number characteristic of some binary data.
This checksum is technically not required by the ILDG format, so we refer to it
as a scidac-checksum
.
Two checksums are calculated for SIMULATeQCD’s ILDG configurations, both of which
are extremely sensitive to changes
in the binary file; indeed if even a single bit is changed, the checksum changes.
By comparing the expected checksum
saved in the header of an ILDG file with the
calculated checksum
upon read in, one can tell whether the file has been corrupted.
The checksum
What is crc32
?
CRC is the abbreviation for Cyclic Redundancy Code, which generates the checksum from
binary data. If the width of the checksum is a 32-bit
it is renamed as crc32
. The aim of the
checksum is to enable the receiver of gauge configurations to determine whether the binary
has been corrupted or not. To do this, the configuration generator constructs a value (called a checksum)
that is a function of the binary data, and appends it to the header. The receiver can then use
the same function to calculate the checksum of the received configuration data and compare it
with the appended checksum to see if the binary data was correctly received. A checksum could
range from an 8-bit
to 32-bit
number. At least two aspects are necessary to generate a good
checksum: First the register width should be wide enough to keep the probability of failure low
(e.g. a probability of failure for 32-bits
is 1/2^32
). Secondly, the formula which gives the
checksum should be very sensitive to the input data. To compute checksum one takes a CRC
polynomial and erforms some bit-wise operations on the binary data w.r.t. the polynomial. This
checksum is computed on each site. All the checksums from sites are combined to one by
performing again some bit-wise operations. To make it stronger one takes a lexicographic rotation
of the checksum coming from each site. As a result one gets two checksums. The
function
uint32_t checksum_crc32_sitedata(const char *ptr_buffer, size_t bytes)
computes the checksum crc32
from the site data (4 links, each link is 3x3 matrix of complex numbers).
The function
void checksum_crc32_combine(Checksum *checksum_crc32,size_t global_vol, uint32_t cs_crc32_sd[])
combines all checksums from sites and performs the lexicographic rotation. The function
void checksum_crc32_accumulator(Checksum *checksum_crc32, size_t site_index, char *ptr_buffer, size_t sitedata_bytes)
does the same job as the two above. But it should only be used on single GPU.
How a configuration is saved on the Lattice Data Grid
Once the ILDG configuration is packaged as a LIME file in the way specified above, it is ready to be stored physically somewhere. This physical location where it is stored is called the “Storage Element”. If we want to write a script later to find this saved configuration, the script needs to know where the configuration is stored. Therefore for each ILDG file there is a corresponding XML file stored in the QCDml configuration format. This XML file must validate against the QCDml configuration schema given here. (To learn more about XML files and schemas, you can look e.g. here.)
In addition to the location of the ILDG configuration, the QCDml file knows a bunch of metadata about the configuration, like who made it and what algorithm was used. This QCDml configuration file is then stored somewhere else, called the “File Catalogue”. Scripts that search for ILDG configurations will interact with the File Catalogue.
How ILDG is implemented in SIMULATeQCD
One of the QCDml metadata is the location of the configuration on the Lattice Data Grid, which is not known at the time of generation. Therefore there must always be some post processing to get an ILDG configuration ready for storage.
With this in mind, what is implemented at the time of writing is this: Each ILDG configuration made by SIMULATeQCD is a LIME file with the minimal amount of information required for convenient reading by SIMULATeQCD, whose gauge field is stored in binary according to the convention above. Since we can read LIME format already, we are able to read ILDG configurations. However we cannot control how ILDG readers are implemented in other codes, e.g. QUDA, so a configuration made by SIMULATeQCD will in general require further processing to be readable by other codes.
More about LIME
For detailed information about LIME, see its GitHub project here. A LIME record is packed as follows:
A 144-byte header
The data (maximum of \(2^{63}\) bytes)
Some null padding (0-7 bytes as needed)
The header is organized into 18 64-bit (8 byte) words as follows:
word |
content |
---|---|
0 |
subheader |
1 |
data length in bytes |
2-17 |
128 byte LIME-type |
where the subheader consists of
bits |
content |
---|---|
0-31 |
LIME magic number |
32-47 |
LIME file version number |
48 |
message begin bit |
49 |
message end bit |
50-63 |
reserved |
The long int
LIME magic number, \(1164413355_{10}=456789ab_{16}\), is used to identify
a record in LIME format. The version number is a short int
. The three integer numbers in
the header, i.e. the magic number, version number, and data length, are written in
IEEE big-endian byte order for their data types, long
, short
, and
long long
, respectively.
MILC
MILC format is the format of the MILC code base. As of v7.7.11, these binaries are always in single precision. They save all three rows.
NERSC
A NERSC format file consists of a simple header
BEGIN_HEADER
DATATYPE = 4D_SU3_GAUGE_3x3
DIMENSION_1 = 8
DIMENSION_2 = 8
DIMENSION_3 = 8
DIMENSION_4 = 4
CHECKSUM = 436aa5c1
LINK_TRACE = 0.002564709374
PLAQUETTE = 0.311637549
FLOATING_POINT = IEEE64BIG
END_HEADER
followed by the binary. The NERSC checksum
is essentially a sum over all elements of
all links in the lattice. This checksum
is not as sensitive as the ILDG checksum.
An advantage of NERSC format is that one has the option of saving two rows, then
reconstructing the third row on the read in. Such compressed gauge configurations
save a good deal of storage space.
OPENQCD
OPENQCD format is the format of the openQCD library. OPENQCD has a bit of an unconventional set up: It stores the spatial and time dimensions in two integers, the plaquette as a double, and then all the links as \(3\times 3\) complex matrices. However, they only store links for odd sites, with links going both forward and backward from the sites. One needs to read them in and get the conjugate for the even sites.