Correlator Class

Here we explain the ideas behind and methods contained in the Correlator header file. The classes described here can be found in src/math/correlators.h. If you have further questions after reading this wiki article, I suggest you have a look at this class yourself, which is thoroughly commented, and will explain in more detail some things that are not fully explained here.

The main class is the CorrelatorTools class, CorrelatorTools<PREC,true,HaloDepth> corrTools; which contains all the methods and attributes needed to correlate objects. What objects are to be correlated? These objects are called CorrField objects, which are essentially arrays of arbitrary type that inherit from the LatticeContainer and include some initialization methods, like the ability to set all elements to zero() or one(). Where are the results stored? Results are stored in objects called Correlator objects, which are basically CorrField objects but indexed slightly differently.

You can correlate whatever kinds of mathematical objects A and B you like; for instance you could correlate Polyakov loops or hadron interpolators. In order to give the user maximum flexibility, I tried to make the calculation of A and B separate from the correlator class. Therefore the general idea of a correlation calculation is to calculate A and B ahead of time yourself, then copy A and B into CorrField objects. What is meant by “correlate”? Well, that is up to you. This framework will allow one to calculate any function \(\langle f(A_x, B_y)\rangle\), where \(f\) is a function that you can design yourself, and \(x\) and \(y\) run over one of a few possible domains. The method

corrTools.correlateAt< fieldType, corrType, f >("domain", CPUfield1, CPUfield2, CPUnorm, CPUcorr);

will carry out this correlator calculation. Here fieldType and corrType are the types of the CorrField and Correlator, respectively. Note that your function \(f\) is implemented as a template parameter. Inside the correlateAt method, a kernel is called depending on the domain you chose. In this method, GPU copies of these CPU arrays will be made and passed to the kernel. The result will be stored in CPUcorr.

The CPUnorm vector is a Correlator-type object that keeps track of how many times a correlator at distance \(r\) was calculated. This is something that you should pre-calculate ahead of time using corrTools.createNorm; the result will be saved in a normalization file that you can load later in your production runs. This is not particularly elegant, and createNorm runs on the CPU only so it’s rather slow. One can think of a more elegant way to implement this or a way to speed it up. Nevertheless this procedure works, and after reading in with corrTools.readNorm, the correlator calculation seems to be rather fast. (See the benchmarking below for details.)

Example correlator calculation

Here is an example taken from src/testing/main_correlatorTest.cpp. In this example we will correlate the \(\mu=2\) component of a gauge field with itself using the 4d all-to-all domain. Here \(f(A,B)={\rm tr}~(AB)\).

gaugeAccessor<PREC> _gauge(gauge.getAccessor());

CorrField<false,SU3<PREC>> CPUfield1(commBase, corrTools.vol4);
CorrField<false,SU3<PREC>> CPUfield2(commBase, corrTools.vol4);
Correlator<false,PREC> CPUnorm(commBase, corrTools.UAr2max);
Correlator<false,COMPLEX(PREC)> CPUcorrComplex(commBase, corrTools.UAr2max);

for(int m=0; m<corrTools.vol4; m++) {
    _CPUfield3.setValue(m, _gauge.getLink(GInd::getSiteMu(m,2)));
    _CPUfield4.setValue(m, _gauge.getLink(GInd::getSiteMu(m,2)));
}

corrTools.correlateAt<SU3<PREC>,COMPLEX(PREC),AxB<PREC>>("spacetime", CPUfield1, CPUfield2, CPUnorm, CPUcorrComplex, XYswapSymmetry = false);

Here, we have set sizes with vol4, i.e. \(N_s^3\times N_\tau\), and UAr2max, which is the largest 4d squared separation possible, given periodic BCs. UA stands for “unrestricted all”; this is opposed to “restricted” (R) domains that do not include all possible spacetime pairs, and “spatial” (S) domains, which only compute spatial pairs. The XYswapSymmetry flag is true if the correlator is insensitive to interchange of \(x\) and \(y\), which can happen for example for some Polyakov loop correlators. Setting this flag to true may decrease computation time by as much as 70% (see the benchmarking below for details). If you aren’t sure what to do, just leave it as false.

Singlet, octet, and average Polyakov loop correlators

This section contains code meant only for constructing certain Polyakov loop correlators. It was implemented by porting C code written by Olaf Kaczmarek into SIMULATeQCD and therefore is not utilizing the Correlator class fully. In principle these could be implemented in the future with the Correlator class. Instead, the following methods, part of the more specific polyakovLoopCorrelator class, can be found in src/modules/gaugeFixing. One can instantiate this class with, e.g. polyakovLoopCorrelator<PREC,true,HaloDepth> PLC(gauge) Various correlators can be constructed from untraced Polyakov loops. Three examples include the singlet, octet, and average. Some useful information about these correlators can be found here, here, and here. The polyakovLoopCorrelator class includes a function to compute these correlators, which can be obtained with the following snippet of code:

const int distmax=corrTools.distmax;
std::vector<PREC> vec_plca(distmax);
std::vector<PREC> vec_plc1(distmax);
std::vector<PREC> vec_plc8(distmax);
std::vector<int>  vec_factor(distmax);
vec_factor=corrTools.getFactorArray();
corrTools.PLCtoArrays(vec_plca, vec_plc1, vec_plc8);

The first line grabs distmax\(=N_s^2/4+1\) so that the resultant vectors vec_* have the correct size. The vectors are indexed by squared distance \(r^2\). Some squared separations are not possible on a lattice in 3D; for example \(r^2=7\) is not allowed. Therefore several of the entries will be zero. The resultant vectors are normalized inside PLCtoArrays by vec_factor, which counts the number of contributions to each correlator at a particular \(r^2\). Hence, vec_factor[r2]==0 for every disallowed distance. NOTE: getFactorArray must be called before PLCtoArrays, as getFactorArray also initializes the factor array. This initialization routine gives back vec_factor because it can be used, for example, if you want to write to file the correlators at allowed distances only. For instance one accomplishes this for the average channel with

for (int r2=0 ; r2<distmax ; r2++) {
    if (vec_factor[r2]>0) {
        if (commBase.MyRank()==0) {
            plcresultfile << std::setw(7) << dx << "  " << std::setw(13) << std::scientific << vec_plca[r2] << std::endl;
        }
    }
}

NOTE: The singlet and octet Polyakov loop correlations depend on the gauge, and in particular, they should be measured in the Coulomb gauge. This means you must fix the gauge before measuring these quantities! Do learn how to do this, please have a look at the gauge fixing page. Don’t forget to re-unitarize before taking your measurements! The singlet, octet, and average correlators are tested in src/testing/main_gfixplcTest.cpp.

Some benchmarks:

This table shows results for the 4d all-to-all tested on NVIDIA Pascal GPUs. This is the time it takes to multiply two SU3 identity matrices with themselves, which is a good estimate for how long it takes to correlate two CorrFields of type SU3 on the lattices of the given size. For reference, 1 minute is 60 000 ms.

Symmetric Results:

\(N_s\)	\(N_\tau\)	UA id3 x id3 [ms]	US id3 x id3 [ms]
8	4	12	5
24	6	3240	78
24	8	5474	78
20	20	9708	48
32	6	15306	470
32	8	25473	469
40	8	96564	1901
48	6	172147	5170
48	8	285101	5174
42	12	270474	2420
48	12	594554	5183
56	8	732532	13296

Asymmetric Results:

\(N_s\)	\(N_\tau\)	UA id3 x id3 [ms]	US id3 x id3 [ms]
8	4	15	5
24	6	4027	97
24	8	6746	96
20	20	13410	47
32	6	22265	633
32	8	37821	632
40	8	155048	2465
48	6	283987	8014
48	8	476607	8016
42	12	448504	3896
48	12	1007574	8012
56	8	1239982	18401