Getting started
There are two possible ways to build SIMULATeQCD. If you are running on your own laptop or desktop and have an NVIDIA GPU, we recommend that you use the container build. The container will automatically grab all software you need. If you are running on an HPC system or want to use AMD, we recommmend you compile manually and ensure that all needed software already exists on the system you’re using.
Prerequisites
Before cloning anything, we recommend you get git-lfs
. The reason we recommend this is that we have several configurations
used to test some of the SIMULATeQCD methods. These are too large to keep in a conventional repository, so we host them
using git-lfs
. SIMULATeQCD will work without it, but you won’t be able to run the tests.
You can install git-lfs by
# For Debian-based system
sudo apt install git-lfs
# For Arch-based system
sudo pacman -S git-lfs
and activate it by calling git lfs install
. If you do not have superuser privileges where you are, you can use wget as follows:
wget https://github.com/git-lfs/git-lfs/releases/download/v3.0.2/git-lfs-linux-amd64-v3.0.2.tar.gz
tar -xf git-lfs-linux-amd64-v3.0.2.tar.gz
PREFIX=/path/to/install/dir ./install.sh
and you can activate it with /path/to/install/dir/bin/git-lfs install
. Finally you will need to add export PATH=/path/to/install/dir/bin:$PATH
to your .bashrc
.
How to download the code
First, make sure you have activated git-lfs using git lfs install
.
The code can then be cloned to your folder using:
git clone https://github.com/LatticeQCD/SIMULATeQCD.git
If you are using two-factor authentication on GitHub, which is likely the case if you are interested in developing SIMULATeQCD, you will need to use the command
git clone git@github.com:LatticeQCD/SIMULATeQCD.git
Build (manual)
If you would like to use AMD, would like to make substantial contributions to SIMULATeQCD, or are running on an HPC system, you will need to do a manual build. If you have an NVIDIA GPU, are running locally, and don’t plan to do much development, we recommend you skip down to the container build section, because the container will take care of all the hassle of finding and installing software automatically.
For the manual build, you will need to make sure the following are installed:
cmake
(Some versions have the “–pthread” compiler bug. Versions that definitely work are 3.14.6 or 3.19.2.)C++
compiler withC++17
support.MPI
(e.g.openmpi-4.0.4
).CUDA Toolkit
version 11+ orHIP
.pip install -r requirements.txt
to build the documentation.
Building source with CUDA
To build the source with CUDA, you need to have the CUDA Toolkit
version 11.0 or higher installed on your machine.
To setup the compilation, create a folder outside of the code directory (e.g. ../buildSIMULATeQCD/
) and from there call the following example script:
cmake ../SIMULATeQCD/ \
-DARCHITECTURE="80" \
-DUSE_GPU_AWARE_MPI=ON \
-DUSE_GPU_P2P=ON \
Here, it is assumed that your source code folder is called SIMULATeQCD
. Do NOT compile your code in the source code folder!
You can set the CUDA installation path manually by setting the cmake
parameter -DCUDA_TOOLKIT_ROOT_DIR
.
-DARCHITECTURE
sets the GPU architecture (i.e. compute capability version without the decimal point). For example use “70” for Volta or “80” for Ampere.
Building source with HIP for NVIDIA platforms (Experimental!)
In order to build the source with HIP for NVIDIA platforms, you need to make sure that
HIP is properly installed on your machine
CUDA is properly installed on your machine
The environment variable
HIP_PATH
holds the path to the HIP installation folderThe environment variables
CC
andCXX
hold the path to the HIP clang compiler
To setup the compilation, create a folder outside of the code directory (e.g. ../buildSIMULATeQCD/
) and from there call the following example script:
cmake ../SIMULATeQCD/ \
-DARCHITECTURE="80" \
-DUSE_GPU_AWARE_MPI=ON \
-DUSE_GPU_P2P=OFF \
-DBACKEND="hip_nvidia" \
Here, it is assumed that your source code folder is called SIMULATeQCD
. Do NOT compile your code in the source code folder!
You can set the HIP installation path manually by setting the cmake
parameter -DHIP_PATH
.
You can also set the CUDA installation path manually by setting the cmake
parameter -DCUDA_TOOLKIT_ROOT_DIR
.
-DARCHITECTURE
sets the GPU architecture.
-DUSE_GPU_P2P=ON
is not yet supported by this backend.
Building source with HIP for AMD platforms (Experimental!)
In order to build the source with HIP for AMD platforms, you need to make sure that
HIP is properly installed on your machine
The environment variable
HIP_PATH
holds the path to the HIP installation folderThe environment variables
CC
andCXX
hold the path to the HIP clang compiler
To setup the compilation, create a folder outside of the code directory (e.g. ../buildSIMULATeQCD/
) and from there call the following example script:
cmake ../SIMULATeQCD/ \
-DARCHITECTURE="gfx906,gfx908" \
-DUSE_GPU_AWARE_MPI=ON \
-DUSE_GPU_P2P=OFF \
-DBACKEND="hip_amd" \
Here, it is assumed that your source code folder is called SIMULATeQCD
. Do NOT compile your code in the source code folder!
You can set the HIP installation path manually by setting the cmake
parameter -DHIP_PATH
.
-DARCHITECTURE
sets the GPU architecture. For example gfx906,gfx908.
-DUSE_GPU_P2P=ON
is not yet supported by this backend.
Compiling particular executables
Inside the build folder, you can now begin to use make
to compile your executables, e.g.
make NameOfExecutable
If you would like to speed up the compiling process, add the option -j
, which will compile in parallel using all available CPU threads. You can also specify the number of threads manually using, for example, -j 4
.
You also have the option to compile certain subsets of executables. For instance make tests
will make all the executables used for testing.
One can also compile applications
, examples
, profilers
, tools
, and everything
. To see a full list of available executables,
look at SIMULATeQCD/CMakeLists.txt
.
Build (container)
If you just want to get something running quickly on your laptop or desktop, this is likely the easiest way to go.
Install Podman
On RHEL-based (Rocky/CentOS/RHEL) systems
Before continuing make sure there are no updates pending with sudo dnf update -y && sudo dnf install -y podman
and then reboot with sudo reboot
. (The reboot just makes avoiding permissions/kernel issues easy because that stuff is reread on boot.)
On Arch-based systems
See install instructions. If you have installed Arch before the upgrade to shadow (as in /etc/shadow) 4.11.1-3 rootless podman may encounter some issues. The build script will check for these anomalies and prompt you if you need to fix them.
Other *NIX Systems
If you have a non RHEL-based OS see here for installation instructions.
Make sure Podman works
Run podman run hello-world
as your user to test your privileges.
If this does not run correctly, the container build will not function.
If you see the error:
ERRO[0014] cannot find UID/GID for user u6042105: No subuid ranges found for user "u6042105" in /etc/subuid - check rootless mode in man pages.
this indicates someone has modified the standard user privileges or you are running an older operating system. To fix this error run sudo usermod --add-subuids 100000-165535 --add-subgids 100000-165535 <YOUR_USER> && podman system migrate
WARNING: If you are SSH’ing to your server, make sure you ssh as a user and not root. If you SSH as root and then su
to user, podman will issue ERRO[0000] XDG_RUNTIME_DIR directory "/run/user/0" is not owned by the current user
. This happens because the user that originally setup /run
is root rather than your user.
Build the code
Update
podman-build/config.yml
with any settings you would like to use for your build. This includes your target output directory.You can run
<where_you_downloaded>/simulate_qcd.sh list
to get a list of possible build targets.If you want to change where the code outputs to, you need to update OUTPUT_DIRECTORY in
podman-build/config.yml
. It will create a folder called build in the specified folder.
Run
chmod +x ./simulate_qcd.sh && ./simulate_qcd.sh build
How to run
On a cluster using slurm
If you are on a cluster that uses slurm, then inside of your sbatch script do not use mpiexec
or mpirun
, but instead do
srun -n <NoGPUs> ./<program>
where <NoGPUs>
is the number of GPUs you want to use.
The <program>
likely already points to some default parameter file. If you would like to pass your
own parameters, one way to do this is to make your own parameter file. This can then be passed
as an argument like
srun -n <NoGPUs> ./<program> /path/to/your/parameter/file
Example parameter files can be found in the parameter
folder. We have tried to explain
what each parameter means in each .param
file. You can learn more about how general parameters
are implemented here.
On your local machine (desktop, laptop, …)
Unless you are using one GPU, any program has to be launched using mpirun
or mpiexec
.
For example:
mpiexec -np <NoGPUs> ./<program>
Simiarly as above, if you want to pass your own parameter file,
mpiexec -np <NoGPUs> ./<program> /path/to/your/parameter/file