Basic CST Simulation Using INSA Cluster (Tutorial): Difference between revisions

From HERMES Wiki
No edit summary
Line 3: Line 3:


The first section of the guide presents the infrastructure of the cluster, and the second one explain how to get access rights and establish an SSH connection. Before continuing with this tutorial, please read the first two sections of the user guide and follow the steps that are indicated there. Once that you have a working SSH connection, you will be able to follow this tutorial and launch a CST simulation.
The first section of the guide presents the infrastructure of the cluster, and the second one explain how to get access rights and establish an SSH connection. Before continuing with this tutorial, please read the first two sections of the user guide and follow the steps that are indicated there. Once that you have a working SSH connection, you will be able to follow this tutorial and launch a CST simulation.
=== Configuring a SSH key ===
[[File:Basic_CST_Sim_Cluster_Tuto_00a.png|500px|center]]
[[File:Basic_CST_Sim_Cluster_Tuto_00b.png|500px|center]]
[[File:Basic_CST_Sim_Cluster_Tuto_00c.png|500px|center]]


== Sending a simulation file to the cluster ==
== Sending a simulation file to the cluster ==

Revision as of 16:15, 13 January 2025

Pre-requisites

This tutorial is focused on the execution of a CST Studio Suite simulation on the research cluster. Nevertheless, before pursuing this goal it is necessary to first get access to the cluster. To assists the user in this task, the IT System Direction (DSI) has created a user guide which can be found here (you'll need to login to INSA's intranet to access it). A version in French can be found here.

The first section of the guide presents the infrastructure of the cluster, and the second one explain how to get access rights and establish an SSH connection. Before continuing with this tutorial, please read the first two sections of the user guide and follow the steps that are indicated there. Once that you have a working SSH connection, you will be able to follow this tutorial and launch a CST simulation.

Configuring a SSH key

Sending a simulation file to the cluster

The first step for running a simulation on the cluster is to copy the CST model to our personal space in the cluster. We can do this by using a SFTP client. If you are working on a machine using Windows, a good SFTP client will be WinSCP. Alternatively, you can also use FileZilla, which has versions for Windows, GNU-Linux and MacOS.

When you first run WinSCP it will show you the following screen, where you can introduce the research cluster address and your login details (make sure that the "file protocol" option is "SFTP" and the "port number" is "22").

If for any reason the former window is not opened by default when you launch the program, you can manually open it by clicking on "New tab" in the main window of WinSCP.

If you do not specify a password at the login screen, you will be prompted to do so during the connection process.

Once that you are connected, WinSCP will show the files of your local computer in the left pane, and those of the cluster in the right pane. Note that, by default, the right pane will show the files of your personal space in the cluster storage system (/data00/.mcalcul/username/).

Using the left pane, we navigate towards the folder where we have stored the CST model to be simulated in the cluster (for this tutorial, the file is calle "EQA_FiveRings_TD.cst"). It must be taken into account that all the options of the simulation (mesh, solver settings, etc) must have been configured using the local PC prior to sending this file to the cluster.

We make a secondary click on the name on the file and, on the context menu that appears, we move the cursor towards the "Upload..." option. This will open an additional pane of the menu where we click on "Upload...".

An emergent window will open up, asking in which path of the cluster would we like to store the file. We make sure that out personal directory is selected (/data00/.mcalcul/username/), and we click OK.

After some seconds, the file will be uploaded and we will be able to see it in the right pane of the WinSCP application.

Launching a simulation

Once that the CST model has been transferred to the cluster, we will connect via SSH to launch the simulation. To do so, open a console and use the following command (replacing lpololop with your username).

After pressing intro, some information regarding the utilization of the cluster will be displayed, and you will be prompted to introduce your password. TYpe your password and press enter. Please, note that while you type your password no characters will display in the console. This is completely normal.

After introducing your password, some information regarding the current work charge of the cluster will be displayed, and you will be prompted with a command line interface to navigate the cluster file system and launch jobs.

It must be noted that the command prompt has changed from:

C:\Users\lpololop>

To:

lpololop@crt:~$

This indicates that the commands that we introduce in our console are now being executed in the cluster, and not in our local computer. For this reason, we must be very careful since our actions could potentially impact the work of our colleagues. The cluster runs a GNU-Linux operating system. you can find a reference of frequently used commands [here](https://github.com/RehanSaeed/Bash-Cheat-Sheet).

Some utilities have been installed in the cluster to assist you when launching a CST simulation. We can launch the job execution utility by issuing the following command:

This will launch a wizard that will help us launch our simulation. First, we are asked in which queue of the cluster would we like to execute our simulation. Detailed information on the hardware characteristics of the queues can be found in the aforementioned cluster user guide.

Whenever possible, we will launch our simulations on the insa-cpu queue. The nodes of the insa-gpu queue present the same CPU capabilities than those of insa-cpu, so we could also use them if all the insa-cpu nodes are busy. Nevertheless, we should opt for insa-cpu nodes whenever possible in order to leave the insa-gpu nodes to other cluster users who may need to use nodes with GPU capabilites (for the moment, GPU acceleration of CST simulations is not possible).

Alternatively, we can also lunch our simulations on the insa-cpu-lte queue. These nodes are slightly less performant than insa-cpu nodes, but they are still quite performant and they can handle many heavy simulations.

To see which nodes of the cluster are currently in use we can access the cluster monitoring web application. In the screenshot below we can see that the three nodes of the insa-cpu-lte queue (crn04, crn05 and crn06) are free. Therefore, we select this queue for the simulation.

Now, we are prompted to introduce the path of the file that contains the CST model to simulate. Tip: While typing the path you can use the Tab key to autocomplete the folders and files names.

The following screen allows us to select the solver to use. For this example we will choose the Time Domain solver. It should be noted that this menu also allows to launch the optimizer and the parameter sweep. If any of these tools is selected, the solver currently defined in the CST project will be used.

In the next screen we select "Distributed Computing" as the cluster acceleration method to use.

After that, we specify the number of cluster nodes that we want to use. Since we have selected the queue insa-cpu-lte, we could specify a maximum of three nodes, since this is the total number of nodes in this queue. However, for this tutorials we are going to use just two nodes.

It must be taken into account that CST requires exclusive dedication of the nodes to launch the simulation. This means that if the chosen queue does not have as many 100%-free nodes as we have requested, our simulation will be put in a waiting queue until the computational resources are available.

The following screen allows to specify a maximum computation time for the simulation. This allows to automatically stop the simulation if something unexpected happens and the simulation takes longer than it should. If no time is specified, the simulation will run for a maximum of 15 days.

Finally, we are shown a summary of all the options that we have selected for our simulation. We can press Enter to confirm or Ctrl+C to cancel. In fact, we can also use Ctrl+C at any of the former screens to abort the process if we make an error at any of the configuration steps.

After pressing enter, the number assigned to out job will be displayed. We can use this number to check the progress of our simulation.

Monitoring the progress of a simulation

The command squeue -u username returns a list of the current jobs submitted by the specified user. Each job is identified by a jobid, and we can see other information like the nodes where it is running or how much time has it been running.

By using the command scontrol show job jobid we can retrieve more details on a specific job.

CST provides also an specific utility to monitor the progress of a simulation. This tool (installed in the ClusterUtilities folder) is called cst_job_connect. When launching this tool, we must specify the path of the CST file by using the -d parameter, as shown in the screenshot below.

The tool will connect with the simulation and show its progress as well as the most recent messages and warnings. This screen will update automatically to reveal new information. We can close the utility by pressing Ctrl+C.

Alternatively, we can also follow the progress of the simulation by taking a look at the log file. To do so, we come back to WinSCP and press on the "refresh" button.

The files corresponding to the simulation that we just launched will appear in the right pane of the window. IMPORTANT: The folder pointed by the arrow contains the data used by CST for the simulation, but the log file is not there. So please do not open this folder.

If we just scroll down, we will see that the log file will be next to the *.cst file that contains our simulation model. It will present the same name as the CST model, with a suffix ".slurm.jobid.log".

If we double click on this file, it will be opened using our default plain text editor. In this file we can find the progress of the simulation up to the current time instant. However, we will need to close and reopen it in order to get it updated.

Retrieving the results

We will know that the simulation has finished because it will not be displayed anymore when we use the command squeue -u username.

If we now check the log file, we should find something like this.

To download the simulation results to our local computer, we must select both the file with the CST model and the folder with the simulation data. In our example, the names of these two elements are, respectively, "EQA_FiveRings_TD.cst" and "EQA_FiveRings_TD". We can select both of them by holding the Ctrl key of our keyboard and click on each of them. the we make a secondary click on one of them to open the context menu, where we will select the "Download..." option. An additional menu will open, and we will select "Download..." too.

Then, a new window will pop up to request in which folder do we want to store the files that will be downloaded. After specifying the path, we can click OK to proceed with the download.

Once that the download is finished, we can open the downloaded CST file as we will do with any other CST model in our PC. The simulation results computed by the cluster will be available at the Navigation Tree.