Visualization Cluster Basics

One question I receive frequently is how do we setup the visualization clusters used at VRAC|HCI.  Here is a very brief overview of how one goes about setting up a visualization cluster for Linux, and some best practice advice I’ve learned over the years.

Redhat Enterprise Linux 6

(most of these instructions directly apply to older RHEL versions and other Linux distributions)

Before you begin:  nVidia graphics GPU, QuadroSync card (for multi-GPU installations)

  • Install your basic RHEL
  • Be sure to install packages such as: gdm, xorg, (add more)
  • Best practice is to disallow auto-updates of packages such as the kernel, and mesa so you are not corrupting your nVidia driver install. You can update these packages but know you’ll have to rebuild your nVidia driver after you do.  Blocking packages with Yum is done by editing the /etc/yum.conf file. Optionally I like to remove the rhgb package and block it’s future installation so I have verbose boot messages.
exclude= mesa-libGL, mesa-libGL-devel, xorg-x11-server-Xorg, rhgb, kernel*
  • Disable Nouveau! This kernel module conflicts with the nVidia driver. Disabling Nouveau takes two steps First block it in your grub.conf file, and second block the module with a modprobe blacklist conf file. Once you’ve done these two steps you’ll need to reboot your system!
    • Edit your /etc/grub.conf file and append the text rdblacklist=nouveau nouveau.modeset=0 to your kernel lines. (optionally you can remove the rhgb option while you are making this edit).
    • Create a file named /etc/modprobe.d/blacklist-nouveau.conf file and add the following text to that file:
blacklist nouveau
options nouveau modeset=0
  • Install your nVidia driver. Make sure you’ve rebooted since disabling Nouveau in the previous steps. Visit the nVidia website and download the latest Unix driver. Note that the latest does not always contain the features your card requires, don’t be afraid of testing older versions if you encounter issues.  Once you have the driver downloaded you’ll need to make sure your system is in init level 3 (non-graphics mode), and assuming you want to compile or develop applications you’ll want to add the option to install the OpenGL headers. Here is the series of commands you should execute as the root users from a text command line (aka do an alt-f3).  If the driver install asks to modify your xorg file, let it. We’ll overwrite the xorg again in the next step. When you do the init 5 command you should see the nVidia logo, and your graphics should start just fine.
init 3sh --opengl-headers
init 5
  • Configure your xorg.conf file!  This is not always simple and what options you use totally depends on your setup.  However I will include a few suggestions that benefit almost every instance.  You can configure the /etc/X11/xorg.conf file manually which is what I typically do, but for the sake of this document I’m going to recommend you use the nvidia-xconfig command first.  The no-composite option has been required for every cluster I’ve worked with, it’s essentially a required option if you plan to do 3D stereo. You can do nvidia-xconfig -A for all the available options.
nvidia-xconfig --no-composite
  • Create a generic user on the image nodes. Use the useradd command for this. This is optional but gives you extra options to control your environment. I’ll use vrac as my generic user example.
  • Modify your GDM preferences.
        • On all of your cluster image nodes edit your /etc/gdm/custom.conf file and add a section to allow remote TCP X connections.  If you created the optional user mentioned above you want that user to auto-login by adding some additional options to your gdm custom.conf files.
    • xhost – The last step in allowing remote applications to remotely connect to the nodes graphically is having the image node(s) run an xhost allow command.  There are a couple ways I  like to do this:
      1. Append the command  /usr/bin/xhost + to  the end of /etc/gdm/Init/Default and /etc/gdm/PostLogin/Default. Note: If you are concerned about security you can limit your xhost to a specific set or hosts.  This is something I prefer to handle with iptables (firewall).
      2. Create a file named .config/autostart/xhost.desktop in your auto-login user account that has the following text:
      [Desktop Entry]
       Name=No name
       Exec=xhost +
  • EDID – When you boot a Linux machine your nVidia GPU will only activate DVI ports it sees an EDID identifier on. This can be a problem if your connection to the display is broken due to a video switcher, or failed hardware. The only way to recover is reboot after resolving the issue, and this is a pain. Thankfully nVidia provides a means for users to retrieve the EDID information from the monitor, store it to a file, and use that stored file at boot time to configure the port. This eliminates potentially time consuming support headaches. Doing this is relatively simple, check out this page on EDID Spoofing.