Quickstart

From CITA Computing

Table of contents

Remote Access

VPN

Remote access to CITA is available through a VPN (Virtual Private Network). The VPN tunnels through CITA's firewall, making it very easy to log in to our servers and access various resources. For your initial login to set up your account and obtain the VPN configuration files, for legacy purposes, or if you need a quick connection from a lightweight setup such as a computer on which you cannot install new software, we have an SSH gateway connection available as well.

CITA's workstations and servers are all firewalled. You will not be able to SSH into or rsync/scp files to a workstation or server from any machine outside of CITA's network without using the VPN or the SSH gateway (the sole exception is moving files in from the SciNet syetem--see SciNet Tips for more information).

To transfer data from a remote machine into CITA's storage spaces, you can either use the VPN, or see the instructions on the Data Transfer through Gateway page to use our SSH gateway to receive your files. To transfer data from SciNet, see SciNet Tips.

Gateway

You also access CITA computers by logging in through the gateway machine gw.cita.utoronto.ca. Alternatives in case this one is down are gw2 or gw3. If your CITA machine is connected to the internal DHCP network and you know the host IP, you can also access through the gateways.

Please note that the gateway has limited service and is only used as an access point -- you cannot run most commands or do work on this machine, so you must ssh into another machine (such as trinity, or your personal workstation) from gw.

See [1] (http://wiki.cita.utoronto.ca/mediawiki/index.php/Using_Our_Gateway) for more info.

Computers

Desktops

Everyone has their own desktop for personal use. Each desktop has the same environment and so sees the same /home space on every machine, mailbox folder in /var/spool/mail and the paths to software.

Each machine also has its own /scratch space that is available for general use. This space is never backed up. This space can be seen on all other machines through the NFS mount point /cita/scratch/your_machine_name

We offer workstations running CentOS 6 or 7--see the Desktops page for more details. We recommend the CentOS 6 OS if you are doing lots of computational work on your workstation since it offers the fastest access to our storage systems and access to all software modules. Your code is also binary compatible with the Sunnyvale cluster. Upgrade to CentOS 7 will occur in late 2018.

Servers

There are six general purpose servers for interactive use - prawn, kingcrab, lobster, homard, elk and trinity (aka moose). If your desktop isn't up to the task, then these servers are the place to go.

General purpose/high-memory:

prawn     96GB/Intel 12-cores
kingcrab 128GB/Intel 16-cores
lobster  256GB/Intel 16-cores
homard   256GB/Intel 16-cores
mussel  1024GB/AMD   32-cores

These nodes can be used for tasks with lower memory needs:

Interactive nodes:

trinity   8GB/Intel 8-cores
elk       8GB/Intel 8-cores

Sunnyvale Cluster

The Sunnyvale Cluster contains approximately 120 nodes and is set up to run parallel jobs through a batch system. There are two front-ends which you must login to use the cluster and submit jobs. These machines have a separate /home space but they use the same user accounts/passwords as all other CITA machines--if you have a general CITA user account it will work for Sunnyvale too. Check here [2] (http://wiki.cita.utoronto.ca/mediawiki/index.php/Sunnyvale) for detailed info.

Sunnyvale front-end:

bubbles (aka ricky)       24G/Intel 8-cores

The front-end node can be used for interactive parallel code development and shorter tasks and is the access point to the rest of the cluster.

Sunnyvale Queues

Sunnyvale has the following batch queues:

workq: General purpose queue for single-node or single-core jobs. Any available node will be used starting with the slowest nodes. The allowed processes per node (ppn) is in the range 1 to 8. The parameter nodes must equal 1 e.g., #PBS -l nodes=1:ppn=8 - you will be allocated a minimum of 2GB/core

fastq: 24 nodes : 8 cores/node : 24GB/node : gigabit interconnect : max ppn=8 : Older generation Intel

hpq: 49 nodes : 12 cores/node : 32GB/node : Infiniband interconnect : max ppn=16 : Intel Sandybridge

sandyq: 35 nodes : 16 cores/node : 64GB/node : Infiniband interconnect : max ppn=16 : Intel Sandybridge

greenq: 10 nodes : 32 cores/node : 128GB/node : Infiniband interconnect : max ppn=32 : Intel Skylake

Remember to add the line e.g., #PBS -q sandyq" in your batch script so your job is submitted to these queues.

For more information on submitting batch jobs, read here (http://wiki.cita.utoronto.ca/mediawiki/index.php/Sunnyvale#PBS).

GPU-enabled machines

grizzly - AMD GPU - contact Prof. Ue-Li Pen

Storage

Our storage comes in two types, backed-up and scratch. The larger storage systems run lustre - a scalable, network-based file system that can handle high I/O bandwidth and gluster, another distributed file system. All interactive servers, the Sunnyvale cluster nodes and most CITA workstations directly mount the lustre filesystems. Our smaller storage systems which include user home space and desktop scratch space work with NFS - the network filesystem.

Any storage labelled 'scratch' is NOT backed up or guaranteed to be protected from data loss, though it should be noted that scratch-lustre and scratch-gl have historically only lost a minimal amount of data over the past 5 years, and are generally quite reliable. Local /scratch on workstations is not as reliable, though similarly we do not experience a very large amount of data loss on these spaces either. That said, if you absolutely cannot afford to lose data, do not use a scratch filesystem to store it.

Home

All users are allocated 10GB-20GB of space on CITA's main home disks. This space is backed up daily and in multiple locations so we recommend storing important small files like code and documents here. Larger files that would take up much of your quota or exceed it will need to be stored on another filesystem (see below). Also please note that the home space is not designed for heavy-duty I/O i.e. reading/writing a few very large files or a very large number of small files rapidly, as this slows down all of CITA's workstations which people need to be able to use. We ask that you use local scratch or one of the below lustre filesystems for your heavy-duty I/O.

Local

/scratch          Varying-size (few-GB to multi-TB, though generally approx 100GB) local workstation hard drive scratch space, directly present inside the system instead of being network-mounted
/scratch-local    Additional varying-size local scratch space available on a few workstations e.g. where a RAID array is present

Lustre

/mnt/scratch-lustre  100T large scratch space
/mnt/raid-cita       100T postdoc space - backed-up
/mnt/raid-project   600T faculty research project space - backed-up


NFS-only

/cita/scratch/your-machine-name  
/cita/d/scratch-lustre
/cita/d/raid-cita
/cita/d/raid-project
/cita/d/sunny-home (Sunnyvale's /home space)

NFS mount points are preceded by /cita/d e.g, /cita/d/scratch-lustre while lustre filesystem mount points are preceded by /mnt, e.g. /mnt/raid-cita. Lustre-enabled nodes include trinity, prawn, bubbles, ricky, all Sunnyvale nodes and most workstations. The lustre mountpoints are much faster and preferred when available.

If the scratch space exceeds 95% capacity, an automated e-mail is sent out to the big users telling them to clean up. Scratch space is never backed up. We are not responsible for lost files in case of hardware failure.

More info on storage is available here [3] (http://wiki.cita.utoronto.ca/mediawiki/index.php/Disk_Space)

Backups

Directories on raid-cita, raid-project and all home spaces are backed up to tape on a regular basis. Live incremental backups are also available. Please contact requests@cita.utoronto.ca if you have lost data and we will try to retrieve it for you.

Personal Webspace

Every user has a working directory for building a CITA website linked to their username located here:

/cita/d/www/home/your_username

The website URL is:

http://www.cita.utoronto.ca/~your_username

You can also use this space for sharing files with external collaborators. See this link [4] (http://wiki.cita.utoronto.ca/mediawiki/index.php/Web_Downloads) for detailed instructions for setting up public and private folders.

Software

Commercial Packages

Binary location for many packages is in /cita/local/bin but some are loaded with modules. Some of these packages have limited numbers of licenses so please exit the programmes when you are finished.


Maple            /cita/local/bin/maple (xmaple too) - only works on '''trinity'''

SMONGO           /cita/local/bin/sm

Modules
IDL              module load idl (various versions); idl or idlde to run
Intel compilers  module avail intel (to see versions); module load intel (to load)
MATLAB           module load matlab; matlab (to run)
Mathematica      module load mathematica (various versions available)

Open source Packages

Desktops are loaded up with most of the usual stuff provided within the CentOS Linux distribution. If something is missing or you need something non-standard, please make a request. The usual packages are gcc, gfortran, python and extras, gnuplot, svn etc. The open source matlab/mathematica equivalent SAGE is also available as a module.

Modules

Most CITA systems including the main servers, sunnyvale cluster and workstations use the excellent modules system for packages which are either unsupported or outdated in the standard packages of some OS's. It also allows you to swap between newer and older versions if necessary. Available packages are listed with the command:

module avail (list all available modules)
module load gcc cuda python (load one or more modules' default versions - see which are default versions using module avail)
module load intel/intel-14.0.0 (load a specific version of a module)
module list (list currently loaded modules)
module unload gcc (unload the specified module)

See the Modules page for more details.

Please request new modules if you need them.

Questions?

Please see our Frequently Asked Questions page.