Log into ondemand.c3plus3.org and click the Clusters
-> Falcon Shell Access
buttons. This will open a new window/tab with a terminal interface. If you'd like more direct SSH access (described below), you'll need to set up RSA keys for passwordless authentication to ondemand. To do this from the Falcon Shell:
ssh-keygen
cat .ssh/id_rsa.pub >> .ssh/authorized_keys
chmod 600 .ssh/authorized_keys
Now on your local workstation, create RSA keys
ssh-keygen
Copy the contents of ~/.ssh/id_rsa.pub
from your workstation and paste that into the authorized_keys file on the Falcon shell (on a new line):
nano .ssh/authorized_keys
Start up Terminal (or PuTTY), then direct log in using the appropriate 'jump box'
BSU Users:
ssh -J username@bsu-hpclogin.c3plus3.org username.bsu@staging1.c3plus3.org
UI Users:
ssh -J username@login-ui.c3plus3.org username.ui@staging1.c3plus3.org
ISU Users:
ssh -J username@login-isu.c3plus3.org username.isu@staging1.c3plus3.org
The jump boxes are configured to only accept SSH connections from each respective university, so you may need to connect to your home university's VPN or be physically on-campus for the connection to work. You can also connect to ondemand.c3plus3.org
in the same manner, but the staging1 server is better suited to interactive user sessions.
When you log in you are brought to your home directory by default
pwd
Should get the response
/lfs/your_user_name
No matter which server you log into, your home directory will be the same. This is the magic of distributed file systems.
By default you are running in the Bash Shell, which is how you interact with the file system, start programs etc. If you want to search for a command, include the word bash in your query. For example you could google 'bash create directory path'. Here are some of the most common and useful bash commands:
ls
- show me the contents of the current directory
mkdir <dir_name>
- create a new directory
mkdir yournamehere
cd <dir_name>
- change directory
cd yournamehere
cd ..
- change to the parent directory
Example:
boswald.ui@ondemand ~ $ mkdir workshop
boswald.ui@ondemand ~ $ cd workshop
boswald.ui@ondemand ~/workshop $ cd ..
nano <filename>
- edit (and create if necessary) a file nano somefile.txt
rm <filename>
- delete a filemv <filename> <destination>
- move or rename a file
man <command>
- show help documentation
which <command>
- locate the actual executable file of a command (and test whether it exists)top
- show system utilization
cat <filename>
- print the contents of a file to screen (std out)
less <filename>
- show the contents of a file interactivelyTo download data directly from the internet, use wget
Lets get some data to work with, Mycobacterium tuberculosis 16S Ribosomal RNA
wget -nc http://hpc.uidaho.edu/example-data/Myco.tb.fasta
If your data is on your local computer, you can scp
the data to the server:
scp -J username@login-isu.c3plus3.org /path/to/local/data username.isu@ondemand.c3plus3.org:/path/to/destination
(replace username and username.isu with your username and university acronym)
Note: SCP does not deal well with spaces in paths or filenames.
Other options:
Shared login nodes
There are two servers that serve as 'head' nodes, ondemand.c3plus3.org
and staging1.c3plus3.org
. The staging sever should be used for compiling software for Falcon nodes as its processor is the same generation as the Falcon nodes. The staging server also has more CPU cores and RAM available and is dedicated for these types of pre-processing.
Now is as good a time as any to say - be a good computational neighbor. Our servers are shared by many researchers, so please don't start a computationally intensive job on ondemand.c3plus3.org. Linux servers will cope relatively well with an overloaded processor, but if you run them out of memory - first they slow down markedly as they start to use hard disk space to offload memory (called swapping). Then the system basically goes crazy and starts a process called OOM killer, which pretty much randomly kills things in a last ditch effort to keep the system from becoming completely frozen.
If you accidentally start a process running and want to stop it use Ctrl-C
(when it's running interactively). If you know the process id (from top) you can stop it with the kill command.
kill 2345
If you use the up and down arrow keys, you can scroll through all the commands you've previously entered. To see a list of all the commands you've entered, use the history command. When you have a bunch of commands in you history (it will store about 1000), pipe the history through another command like less or tail.
history
history | less
history | tail -n 40
You can write a program to do whatever you want using only Bash - but the syntax is a bit different than most other programming languages. First and foremost, spaces matter. In most programming languages the following three lines are equivalent:
a=10
a = 10
a= 10
In Bash, only the first is correct. Once you assign a value to a variable, refer to it by prepending a $
echo $a
The echo command simple means print to the screen (STDOUT). If you just enter $a
, the Bash shell will try to run the command 10 (and give you an error). Similarly, only the first of these commands will work:
if [ $a -lt 11 ]; then echo "less than eleven"; fi
if[ $a -lt 11]; then echo "less than eleven"; fi
if [$a -lt 11]; then echo "less than eleven"; fi
Let's experiment with looping and conditionals. First, let's create a new directory
cd ..
mkdir bashfun
cd bashfun
Now let's create a bunch of input files from the built in $RANDOM variable
for i in {1..50}; do echo $RANDOM > num.$i; done
As an exercise, we'll now create two directories and sort the file by whether the numbers in them are even or odd.
mkdir even odd
for nf in $(ls num.*); do rn=$(cat $nf); if [ $(expr $rn % 2) -eq 0 ]; then mv $nf even/ ; else mv $nf odd; fi don
ls even
ls odd
cat even/*
cat odd/*
Let's deconstruct the above for statement:
# when you wrap text in a $(), that tells Bash to execute the commands within
for nf in $(ls num.*) # list all the files that start with num. and loop over them
do # starts the execution loop
rn=$(cat $nf) # read the file with the name stored in nf and store it as rn
# this really only works when the file contains a single line
if [ $(expr $rn % 2) -eq 0 ] # expr tells Bash to do mathematical operations
then # % means modulo, or the remaider of integer division
# compare numbers in Bash with -lt -gt and -eq
mv $nf even/ # move the file to the even directory
else # the above if returned false, so
mv $nf odd; # move the file to the odd directory
fi # end if command
done # end for loop
All of the above commands could be put into a script, and then executed repeatedly. Here's what that would look like:
#!/bin/bash
# create a bunch of random numbers
for i in {1..50}; do echo $RANDOM > num.$i; done
# sort them
for nf in $(ls num.*); do
rn=$(cat $nf)
if [ $(expr $rn % 2) -eq 0 ]; then
mv $nf even/
else mv $nf odd
fi
done
Create a file named random_sort.sh with the above script. The first line is called the shebang line, and indicates what interpreter to use to run the script - in our case Bash (other options could be python or perl etc...). Comment lines start with a #, and are skipped over by Bash. To make this script executable, we need to set the executable bit
chmod +x random_sort.sh
Then we can execute it with:
./random_sort.sh
Why the ./
? This tells Bash to look in the current directory for the executable, which it would otherwise not do - because it is a security risk. Bash looks for executables in the $PATH
. To see what directories are currently in the $PATH
, we can just echo it out.
echo $PATH
If you've still got the modules from above loaded, you should see their direcotories listed. Unload the modules and see how the $PATH
changes.
module unload mrbayes
echo $PATH
Mostly, the module command just manipulates your $PATH
(It also can set other environment variables and load other modules).
Let's modify our script to accept a command line argument - the number of random number files to generate.
#!/bin/bash
if [ -z $1 ]; then
echo "You need to enter a number"
exit
fi
# create a bunch of random numbers
for i in $(seq 1 $1); do echo $RANDOM > num.$i; done
# sort them
for nf in $(ls num.*); do
rn=$(cat $nf)
if [ $(expr $rn % 2) -eq 0 ]; then
mv $nf even/
else mv $nf odd
fi
done
Command line arguments are passed to a script in the variables $1, $2, $3 ... etc. (the variable $0 contains the name of the script/command). At the top of the script we check to see if the $1 variable is empty (-z), and if it is the script exits. Now if we run our script with a number, it will generate that many files.
./random_sort.sh 10
Of course, there are more advanced methods to parse command line arguments.
Practice exercises: