Accompanying lecture: Introduction to the Command Line
Software to Install
Make sure that you have each of the following software packages sucessfully installed before continuing.
Mac (OS X)
Windows 10
Windows (Not 10)
Bash and the Command Line
You may have heard about using the command line before to interact with a computer. Most scientists are required to interact with computers without a Graphical User Interface (GUI) i.e. without a mouse or a screen. They need to run programs, create programs on remote computers, and move files around all without a mouse and a screen. The way they do this is by interacting with the computer via the command line, basically telling the computer through text what to do instead of through pointing and clicking, as most of us are used to.
What is usually meant when people say “I use the command line” or “I use the terminal” is “I use Bash to work on my Unix machine”. Bash is what is known as a shell, which acts as a command interface between you and a computer which is running a Unix based operating system. It basically wraps the operating system with a human interface (hence the name shell). Unix is just a type of operating system, which many other operating systems are based on. Mac operating system OS X is actually based on Unix and there are also many flavors of Linux based on Unix as well (Red Hat, Ubuntu, CentOS, etc.). Windows is completely different, and a lot of the time that makes it unfit for scientific work if you need to use the command line a lot.
Getting Set Up
Mac (OS X)
You already have a Bash command line on your computer! Simply open the application Terminal and you’re ready to go. This will open a Bash shell that lets you interact with your computer textually instead of visually.
Windows
Unfortunately, Windows is not based on the Unix operating system and thus it does not have a Bash shell. Instead it uses a different shell, which you can access all the same. Simply open the application Command Prompt. To find it, hit the Windows Key. If you are on Windows 10, just start typing “cmd” and it should pop up. On Windows 7, there should be a search function with the same capability, just type “cmd” into there to find the program. Since the shell provided by Windows is not Bash, it is largely irrelevant. If you don’t have Windows 10, this is the end of the road for you, and you’ll have to use PuTTY in order to access another machine which has a bash shell you can use, which we’ll talk about later. If you do have Windows 10, read on.
Ubuntu on Windows
If you use Windows 10, then you actually already have an Ubuntu installation on your computer. Follow this guide in order to get it set up on your computer. Once you have the Ubuntu subsystem set up, you can access a bash shell by opening the Command Prompt, as before, and simply typing bash
to open the bash shell.
Now, you’re set up and ready to do some scientific computing!
Accesing Far Away Places via SSH
Most of the time, scientists don’t run programs and store data on their own computers. Our personal computers simply are not powerful enough, do not have enough disk space, and do not have fast enough internet connections to act as a reasonable computing resource for us. Usually, we use our own command line and the command SSH to access the command line of a different computer..
In Bash
Open your Bash shell using one of the methods described above.
The syntax to use SSH is pretty simple:
ssh <your_user_name>@<domain_name>
To log into our computing resources, we will be using the following command:
ssh -Y uvastudent@lwalab.phys.unm.edu
This command says “Log me into the user uvastudent on the computer accessed by lwalab.phys.unm.edu (a computer at the University of New Mexico).” The -Y
means “please forward me the graphics from that computer to mine”. If you have XQuartz (OSX) or XMing (Windows) open on your computer, then when you open a program with a graphical interface on the remote machine, the graphical interface will instead appear on your computer. When accessing the remote computer lwalab.phys.unm.edu
for the first time, you will be prompted with a message something like The authenticity of host ... can't be established.
Just type yes
when you receive this message. You will also be asked for the password to the uvastudent account. Enter it when prompted. Don’t be worred when no text appears on the screen: it’s there, it’s just invisible.
In PuTTY
Open PuTTY, then type into the ‘Host Name (or IP address field)’, uvastudent@lwalab.phys.unm.edu
. Then click the + next to SSH on the left menu. Then click X11 in the SSH submenu. Click the checkbox next to ‘Enable X11 fowarding’ and in the field ‘X display location’ type 0.0.0.0
. This is equivalent to the -Y
flag we passed to ssh
before. Now click ‘Open’ at the bottom of the screen. Enter the password for our computing account and follow the instructions below.
#
Once logged into the lwalab.phy.unm.edu
computer, run the following command:
ssh -Y lwaucf1
You will have to enter the same password again. lwaucf1
can be replaced with lwaucf#
(any number 1, 2, 3, 4, 5, or 6) to access any of the 6 LWA UCF computing nodes. Now we are saying, “Using my username (now uvastudent), log into the computer lwaucf#”.
Now we are logged into the LWA computing cluster. This is where all of our data will be stored and we will run all of our commands to process the data.
Using the Command Line
Navigation
Once logged in, you will be presented to a shell, similar to your own. We’re going to explore what’s on this computer by using textual commands. To execute a command, you type its name followed by the ‘Enter’ key to execute it. Type ls
to list what is in the current directory. You should see two items: examples.desktop
and uva_students
. Type ls uva_students
to list the contents of the uva_students
folder. Type cd
to change directory into uva_students
, which moves you into the folder. Type ls
again to list what’s in this directory. You should see another folder called sgs7cr
. This is a folder I made for myself. Make a directory by typing mkdir <your_id>
, where
These are the basic commands to move around and inspect the your computer’s files and folders: ls
, cd
.
Copying Files
The second important thing we do with our computers is move files around, delete them, and rename them. So how do we do that? We’re going to copy a file from my student directory to yours. Make sure you’re in the home directory (cd ~
). Then use the command
cp uva_students/sgs7cr/steven.txt uva_students/<your_id>/
to copy the file steven.txt
to the directory called <your_id>.
Notice that we can execute commands at any level of the file system. We can either cd
to the folder sgs7cr
then execute our cp
command, or we can use cp
from the home directory and simply specify the full path to the file we want to copy. All cp
commands will come in the form
cp <path/to/source/file> <path/to/destination/folder/>
Check out the other files that are in my directory using ls uva_students/sgs7cr/
. How can we copy all of those files at once? You can do this by using the wildcard character *
. *
in bash stands for ‘anything’. For example, A*
means ‘get me everything that starts with A’. *z
means ‘get me everything that ends in z’. A*z
means ‘get me everything that starts with A and ends with z’. Let’s copy everything from my directory to yours using the command
cp uva_students/sgs7cr/* uva_students/<your_id>/
You should get a message cp: omitting directory ‘uva_students/sgs7cr/data’
cp: omitting directory ‘uva_students/sgs7cr/junk_folder’
cp: overwrite ‘uva_students/<your_id>/steven.txt’?
. The first two messages mean that cp
won’t work with directories. The third message means that the file steven.txt
already exists in your directory and thus cp
needs to confirm whether you want to keep or overwrite the old file. Simply type y
or yes
to confirm the request or n
or no
to deny it. Now, how do we resolve the fact that we can’t copy directories? All you need to do is use the command
cp -r uva_students/sgs7cr/* uva_students/<your_id>/
The -r
here simply means recursive and it just tells cp
to copy everything from my folder, including all directories and everything inside those directories into the new folder. Before, cp
didn’t know what to do when it reached a directory that isn’t empty. Type ls uva_students/<your_id>/
to confirm the contents of your directory. Finally, try running
cp -r uva_students/sgs7cr/* uva_students/<your_id>/
again. You’ll notice that now you get a bunch of messages asking whether cp
can overwrite your files. You could just hit y
or n
for every file that you’re copying. An alternative if you want to overwrite every file automatically is to use the command
yes | cp -r uva_students/sgs7cr/* uva_students/<your_id>/
This command pipes the outpt of the command yes
to the cp
command. Try typing yes
just for fun to see what it does. Type Ctrl+C
to stop the infinite output of y
.
Moving and Renaming Files
Similarly to cp
, you can move files using the syntax
mv <path/to/source/file> <path/to/destination/folder/>
To see this in action, let’s move steven.txt
to a different location. Let’s go to your directory using cd uva_students/<your_id>
and move the file steven.txt
to <your_name>.txt
using the command
mv steven.txt <your_name>.txt
Type ls
to list what is in your directory now. Notice that steven.txt
is gone and <your_name>.txt
took it’s place. We can also use mv
to rename files!
Removing Files
Now how do we delete files? Let’s go into the junk folder in your current directory: cd junk_folder
. Type ls
to check out what’s in your directory. Now let’s remove a file using
rm junk_1.txt
Check your directory again to make sure it’s gone. Notice that we also got a confirmation message here. If we don’t want to receive a confirmation message for rm
, we can pass it the -f
flag to force deletion. Try it:
rm -f junk_2.txt
Type ls
to make sure it’s gone. Now what if we want to get rid of all of the files in this folder? If we remember our handy wildcard symbol *
from before, then we could just use the command rm -f *
to delete all of the files in the current directory. However, we’re just going to remove the entire junk_folder
. Move up one directory using cd ..
. And now let’s try to remove the folder using
rm junk_folder
It should say rm: cannot remove ‘junk_folder/’: Is a directory
. So how do we remove a directory if we can’t use rm
? To make a directory, we used mkdir
, so to remove a directory we would think to use rmdir
. Try running it:
rmdir junk_folder
But this still will give an error: rmdir: failed to remove ‘junk_folder/’: Directory not empty
. The best way to remove a directory is instead to use
rm -r junk_folder
Here, the -r
means ‘recursive’, which just means when rm
tries to remove junk_folder
and junk_folder
isn’t empty, it will first try to delete everything inside junk_folder
by calling rm -r <file>
on every file inside junk_folder. You will get a bunch of confirmation messages though asking if you want to enter into the directory and remove files. If we need to delete a lot of files, then this becomes a hassle. Instead, we can use
rm -rf junk_folder
and if we wanted to remove all files and folders in your current directory we would use
rm -rf *
!!!WARNING!!! I may have just taught you the most dangerous command you can run. rm -rf *
will delete everything in the current directory without prompting you and with no way to recover any of it. Always be careful when removing files and folders using the command line. If you go to the home directory and call rm -rf *
on then everyone’s files will be deleted with no way to recover them. Please heed these words of advice: Always be cautious with the commands you are running. It’s VERY likely that you will accidentally delete your own data, or the work of others. This has actually happened before! I’ve heard stories about students who have accidentally deleted the entirety of their research projects, or the entirety of the data they are basing their research on.
Okay, now we know how to move about and explore the file system as well as move, copy, and delete files. The last few commands I will show you allow you to check up on the hardware of the computer you are working on.
Type into the command line
top
Top is a simple command that lets you see a summary of the top processes (the ones using the most CPU resources) running on the computer right now. top
gives you a lot of information to work with, but the most important things to look at are the columns labeled ‘PID’, ‘USER’, ‘%CPU’, and ‘COMMAND’. ‘COMMAND’ gives you the name of the proccess. For example, if you were running the web browser Firefox, then there would be a process called ‘firefox’ in the ‘COMMAND’ column. Next to ‘firefox’ in the ‘%CPU’ column you can see what percentage of the Central Processing Unit (CPU) Firefox is using. Under USER
you can see who is running Firefox. Finally under PID
you can see Firefox’s processor ID, which is just a unique numerical label assigned to Firefox. When you want to kill a process, you just type k
while in top
, then type in the PID of the process. Alternatively, if you know the PID after leaving top
, you can return to the command line and type
kill <PID>
to kill the process.
Next, we’ll look at the commands du
and df
. Go back to your own directory: cd ~/uva_students/<your_id>
. Now type
du -h
du
gives you the disk space usage of the current directory you are in. It spits back to you the sizes of all of the files and folders in your current directory. -h
here means human readable, meaning it will give you sizes in terms of K, M, G, or T, which stand for kilobyte, megabyte, gigabyte, or terabyte respectively.
We can also use the command
df -h
to show us the disk file system space usage. This means that it will show us how full all of the hard drives that are connected to our computing system are. Notice that a lot of the hard drives are pretty full! This means that we should be respectful of the other users on the system and quickly process, reduce, and delete the raw observational data that we obtain so that the file system doesn’t fill up.
Vim - Making and Editting Files
Previously, we’ve only moved around and deleted files that others have created? How do we make our own? When working over SSH, the most efficient way to create and edit files it use a slimmed down and efficient text editor, which you probably aren’t used to using. I’m going to be introducing you to one text editor called Vim.
Vim
Vim is a barebones, command line text editor which is popular among many scientists who have to use the command line a lot. Go to your directory cd ~/uva_students/<your_id>
. Now type
vim <your_name>.txt
to open the text file <your_name>.txt
. What you should see is a few lines of text followed by a bunch of tildes on the side and numbers and words on the bottom. Completely foreign and weird I know. Hit i
on your keyboard to enter into Insert mode. You’ll notice at the bottom of the screen the words --INSERT--
. This means that you can start adding and editing text to the file. Try it out. Use the arrow keys to move around (no way to click around in the command line). Hit Esc
on your keyboard to escape to command mode. Now, if you start typing, you’ll notice that text doesn’t get added to the file, and some other weird stuff might happen. Don’t worry though. You only need to hit i
to enter Insert mode to type stuff, or Esc
to escape from Insert mode.
Make sure you’re in the command mode. Now type :w
. Notice that now the text shows up at the bottom. That’s because you’re entering a command. Hit Enter
on your keyboard to execute :w
which tells Vim to write the file. Now type :q
to quit Vim. Now open up the text file again in Vim, as before. Enter into Insert mode using i
and change the line my name is steven
to my name is <your_name>
. You can also write and quit at the same time using the command :wq
.
The way to search files in Vim is by using the command /
while in command mode. Type /<your_name>
to find your name in the text file. Type /i
to find each occurance of the letter ‘i’ in the text file. Hit n
on your keyboard to move forward to the next occurance of ‘i’ and type Shift+n
to move to the previous occurance of ‘i’.
Vim might seem a bit confusing and strange at first, but I promise you it is one of the fastest, most efficient, and most convienient ways to edit text files within the command line, so try to use it! If you hate Vim, there are other alternatives like Emacs and nano, which can be used the same way as Vim from the command line:
emacs <your_name>.txt
and
nano <your_name>.txt
But in my opinion, they are way harder to use without much benefit over good old Vim.
Slightly More Advanced Stuff
screen
There’s a neat command called screen
which lets you run multiple terminal “screens”. Try it out:
screen
This should open a new, fresh command line for you to use. Now type Ctrl+A
then D
to get out of the screen. Notice you get a message saying detached from ...
. That other bash session we opened is still alive! Type
screen -ls
To see a list of all of the current screens that are running. You may notice that the names of the screens are just numbers followed by characters. Those characters after the number are the session name, which we can and should specify. For our purposes, you should start new screens using the command
screen -S <your_id>
Let’s get back to the one you just created though. Type screen -ls
to list the current screen and find the one you just opened. You can return to that screen using
screen -x <session_id>
You don’t have to type the entirety of the session ID, you just need to provide enough characters to differentiate it from all of the others. Since we will be using our user IDs to start new screens, use instead
screen -x <your_id>
PATH and Environment Variables
You may have wondered what these strange commands we are running are. Well, they’re not actually inherent to Bash itself, they’re just programs that Bash knows how to execute. So where are these programs located and how does Bash know where to find them? The answer is the PATH. Type
echo $PATH
You should get as output /usr/local/sbin:/usr/local/bin:/usr/sbin:...
, which is the names of a bunch of directires separated by colons. Each of these is the name is a folder that contains programs that you want Bash to be able to find. Notice that most of the folders have ‘bin’ in their name. This is a convention: On Linux systems, executable programs are usually installed to a folder called ‘bin’. Let’s try confusing Bash. Type
export PATH=""
Now try using our old friend ls
. It doesn’t work! You should get an error: -bash: ls: No such file or directory
. Now type echo $PATH
again. Nothing get’s printed out! That’s because in the previous export
command we told Bash that the PATH is nothing (""
), therefore Bash has no idea where to look to find any of our programs (including ls
). Let’s get things back to normal by typing
export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/cuda/bin:/usr/local/tempo2/bin:/usr/local/presto/bin:/usr/local/casapy
PATH is what is known in Bash as an ‘environment variable’. An environment variable is just a placeholder for a location (in the case of PATH) or some information that another program or Bash might use. You can set your own environment variables by using the command
export <var_name>=<var_defintion>
For example, let’s try
export <your_id>=/home/uvastudent/uva_students/<your_id>/
Now try
cd $<your_id>
The $
, which we saw before with echo
just means replace the environment variable with what it contains. You should find that the previous command took you to your own folder, since $<your_id>
was replaced with the path you specified before.