Accompanying lecture: Introduction to the Command Line

Software to Install

Make sure that you have each of the following software packages sucessfully installed before continuing.

Mac (OS X)

Windows 10

Windows (Not 10)

Bash and the Command Line

You may have heard about using the command line before to interact with a computer. Most scientists are required to interact with computers without a Graphical User Interface (GUI) i.e. without a mouse or a screen. They need to run programs, create programs on remote computers, and move files around all without a mouse and a screen. The way they do this is by interacting with the computer via the command line, basically telling the computer through text what to do instead of through pointing and clicking, as most of us are used to.

What is usually meant when people say “I use the command line” or “I use the terminal” is “I use Bash to work on my Unix machine”. Bash is what is known as a shell, which acts as a command interface between you and a computer which is running a Unix based operating system. It basically wraps the operating system with a human interface (hence the name shell). Unix is just a type of operating system, which many other operating systems are based on. Mac operating system OS X is actually based on Unix and there are also many flavors of Linux based on Unix as well (Red Hat, Ubuntu, CentOS, etc.). Windows is completely different, and a lot of the time that makes it unfit for scientific work if you need to use the command line a lot.

Getting Set Up

Mac (OS X)

You already have a Bash command line on your computer! Simply open the application Terminal and you’re ready to go. This will open a Bash shell that lets you interact with your computer textually instead of visually.

Windows

Unfortunately, Windows is not based on the Unix operating system and thus it does not have a Bash shell. Instead it uses a different shell, which you can access all the same. Simply open the application Command Prompt. To find it, hit the Windows Key. If you are on Windows 10, just start typing “cmd” and it should pop up. On Windows 7, there should be a search function with the same capability, just type “cmd” into there to find the program. Since the shell provided by Windows is not Bash, it is largely irrelevant. If you don’t have Windows 10, this is the end of the road for you, and you’ll have to use PuTTY in order to access another machine which has a bash shell you can use, which we’ll talk about later. If you do have Windows 10, read on.

Ubuntu on Windows

If you use Windows 10, then you actually already have an Ubuntu installation on your computer. Follow this guide in order to get it set up on your computer. Once you have the Ubuntu subsystem set up, you can access a bash shell by opening the Command Prompt, as before, and simply typing bash to open the bash shell.

Now, you’re set up and ready to do some scientific computing!

Accesing Far Away Places via SSH

Most of the time, scientists don’t run programs and store data on their own computers. Our personal computers simply are not powerful enough, do not have enough disk space, and do not have fast enough internet connections to act as a reasonable computing resource for us. Usually, we use our own command line and the command SSH to access the command line of a different computer..

In Bash

Open your Bash shell using one of the methods described above.

The syntax to use SSH is pretty simple:

ssh <your_user_name>@<domain_name>

To log into our computing resources, we will be using the following command:

ssh -Y uvastudent@lwalab.phys.unm.edu

This command says “Log me into the user uvastudent on the computer accessed by lwalab.phys.unm.edu (a computer at the University of New Mexico).” The -Y means “please forward me the graphics from that computer to mine”. If you have XQuartz (OSX) or XMing (Windows) open on your computer, then when you open a program with a graphical interface on the remote machine, the graphical interface will instead appear on your computer. When accessing the remote computer lwalab.phys.unm.edu for the first time, you will be prompted with a message something like The authenticity of host ... can't be established. Just type yes when you receive this message. You will also be asked for the password to the uvastudent account. Enter it when prompted. Don’t be worred when no text appears on the screen: it’s there, it’s just invisible.

In PuTTY

Open PuTTY, then type into the ‘Host Name (or IP address field)’, uvastudent@lwalab.phys.unm.edu. Then click the + next to SSH on the left menu. Then click X11 in the SSH submenu. Click the checkbox next to ‘Enable X11 fowarding’ and in the field ‘X display location’ type 0.0.0.0. This is equivalent to the -Y flag we passed to ssh before. Now click ‘Open’ at the bottom of the screen. Enter the password for our computing account and follow the instructions below.

#

Once logged into the lwalab.phy.unm.edu computer, run the following command:

ssh -Y lwaucf1

You will have to enter the same password again. lwaucf1 can be replaced with lwaucf# (any number 1, 2, 3, 4, 5, or 6) to access any of the 6 LWA UCF computing nodes. Now we are saying, “Using my username (now uvastudent), log into the computer lwaucf#”. Now we are logged into the LWA computing cluster. This is where all of our data will be stored and we will run all of our commands to process the data.

Using the Command Line

Once logged in, you will be presented to a shell, similar to your own. We’re going to explore what’s on this computer by using textual commands. To execute a command, you type its name followed by the ‘Enter’ key to execute it. Type ls to list what is in the current directory. You should see two items: examples.desktop and uva_students. Type ls uva_students to list the contents of the uva_students folder. Type cd to change directory into uva_students, which moves you into the folder. Type ls again to list what’s in this directory. You should see another folder called sgs7cr. This is a folder I made for myself. Make a directory by typing mkdir <your_id>, where is replaced with your id. Type `ls` again to confirm your new folder was made. **Change directory** into the folder you just made by typing `cd `. Type `pwd` to **print** the **working directory**. It should output `/home/uvastudent/uva_students/sgs7cr`. Type `cd ~` to return to the home directory, which is where you started from. Type `pwd` again and notice that you are in the `/home/uvastudent` folder. The `~` is just a placeholder for the folder `/home/uvastudent`.

These are the basic commands to move around and inspect the your computer’s files and folders: ls, cd.

Copying Files

The second important thing we do with our computers is move files around, delete them, and rename them. So how do we do that? We’re going to copy a file from my student directory to yours. Make sure you’re in the home directory (cd ~). Then use the command

cp uva_students/sgs7cr/steven.txt uva_students/<your_id>/

to copy the file steven.txt to the directory called <your_id>. Notice that we can execute commands at any level of the file system. We can either cd to the folder sgs7cr then execute our cp command, or we can use cp from the home directory and simply specify the full path to the file we want to copy. All cp commands will come in the form

cp <path/to/source/file> <path/to/destination/folder/>

Check out the other files that are in my directory using ls uva_students/sgs7cr/. How can we copy all of those files at once? You can do this by using the wildcard character *. * in bash stands for ‘anything’. For example, A* means ‘get me everything that starts with A’. *z means ‘get me everything that ends in z’. A*z means ‘get me everything that starts with A and ends with z’. Let’s copy everything from my directory to yours using the command

cp uva_students/sgs7cr/* uva_students/<your_id>/

You should get a message cp: omitting directory ‘uva_students/sgs7cr/data’ cp: omitting directory ‘uva_students/sgs7cr/junk_folder’ cp: overwrite ‘uva_students/<your_id>/steven.txt’? . The first two messages mean that cp won’t work with directories. The third message means that the file steven.txt already exists in your directory and thus cp needs to confirm whether you want to keep or overwrite the old file. Simply type y or yes to confirm the request or n or no to deny it. Now, how do we resolve the fact that we can’t copy directories? All you need to do is use the command

cp -r uva_students/sgs7cr/* uva_students/<your_id>/

The -r here simply means recursive and it just tells cp to copy everything from my folder, including all directories and everything inside those directories into the new folder. Before, cp didn’t know what to do when it reached a directory that isn’t empty. Type ls uva_students/<your_id>/ to confirm the contents of your directory. Finally, try running

cp -r uva_students/sgs7cr/* uva_students/<your_id>/

again. You’ll notice that now you get a bunch of messages asking whether cp can overwrite your files. You could just hit y or n for every file that you’re copying. An alternative if you want to overwrite every file automatically is to use the command

yes | cp -r uva_students/sgs7cr/* uva_students/<your_id>/

This command pipes the outpt of the command yes to the cp command. Try typing yes just for fun to see what it does. Type Ctrl+C to stop the infinite output of y.

Moving and Renaming Files

Similarly to cp, you can move files using the syntax

mv <path/to/source/file> <path/to/destination/folder/>

To see this in action, let’s move steven.txt to a different location. Let’s go to your directory using cd uva_students/<your_id> and move the file steven.txt to <your_name>.txt using the command

mv steven.txt <your_name>.txt

Type ls to list what is in your directory now. Notice that steven.txt is gone and <your_name>.txt took it’s place. We can also use mv to rename files!

Removing Files

Now how do we delete files? Let’s go into the junk folder in your current directory: cd junk_folder. Type ls to check out what’s in your directory. Now let’s remove a file using

rm junk_1.txt

Check your directory again to make sure it’s gone. Notice that we also got a confirmation message here. If we don’t want to receive a confirmation message for rm, we can pass it the -f flag to force deletion. Try it:

rm -f junk_2.txt

Type ls to make sure it’s gone. Now what if we want to get rid of all of the files in this folder? If we remember our handy wildcard symbol * from before, then we could just use the command rm -f * to delete all of the files in the current directory. However, we’re just going to remove the entire junk_folder. Move up one directory using cd ... And now let’s try to remove the folder using

rm junk_folder

It should say rm: cannot remove ‘junk_folder/’: Is a directory. So how do we remove a directory if we can’t use rm? To make a directory, we used mkdir, so to remove a directory we would think to use rmdir. Try running it:

rmdir junk_folder

But this still will give an error: rmdir: failed to remove ‘junk_folder/’: Directory not empty. The best way to remove a directory is instead to use

rm -r junk_folder

Here, the -r means ‘recursive’, which just means when rm tries to remove junk_folder and junk_folder isn’t empty, it will first try to delete everything inside junk_folder by calling rm -r <file> on every file inside junk_folder. You will get a bunch of confirmation messages though asking if you want to enter into the directory and remove files. If we need to delete a lot of files, then this becomes a hassle. Instead, we can use

rm -rf junk_folder

and if we wanted to remove all files and folders in your current directory we would use

rm -rf *

!!!WARNING!!! I may have just taught you the most dangerous command you can run. rm -rf * will delete everything in the current directory without prompting you and with no way to recover any of it. Always be careful when removing files and folders using the command line. If you go to the home directory and call rm -rf * on then everyone’s files will be deleted with no way to recover them. Please heed these words of advice: Always be cautious with the commands you are running. It’s VERY likely that you will accidentally delete your own data, or the work of others. This has actually happened before! I’ve heard stories about students who have accidentally deleted the entirety of their research projects, or the entirety of the data they are basing their research on.

Okay, now we know how to move about and explore the file system as well as move, copy, and delete files. The last few commands I will show you allow you to check up on the hardware of the computer you are working on.

Type into the command line

top

Top is a simple command that lets you see a summary of the top processes (the ones using the most CPU resources) running on the computer right now. top gives you a lot of information to work with, but the most important things to look at are the columns labeled ‘PID’, ‘USER’, ‘%CPU’, and ‘COMMAND’. ‘COMMAND’ gives you the name of the proccess. For example, if you were running the web browser Firefox, then there would be a process called ‘firefox’ in the ‘COMMAND’ column. Next to ‘firefox’ in the ‘%CPU’ column you can see what percentage of the Central Processing Unit (CPU) Firefox is using. Under USER you can see who is running Firefox. Finally under PID you can see Firefox’s processor ID, which is just a unique numerical label assigned to Firefox. When you want to kill a process, you just type k while in top, then type in the PID of the process. Alternatively, if you know the PID after leaving top, you can return to the command line and type

kill <PID>

to kill the process.

Next, we’ll look at the commands du and df. Go back to your own directory: cd ~/uva_students/<your_id>. Now type

du -h

du gives you the disk space usage of the current directory you are in. It spits back to you the sizes of all of the files and folders in your current directory. -h here means human readable, meaning it will give you sizes in terms of K, M, G, or T, which stand for kilobyte, megabyte, gigabyte, or terabyte respectively.

We can also use the command

df -h

to show us the disk file system space usage. This means that it will show us how full all of the hard drives that are connected to our computing system are. Notice that a lot of the hard drives are pretty full! This means that we should be respectful of the other users on the system and quickly process, reduce, and delete the raw observational data that we obtain so that the file system doesn’t fill up.

Vim - Making and Editting Files

Previously, we’ve only moved around and deleted files that others have created? How do we make our own? When working over SSH, the most efficient way to create and edit files it use a slimmed down and efficient text editor, which you probably aren’t used to using. I’m going to be introducing you to one text editor called Vim.

Vim

Vim is a barebones, command line text editor which is popular among many scientists who have to use the command line a lot. Go to your directory cd ~/uva_students/<your_id>. Now type

vim <your_name>.txt

to open the text file <your_name>.txt. What you should see is a few lines of text followed by a bunch of tildes on the side and numbers and words on the bottom. Completely foreign and weird I know. Hit i on your keyboard to enter into Insert mode. You’ll notice at the bottom of the screen the words --INSERT--. This means that you can start adding and editing text to the file. Try it out. Use the arrow keys to move around (no way to click around in the command line). Hit Esc on your keyboard to escape to command mode. Now, if you start typing, you’ll notice that text doesn’t get added to the file, and some other weird stuff might happen. Don’t worry though. You only need to hit i to enter Insert mode to type stuff, or Esc to escape from Insert mode.

Make sure you’re in the command mode. Now type :w. Notice that now the text shows up at the bottom. That’s because you’re entering a command. Hit Enter on your keyboard to execute :w which tells Vim to write the file. Now type :q to quit Vim. Now open up the text file again in Vim, as before. Enter into Insert mode using i and change the line my name is steven to my name is <your_name>. You can also write and quit at the same time using the command :wq.

The way to search files in Vim is by using the command / while in command mode. Type /<your_name> to find your name in the text file. Type /i to find each occurance of the letter ‘i’ in the text file. Hit n on your keyboard to move forward to the next occurance of ‘i’ and type Shift+n to move to the previous occurance of ‘i’.

Vim might seem a bit confusing and strange at first, but I promise you it is one of the fastest, most efficient, and most convienient ways to edit text files within the command line, so try to use it! If you hate Vim, there are other alternatives like Emacs and nano, which can be used the same way as Vim from the command line:

emacs <your_name>.txt

and

nano <your_name>.txt

But in my opinion, they are way harder to use without much benefit over good old Vim.

Slightly More Advanced Stuff

screen

There’s a neat command called screen which lets you run multiple terminal “screens”. Try it out:

screen

This should open a new, fresh command line for you to use. Now type Ctrl+A then D to get out of the screen. Notice you get a message saying detached from .... That other bash session we opened is still alive! Type

screen -ls

To see a list of all of the current screens that are running. You may notice that the names of the screens are just numbers followed by characters. Those characters after the number are the session name, which we can and should specify. For our purposes, you should start new screens using the command

screen -S <your_id>

Let’s get back to the one you just created though. Type screen -ls to list the current screen and find the one you just opened. You can return to that screen using

screen -x <session_id>

You don’t have to type the entirety of the session ID, you just need to provide enough characters to differentiate it from all of the others. Since we will be using our user IDs to start new screens, use instead

screen -x <your_id>

PATH and Environment Variables

You may have wondered what these strange commands we are running are. Well, they’re not actually inherent to Bash itself, they’re just programs that Bash knows how to execute. So where are these programs located and how does Bash know where to find them? The answer is the PATH. Type

echo $PATH

You should get as output /usr/local/sbin:/usr/local/bin:/usr/sbin:..., which is the names of a bunch of directires separated by colons. Each of these is the name is a folder that contains programs that you want Bash to be able to find. Notice that most of the folders have ‘bin’ in their name. This is a convention: On Linux systems, executable programs are usually installed to a folder called ‘bin’. Let’s try confusing Bash. Type

export PATH=""

Now try using our old friend ls. It doesn’t work! You should get an error: -bash: ls: No such file or directory. Now type echo $PATH again. Nothing get’s printed out! That’s because in the previous export command we told Bash that the PATH is nothing (""), therefore Bash has no idea where to look to find any of our programs (including ls). Let’s get things back to normal by typing

export PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/local/cuda/bin:/usr/local/tempo2/bin:/usr/local/presto/bin:/usr/local/casapy

PATH is what is known in Bash as an ‘environment variable’. An environment variable is just a placeholder for a location (in the case of PATH) or some information that another program or Bash might use. You can set your own environment variables by using the command

export <var_name>=<var_defintion>

For example, let’s try

export <your_id>=/home/uvastudent/uva_students/<your_id>/

Now try

cd $<your_id>

The $, which we saw before with echo just means replace the environment variable with what it contains. You should find that the previous command took you to your own folder, since $<your_id> was replaced with the path you specified before.