Shell: A powerful first computer language

On a recent phone call I was asked what I would recommend for a first computer language and I quickly responded: “Python”. While I believe many computer languages have places in modern day computing, “Python” is a good beginner language.

 

However, after a few seconds, I added: “But for Unix and Linux the first language should be 'shell'”.

 

The people on the call were incredulous. “Shell?” they said and added: “Our students in systems administration grumble and ask why they should learn 'Shell'.”

 

“Shell” is the first thing you learn after you get past the graphics of the Unix or Linux system. As soon as you touch the command line you are programming in shell-- even if you do not realize it.

 

Whether you put two commands on a line to be executed one after the other, or whether you use a pipe symbol to tell the computer to take the output of the first command and put it into the second one, you are typing a language that will be interpreted by a command interpreter called “the shell”.

 

Getting the computer to do things for you from the command line forces you to learn the real commands of Unix or GNU/Linux.

 

These days the commands you use are (for the most part) the GNU part of “GNU/Linux” re-implemented under the GPL so the software could be Free and so people could never hide the source code again. Yes, there are also a lot of commands borrowed from the original AT&T and Berkeley BSD systems, just as there was a lot borrowed from MIT and a thousand other contributors, but GNU systematized the creation of the “GNU coreutils” as they are called. Since a lot of the code comes from one place, protected under the GPL, there is little incentive to create offshoots that are “different”. You improve the ones that are there....if you have a reason to improve the commands. Most people do not have that reason.

 

These commands are quite powerful. Most of them do one thing and do it fairly efficiently....in fact, quite fast.....very fast. Since most of them were written when computer memories were small, and tend to have “pass through” of data, they also have a start-up time that is breathtaking compared to the start-up time of large graphical programs. In many cases a shell command or shell script will have finished its work before the large graphical program can even initialize itself.

 

“Section (1)” of the Unix man(1)ual pages lists these commands. Some commands are for navigating the file structure such as cd(1) and pwd(1). Some are for finding lost files such as the “find(1)” command. Some are for finding data inside of files, such as grep(1), and others for manipulating data inside of files, such as cut(1), paste(1)and sed(1). However these programs do their job very well. If you held much of your data in flat, character-oriented files (with file names that followed certain conventions) than these commands work extremely well.

 

By going through the man(1) pages of section (1) you could go on to learn how to use these commands in concert.

 

One night when I was a beginning systems administrator for Unix systems I had a job of taking a large number of textual files and updating them with other data. I was going through the file line by line with a text editor. Finally I stopped, realizing what I was trying to do would take hours. I said to myself: “I do not know that Unix has commands to do this, but I am willing to bet that it does!” I started going through the manual and soon found “cut(1)”with its associated (“see also”) paste(1) command. Twenty minutes later I left the building with the task completed.

 

Shell scripting is the one tool that every Unix or GNU/Linux system has. There may not be Python installed, or PHP, or even “C”; but the shell will be there, whether it be the Bourne Shell “sh”, the “C” shell (csh), the “trusted” shell (tsh), the Korn shell (ksh, named after David Korn) or the “Bourne Again” shell (bash), the last of which is used in most GNU/Linux systems today. The shells typically were fairly upward compliant so people could write code that was useful from system to system.

 

Shell programming is made up of two main parts:

 

  • A command interpreter

  • many tiny (and a few “not so tiny”) programs that tend to do one specific thing very well

     

The command interpreter controls the environment that you are programming in, re-directs the input and output of the programs, supplies some control structures (while, if-then, until, case, etc.),

reads in one line at a time as you type it or reads in a file that you have written many shell commands into (called a “shell script” or in effect a program).

 

Some modern command interpreters also allow you to use “history” to re-execute commands you have executed in the past-- perhaps editing those commands before you re-execute them. However, even if they do not do this, the only real tool you need to write and debug a shell script is a simple text editor and the liberal use of the “echo” command to output variables. No compiler or special debugger is needed.

 

Many years ago Digital Equipment Corporation had to update a number of customers on service contracts with the proper license keys for their software. The list of customers on service contracts were listed in one printed report and the list of which customers had purchased license keys on another printed report. We were relying on a clerk to physically match up these reports to ship each customer the proper new key and the estimated time to do this was nine months.

 

By obtaining the printed reports as two electronic file sand by using simple shell commands such as sed(1), cut(1), paste(1) and awk(1), I was able to write a short (ten line) shell script to extract the data from both magnetic tapes, cross-check the two tapes to line up the information, print out which customer received which key and create a “mailing label” for the package. Total development time was three hours.

 

Some people think that shell scripting is not as fast as writing a program in a compiled language, but these days it takes a lot of processing and a lot of data to have a significant impact. My little program done so many years ago for Digital would probably be finished before I could lift my finger from the “Enter” key.

 

Can “shell” do everything that “C” or some other modern languages can do? Certainly there are things which are easier to do in another language, or with a graphical tool, or to manipulate binary data, but shell scripting is used extensively in systems administration or textual control and is a good first language to teach programming techniques.

 

For those of you who are interested in learning more about shell programming, I can recommend these books:

A Practical Guide to Linux Commands, Editors and Shell Programming by Mark G. Sobell

Learning the bash Shell: Unix Shell Programming In a Nutshell (O'Reilly) by Cameron Newham

 

and as a final treat, a classic book by an old friend:

 

Linux and the UNIX Philosophy by Mike Gancarz