The Third Language

[Warning: Highly biased and opinionated text follows. Faint of heart, please skip.]

 

A short time ago I blogged about how “shell” should be the first language taught, due to its usefulness for systems administration, and its prevalence in *x (Unix, Linux, OS/X, etc.) systems of every type.

 

In the article I mentioned that I almost indicated “Python” as a first language, and just that mention of Python started a potential flame war about whether Python or some other language should be a first language (“white space as delimiters, horror!”).

 

While I still believe that Python would make a good “second language”, I will acknowledge that there are lots of good “second languages” -- each having their own fans and uses. Therefore I will skip the abuse of suggesting a “second language” and go directly to the third language: Assembly/Machine language.  Since assembly language is almost the same as machine language, I will call both by the term “assembly language” for the rest of this blog.

 

I can hear the screams already (even before I post this). “No one codes in assembly language any more”. “It is too complicated...too hard”. “What about 'C'?”

 

First of all, when I suggest assembly as a third language, I would also suggest a good course or text in computer architecture as something you study along with learning assembly language, for this (to me) is one of the main reasons for learning assembly language....finding out how the machine actually works.

 

“Who needs to know that?” I will hear people say. “The virtual machine and JAVA hides all of that”.

 

That is only partially true, and should not be true at all. All professional programmers should be introduced to machine architecture. Even people who only work with databases need to know that disk drives spin, and transfer of data from disk to memory is millions of times slower than transfer of data from one register to another in a CPU. I had a Cobol programmer one time that could not understand pointers in a class on “C”. I am sure if he had taken a course in computer architecture and assembly language programming it would have been a “piece of cake”.

 

Young people who overclock their CPUs need to understand the tradeoffs versus the benefits and programmers who use floating point should understand what “significant digits” in a mantissa really means.

 

Once you have learned assembly language for one machine, a lot of the issues in computer science become much easier to understand. Concepts like re-entrance, recursive, thread-safe, multi-threading, atomic operations, and a variety of other issues that spring up in even application-level programming, much less device driver or kernel programming, are more understandable after you have learned assembly language programming.

 

Of course I am also an advocate for all programmers (and systems administrators) taking a higher-level interest in operating system design and compiler design and theory. If you work with the machines every day of your life, you should at least have a rough idea of how they work...or don't work....why viruses can attack your system and other issues. You still may not have the skills to fix the problems, but at least you will know a bit about how it happens.

 

Knowledge of assembly language is also useful if you think your compiler has a bug. Many times a programmer writing in Cobol, Fortran or some other high level language would come to me and complain that they could not see what was wrong with their code. Looking at the machine language we could see what the compiler had generated and often re-write the application code to remove the error.

 

Learning assembly language is also a lot easier than it was when I learned it. I learned my first assembly language with an ASR-33 teletype, then flipping switches and reading lights on the front of a PDP-8. The editor took five minutes to read in from paper tape and the assembler took 15 minutes to read it. You had to pass your source code through the assembler at least twice and three times if you wanted a listing. Running your program typically wiped out the editor, assembler and even the boot loader.

 

These days you can run your assembly language program in a virtual machine, developed using an IDE and debugged with an interactive debugger. Using an emulator, you can chose from lots of different machine language architectures to study and learn.

 

I have to admit that the old PDP-8 was a great “first (machine) language” for me to learn. Only one “accumulator”, the machine not only was incapable of multiplying and dividing, it could not even subtract! You had to take the two's complement of the subtrahend and add that to the minuend. Before you people echo Scotty in “Star Trek IV: The Voyage Home” and say “how quaint”, remember that this machine was built before the days of microprocessors, and all the logic of the machine was made up of components on the boards. The machine also used core memory.

 

What made the PDP-8 so great was that every instruction was the same length (twelve bits) and most of the data was also oriented around twelve bits, so it made the address calculations fairly simple.

 

In a lot of ways, the second computer I ever programmed (DEC's PDP-8) was a RISC computer.

 

Recently I was chatting with a student studying computer science and he told me that he wanted to get a Raspberry Pi so he could teach himself machine language. He then told me that in the future his university was only going to be teaching JAVA and that they were no longer going to be teaching machine language, computer architecture or operating system design because these courses were “too hard”. He also lamented that the best computer science teachers were leaving his university in droves.

 

I do not blame those professors for leaving. When a universary waters down a field of study as beautiful as computer science because some of the students find it “too hard”, I think it is time for the university to really think about what they are doing, and who they are trying to educate.