Languages and Programming

So far we have covered mostly hardware. Certainly virtualization and systems architecture involves a lot of software, programs to run on general purpose hardware, but the languages are all machine word formats for low level instructions. You could program that way, but I haven't seen it since the hobby days in the '70s when we didn't have compilers and assemblers and the machines were too small to run them anyway. You certainly can't write very big programs this way. At the lowest level there is assembly language that maps more or less one to one to machine instructions, but has mnemonic keywords and symbolic names for program and data addresses to perform some of the low level chores. At this low level we see another tool to handle low level accounting, the link editor. This program takes enhanced machine code files, object files produced by the assembler and compilers which come up next. The link editor finalizes the locations of code and data and updates the internal links to point to the finalized location values. Many modern operating systems (OS) defer this linking until load time. This is called dynamic linking, and it allows shared library code to be stored in object form in files called dynamic libraries. This saves a lot of space because the library code doesn't have to be copied into each executable making them smaller. On systems that can use virtual memory features to share the code in memory, you also save a lot of runtime memory because a library used by many programs only needs to be in memory once.

That was just a little preview of a more complete toolchain for going from source code to running programs. Languages come in two basic types, compiled and interpreted. Actually it is a bit of a continuum, on one end with completely compiled code that is pretty mush like machine or assembler code, you load and go (maybe after linking), and on the other end where a program reads each byte or line of a program is parsed and executed. Many utility programs use the same basic "read-eval-print" loop even when the language isn't general purpose. Let's skip the digression where all programs are interpreters for some domain specific language and stick to complete languages. Formally, you might want to specify Turing machine equivalence, but that isn't probably necessary. Languages vary quite a bit on their expressive power even though they all share a lot of structure and concepts as well.

Among interpreted languages, the grandaddy is probably BASIC althogh LISP is older. BASIC isn't too important today, but new variants of LISP show up every day, for example Cloture. LISP has its enthusiasts and some of the new variants are probably great languages, but I don't know them well enough to say more. There is another group of languages that have come around more recently, I would like Perl, Python and Ruby, plus maybe PHP, but it is more a variant of Perl. Of these, only Ruby has a good enough type system with classes and some decent forms of inheritance.