We use cookies to distinguish you from other users and to provide you with a better experience on our websites. Close this message to accept cookies or find out how to manage your cookie settings.
To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure [email protected]
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
The technology supporting the World Wide Web enables computations and data to be distributed over networks of computers, and this lets you use programs that you don't have to install or maintain and gain access to information that you don't have store or worry about backing up. The World Wide Web is essentially a global computer with a world-wide file system.
Many of the documents in this huge file system are accessible to anyone with a browser and a network connection. Indeed, their authors really want you to take a look at what they have to offer in the terms of advice, information, merchandise and services. Some of these documents, we tell ourselves, contain exactly what we're looking for, but it isn't always easy to find them.
It's difficult enough to search through files full of your own personal stuff – now there are billions of files generated by millions of authors. Some of those authors are likely to be somewhat misleading in promoting whatever they're offering, others may be downright devious, and some will just be clueless, ignorant about whatever they're talking about. How can we sift through this morass of data and find what we're looking for?
SPIDERS IN THE WEB
Billions of files are stored on web servers all over the world and you don't need a browser to access them.
In Daniel Dennett's Darwin's Dangerous Idea: Evolution and the Meanings of Life, the dangerous idea is natural selection, Charles Darwin's name for the mechanism governing the evolution of species that he codiscovered with Alfred Russel Wallace. Natural selection is the “process by which individual characteristics that are more favorable to reproductive success are ‘chosen,’ because they are passed on from one generation to the next, over characteristics that are less favorable”.
Natural selection explains how characteristics that promote reproductive success persist and under certain circumstances can come to dominate less favorable characteristics, but it doesn't explain how those characteristics are passed on or how new characteristics come into being. It was Gregor Mendel, living around Darwin's time, who suggested that heritable characteristics are packaged in the discrete units we now call genes and that offspring inherit a combination of their parents' genes.
When you combine Darwin's and Mendel's ideas, you have the basis for genetic algorithms, an interesting class of algorithms generally attributed to John Holland that take their inspiration from evolutionary biology. These algorithms simulate some aspects of biological evolution – while ignoring others – to tackle a wide range of problems from designing circuits to scheduling trains. Genetic algorithms allow the algorithm designer to specify the criterion for reproductive success by supplying a fitness function, alluding to the idea that in a competitive environment only the fittest survive to reproduce.
On the morning I began this chapter, I was driving into work and passed a sign near a grade school advertising a “Bug Safari”. When I got to work, I found out from the Web that “Butterfly Walks”, “Spider Safaris”, “Pond Dips” and similar excursions are regular fare at summer camps and neighborhood activity centers. This intrigued me: who would lead these tours? I initially thought that the most likely candidates would be trained entomologists or specialists such as lepidopterists. An entomologist studies bugs, beetles, butterflies, spiders and the like (not to be confused – as I usually do – with the similar-sounding etymologist, someone who studies the derivation of words), while lepidopterists specialize in lepidoptera, butterflies and moths.
But then I realized that it's a rare specialist who can resist being distracted by his or her own specialty. I thought about what this tendency might portend for my attempt to speak to a general audience about concepts in computer science. By most accounts, I'm an expert in computer science and a specialist in artificial intelligence. But just as a trained and practicing physicist is not an expert in all matters spanning the length and breadth of physics, from quantum electrodynamics to cosmology, so I'm not an expert in all areas of computer science.
In this chapter, I'll be talking about programming languages, an area of computer science about which I know a fair bit but certainly am not an expert.
Have you ever wondered what's going on when you click your mouse on some underlined text in a browser window and suddenly the screen fills with text and graphics that clearly come from some other faraway place? It's as if you've been transported to another location, as if a window has opened up on another world. If you're on a fast cable or DSL (“digital subscriber line”, the first of many acronyms in this chapter) connection, the transformation is almost instantaneous; if you're using a slow modem, then updating the screen can take several seconds or even minutes, but in any case the behind-the-scenes machinations making this transformation possible leave little evidence. Occasionally, however, you'll catch fleeting glimpses of the machinery through little cracks in the user interface.
If you use a dial-up connection and modem to connect with your Internet service provider, you may hear an awful squawking as your modem and the service provider's modem initiate two-way communication. Similar noisy exchanges can occur when one fax machine attempts to communicate with a second. In both cases, computer programs at each end of a telephone connection are validating, handshaking, synchronizing and otherwise handling the transmission of information. As smarter modems and fax machines replace older technology, these noisy accompaniments are being silenced, since a human need no longer overhear them to check that things are proceeding as desired.
There is a reason I liken computations to evanescent spirits. They are called into being with obscure incantations, spin filaments of data into complex webs of information, and then vanish with little trace of their ephemeral calculations. Of course, there are observable side effects, say when a computation activates a physical device or causes output to be sent to a printer, displayed on a screen or stored in a file. But the computations themselves are difficult to observe in much the way your thoughts are difficult to observe.
I'm exaggerating somewhat; after all, unlike our brains, computers are designed by human beings. We can trace every signal and change that occurs in a computer. The problem is that a complete description of all those signals and changes does not help much in understanding what's going on during a complex computation; in particular, it doesn't help figure out what went wrong when the results of a computation don't match our expectations.
An abstraction is a way of thinking about problems or complex phenomena that ignores some aspects in order to simplify thinking about others. The idea of a digital computer is itself an abstraction. The computers we work with are electrical devices whose circuits propagate continuously varying signals; they don't operate with discrete integers 0, 1, 2, …, or for that matter with binary digits 0 and 1.
Often when you're trying to solve a problem, you pull out a pen or pencil and grab a handy piece of paper to write down a column of numbers, draw a diagram, or make a list. In solving the problem, you might sum the numbers, trace lines in the diagram, or match items of the list with items of a second list. These auxiliary scribblings are used to organize information – as with the list or column of numbers – or represent objects so they can be easily manipulated – as with the diagram. Data structures are the algorithmic analog of these handy pieces of paper.
In Chapter 11 we saw that browsers and web servers in a computer network exchange information by bouncing small packages of data from one computer to another until they arrive at their final destination. How can we represent a computer network so that an algorithm can figure out what sequence of computers to use in transferring a data package? For that matter, how do we represent airline schedules, circuit diagrams, computer file systems, road maps and medical records so that they can be manipulated algorithmically?
Many algorithms use special data structures to represent their inputs and outputs or to perform intermediate calculations. In Chapter 7 we used lists and vectors to keep track of journal entries.
Hardly a day goes by that I don't write at least one short computer program: a few lines of code to explore an idea or help organize my thoughts. I think of it as simply talking with my computer, and more and more often there is a computer available to talk with, often several of them joining in the conversation simultaneously. Each time you click on a link in a browser, you cause a sequence of computations involving dozens if not hundreds of computers scattered all over the world.
Making a computation happen is not, however, the same thing as programming. There are lots of powerful programs written by talented programmers that you can call up with a click of a mouse or few keystrokes. These programs animate computers, breathing life and spirit into lumps of metal and plastic. Even if you know what's going on inside computers and computer programs, it's easy to imagine that programs are spells and the programmers who create them are sorcerers. When you click on the icon for a program, you invoke these spells and the spells conjure up spirits in the machine. But this book isn't about invoking the spells of others; it's about creating your own spells and conjuring spirits of your own design.
This is not to say I won't encourage you to use code written by other programmers. Quite the contrary: an important part of the power of computing is that good spells can be reused as often as needed.
When I use the term “hacker” I mean someone who enjoys programming and is good at it. Hackers in my experience tend to be an opinionated and individualistic lot and they tend to appreciate strong opinions and independence of thought in others. The slogan of the Perl language is “There's more than one way to do it”, and most hackers are motivated to exploring different ways of doing the same job. That said, if adopting a standard way of doing something provides leverage for building better software, then most hackers will agree to adopt the standard (after much dickering about the details, of course).
If you write code because you want other people to use it, it behooves you to use a language that others are familiar with, adopt standard conventions for input and output so that others can interact with your code without learning some new set of conventions, and provide your code in a format so that others can use it as a component of a larger project without having to understand all the details of your implementation. The struggle to meet these basic criteria requires the hacker to negotiate, make concessions and, generally, work within a community of potential users to produce, adopt and adhere to reasonable standards.
Building great software requires a great deal of discipline and interpersonal skill – in sharp contrast with the stereotype of a hacker as an unkempt, uncommunicative obsessive compulsive lacking basic hygiene and addicted to highly caffeinated drinks.
Most physicists believe that the speed of light is a fundamental limit on how quickly we can move through space. This claim is based on the predictions of mathematical theories and the results of experiments that appear to support them. According to theory, it doesn't matter whether you move through space with a pogo stick or an anti-matter drive, you're still subject to the rules governing all matter in the universe and thus unable to exceed the speed of light.
What if there are limits on what you can compute? Pharmaceutical companies simulate interactions at the atomic level in searching for molecules to cure diseases. There could be viruses for which it will take years to find a vaccine – there is simply no way to speed up the necessary computations. Software developers who write the programs that keep airplanes flying and emergency rooms functioning would like to prove that their code won't malfunction and put lives at risk. But maybe it's impossible to provide such assurances.
In some cases, computational limitations can work to our advantage. Some programs exploit the difficulty of computing answers to particular problems; for example, the most popular encryption schemes for transferring information securely on the World Wide Web rely on the difficulty of computing the prime factors of large composite integers. Of course, if someone figures out how to factor large numbers efficiently, our privacy will be seriously threatened.
The first computers were used primarily to manage information for large companies and perform numerical calculations for the military. Only a few visionaries saw computing as something for everyone or imagined it could become a basic service like the telephone or electric power. This failure of imagination was due in large part to the fact that the people who controlled computing in the early years weren't the ones actually programming computers. If you worked for a large corporation or industrial laboratory, then you might have strictly limited access to a computer, but otherwise you were pretty much out of luck.
In the early years of computing, users submitted their programs to computer operators to be run in batches. You would hand the operator a stack of cards or a roll of paper tape punched full of holes that encoded your program. An operator would schedule your program to be run (possibly in the middle of the night) with a batch of other programs and at some point thereafter you would be handed a printout of the output generated by your program. You didn't interact directly with the computer and if your program crashed and produced no output, you'd have very little idea what had gone wrong.
The people who ran computer facilities were horrified at the idea of having users interact directly with their precious computers.
Programming languages come in all shapes and sizes and some of them hardly seem like programming languages at all. Of course, that depends on what you count as a programming language; as far as I'm concerned, a programming language is a language for specifying computations. But that's pretty broad and maybe we should narrow our definition to include only languages used for specifying computations to machines, that is, languages for talking with computers. Remember, though, that programmers often communicate with one another by sharing code and the programming language used to write that code can significantly influence what can or can't be easily communicated.
C, Java and Scheme are so-called general-purpose, high-level programming languages. Plenty of other programming languages were designed to suit particular purposes, among them the languages built into mathematical programming packages like Maple, Matlab and Mathematica. There are also special-purpose languages called scripting languages built into most word-processing and desktop-publishing programs that make it easier to perform repetitious tasks like personalizing invitations or making formatting changes throughout a set of documents.
Lots of computer users find themselves constantly doing routine housecleaning tasks like identifying and removing old files and searching for documents containing specific pieces of information. Modern operating systems generally provide nice graphical user interfaces to make such house-cleaning easier, but many repetitive tasks are easy to specify but tedious to carry out with these fancy interfaces.
Programming languages, like natural languages, have a vocabulary (lexicon) and rules of syntax (grammar) that you have to learn in order to communicate. Just as unfamiliar grammatical conventions can make learning a new natural language difficult, unfamiliar programming-language syntax can make learning to program difficult. English speakers learning Japanese have to get used to the fact that Japanese verbs generally come at the end of the sentence. With computer languages, the problem is made worse by the fact that computers are much less adept at handling lexically and syntactically mangled programs than humans are at grasping the meaning of garbled speech.
If you want to talk with computers, however, you're going to have to learn a programming language. Just as you learn new natural languages to communicate with other people and experience other cultures, you learn a programming language to communicate with computers and other programmers and to express computational ideas concisely and clearly. The good news is that learning one programming language makes it a lot easier to learn others.
When you start learning to program, you may find yourself consumed with sorting out the lexical and syntactic minutiae of the programming language. You'll have to look up the names of functions and operators and memorize the particular syntax required to invoke them correctly. You may end up spending obscene amounts of time tracking down obscure bugs caused by misplaced commas or missing parentheses.
While writing the previous chapter, I got to thinking about concepts in computer science that connect the microscopic, bit-level world of logic gates and machine language to the macroscopic world of procedures and processes we've been concerned with so far. In listing concepts that might be worth mentioning, I noticed that I was moving from computer architecture, the subdiscipline of computer science concerned with the logical design of computer hardware, to operating systems, the area dealing with the software that mediates between the user and the hardware.
In compiling my list, I was also struck by how many “computerese” terms and phrases have slipped into the vernacular. Interrupt handling (responding to an unexpected event while doing something else) and multitasking (the concurrent performance of several tasks) are prime examples. The common use of these terms concerns not computers but human information processing. I don't know what you'd call the jargon used by psychologists and cognitive scientists to describe how humans think. The word “mentalese” is already taken: the philosopher Jerry Fodor postulates that humans represent the external world in a “language of thought” that is sometimes called “mentalese.” Fodor's mentalese is more like machine language for minds. I'm interested in the language we use to describe how we think, how our thought processes work – a metalanguage for talking about thinking.
One consequence of inexpensive computer memory and storage devices is that much less gets thrown out. People who normally wouldn't characterize themselves as packrats find themselves accumulating megabytes of old email messages, news articles, personal financial data, digital images, digital music in various formats and, increasingly, animations, movies and other multimedia presentations. For many of us, digital memory serves to supplement the neural hardware we were born with for keeping track of things; the computer becomes a sort of neural prosthetic or memory amplifier.
However reassuring it may be to know that every aspect of your digital lifestyle is stored on your computer's hard drive, storing information doesn't do much good if you can't get at what you need when you need it. How do you recall the name of the restaurant your friend from Seattle mentioned in email a couple of years back when she told you about her new job? Or perhaps you're trying to find the recommendation for a compact digital camera that someone sent you in email or you saved from a news article. It's tough remembering where you put things and you'd rather not look through all your files each time you want to recall a piece of information.
In 1999, when NASA launched the first of its Earth Observing System (EOS) satellites, they knew they would have to do something with the terabytes (a terabyte is a billion bytes) of data streaming down from these orbiting observers.
With all my mumbo-jumbo about conjuring up spirits and casting spells, it's easy to lose track of the fact that computers are real and there is a very precise and concrete connection between the programs and fragments of code you run on your computer and the various electrical devices that make up the hardware of your machine. Interacting with the computer makes the notions of computing and computation very real, but you're still likely to feel shielded from the hardware – as indeed you are – and to be left with the impression that the connection to the hardware is all very difficult to comprehend.
For some of you, grabbing a soldering iron and a handful of logic chips and discrete components is the best path to enlightenment. I used to love tinkering with switching devices scavenged from the local telephone company, probing circuit boards to figure out what they could do and then making them do something other than what they were designed for. Nowadays, it's easier than ever to “interface” sensors and motors to computers, but it still helps to know a little about electronics even if you're mainly interested in the software side of things.
I think it's a good experience for every computer scientist to learn a little about analog circuits (for example, build a simple solid-state switch using a transistor and a couple of resistors) and integrated circuits for memory, logic and timing (build a circuit to add two binary numbers out of primitive logic gates).