We've all seen it before, where a program suddenly decides to stop working possibly causing you to lose a lot of unsaved data.

Why does this happen?

The simplistic but probably not very satisfying answer is that there was an error in the program.

There are 3 fundamental types of computer program errors that are related to when/how the error is discovered: Compile-time errors, Run-time errors, and "silent" errors.  Runtime errors are the ones that result in crashes that you see as a user.

To illustrate the difference, let's pretend you have a robot that you want to program to fill up the gas tanks on your 5 cars.  A program is essentially a set of instructions for a computer to follow.  Let's say your instructions are the following (yes the typos are intentional):

  1. Take the keys, they're color-coded so each key works on the car of the same color.
  2. Start the car.
  3. Drive 3 blcks nort
  4. Stop to the left of the pump
  5. Put the nozzle in the gas receptacle on the right side of the car
  6. Start pumping
  7. Pay for the gas
  8. Drive back
  9. Start the next car and repeat from step 2 until all the cars are filled

Compile-time errors are essentially syntax errors.  The analogue to natural languages is either spelling or grammar errors.  It's something that the computer can tell is incorrect just by looking at it and it will inform the programmer that it has no idea what they're talking about.

In simplistic terms the act of "compiling" is essentially the act of translating the code that a programmer wrote into something the computer understands.  Contrary to what some non-programmers may think, computers do not understand code as written, instead it must be translated.  When the compiler doesn't know how to translate something, it gives a compiler error, which is seen right away by the programmer and fixed.

In the example above the robot's compiler would say "There's an error on line 3, I don't know what 'blck' and 'nort' mean."  That's simple enough to fix.  This is the most desirable type of error.  It's noticed before any damage is done.

Let's fix that error and go on to the next type of error.

  1. Take the keys, they're color-coded so each key works on the car of the same color.
  2. Start the car.
  3. Drive 3 blocks north
  4. Stop to the left of the pump
  5. Put the nozzle in the gas receptacle on the right side of the car
  6. Start pumping
  7. Pay for the gas
  8. Drive back
  9. Start the next car and repeat from step 2 until all the cars are filled

A runtime error is one that is only discovered when trying to follow the instructions.  You of course want to test your robot's program, so you ride along for the first car.  Everything goes fine, so you leave and have the robot finish up.

The 3rd car has the fuel door on the left side of the car.  When the robot gets to this point because you told it to "Put the nozzle in the gas receptacle on the right side of the car" but there wasn't one on the right side.  It will stop everything and essentially "crash".  Programs will sometimes have a way to send the error report to the programmer with what it was trying to do when it failed.

As extreme as it sounds to completely stop everything, it's often the safest thing to do in an unexpected situation.  A program is incapable of "thinking" for itself, and can only do what it is instructed to do.  It could be instructed to ignore anything it can't do, but that leads us to the 3rd type of error.

"Silent" errors are the ones were the program continues executing despite something going wrong.  It could be the case that the programmer haphazardly tells the program to ignore all errors and skip to the next step.  In this situation as you can see it would be bad to not know an error happened.  The robot would try to "Put the nozzle in the gas receptacle on the right side of the car" but fail because the tank is on the other side.

Now what happens?  Well the robot then starts pumping, that goes fine, it just doesn't go in the car but the robot wasn't asked to make sure the gas goes in the car.  Pay for the gas, sure it's on the floor, but it was still pumped so now it's paid for.  Drive back, and now no one will be the wiser.

Ultimately program crashes are almost always due to programming error (but sometimes it is due to hardware or operating system error), but crashes help the programmer know that there was in fact an error.  As annoying as it may be, it is often the lesser of two evils.

As for why there are errors in programs, it's ultimately the fact that humans are fallible and programming is inherently complicated.  If you actually see the program error reports, you will sometimes see an error referred to as an "Unhandled Exception".  The reason is that the errors are almost always "exceptional" circumstances that the programmer didn't think to account for, or something that should never happen unless something went wrong.

In our previous example, we could add a step after filling up to make sure the gas meter shows tank is full.  If not then we have an exceptional situation, that if the programmer didn't account for, would be "unhandled".  Usually the earlier the error is detected the easier it is to fix.

The difference with errors in programs and errors in many other fields are how noticeable the errors are.  If a waiter gets an order wrong once, it affects that person's order (and possibly the restaurant's reputation, but often it is simply forgotten).  If a program has an error, it affects everyone that uses the program.  In a program with millions of users, each mistake is that much more costly.