In any discussion about computers at some point an understanding of numbers is essential since a computer works numerically. Even the letters of the alphabet are represented by numbers enabling them to be processed and manipulated in the same way as a number. So before proceeding further a few things need explaining. A computer can only understand ones and noughts ie the electrical states ON and OFF. But to have a long string of these can make your mind whirl. It is much easier to have something shorter. Suffice to say there are ways to represent a binary number in an easier to read form.
The binary system (base 2) uses only ones and noughts, so there are only two digits. It is the most easily adapted system for computer circuitry and is thus used internally in most computers. In computer jargon a binary digit is known as a bit. As with all number systems it is positional ie a symbol in one position has a different meaning than if the same symbol were in a different position. (In decimal 0002 has a different meaning to 0020, for example.) The right hand digit is the unit (20 - two to the power zero) since the base is two. At first sight it needs more digits to represent a certain value than, say, decimal with a higher base. But before looking at this we shall see how to convert between systems.
In any system the right hand digit has the weighting of one unit. In the decimal system (with which we are most familiar) this is 1. The next digit to the left has a weighting of ten. For each shift to the left the weighting increases by a factor of ten (note: this is the base or RADIX). Thus 34510 means 5 units plus 4 tens plus 3 hundreds. If this were binary the 10 would be replaced by a 2. The least significant digit (LSD) is the one furthest to the right while the most significant digist (MSD) is the one furthest to the left.
10010 = 6416 = 11001002
To actually convert a number take the original number and working in its base we divide by the new base writing down the remainders generated by each division as the new digits of the new base.
Example: Convert 102310 into base 2, 5 and 7 numbers.
| - | Remainder | - | Remainder | - | Remainder |
|---|---|---|---|---|---|
| 231023 | - | 531023 | - | 731023 | - |
| 23 511 | 1 LSD | 53 204 | 3 | 73 146 | 1 |
| 23 255 | 1 | 53 40 | 4 | 73 20 | 6 |
| 23 127 | 1 | 538 | 0 | 732 | 6 |
| 23 63 | 1 | 531 | 3 | 730 | 2 |
| 23 32 | 1 | - | =130435 | - | =26617 |
| etc | =1111111112 (Check this for yourself) |
As you can see the larger the base the easier it is to work with and recognise a number. However, problems arise in any base greater than ten as we shall see below. Because we are not familiar with working in any system except decimal, the conversion from one non-decimal base to another is best carried out by adding the weighted value of each digit in decimal and then carrying out a second conversion by continual division (as shown above).
Example: Convert 12345 to base 9
| 12345 | gives 4 x 50 | = 4 |
| 3 x 51 | = 15 | |
| 2 x 52 | = 50 | |
| 1 x 53 | = 125 | |
| - | 19410 | |
| 9319410 | - | |
| 93 21 | 5 | |
| 93 2 | 3 | |
| 0 | 2 | |
| so 12345 = 2359 |
Binary numbers - especially large ones - are awkward to visualise and difficult for us to work with. If we group the bits (binary digits) in a binary number into threes starting from the LSB then each group can be given a value between nought and seven and represents a number to the base of eight.
eg.
| 11 | 011 | 1002 |
| 3 | 3 | 48 |
This octal system is used by some computer manufacturers but most use the hexadecimal (HEX) system instead. Here the bits are grouped in fours (nibbles) from the LSB and each group has a value between 0 and 1510 (base 16). Eight bits are termed a byte and two bits a gulp. How, then, are we to represent numbers greater than 9 since we have run out of numerical symbols? The answer is to use alphabetical symbols ie A to F.
eg.
| 1101 | 1100 |
| D | C16 |
This neatly gets round a number of problems which are beyond the scope of this book, but some of the reasons for using hex will be seenin later chapters. It also is more convenient to use for those programming in machine code (ie instructing the computer in a raw form as opposed to something more like English).
The numbers dealt with so far have been unsigned and by implication this means they are positive. It is necessary, though, to also be able to represent negative numbers. If we assign one bit to represent the sign (the sign bit) we can then represent both types of number. The usual convention is for 0 to represent positive and 1 negative. The simplest representation is for the size of the number to follow its sign. The size itself is thus always positive.
eg. 100000102 = -210
This does not always suit the computer. So a second method is usually employed: that of ones or twos complement.
The ones complement is formed by subtracting the positive magnitude representation from 111... (the number of bits is the number required for magnitude plus one sign bit). In otherwords replace 0 by 1 and 1 by 0 for each bit of positive number plus sign bit. The result is converted to twos complement by adding 1 to the least significant bit.
A computer can apparently handle information other than numbers. If you use a word-processing program, for example, to type a letter or report you will be typing the letters of words and symbols. But we have said that the computer can only recognise binary numbers. However, we can use a coding system to represent non-numerical data and thus enable information other than numbers to be entered into the system. A secret agent will use a code to represent his or her message. So we might adopt some of their principles. Except that we need our messages to be easily understood. The most usual coding system used is ASCII (American Standard Code for Information Interchange). All systems contain codes for the decimal digits as well as alphabetic characters.
ASCII uses a 7 bit code to indicate the character or numeral and sometimes uses a leading eighth bit for error checking. Thus ASCII is capable of defining 128 different characters. 0-31 are used as control characters and vary from computer to computer. They control various actions such as sounding the bell. Characters higher than 127 also vary from system to system and tend to be used for special symbols and graphics. The error (or parity) bit will be decided by the number of ones in the seven bit ASCII code and which type of parity is being used. Parity is a method of checking for a single error in transmitting data (see the section on COMMUNICATIONS). There are several different types of parity. For ODD parity the number of ones in the eight bit byte mist be ODD.
eg. G = 4716 = 010001112
This has four ones and so the leading bit must be 1 to make an odd number of ones. Thus:
G = C716 = 110001112 using ODD parity
| LSD\MSD | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 |
| 0 | NUL | DLE | SP | 0 | @ | P | ' | p |
| 1 | SOH | DC1 | ! | 1 | A | Q | a | q |
| 2 | STX | DC2 | " | 2 | B | R | b | r |
| 3 | ETX | DC3 | # | 3 | C | S | c | s |
| 4 | EOT | DC4 | $ | 4 | D | T | d | t |
| 5 | ENQ | NAK | % | 5 | E | U | e | u |
| 6 | ACK | SYN | & | 6 | F | V | f | v |
| 7 | BEL | ETB | ' | 7 | G | W | g | w |
| 8 | BS | CAN | ( | 8 | H | X | h | x |
| 9 | HT | EM | ) | 9 | I | Y | i | y |
| A(10) | LF | SUB | * | : | J | Z | j | z |
| B(11) | VT | ESC | + | ; | K | [ | k | { |
| C(12) | FF | FS | , | < | L | \ | l | | |
| D(13) | CR | GS | - | = | M | ] | m | } |
| E(14) | SO | RS | . | > | N | ^ | n | ~ |
| F(15) | SI | US | / | ? | O | | | o | DEL |
Note: 9510 (5E16) can be _ (underline character), the hash (#) character can be made to appear as a ? depending on how you set your system up.
| NUL | null or nil | SOH | start of heading |
| STX | start of text | EOT | end of transmission |
| ENQ | enquiry | ACK | acknowledge |
| BEL | bell | BS | backspace |
| HT | horizontal tabulation | LF | linefeed |
| VT | vertical tabulation | FF | formfeed |
| CR | carriage return | SO | shift out (black to red print) ribbon) |
| SI | shift in (red to black print ribbon) | - | |
| DLE | data link escape | DC | device control |
| NAK | negative acknowledge | SYN | synchronous file control |
| ETB | end of transmission block | EM | end of medium |
| CAN | cancel | SUB | substitute |
| ESC | escape | FS | file separator |
| GS | group separator | RS | record separator |
| US | unit separator | SP | space |
| DEL | delete |
It can be seen that many of these codes are for use in communications and will be referred to again in the chapter on communications. The escape character (2710 B116) is used to send a sequence of control commands to a printer to get it to print in different typestyles (fonts) or pitches, to control print quality, the size of print and various other effects. Backspace and delete are usually used together, for example, when you press the delete key (sometimes marked with a backarrow ?) the cursor will move one space backwards and erase the character currently shown in that position. The term carriage return goes back to the days of typewriters when the carriage (which could nowadays be likened to a print head) reached the right hand edge and the typist had to manually pull a handle to return the carriage to the left hand edge. A key to perform this function can be found on all computers, although it is now usually labelled as "Enter" as pressing the key enters information you have typed into the computer's working memory. You may still see or hear it referred to as the return key. Word processors normally only require a carriage return at the end of a paragraph.
In passing it should be noted that some computers especially business oriented ones or calculators store numbers in a special format called Binary Coded Decimal (BCD) where each decimal digit is represented by its own separate binary code of four bits. There are many different versions of this but its disadvantage is that it is inefficient since the range of values represented is much smaller and when performing arithmetic the results have to be corrected.
Computers can only recognise the two states on and off which can be represented as binary digits. All alphanumeric characters can be stored as binary numbers by representing them in a code. The most common code is ASCII. For humans binary numbers are awkward to work with so the hexadecimal system is used. Various control codes can also be represented by the ASCII code.
Number systems for computers Non-numerical data
Contents Introduction Communications Input Systems Operating systems Disks History Computer Languages