Andrew Hewson returns to the program area of memory in RAM to find six bytes which are hidden in the program-following figures
LAST MONTH I explained how line numbers are held in both the ZX-81 and the Spectrum and showed how they could be studied using a short routine which investigated the program area in RAM. This month I shall study again the program area but for a different purpose.
I am doing so in response to the following question from Alan Sheldon. He asks: It would appear that numbers in the program area of memory are followed by additional information which does not appear in listings. Is that so and if so why?
Sheldon is correct as can be seen by entering the Spectrum program listed in table one. The program will also work on the ZX-81 if line 15 is altered to read:
15 LET S=16509
Line 5 is a dummy line, the purpose of which is to allow the user to study the appearance of numbers in programs. When the program is RUN it looks at the contents of the first 21 bytes in the program area as I explained last month and displays them on the screen. The results for the Spectrum and the ZX-81 are shown in tables two and three respectively.
The first two bytes contain the line number (5) and the next two bytes specify the length of the remainder of the line (11 bytes). The next four bytes hold the character codes for the first line of the program:
The character codes vary slightly between the two machines. For example, the code for the letter 'A' is 65 on the Spectrum and 38 on the ZX-81 although the code for the 'LET' is 241 on both machines. The full list of character codes is given in appendix A of the manual supplied with each computer.
On the Spectrum the next byte contains 14. That is not the code for the end of a line, as might be expected, but instead it is described in appendix A of the manual as "number". In fact, the byte acts as a signal to the LIST and other commands to ignore the byte and the contents of the five locations which follow it. Hence there is no indication in listings of the program that those additional locations are used. The line is terminated by the next byte which contains 13 - the ENTER character.
On the ZX-81 the character codes are different but the effect is the same. The location containing the code for '1' is followed by six "hidden" bytes, which do not appear in program listings.
Some clue as to the purpose of those hidden bytes can be gained by replacing line 5, the dummy line, by another line. Try, for example, RUNning the program with
5 LET A=2.7
as the dummy line. The characters for the number "2.7 " occupy three bytes, not one as for the number "1", but again the number is followed by six hidden bytes. A few minutes' experimentation will show that whenever a number appears within a program six hidden bytes follow.
The reason for the use of the hidden bytes is that the ZX-81 and the Spectrum do not store and manipulate numbers in the character form in which they are displayed. They are converted into a "calculation" format and all additions, multiplications and so on are undertaken on the numbers in this format. When the result of a calculation is PRINTed it must be converted into characters for display on the screen. Similarly, the character form of a number entered by the user must be converted to the calculation format before a calculation can be executed.
All such conversions take time. To accelerate the execution of programs the conversion to calculation format is undertaken immediately a number in a program line is entered from the keyboard. The resulting five-byte form is stored in the hidden bytes. The use of this technique enables a considerable saving to be made in the time taken to execute a program, particularly if numbers are included within FOR loops, in which case the same conversion would otherwise be undertaken many times. Of course, the time taken to deal with a program line entered from the keyboard is lengthened but not to an unacceptable extent.
The next question, from Hugo Cassidy, follows from the previous one. He asks: Can you explain the method of encoding numbers on the Spectrum?
Before explaining the form of encoding used it is useful to explain why it is necessary to encode numbers. The decimal system of counting has become universally established for everyday purposes because people have 10 fingers and thumbs so we can conveniently count in tens, hundreds, thousands and higher powers of 10. Digital computers, however, count using bits which can be in one of only two states. It is as if they had many hands but each hand had only two fingers. Therefore they can count conveniently in twos, fours, eights, sixteens and higher powers of two. The primary reason for encoding is to convert decimal numbers to binary.
Unfortunately binary, written as a string of zeros and ones, is cumbersome for mere human beings to handle because large numbers of digits are often required. The number 1,000, for example, occupies nine digits when converted to binary. In the ZX-81 and the Spectrum the bits are grouped in bytes containing eight bits each so that the computers can hold a single positive integer number in the range 0 to 255 decimal in each memory location.
Hence it is usually convenient to consider bytes to be the fundamental unit of memory and ignore the constituent bits. Hexadecimal notation - numbers written in base sixteen - are used conveniently to represent bytes because two hexadecimal digits only are required. I think that most readers have enough trouble understanding decimal-to-binary conversions without introducing a further complication. I shall therefore continue to use the decimal version.
Given that it is necessary to convert numbers from decimal to binary, it is logical to use a binary format which is efficient and therefore fast for the computer to use. Two separate formats are used on the Spectrum, a special format for integers, or whole numbers, lying in the range -65535 to 65535 and a floating point format for all other numbers. The ZX-81 uses the floating point format only.
The integer format is the simplest to understand and so I shall explain it first. A suitable number, N, is converted to the five-byte form by setting the first and fifth bytes to zero and using the second byte to indicate the sign of the number, 0 for positive, 255 for negative. If the number is positive the value is stored in the third and fourth byte as:
Third byte = N-256*INT (N/256) Fourth byte = INT (N/256)
If N is negative the two bytes contain:
Third byte = 65536-N-256*INT ((65536-N)/256) Fourth byte= INT ((65536-N)/256
The principal advantage of the use of integer format is that for positive integers the third and fourth bytes are in the form the Z-80A microprocessor uses when addressing locations in memory. Commands such as PEEK and POKE are executed much faster than they would otherwise be if the more complex floating point form were used to store the addresses to which they refer. The format also enables the calculator routines in the ROM to execute much more quickly when calculations involving integers only are performed.
The program in table one can be used to inspect the positive integer form by varying the first line. For example, entering:
5 LET A =47
will show that 47 is held as 0,0,4 7,0,0. The negative version cannot be inspected using this program because all numbers are stored in their positive form in the hidden bytes. If a number of preceded by a negative sign it is negated when the line is executed.
|'The floating point form is designed to provide the computer with a systematic method of retaining as much accuracy as possible in any given calculation'
The program in table four gives the five-byte form of any number, positive or negative, entered from the keyboard. The program PRINTs the contents of the first item in the variables area, that is the number N entered by the user from the keyboard, because it is the first variable declared in the program. Note that the program should be initiated by entering RUN rather than GO TO 10 because doing so will cause the variables area to be CLEARed, thus ensuring that N is the first variable.
The floating point form is designed to provide the computer with a systematic method of retaining as much accuracy as possible in any given calculation. Some numbers cannot be completely specified in decimal form. The fraction one-third in decimal form consists of 1.3 followed by an infinite number of threes so that expressing it as 1.3333, for example, is almost, but not exactly, correct. The same problem occurs when binary arithmetic is used.
The solution is to retain only the most significant digits at each stage in a calculation. Provided more significant digits are retained than are required in the answer then in all but the most exceptional circumstances the calculated result will be accurate enough for practical purposes.
The program listed in table five calculates and PRINTs the floating point form of a number entered by the user. The line numbers have been set so that it can be placed in memory at the same time as the inspection program in table four. By entering the same number into both programs the user will see that the calculation is correct.
The program has two parts. The first stores the sign, S, of the number, X, entered by the user. It then multiplies the absolute value of X successively by 2 until the result exceeds 2 raised to the power 31 or 2147483648. The number of multiplications executed is stored in N. The new value of X then lies necessarily between 2 to the power 31 and 2 to the power 32 and so the integer part of the number can be stored exactly in 31 bits. Thus by discarding the fractional residue the number can be stored in four bytes, each containing eight bits with one bit left over to hold the sign of the number. The four bytes together are called the mantissa.
The second part of the program calculates the values held in each of the four bytes and stores them in the variables A, B, C and D and then PRINTs the variables. An adjustment is made to the value of A depending on the sign of the original number. In effect A is less than 128 for positive numbers and greater than or equal to 128 for negative numbers.
The fifth byte of the floating point form is used to store the exponent, that is the number of times that the mantissa must be divided or multiplied by 2 to place the decimal point in the correct position in the number. The program calculates that number using N, the number of multiplications made originally. The result is adjusted by adding 160 so that numbers greater than one have exponents greater than or equal to 128 and numbers less than one have exponents less than 128.
If a number with an absolute value greater than 4294967296, or 2 raised to the power 32, is entered into the program in table five the result will be incorrect because there is no provision for successive division by 2 to yield a number in the required range. It is easy to adjust the program to perform such successive divisions but that is left as an exercise for the reader.