helpline |
Andrew Hewson considers the basics of machine code
HAVING RECEIVED a number of letters asking about the fundamental ideas of machine code programming, I have devoted most of the column to the topic. John Stevens writes:
I am trying to learn how to write machine code programs but I am finding it difficult to understand the meaning of some of the words which are used. Can you explain as fully as possible what is the difference between a bit and a byte, and between a register and a variable?
A bit is the fundamental building block of computer memory and can exist in only one of two states. The two states can be thought of as representing ON or OFF; TRUE or FALSE; YES or NO; UP or DOWN; MALE or FEMALE or any other pair of logically opposite conditions. The mechanism by which a computer memory works is not really important to us but in the Sinclair computers the state of a bit is memorised by setting a microscopic solid state switch either ON or OFF as appropriate.
The usual notation is to think of one state as the ZERO state and the other as the ONE state. A bit is considered to be set when it is in the state representing ONE and to be reset otherwise. That notation allows us to speak of a given pattern of bits in terms of its binary equivalent and by converting the binary number to a decimal; each bit pattern can be given an exceptional positive integer decimal number.
For example, consider eight bits of which the right-most four are set and the left-most four are reset as illustrated in table one. The binary pattern of the eight bits can be converted to a decimal if it is remembered that, in a binary number, the right-most column is the units column; the next column to the left is the twos column; the next to the left again is the fours column and so on, doubling at each move to the left. The decimal equivalent of 00001111 is therefore:
0*128+0*64+0*32+0*16+1*8+1*4+1*2+1*1=15
Obviously it is inconvenient to refer to bits as the right-most or the third from the right and so the convention is adopted of numbering the bits from the right, starting at zero as shown in table one. When that convention is used the number of each bit is also the power to which 2 must be raised to give the value of the column. That is:
2bit number = column value
Bit 3, for example, is in the eights column because 23 = 8.
|
I chose to consider a group of eight bits together because the Z-80A microprocessor at the heart of the Sinclair computers is designed to operate on eight bits at a time. The term operates covers all the types of task which the Z80A can perform directly such as addition, subtraction, rotation, logical AND, and the like. Thus although a bit is the fundamental unit of computer memory, bits are usually manipulated together in groups of eight, so a group of eight bits is called a byte - pronounced bite.
There are 256 ways of arranging the contents of a group of eight bits. The first is 00000000, the second is 00000001, the third is 00000010. Thus each of the bytes in RAM can be used to hold a single positive whole number lying between 0 and 255 inclusive by setting or resetting the eight bits in the byte according to the binary equivalent of the number.
The Z-80A does not alter the contents of memory directly when it is executing a program; rather it copies the contents of a location in memory into one of several special locations in the microprocessor called a register and then operates on the contents of the register. The Z-80A is a powerful microprocessor because it has many registers and so it can hold several numbers at once, thereby reducing the need to make time-consuming transfers between the processor and memory.
Most of the registers have one or more special features. The most important one is the 'a' register or accumulator, so-called because the results of most arithmetic or logical instructions are accumulated in the 'a' register. Some instructions use a second register as a second source of data together with the 'a' register.
For example the instruction:
add a,b
means add the contents of the 'a' register to the contents of the 'b' register and leave the result in 'a'.
Thus a register is a dedicated location in the microprocessor which has specific attributes and functions. A variable is a location or group of locations in RAM which are used by a particular program. If the program is written in Basic or another high-level language, the variable is given a name and all references to the variable are made using the name.
The next question, from Alan Bermingham, follows from the previous one. He asks:
What do the following programs do - an assembler, a disassembler, an interpreter, a compiler?
A machine code routine consists of a sequence of instructions which the Z80A understands directly with no need for prior interpretation. The simpler instructions are held in one byte of memory but the more complicated instructions can occupy as many as four bytes.
Generally, the instructions are executed in the order in which they are encountered, although there are exceptions. The Z-80A keeps a note of from where the next instruction is to come by means of a special register pair called the program counter. Thus if the location pointed to by the program counter contains the number 128 in decimal - 80 in hexadecimal - the Z-80A will add the contents of the 'a' register to the contents of the 'b' register and leave the result in the accumulator, because 128 is the decimal machine code instruction for
add a,b
The decimal or hexadecimal codes for all the 600 or so instructions in the Z-80A instruction set are difficult to remember and so for that and other reasons machine code programs are almost always written using an assembler program. An assembler converts instructions like add a,b to the correct code. It also allows the programmer to name variables, add comments and give labels to various points in the program and to call subroutines using the labels. A good assembler will have other facilities as well, all aimed at making the programmer's job as straightforward as possible.
A disassembler performs the opposite function to an assembler; it converts a sequence of numbers into a sequence of mnemonics which are easier to understand than the original code. A list of the more important mnemonics is given in the Sinclair manuals in Appendix A. A disassembler is of use when analysing code written by somebody else to discover how it works.
The output from an assembler is a program which the microprocessor can understand directly because it consists of machine code instructions. In contrast, a program written using an interpreter, such as Sinclair Basic, is held in RAM in more or less the form in which it was entered by the programmer.
'A disassembler is of use when analysing code' |
Interpreters are high-level languages which bear little or no relationship to the instruction set of the processor on which they are run. Every time the program is executed, however, each line must be analysed by the processor before the required action can be taken. The principal disadvantage of the system is that the programs can be slow to execute, because the processor spends most of its time determining what each program line means.
A compiler circumvents the problem by analysing each program line once only and then storing a sequence of machine code instructions which are equivalent to the original program. Thus the speed of a machine code program is obtained without losing the convenience of a high-level language. The machine code produced by a compiler can be somewhat tortuous and inflexible and so when efficiency is essential an assembler is used instead.
Robert King asks: I have a program which works well on my friend's Spectrum but always crashes on mine. I suspect a fault in the RAM. Have you a program which checks each RAM location in turn?
Checking RAM involves setting every bit of every byte, checking that it remains set, and then resetting every bit and checking that it then remains reset. Setting every bit in a byte is equivalent to POKEing 255 into that byte. Similarly, resetting every bit is equivalent to POKEing in 0.
Obviously it is not possible to POKE numbers into every memory location while the machine is running, because the computer will crash, but a partial check can be made by testing every location in the spare area of memory.
The program in table two runs such a check. It can be loaded using an assembler or using the simple hexadecimal loader listed in table three. The program checks every location up to the bottom of the stack and returns the address at which it stops - about 32575 in the 16K machines and 65343 in the 48K machine when they are working correctly.
|
|
Finally, I must apologise for an error in the Kaleidoscope program which appeared in the June issue. Line 100 should read:
100 POKE S+32-31*I+32*J,K
My thanks to all those who pointed out the error.