This is a very old article written by me (around 2001, I guess) and it is just here for record.
I assume that you all know at least what the processor is all about. Then start here:
Part 1
The processor and the memory are two important part of the computers that you need to know about when programming. The I/O is also an important part but it is sometimes merged with memory as it can be memory mapped and access to I/O is little different than memory. Well, to start up with Memory addressing techniques and modes I ask you to read the tutorial on DEF SEG. Read that first, in order to understand how memory is accessed.
Now, for buses. Everybody knows that electric current can only be transported through wires. Well, wireless exists, but that wouldn’t be used in your processor or motherboard. To access memory, the processor requires a set of wires to interact with memory unit. This set of wires is called a bus. Generally, bus is a set of wires that is used to send some information to another device.
There are 3 types of buses with your Pentium processor. They are address bus, data bus and control bus. Address bus passes the address of the memory or the I/O unit in the system to all the devices. This bus is 32-bits wide on a Pentium and 36-bits wide on a Pentium-Pro. With 32 bits of address bus width, one can access 232 bytes or 4 GB of memory or with 36 bits one can access 64 GB of memory 236. So, your processor can only support that much memory and nothing more. The recently released IA-64 will address 264 bytes, i.e. it will be 4 GB * 4 GB(real huge).
Now, for the data bus. The data bus, is used to send info between processor and the memory or I/O. It is usually 64 bits wide on Pentium and upwards. It contains all the data that has to be sent to memory for writing or holds the data which is read from the memory.
Now, for the control bus. With my earlier two paragraphs, if you thought that how do the memory and the I/O know that whether memory was requested or I/O or whether they have to read or write the info in data bus. Then here’s your answer, the control bus tells all about it. The control bus has some signals which tell the memory and I/O all they need to know. They have the wires labeled IORD, IOWR, MRD, MWR(not official). Suppose, the processor wants some data in the memory, then it puts the address in address bus, and sets MRD(Memory Read Line) to zero(zero indicates that they are active). The memory then activates as a result of this and I/O will ignore the signal. The memory sends the data at the address in the address bus and keeps quiet then.
Suppose, if data was to be written in memory, then MWR(Memory Write Line) should be activated and address where it has to be written has to be kept in address bus and data to be written should be kept in data bus. Then, memory unit sees this and writes the data on data bus in the address specified by the address bus. The I/O follows the same method but the only difference is IORD and IOWR are activated instead.
Part 2
In the previous section, I told about the address bus which is used to send the address and that will be the matter of our conversation.
The method which you told was applicable for humans since we don’t think of time when it comes for dividing a number by 4. But for constant memory or I/O read/write operaion, this would become very slow. That is why, they have employed another method for this.
When the processor sends the address to the memory via address bus, it takes last 3 bits (Least significant bits) away from the address. These 3 bits, will decide whether the memory is aligned or not. Actually, this was done to divide the memory in such a way that byte still remains as a standard for data transfer and higher word widths should also be possible. So, the memory was divided into banks. For a 8-byte aligned memory, the number of banks would be 8. So, the last 3 bits which are taken away from the address will specify in which bank to write and the remaining address part will locate the part of memory.
How does it make the program slow? Well, consider this example. If you want to write 4 bytes of data in the location 1234h on a processor which has a alignment of 8 bytes, then the address 1234h is placed on the address bus as follows.
Remember, the last 3 bits are seperated on a 8 byte alignment machine. Since, we require only 4 bytes to be written and the above address is aligned by 4 byte(for 4 bytes alignment, only last two bits are seperated), the data transfer is done in one cycle(not the timing cycles). Suppose, we wanted to write an 8 byte data at the same address, then you see that the above address is not aligned by 8 bytes(last 3 bits are not 000, which is required for address to be aligned at 8 bytes), and the processor will not be able to write all the 8 bytes in one cycle because the other 4 bytes lie in banks in other location(Try to understand it logically). You need 2 cycles for this data transfer, one for the first 4 bytes and the other one for the second 4 bytes(by changing the location internally). This one doubles the time taken for execution of a instruction. But things become even worse when the address becomes an odd number. It will require many cycles to write the data in memory.
For eg. try to determine how many cycles will it take to transfer and 8 byte data in the location 0A6F257h. If you cannot do this and are confused, then read this again and again until you get it.