HW4 - Greatest Common Divisor
This is the second of two assignments in which you will write binary code for the simple machine you created a simulator for in lab. In this assignment, we will be using our modulo code from hw3 and extending our ISA to provide better support for function calls.
Adding a Stack
Our initial machine did not directly support a stack. While we could have managed a stack in our ISA using icodes 1, 2, and 3, we would like to provide the functionality at the hardware level. To do that, we decided to introduce a new register: the stack pointer rsp
. This register will contain the memory address of the top of our stack.
Aside: In our machine’s version of the stack, we will only store 8-bit values. However, in practice we will want to store data of different sizes in our stack. We will see more of this as we move to x86-64.
The (Updated) Instructions
Remember that when our reserved bit (the 7th bit) is 0, we used the following icodes:
icode
Behaviors
icode | operation | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
0 |
| ||||||||||
1 |
| ||||||||||
2 |
| ||||||||||
3 |
| ||||||||||
4 | write | ||||||||||
5 | do different things for different values of
| ||||||||||
6 | do different things for different values of
In all 4 cases, increase | ||||||||||
7 | Compare |
One more icode
Now, we will add one more icode when the reserved bit is set to 1
:
icode | operation | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
0 | do different things for different values of
If |
The first 2 actions (b = 0, 1) allow us to interact with the stack. When we push a value onto the stack, we decrement rsp
to make space, then write the value of register A to memory at the address stored in rsp
. When a value is popped off the stack, we read the value from memory at position rsp
into register A and then increment rsp
. Note, this does still leave the old value in memory, and that’s okay. Also note that our stack grows downward in memory.
The next two actions (b = 2, 3) provide additional support for function calls.
- The first (b=2) is the function call. It adds 2 to the current
pc
and writes that value to the stack (also updatingrsp
). Consider: why might we write pc+2 to the stack instead of pc? It then performs an unconditional jump to the address given as the immediate value following the instruction (i.e., the next byte of our program). This is essentially calling the function. - The second (b=3) is the function return. It reads the top value from the stack (also updating
rsp
) and sets that value as the newpc
. This is an unconditional jump to the address stored at the top of the stack. Use this at the end of the function to return back to the code that called it.
Aside: One thing to note here is that we are assuming that the top value of the stack, when executing instruction
83
, is the address that called the current function. If the called function pushes any values onto the stack, it must pop them before returning, or else we may return to an unexpected position in memory.We will see more of how the stack works, including the handling of return addresses, parameters, and return values, as we discuss x86-64 assembly in greater detail.
Function Call Syntax
We need more than just the address of our function code and the address to return when the function is complete. We also need parameters. Consider the following function call (in python-like syntax):
z = modulo(a, n)
- We need to know:
- The location of our function
modulo
, i.e., the address in memory where that code starts - The location of where
a
andn
are stored - A place to put the result,
z
- The location of our function
Our ISA will define the following calling conventions for functions. Calling conventions determine where the parameters and return values are located and also include any requirements on the registers that the function call is not using.
- Instructions may have up to 2 arguments. They will be placed in registers 2 and 3; with the first parameter in register 2 and the second in register 3.
- The return value for the function is stored in register 0.
- Registers 0, 2, and 3 are considered volatile; that means that if the code calling the function wants to keep those values, it must save them to memory or push them onto the stack before calling the function (using instruction
82
). - Register 1 is considered non-volatile; that means that if the function being called wants to use register 1, it must save the value to memory (i.e., push onto the stack) and restore it (i.e., pop from the stack) before returning to the caller (using instruction
83
).
Updating our example above with these conventions, our function call would look something like the following (in python syntax):
r0 = modulo(r2, r3)
Or more specifically:
r2 = x
r3 = y
jump to (address of modulo() code)
z = r0
Therefore, when we want to make a function call, we will place the parameters in registers 2 and 3 before making the call with instruction 82
. We cannot assume that the values in registers 2 and 3 will be the same when the function has completed and returns. When the function completes, it will jump back (using instruction 83
) to the next instruction in our code to continue execution. Immediately after the function returns, we can safely assume that the function’s return value has been stored in register 0 (overwriting the value previously stored in that register before the function call).
What about r1? What does non-volatile mean? This means that if we were using r1 before we jumped to our modulo() code, then it should still be the same afterwards. So, for example, if we wanted to pre-load our destination address 0xD0
:
r1 = 0xD0
r2 = x
r3 = y
jump to (address of modulo() code)
z = r0
// r1 should still have D0
It appears that the function’s code is not allowed to use register 1. But it can! It must simply stash the value away somewhere for safe-keeping, then bring it back out before the function returns. We also now have a place to store values like this: the stack. So our function can do something like:
push r1
// all the code to calculate modulo, r0 = r2 * r3,
// using r1 as needed
pop r1 // put r1's value back
return // use reserved=1, icode=0, b=3 to jump back
So remember: register 1 will have its same value after the function returns, but registers 0, 2, and 3 may have changed.
Your tasks
This homework is divided into two separate tasks:
- Update the simulator from Lab 4 to provide the stack and new instructions
- Use the new instructions to write a binary program that computes the greatest common divisor of the input values
Updating Our Simulators
We must first update our simulators from Lab 5 to support the stack and this new instruction.
- Add the
rsp
to your code. In our simulators, we will initialize the stack pointer to a location late in memory (0xFF
) and create a stack that grows downward in memory. There are trade offs for this choice (if our stack gets too big, it might overwrite our program!).-
If you are using Java, update the following section of SimBase.java:
// memory and registers public byte[] M; public byte[] R; public byte rsp = (byte) 0xFF; // new!
-
If you are using Python, update the following section of sim_base.py:
# initialize memory and registers R = [0 for i in range(4)] M = [0 for i in range(256)] rsp = 0xFF # new!
-
- Add the new instruction defined above (when the reserved bit is
1
and the icode is0
) to yourexecute()
method. Note: it is up to the binary that our simulators execute to properly read the function parameters from memory and write return values to register 0. Your simulator should correctly updatersp
and provide the jumps. - If the reserved bit is
1
and the icode is not0
, then set the nextpc
to the currentpc
instead of advancing it and do nothing else.
Greatest Common Divisor
Next, you will write a binary program gcd.binary
using this new icode and associated operations.
- Modify your modulo code from HW3 to be used as a function in a new program. You will include the updated version of this code at the end of your program below.
- Instead of needing to read the immediate values at bytes 0x01 and 0x03, it can assume the two values are already stored in registers 2 and 3 (respectively).
- You should save the contents of register 1, since it should be unchanged when your function returns. Just before returning, save the value back to register 1.
- Instead of storing the result at memory location
D0
, return the result to the caller. That is, store the result in register 0 and use the return instruction (83
).
- Write a new binary program that computes the greatest common divisor of the inputs (i.e., gcd((value at 0x01), (value at 0x03)) and stores the result at address
0xE0
. Your new code should:- Load the value in memory at address
0x01
into a register - Load the value in memory at address
0x03
into a register - Calculate the greatest common divisor,
gcd(a, b)
defined as:- gcd(a,b) is the largest positive integer that divides both a and b. See Calculating GCD below for more details; a more complete definition of the Euclidean algorithm for GCD is also available on Wikipedia
- Note, you must call your original modulo code to compute a mod b. (hint: you will need the address where your modulo() code starts)
- Store the result (the GCD) at address
0xE0
- Halt once it is done.
- Load the value in memory at address
- Include your updated modulo code (from step 1 above) at the end of your new instructions. Note: you will need to know the starting address of your modulo code to call it, that is, to jump to it.
You may assume that the input values will not be negative, but they may be 0. We also define that greatest common divisor of x and 0 is x. You may also assume that the first value (in 0x01
) is always larger than the second value (in 0x03
) except in the case of 0.
Thus, if gcd.binary
begins __ 10 __ 04
then when it is finished it should have 04
in address 0xE0; that is gcd(0x10,0x04)
= gcd(16,4)
= 4
.
Note: We should be able to change the second and fourth bytes of your program to compute the greatest common divisor of other values as well. That is, consider the values in bytes 0x01 and 0x03 of your program to be immediate values. The autograder will overwrite literal values to these bytes to test your program.
Calculating GCD
The Euclidean algorithm for calculating greatest common divisor uses the modulo operation. Assuming a ≥ b, this algorithm makes use of the following observation:
gcd(a,b) = gcd(b, modulo(a,b))
The algorithm repeatedly applies the modulo operation until the result is 0. The greatest common divisor is smallest result seen before 0. For example:
gcd(66, 18) = gcd(18, modulo(66, 18)) = gcd(18, 12)
= gcd(12, modulo(18, 12)) = gcd(12, 6)
= gcd(6, modulo(12, 6)) = gcd(6, 0)
= 6
Testing Your GCD Binary
To test your code, do one of
python3 sim_base.py gcd.binary
or
java SimBase gcd.binary
or going to our updated online simulator and click the file upload button at the top of the page to load your gcd.binary
into the simulator’s memory.
Hints, tips, and suggestions
How to write binary
We suggest following these steps, carefully, saving the result of each in a file so you can go back and fix them if they were wrong:
- Write pseudocode that does the desired task
- Convert any
for
loops towhile
loops with explicit counters - Change any
if
orwhile
guards to the formsomething <= 0
a <= b
becomesa-b <= 0
a < b
becomesa+1 <= b
becomesa+1-b <= 0
a >= b
becomes0 >= b-a
becomesb-a <= 0
a > b
becomes0 > b-a
becomesb+1-a <= 0
a == b
becomesa-b == 0
becomes!(a-b) == 1
becomes!!(a-b) <= 0
a != b
becomesa-b != 0
becomes!(a-b) == 0
becomes!(a-b) <= 0
- Add more variables to split multi-operation lines into a series of single-operation lines
- Add more operations to convert ones not in the instruction set into ones in the instruction set
- Change each loop into a pair of instructions, opening with “
spot1
=pc
” and closing with “if …, gotospot1
” - Count the number of variables needed
- Pick a memory address for each variable. Make these big enough your code is unlikely to get that big; for example, you might pick
0x80
though0x80
+ number of variables - Convert each statement that uses variables into
- register ← load variable’s memory
- original statement
- store variable’s memory ← register
- translate each instruction into numeric (
icode
,a
,b
) triples, possibly followed by aM[pc+1]
immediate value - turn (
icode
,a
,b
) into hex - Write all the hex into
gcd.binary
Debugging binary is hard. That’s part of why we don’t generally write code in binary. If you get stuck, you should probably try pulling just the part you are stuck on separate from the rest and test it until it works, then put it back in the main solution.
Submit
Submit both your gcd.binary
and SimBase.java
or sim_base.py
files via Gradescope.