2025-03-10

assembly

01001101 01100001 01100011 01101000 01101001 01101110 01100101 01110011 00100000 01110100 01101000 01101001 01101110 01101011 00100000 01101001 01101110 00100000 01100010 01101001 01101110 01100001 01110010 01111001 00101100 00100000 01110111 01101000 01101001 01100011 01101000 00100000 01101001 01110011 00100000 01100100 01101001 01100110 01100110 01101001 01100011 01110101 01101100 01110100 00100000 01100110 01101111 01110010 00100000 01101000 01110101 01101101 01100001 01101110 01110011 00100000 01110100 01101111 00100000 01110101 01101110 01100100 01100101 01110010 01110011 01110100 01100001 01101110 01100100 00101110

(ENG) Machines think in binary, which is difficult for humans to understand. Each instruction, consisting of a set of bits, determines the actions a computer should execute—these bits are defined by chip designers, as seen during the ALU implementation.

In the early days, programmers had to manually flick switches of ones and zeros to the correct configuration in order to load the instruction into the computer’s instruction memory. Debugging had to be hell, and these people came up with a solution: use readable symbols to represent sections of the instructions.

assembly

An example of an assembly instruction is

ADD AX, 2

This instruction adds 2 to the current value stored in AX. Though it might not be as intuitive as high level code like

AX += 2

It’s still a major improvement over 1s and 0s. A program called the assembler then converts the assembly code to binary for the machine’s use.

[asm_to_bin.png]

basics of Hack Assembly

We will focus on assembly in this chapter. In particular, we will showcase a simple, barebones assembly language: the Hack Assembly Language from nand2tetris.

[hack_platform.png]

The hack platform has

Data memory, which is a block of RAM.
Instruction memory, which is read-only and is also known as ROM
A data register D, convenient for storing intermediate values during a series of instructions
Address register A

The address register acts as a pointer to the memory blocks. Suppose the value of the address register is set to 5 with @5. Then, M will point to memory location 5. We can then affect the value in M (memory location 5). For example, we can set the value at M to 12 with load M, 12.

Setting the value of D can be done with an instruction such as D=A. You can see here that A serves multiple functions:

To point to locations in data memory and instruction memory
As an extra data register, capable to storing values

branching

By default, instructions inside the instruction memory execute sequentially. But how can we jump around to other instructions non-sequentially? This is crucial when implementing flow control of programs.

unconditional jump

The 0; JMP instruction executes an unconditional jump to the instruction currently selected inside A. In order to jump to any location xxx of your liking, you can execute

@xxx
0; JMP

The next instruction executed will be at location xxx.

conditional jump

A simple conditional jump can be performed with the instruction D; JEQ. This means jump to the value inside register A, if the value of D is 0.

variables

Apart from A, D and constants, variables can be used to store values.

@x
D=A

This means set the Data register D to the value of the variable x. To instantiate a variable, invoke it for the first time, then set M to the desired value

// x = -1
@x
M = -1

Finding the actual value of a variable is the assembler’s job—a process we’ll discuss in a later chapter.

mapping binary to assembly

It could be helpful to demystify how an assembler translates assembly to binary. Its most basic task is to split up each instruction into different groups of bits.

Each type of computer architecture, such as x86 or ARM, will have its own instruction set. These standards are decided by the hardware engineers. The Hack Language Specification is a lot simpler than real-world specifications but provides intuition for understanding assemblers.

A-instruction

An A-instruction, in the form of @x, loads the value x into the A register. In binary, the most-significant-bit is 0, and the other 15 bits represent the value to be set.

[a_instruction.png]

The above example is equivalent to @7.

C-instruction

The C-instruction is used for everything else. It is multi-purpose and takes the syntax

dest = comp; jump

where dest represents the destination register to store the result (optional),
comp dictates which operation to perform and
jump for any branching instructions (optional).

[c_instruction_bits.png]

The 16-bit instruction is split into

1 bit for C-instruction
2 useless bits that are set to 1 by convention
7 comp bits (implies 128 max possible operations)
3 dest bits
3 jump bits

leetcode (extra) easy

multiplication

Here’s an example of a program that does multiplication using additions.We will calculate the R0 * R1, where R0 and R1 are the values in these registers. Assume that both values are non-negative and the result can be stored in a single register.

pseudocode

i = 1
sum = 0
while i <= R1 {
	sum = sum + R0
	i = i + 1
}

mult.asm

// This file is part of www.nand2tetris.org
// and the book "The Elements of Computing Systems"
// by Nisan and Schocken, MIT Press.
// File name: projects/4/Mult.asm

// Multiplies R0 and R1 and stores the result in R2.
// (R0, R1, R2 refer to RAM[0], RAM[1], and RAM[2], respectively.)
// The algorithm is based on repetitive addition.

// i = 1
@i
M=1

// sum (R2) = 0
@R2
M=0

(LOOP)
// if i > R1: break
@i
D=M
@R1
D=D-M
@END
D;JGT

// sum (R2) = sum + R0
@R0
D=M
@R2
M=D+M

// i = i + 1
@i
M=M+1
@LOOP
0;JMP

(END)
@END
0;JMP

In practice, a more optimized multiplication program could be implemented with bit shifting, but the point of this example was to explore variables, loops and branching. While high-level languages offer convenience, understanding assembly sharpens our grasp of how computers truly execute instructions.