vid,
2007-11-20 | Revision: 1.1 |

"Big number" arithmetics requires special care on x86 architecture. You need to use some properties of arithmetic instructions, which are not immediately apparent.

*
I expect reader to understand binary representation of numbers, basics
of binary arithmetics, and hexadecimal number notation.*

Big numbers are numbers that consists of more bits than machine word contains. For example 1024-bit number is considered big number. Usually, size of big number is multiple of size of machine word, so we can say big number consists of multiple machine words.

Processor doesn't have instructions for direct manipulation with big number, but it provides instructions that can manipulate with machine words, which are part of big number.

I will provide examples for x86-32, where size of machine word is 32 bits. When I will refer to machine word, I mean 32-bit number, even though it is customary to refer to 32-bit number as double word or dword on x86.

Before continuing, download Intel Manual Volume 2 (2A and 2B), which contains instruction set reference. We will use it through the article.

As I stated previosly, big numbers usually consist of several machine words.

Ordering of machine words within big number is a problem similar to ordering bits within single machine words. There are two possibilities: either the most significant word comes first (analogous to big endian) or the least significant word comes first (analogous to little endian). The x86 architecture uses little endian for bit order within machine word, and we will use it for order of machine words in this examples too. There is no practical advantage in either way, it is only matter of agreement.

For example 64-bit number with value 1, would consists of dword with value 1 followed by dword with value 0.

dd 1 ;low order dword dd 0 ;high order dword

A 96-bit number containing (hexadecimal) value `0x0102030412345678ABCDEF00`

would be:

dd 0xABCDEF00 ;low order dword dd 0x12345678 dd 0x01020304 ;high order dword

First operation we will demonstrate is addition. Look up
`add`

instruction in manual.

We will start with adding value 1 to 96-bit
number `bignum`

(3 dwords). We simply add 1 to low
order (first) dword:

bignum dd 15 dd 0 dd 0 ... add dword [bignum], 1

There is a problem with this though. If low order number contains value 0xFFFFFFFF, then it will overflow to 0 after adding number. We must then add 1 to higher dword. We say that we carry 1 from lower dword to higher dword in case of overflow.

How to detect overflow? When `add`

causes overflow, it sets CF (carry flag).
This is one way:

add dword [bignum], 1 jnc done add dword [bignum+4], 1 done:

Still, same thing can happen during second addition. If first two dwords contained value 0xFFFFFFFF, then we must carry 1 to third dword:

add dword [bignum], 1 jnc done add dword [bignum+4], 1 jnc done add dword [bignum+8], 1 done:

If last `add`

overflows, then entire operation on big number has overflown.
Depending on what we use bignum for, we can ignore or signal this
error:

add dword [bignum], 1 jnc done add dword [bignum+4], 1 jnc done add dword [bignum+8], 1 jc overflow done:

There is also a "nicer" way. We have `adc`

instruction, which does same as `add`

,
and if CF is set, then it adds extra 1 to destination. We use it like
this:

add dword [bignum], 1 adc dword [bignum+4], 0 adc dword [bignum+8], 0

First `add`

does the addition, and sets CF is overflow occured. Following `adc`

does nothing if CF is 0 (no overflow during first operation), or adds 1 to
second dword if CF is 1. It also sets CF if overflow occurs, or clears it if no
overflow happened, just like `add`

. Another `adc`

does same thing as first one.

We can use this way for adding any value, not just 1. No more than one bit ever
needs to be carried during addition (0xFFFFFFFF + 0xFFFFFFFF = 0x1FFFFFFFE).
Following is generic algorithm that adds `EAX`

to big number pointed by
`EDI`

:

add dword [edi], eax adc dword [edi+4], 0 adc dword [edi+8], 0

We can also add numbers bigger than 32 bits, by using `adc`

to add to higher dwords.
This example adds 0x12345678AABBCCDD to bignum:

add dword [edi], 0xAABBCCDD adc dword [edi+4], 0x1234567 adc dword [edi+8], 0

This way, we can add big numbers together. This example adds 128-bit (4 dword)
`bignum1`

to `bignum2`

:

mov eax, dword [bignum1] add dword [bignum2], eax mov eax, dword [bignum1+4] adc dword [bignum2+4], eax mov eax, dword [bignum1+8] adc dword [bignum2+8], eax mov eax, dword [bignum1+12] adc dword [bignum2+12], eax jc overflow

Subtraction is analogous to addition, we just use `sub`

and `sbb`

instructions. In this case, we say that we
borrow 1 from lower dword to higher dword in case of
overflow.

mov eax, dword [bignum1] sub dword [bignum2], eax mov eax, dword [bignum1+4] sbb dword [bignum2+4], eax mov eax, dword [bignum1+8] sbb dword [bignum2+8], eax mov eax, dword [bignum1+12] sbb dword [bignum2+12], eax jc underflow

Interesting problem is how to negate big number. We have `neg`

instruction, but
it doesn't work with big numbers so easily. That's why we will start with method
which we already know - subtraction. We will simply subtract big number from 0:

mov eax, 0 sub eax, dword [bignum] mov dword [bignum], eax mov eax, 0 sbb eax, dword [bignum+4] mov dword [bignum+4], eax mov eax, 0 sbb eax, dword [bignum+8] mov dword [bignum+8], eax

Negation can be also performed by inverting all bits (0->1, 1->0), and then add 1 to it. I won't explain why, it can be task for reader. You need to understand how are negative numbers represented in x86 architecture.

To invert all bits, we have `not`

instruction. Adding 1 to big number already was
described. So negating big number looks like this:

not dword [bignum] not dword [bignum+4] not dword [bignum+8] add dword [bignum], 1 adc dword [bignum+4], 0 adc dword [bignum+8], 0

After understanding this, as a bonus, we can explain how to use `neg`

instruction directly to negate big number. As we already know, `neg`

first inverts all bits, and then adds 1 to result. So we negate first (low order)
dword. If value of dword before negating is 0, after inverting bits it becomes
0xFFFFFFFF, and after adding 1 it is 0 again. But this is also only case when
addition needs to be carried to upper dword, and only case when `neg`

sets CF. In all other cases, CF is clear, and other dwords need to be just
bit-inverted. In practice, code to do this is not particulary nice:

neg dword [bignum] jnc not1 neg dword [bignum+4] jnc not2 neg dword [bignum+8] jmp done not1: not dword [bignum+4] not2: not dword [bignum+8] done:

Bitwise shifting is sometimes useful for big numbers (for example when they are used as floating point numbers).

First we will shift by 1 bit.

When we want to bitwise shift ordinary (not big) number, we use `shl`

and `shr`

instructions. When we use them to shift by 1, bit that is
shifted out of number is saved in CF. Problem is then, how to shift this bit into
next dword.

In this case, we can use `rcl`

and `rcr`

instructions.
They "rotate" number "through" CF. When they are used to rotate by 1, they shift
number, insert CF as new bit, and then save bit which comes out in CF.
Look these up in manual. This is how rotation looks then:

;shift left shl dword [bignum], 1 rcl dword [bignum+4], 1 rcl dword [bignum+8], 1 jc overflow ;shift right shr dword [bignum+8], 1 rcr dword [bignum+4], 1 rcr dword [bignum], 1 jc overflow ;often not needed

Shifting becomes interesting when we want to shift by more than 1 bit. In this
case, we need to carry more bits between dword operations. A 386 provided
useful instructions to do this, `shld`

and `shrd`

. They shift number like
`shl`

and `shr`

/`sar`

, but insert bits from another operand instead of
using fixed value.

Shifting right:

mov eax, [bignum+4] shrd [bignum], eax, 5 mov eax, [bignum+8] shrd [bignum+4], eax, 5 shr [bignum+8], 5

Shifting left:

mov eax, [bignum+4] shld [bignum+8], eax, 5 mov eax, [bignum] shld [bignum+4], eax, 5 shl [bignum], 5

Testing for overflow is trickier in this case. We need to test if upper/lower N bits are all zero before shifting. For constant value (like 5), we can use mask:

Shifting right:

test [bignum+8], 0x0000001F ;00000000 00000000 00000000 00011111 binary jnz overflow

Shifting left:

test [bignum+8], 0xF8000000 ;11111000 00000000 00000000 00000000 binary jnz overflow

But with non-constant number of bits to shift, you would need to construct this mask.
Instead, you can use `shld`

and `shrd`

again:

Shifting right:

xor eax, eax ;eax = 0 mov ebx, [bignum] shrd eax, ebx, 5 test eax, eax ;eax ?= 0 jnz overflow

Shifting left:

xor eax, eax ;eax = 0 mov ebx, [bignum+8] shld eax, ebx, 5 test eax, eax ;eax ?= 0 jnz overflow

To multiply number, there is `mul`

instruction. We will
only use `mul`

with one operand. Forms with 2
or 3 operands are not usable for big numbers.

Instruction `mul`

multiplies 32-bit number in
`EAX`

, and saves 64-bit result in
`EDX:EAX`

. That means we have part that which is
"carried" to upper dword in `EDX`

:

mov ebx, 10 ;EBX = multiplier mov eax, [bignum] mul ebx ;EDX:EAX = EAX*EBX mov [bignum], eax ;save result mov ecx, edx ;save carried part in ECX mov eax, [bignum+4] mul ebx add eax, ecx ;add carried part from previous multiplication mov [bignum+4], eax mov ecx, edx mov eax, [bignum+8] mul ebx add eax, ecx mov [bignum+8], eax cmp edx, 0 jne overflow

Multiplying with numbers which don't fit 32 bits (like multiplying two bignums) can't be done in any straightforward way.

Division is similar to multiplication. We will use one-operand form
of `div`

instruction. It takes 64-bit number in
`EDX:EAX`

, stores result of division in `EAX`

and remainder in `EDX`

. If result is bigger than 32-bits,
it throws exception, so we must make sure that this never happens.

We will start from "top" of number (high order dword). After every division, remainder in EDX is used as high 32bits of divident for subsequent division. Last vale of EDX is remainder from entire big number division.

mov ebx, 10 ;divisor mov edx, 0 mov eax, [bignum+8] ;EDX:EAX = number to divide div ebx mov [bignum+8], eax mov eax, [bignum+4] div ebx mov [bignum+4], eax mov eax, [bignum] div ebx mov [bignum], eax ;edx holds remainder from division

Again, dividing by values greater than 32 bits cannot be done "easy way".

Continue to discussion board.

You can contact the author using e-mail vid@x86asm.net.

Visit author's home page.

2007-11-20 | 1.1 | Partial rewrite and additions | vid |

2007-11-06 | 1.0 | First public version | vid |

(dates format correspond to ISO 8601)