Working with Big Numbers Using x86 Instructions

vid, 2007-11-20

Revision: 1.1

"Big number" arithmetics requires special care on x86 architecture. You need to use some properties of arithmetic instructions, which are not immediately apparent.

I expect reader to understand binary representation of numbers, basics of binary arithmetics, and hexadecimal number notation.

Big numbers are numbers that consists of more bits than machine word contains. For example 1024-bit number is considered big number. Usually, size of big number is multiple of size of machine word, so we can say big number consists of multiple machine words.

Processor doesn't have instructions for direct manipulation with big number, but it provides instructions that can manipulate with machine words, which are part of big number.

I will provide examples for x86-32, where size of machine word is 32 bits. When I will refer to machine word, I mean 32-bit number, even though it is customary to refer to 32-bit number as double word or dword on x86.

Before continuing, download Intel Manual Volume 2 (2A and 2B), which contains instruction set reference. We will use it through the article.

Layout of Big Numbers

As I stated previosly, big numbers usually consist of several machine words.

Ordering of machine words within big number is a problem similar to ordering bits within single machine words. There are two possibilities: either the most significant word comes first (analogous to big endian) or the least significant word comes first (analogous to little endian). The x86 architecture uses little endian for bit order within machine word, and we will use it for order of machine words in this examples too. There is no practical advantage in either way, it is only matter of agreement.

For example 64-bit number with value 1, would consists of dword with value 1 followed by dword with value 0.

 dd 1   ;low order dword
 dd 0   ;high order dword

A 96-bit number containing (hexadecimal) value 0x0102030412345678ABCDEF00 would be:

 dd 0xABCDEF00   ;low order dword
 dd 0x12345678
 dd 0x01020304   ;high order dword

Addition

First operation we will demonstrate is addition. Look up add instruction in manual.

We will start with adding value 1 to 96-bit number bignum (3 dwords). We simply add 1 to low order (first) dword:

 bignum dd 15
        dd 0
        dd 0
 ...
 add dword [bignum], 1

There is a problem with this though. If low order number contains value 0xFFFFFFFF, then it will overflow to 0 after adding number. We must then add 1 to higher dword. We say that we carry 1 from lower dword to higher dword in case of overflow.

How to detect overflow? When add causes overflow, it sets CF (carry flag). This is one way:

 add dword [bignum], 1
 jnc done
 add dword [bignum+4], 1
done:

Still, same thing can happen during second addition. If first two dwords contained value 0xFFFFFFFF, then we must carry 1 to third dword:

 add dword [bignum], 1
 jnc done
 add dword [bignum+4], 1
 jnc done
 add dword [bignum+8], 1
done:

If last add overflows, then entire operation on big number has overflown. Depending on what we use bignum for, we can ignore or signal this error:

 add dword [bignum], 1
 jnc done
 add dword [bignum+4], 1
 jnc done
 add dword [bignum+8], 1
 jc overflow
done:

There is also a "nicer" way. We have adc instruction, which does same as add, and if CF is set, then it adds extra 1 to destination. We use it like this:

 add dword [bignum], 1
 adc dword [bignum+4], 0
 adc dword [bignum+8], 0

First add does the addition, and sets CF is overflow occured. Following adc does nothing if CF is 0 (no overflow during first operation), or adds 1 to second dword if CF is 1. It also sets CF if overflow occurs, or clears it if no overflow happened, just like add. Another adc does same thing as first one.

We can use this way for adding any value, not just 1. No more than one bit ever needs to be carried during addition (0xFFFFFFFF + 0xFFFFFFFF = 0x1FFFFFFFE). Following is generic algorithm that adds EAX to big number pointed by EDI:

 add dword [edi], eax
 adc dword [edi+4], 0
 adc dword [edi+8], 0

We can also add numbers bigger than 32 bits, by using adc to add to higher dwords. This example adds 0x12345678AABBCCDD to bignum:

 add dword [edi], 0xAABBCCDD
 adc dword [edi+4], 0x1234567
 adc dword [edi+8], 0

This way, we can add big numbers together. This example adds 128-bit (4 dword) bignum1 to bignum2:

 mov eax, dword [bignum1]
 add dword [bignum2], eax
 mov eax, dword [bignum1+4]
 adc dword [bignum2+4], eax
 mov eax, dword [bignum1+8]
 adc dword [bignum2+8], eax
 mov eax, dword [bignum1+12]
 adc dword [bignum2+12], eax
 jc overflow

Subtraction

Subtraction is analogous to addition, we just use sub and sbb instructions. In this case, we say that we borrow 1 from lower dword to higher dword in case of overflow.

 mov eax, dword [bignum1]
 sub dword [bignum2], eax
 mov eax, dword [bignum1+4]
 sbb dword [bignum2+4], eax
 mov eax, dword [bignum1+8]
 sbb dword [bignum2+8], eax
 mov eax, dword [bignum1+12]
 sbb dword [bignum2+12], eax
 jc underflow

Negation

Interesting problem is how to negate big number. We have neg instruction, but it doesn't work with big numbers so easily. That's why we will start with method which we already know - subtraction. We will simply subtract big number from 0:

 mov eax, 0
 sub eax, dword [bignum]
 mov dword [bignum], eax
 mov eax, 0
 sbb eax, dword [bignum+4]
 mov dword [bignum+4], eax
 mov eax, 0
 sbb eax, dword [bignum+8]
 mov dword [bignum+8], eax

Negation can be also performed by inverting all bits (0->1, 1->0), and then add 1 to it. I won't explain why, it can be task for reader. You need to understand how are negative numbers represented in x86 architecture.

To invert all bits, we have not instruction. Adding 1 to big number already was described. So negating big number looks like this:

 not dword [bignum]
 not dword [bignum+4]
 not dword [bignum+8]
 add dword [bignum], 1
 adc dword [bignum+4], 0
 adc dword [bignum+8], 0

After understanding this, as a bonus, we can explain how to use neg instruction directly to negate big number. As we already know, neg first inverts all bits, and then adds 1 to result. So we negate first (low order) dword. If value of dword before negating is 0, after inverting bits it becomes 0xFFFFFFFF, and after adding 1 it is 0 again. But this is also only case when addition needs to be carried to upper dword, and only case when neg sets CF. In all other cases, CF is clear, and other dwords need to be just bit-inverted. In practice, code to do this is not particulary nice:

  neg dword [bignum]
  jnc not1
  neg dword [bignum+4]
  jnc not2
  neg dword [bignum+8]
  jmp done
  not1:
  not dword [bignum+4]
  not2:
  not dword [bignum+8]
  done:

Bitwise Shifting

Bitwise shifting is sometimes useful for big numbers (for example when they are used as floating point numbers).

First we will shift by 1 bit.

When we want to bitwise shift ordinary (not big) number, we use shl and shr instructions. When we use them to shift by 1, bit that is shifted out of number is saved in CF. Problem is then, how to shift this bit into next dword.

In this case, we can use rcl and rcr instructions. They "rotate" number "through" CF. When they are used to rotate by 1, they shift number, insert CF as new bit, and then save bit which comes out in CF. Look these up in manual. This is how rotation looks then:

 ;shift left
 shl dword [bignum], 1
 rcl dword [bignum+4], 1
 rcl dword [bignum+8], 1
 jc overflow

 ;shift right
 shr dword [bignum+8], 1
 rcr dword [bignum+4], 1
 rcr dword [bignum], 1
 jc overflow    ;often not needed

Shifting becomes interesting when we want to shift by more than 1 bit. In this case, we need to carry more bits between dword operations. A 386 provided useful instructions to do this, shld and shrd. They shift number like shl and shr/sar, but insert bits from another operand instead of using fixed value.

Shifting right:

 mov  eax, [bignum+4]
 shrd [bignum], eax, 5
 mov  eax, [bignum+8]
 shrd [bignum+4], eax, 5
 shr  [bignum+8], 5

Shifting left:

 mov  eax, [bignum+4]
 shld [bignum+8], eax, 5
 mov  eax, [bignum]
 shld [bignum+4], eax, 5
 shl  [bignum], 5

Testing for overflow is trickier in this case. We need to test if upper/lower N bits are all zero before shifting. For constant value (like 5), we can use mask:

Shifting right:

 test [bignum+8], 0x0000001F  ;00000000 00000000 00000000 00011111 binary
 jnz  overflow

Shifting left:

 test [bignum+8], 0xF8000000  ;11111000 00000000 00000000 00000000 binary
 jnz  overflow

But with non-constant number of bits to shift, you would need to construct this mask. Instead, you can use shld and shrd again:

Shifting right:

 
 xor  eax, eax   ;eax = 0
 mov  ebx, [bignum]
 shrd eax, ebx, 5
 test eax, eax   ;eax ?= 0
 jnz  overflow

Shifting left:

 
 xor  eax, eax   ;eax = 0
 mov  ebx, [bignum+8]
 shld eax, ebx, 5
 test eax, eax  ;eax ?= 0
 jnz  overflow

Multiplication

To multiply number, there is mul instruction. We will only use mul with one operand. Forms with 2 or 3 operands are not usable for big numbers.

Instruction mul multiplies 32-bit number in EAX, and saves 64-bit result in EDX:EAX. That means we have part that which is "carried" to upper dword in EDX:

 mov ebx, 10            ;EBX = multiplier
 mov eax, [bignum]	
 mul ebx                ;EDX:EAX = EAX*EBX
 mov [bignum], eax	;save result
 mov ecx, edx           ;save carried part in ECX

 mov eax, [bignum+4]
 mul ebx
 add eax, ecx           ;add carried part from previous multiplication
 mov [bignum+4], eax
 mov ecx, edx
	
 mov eax, [bignum+8]
 mul ebx
 add eax, ecx
 mov [bignum+8], eax

 cmp edx, 0
 jne overflow

Multiplying with numbers which don't fit 32 bits (like multiplying two bignums) can't be done in any straightforward way.

Division

Division is similar to multiplication. We will use one-operand form of div instruction. It takes 64-bit number in EDX:EAX, stores result of division in EAX and remainder in EDX. If result is bigger than 32-bits, it throws exception, so we must make sure that this never happens.

We will start from "top" of number (high order dword). After every division, remainder in EDX is used as high 32bits of divident for subsequent division. Last vale of EDX is remainder from entire big number division.

 mov ebx, 10   ;divisor

 mov edx, 0
 mov eax, [bignum+8]   ;EDX:EAX = number to divide
 div ebx
 mov [bignum+8], eax

 mov eax, [bignum+4]
 div ebx
 mov [bignum+4], eax

 mov eax, [bignum]
 div ebx
 mov [bignum], eax

 ;edx holds remainder from division

Again, dividing by values greater than 32 bits cannot be done "easy way".

Comments

Continue to discussion board.

You can contact the author using e-mail vid@x86asm.net.

Visit author's home page.

Revisions

2007-11-20	1.1	Partial rewrite and additions	vid
2007-11-06	1.0	First public version	vid

(dates format correspond to ISO 8601)