Card deck used to compute SHA-256 hashes on IBM 1401 mainframe. Behind the card deck is the line printer output showing the input to the algorithm and the resulting hash.
Line printer and IBM 1401 mainframe at the Computer History Museum. This is the computer I used to run my program. The console is in the upper left. Each of the dark rectangular panels on the computer is a "gate" that can be folded out for maintenance.
Mining requires a task that is very difficult to perform, but easy to verify. Bitcoin mining uses cryptography, with a hash function called double SHA-256. A hash takes a chunk of data as input and shrinks it down into a smaller hash value (in this case 256 bits). With a cryptographic hash, there's no way to get a hash value you want without trying a whole lot of inputs. But once you find an input that gives the value you want, it's easy for anyone to verify the hash. Thus, cryptographic hashing becomes a good way to implement the Bitcoin "proof-of-work".
In more detail, to mine a block, you first collect the new transactions into a block. Then you hash the block to form an (effectively random) block hash value. If the hash starts with 16 zeros, the block is successfully mined and is sent into the Bitcoin network. Most of the time the hash isn't successful, so you modify the block slightly and try again, over and over billions of times. About every 10 minutes someone will successfully mine a block, and the process starts over. It's kind of like a lottery, where miners keep trying until someone "wins". It's hard to visualize just how difficult the hashing process is: finding a valid hash is less likely than finding a single grain of sand out of all the sand on Earth. To find these hashes, miners have datacenters full of specialized hardware to do this mining.
I've simplified a lot of details. For in-depth information on Bitcoin and mining, see my articles
Bitcoins the hard way
and Bitcoin mining the hard way.
SHA-256 round, from Wikipedia created by kockmeyer, CC BY-SA 3.0.
The dark blue boxes mix up the values in non-linear ways that are hard to analyze cryptographically.
(If you could figure out a mathematical shortcut to generate successful hashes, you could take over Bitcoin mining.)
The Ch"choose" box chooses bits from F or G, based on the value of input E.
The Σ "sum" boxes rotate the bits of A (or E) to form three rotated versions, and then sums them together modulo 2.
The Ma"majority" box looks at the bits in each position of A, B, and C, and selects 0 or 1, whichever value is in the majority.
The red boxes perform 32-bit addition, generating new values for A and E.
The input Wtis based on the input data, slightly processed. (This is where the input block gets fed into the algorithm.)
The input Ktis a constant defined for each round.
As can be seen from the diagram above, only A and E are changed in a round. The other values pass through unchanged, with the old A value becoming the new B value, the old B value becoming the new C value and so forth.
Although each round of SHA-256 doesn't change the data much, after 64 rounds the input data will be completely scrambled, generating the unpredictable hash output.
This shows a rack (called a "gate") folded out of the IBM 1401 mainframe. The photo shows the SMS cards used to implement the circuits. This specific rack controls the tape drives.
Internally, the computer was very different from modern computers. It didn't use 8-bit bytes, but 6-bit characters based on binary coded decimal (BCD).
Since it was a business machine, the computer used decimal arithmetic instead of binary arithmetic and each character of storage held a digit, 0 through 9.
The computer came with 4000 characters of storage in magnetic core memory; a dishwasher-sized memory expansion box provided 12,000 more characters of storage.
The computer was designed to use punched cards as input, with a card reader that read the program and data. Output was printed on a fast line printer or could be punched on more cards.
The Computer History Museum in Mountain View has two working IBM 1401 mainframes. I used one of them to run the SHA-256 hash code.
For more information on the IBM 1401, see my article
Fractals on the IBM 1401.
job bitcoin
* SHA-256 hash
* Ken Shirriff //righto.com
ctl 6641
org 087
X1 dcw @000@
org 092
X2 dcw @000@
org 097
X3 dcw @000@
org 333
start cs 299
r
sw 001
lca 064, input0
mcw 064, 264
w
* Initialize word marks on storage
mcw +s0, x3
wmloop sw 0&x3
ma @032@, x3
c +h7+32, x3
bu wmloop
mcw +input-127, x3 * Put input into warr[0] to warr[15]
mcw +warr, x1
mcw @128@, tobinc
b tobin
* Compute message schedule array w[0..63]
mcw @16@, i
* i is word index 16-63
* x1 is start of warr[i-16], i.e. bit 0 (bit 0 on left, bit 31 on right)
mcw +warr, x1
wloop c @64@, i
be wloopd
* Compute s0
mcw +s0, x2
za +0, 31&x2 * Zero s0
* Add w[i-15] rightrotate 7
sw 7&x2 * Wordmark at bit 7 (from left) of s0
a 56&x1, 31&x2 * Right shifted: 32+31-7 = bit 24 of w[i-15], 31 = end of s0
a 63&x1, 6&x2 * Wrapped: 32+31 = end of w[i-15], 7-1 = bit 6 of s0
cw 7&x2 * Clear wordmark
* Add w[i-15] rightrotate 18
sw 18&x2 * Wordmark at bit 18 (from left) of s0
a 45&x1, 31&x2 * Right shifted: 32+31-18 = bit 13 of w[i-15], 31 = end of s0
a 63&x1, 17&x2 * Wrapped: 32+31 = end of w[i-15], 18-1 = bit 17 of s0
cw 18&x2 * Clear wordmark
* Add w[i-15] rightshift 3
sw 3&x2 * Wordmark at bit 3 (from left) of s0
a 60&x1, 31&x2 * Right shifted: 32+31-3 = bit 28 of w[i-15], 31 = end of s0
cw 3&x2 * Clear wordmark
* Convert sum to xor
mcw x1, x1tmp
mcw +s0+31, x1 * x1 = right end of s0
mcw @032@, x2 * Process 32 bits
b xor
sw s0 * Restore wordmark cleared by xor
mcw x1tmp, x1
* Compute s1
mcw +s1, x2
za +0, 31&x2 * Zero s1
* Add w[i-2] rightrotate 17
sw 17&x2 * Wordmark at bit 17 (from left) of s1
a 462&x1, 31&x2 * Right shifted: 14*32+31-17 = bit 14 of w[i-2], 31 = end of s1
a 479&x1, 16&x2 * Wrapped: 14*32+31 = end of w[i-2], 17-1 = bit 16 of s1
cw 17&x2 * Clear wordmark
* Add w[i-2] rightrotate 19
sw 19&x2 * Wordmark at bit 19 (from left) of s1
a 460&x1, 31&x2 * Right shifted: 14*32+31-19 = bit 12 of w[i-2], 31 = end of s1
a 479&x1, 18&x2 * Wrapped: 14*32+31 = end of w[i-2], 19-1 = bit 18 of s1
cw 19&x2 * Clear wordmark
* Add w[i-2] rightshift 10
sw 10&x2 * Wordmark at bit 10 (from left) of s1
a 469&x1, 31&x2 * Right shifted: 14*32+31-10 = bit 21 of w[i-2], 31 = end of s1
cw 10&x2 * Clear wordmark
* Convert sum to xor
mcw +s1+31, x1 * x1 = right end of s1
mcw @032@, x2 * Process 32 bits
b xor
sw s1 * Restore wordmark cleared by xor
* Compute w[i] := w[i-16] + s0 + w[i-7] + s1
mcw x1tmp, x1
a s1+31, s0+31 * Add s1 to s0
a 31&x1, s0+31 * Add w[i-16] to s0
a 319&x1, s0+31 * Add 9*32+31 = w[i-7] to s0
* Convert bit sum to 32-bit sum
mcw +s0+31, x1 * x1 = right end of s0
mcw @032@, x2 * Process 32 bits
b sum
sw s0 * Restore wordmark cleared by sum
mcw x1tmp, x1
mcw s0+31, 543&x1 * Move s0 to w[i]
ma @032@, x1
a +1, i
mz @0@, i
b wloop
x1tmp dcw #5
* Initialize: Copy hex h0init-h7init into binary h0-h7
wloopd mcw +h0init-7, x3
mcw +h0, x1
mcw @064@, tobinc * 8*8 hex digits
b tobin
* Initialize a-h from h0-h7
mcw @000@, x1
ilp mcw h0+31&x1, a+31&x1
ma @032@, x1
c x1, @256@
bu ilp
mcw @000@, bitidx * bitidx = i*32 = bit index
mcw @000@, kidx * kidx = i*8 = key index
* Compute s1 from e
mainlp mcw +e, x1
mcw +s1, x2
za +0, 31&x2 * Zero s1
* Add e rightrotate 6
sw 6&x2 * Wordmark at bit 6 (from left) of s1
a 25&x1, 31&x2 * Right shifted: 31-6 = bit 25 of e, 31 = end of s1
a 31&x1, 5&x2 * Wrapped: 31 = end of e, 6-1 = bit 5 of s1
cw 6&x2 * Clear wordmark
* Add e rightrotate 11
sw 11&x2 * Wordmark at bit 11 (from left) of s1
a 20&x1, 31&x2 * Right shifted: 31-11 = bit 20 of e, 31 = end of s1
a 31&x1, 10&x2 * Wrapped: 31 = end of e, 11-1 = bit 10 of s1
cw 11&x2 * Clear wordmark
* Add e rightrotate 25
sw 25&x2 * Wordmark at bit 25 (from left) of s1
a 6&x1, 31&x2 * Right shifted: 31-25 = bit 6 of e, 31 = end of s1
a 31&x1, 24&x2 * Wrapped: 31 = end of e, 25-1 = bit 24 of s1
cw 25&x2 * Clear wordmark
* Convert sum to xor
mcw +s1+31, x1 * x1 = right end of s1
mcw @032@, x2 * Process 32 bits
b xor
sw s1 * Restore wordmark cleared by xor
* Compute ch: choose function
mcw @000@, x1 * x1 is index from 0 to 31
chl c e&x1, @0@
be chzero
mn f&x1, ch&x1 * for 1, select f bit
b chincr
chzero mn g&x1, ch&x1 * for 0, select g bit
chincr a +1, x1
mz @0@, x1
c @032@, x1
bu chl
* Compute temp1: k[i] + h + S1 + ch + w[i]
cs 299
mcw +k-7, x3 * Convert k[i] to binary in temp1
ma kidx, x3
mcw +temp1, x1
mcw @008@, tobinc * 8 hex digits
b tobin
mcw @237@, x3
mcw +temp1, x1
mcw @008@, tobinc
b tohex
a h+31, temp1+31 * +h
a s1+31, temp1+31 * +s1
a ch+31, temp1+31 * +ch
mcw bitidx, x1
a warr+31&x1, temp1+31 * + w[i]
* Convert bit sum to 32-bit sum
mcw +temp1+31, x1 * x1 = right end of temp1
b sum
* Compute s0 from a
mcw +a, x1
mcw +s0, x2
za +0, 31&x2 * Zero s0
* Add a rightrotate 2
sw 2&x2 * Wordmark at bit 2 (from left) of s0
a 29&x1, 31&x2 * Right shifted: 31-2 = bit 29 of a, 31 = end of s0
a 31&x1, 1&x2 * Wrapped: 31 = end of a, 2-1 = bit 1 of s0
cw 2&x2 * Clear wordmark
* Add a rightrotate 13
sw 13&x2 * Wordmark at bit 13 (from left) of s0
a 18&x1, 31&x2 * Right shifted: 31-13 = bit 18 of a, 31 = end of s0
a 31&x1, 12&x2 * Wrapped: 31 = end of a, 13-1 = bit 12 of s0
cw 13&x2 * Clear wordmark
* Add a rightrotate 22
sw 22&x2 * Wordmark at bit 22 (from left) of s0
a 9&x1, 31&x2 * Right shifted: 31-22 = bit 9 of a, 31 = end of s0
a 31&x1, 21&x2 * Wrapped: 31 = end of a, 22-1 = bit 21 of s0
cw 22&x2 * Clear wordmark
* Convert sum to xor
mcw +s0+31, x1 * x1 = right end of s0
mcw @032@, x2 * Process 32 bits
b xor
sw s0 * Restore wordmark cleared by xor
* Compute maj(a, b, c): majority function
za +0, maj+31
a a+31, maj+31
a b+31, maj+31
a c+31, maj+31
mz @0@, maj+31
mcw @000@, x1 * x1 is index from 0 to 31
mjl c maj&x1, @2@
bh mjzero
mn @1@, maj&x1 * majority of the 3 bits is 1
b mjincr
mjzero mn @0@, maj&x1 * majority of the 3 bits is 0
mjincr a +1, x1
mz @0@, x1
c @032@, x1
bu mjl
* Compute temp2: S0 + maj
za +0, temp2+31
a s0+31, temp2+31
a maj+31, temp2+31
* Convert bit sum to 32-bit sum
mcw +temp2+31, x1 * x1 = right end of temp1
b sum
mcw g+31, h+31 * h := g
mcw f+31, g+31 * g := f
mcw e+31, f+31 * f := e
za +0, e+31 * e := d + temp1
a d+31, e+31
a temp1+31, e+31
mcw +e+31, x1 * Convert sum to 32-bit sum
b sum
mcw c+31, d+31 * d := c
mcw b+31, c+31 * c := b
mcw a+31, b+31 * b := a
za +0, a+31 * a := temp1 + temp2
a temp1+31, a+31
a temp2+31, a+31
mcw +a+31, x1 * Convert sum to 32-bit sum
b sum
a @8@, kidx * Increment kidx by 8 chars
mz @0@, kidx
ma @032@, bitidx * Increment bitidx by 32 bits
c @!48@, bitidx * Compare to 2048
bu mainlp
* Add a-h to h0-h7
cs 299
mcw @00000@, x1tmp
add1 mcw x1tmp, x1
a a+31&x1, h0+31&x1
ma +h0+31, x1 * Convert sum to 32-bit sum
b sum
ma @032@, x1tmp
c @00256@, x1tmp
bu add1
mcw @201@, x3
mcw +h0, x1
mcw @064@, tobinc
b tohex
w
mcw 280, 180
p
p
finis h
b finis
* Converts sum of bits to xor
* X1 is right end of word
* X2 is bit count
* Note: clears word marks
xor sbr xorx&3
xorl c @000@, x2
be xorx
xorfix mz @0@, 0&x1 * Clear zone
c 0&x1, @2@
bh xorok
sw 0&x1 * Subtract 2 and loop
s +2, 0&x1
cw 0&x1
b xorfix
xorok ma @I9I@, x1 * x1 -= 1
s +1, x2 * x2 -= 1
mz @0@, x2
b xorl * loop
xorx b @000@
* Converts sum of bits to sum (i.e. propagate carries if digit > 1)
* X1 is right end of word
* Ends at word mark
sum sbr sumx&3
suml mz @0@, 0&x1 * Clear zone
c 0&x1, @2@ * If digit is <2, then ok
bh sumok
s +2, 0&x1 * Subtract 2 from digit
bwz suml, 0&x1, 1 * Skip carry if at wordmark
a @1@, 15999&x1 * Add 1 to previous position
b suml * Loop
sumok bwz sumx,0&x1,1 * Quit if at wordmark
ma @I9I@, x1 * x1 -= 1
b suml * loop
sumx b @000@ * return
* Converts binary to string of hex digits
* X1 points to start (left) of binary
* X3 points to start (left) of hex buffer
* X1, X2, X3 destroyed
* tobinc holds count (# of hex digits)
tohex sbr tohexx&3
tohexl c @000@, tobinc * check counter
be tohexx
s @1@, tobinc * decrement counter
mz @0@, tobinc
b tohex4
mcw hexchr, 0&x3
ma @004@, X1
ma @001@, X3
b tohexl * loop
tohexx b @000@
* X1 points to 4 bits
* Convert to hex char and write into hexchr
* X2 destroyed
tohex4 sbr tohx4x&3
mcw @000@, x2
c 3&X1, @1@
bu tohx1
a +1, x2
tohx1 c 2&X1, @1@
bu tohx2
a +2, x2
tohx2 c 1&x1, @1@
bu tohx4
a +4, x2
tohx4 c 0&x1, @1@
bu tohx8
a +8, x2
tohx8 mz @0@, x2
mcw hextab-15&x2, hexchr
tohx4x b @000@
* Converts string of hex digits to binary
* X3 points to start (left) of hex digits
* X1 points to start (left) of binary digits
* tobinc holds count (# of hex digits)
* X1, X3 destroyed
tobin sbr tobinx&3
tobinl c @000@, tobinc * check counter
be tobinx
s @1@, tobinc * decrement counter
mz @0@, tobinc
mcw 0&X3, hexchr
b tobin4 * convert 1 char
ma @004@, X1
ma @001@, X3
b tobinl * loop
tobinx b @000@
tobinc dcw @000@
* Convert hex digit to binary
* Digit in hexchr (destroyed)
* Bits written to x1, ..., x1+3
tobin4 sbr tobn4x&3
mcw @0000@, 3+x1 * Start with zero bits
bwz norm,hexchr,2 * Branch if no zone
mcw @1@, 0&X1
a @1@, hexchr * Convert letter to value: A (1) -> 2, F (6) -> 7
mz @0@, hexchr
b tob4
norm c @8@, hexchr
bl tob4
mcw @1@, 0&X1
s @8@, hexchr
mz @0@, hexchr
tob4 c @4@, hexchr
bl tob2
mcw @1@, 1&X1
s @4@, hexchr
mz @0@, hexchr
tob2 c @2@, hexchr
bl tob1
mcw @1@, 2&X1
s @2@, hexchr
mz @0@, hexchr
tob1 c @1@, hexchr
bl tobn4x
mcw @1@, 3&X1
tobn4x b @000@
* Message schedule array is 64 entries of 32 bits = 2048 bits.
org 3000
warr equ 3000
s0 equ warr+2047 *32 bits
s1 equ s0+32
ch equ s1+32 *32 bits
temp1 equ ch+32 *32 bits
temp2 equ temp1+32 *32 bits
maj equ temp2+32 *32 bits
a equ maj+32
b equ a+32
c equ b+32
d equ c+32
e equ d+32
f equ e+32
g equ f+32
h equ g+32
h0 equ h+32
h1 equ h0+32
h2 equ h1+32
h3 equ h2+32
h4 equ h3+32
h5 equ h4+32
h6 equ h5+32
h7 equ h6+32
org h7+32
hexchr dcw @0@
hextab dcw @0123456789abcdef@
i dcw @00@ * Loop counter for w computation
bitidx dcw #3
kidx dcw #3
* 64 round constants for SHA-256
k dcw @428a2f98@
dcw @71374491@
dcw @b5c0fbcf@
dcw @e9b5dba5@
dcw @3956c25b@
dcw @59f111f1@
dcw @923f82a4@
dcw @ab1c5ed5@
dcw @d807aa98@
dcw @12835b01@
dcw @243185be@
dcw @550c7dc3@
dcw @72be5d74@
dcw @80deb1fe@
dcw @9bdc06a7@
dcw @c19bf174@
dcw @e49b69c1@
dcw @efbe4786@
dcw @0fc19dc6@
dcw @240ca1cc@
dcw @2de92c6f@
dcw @4a7484aa@
dcw @5cb0a9dc@
dcw @76f988da@
dcw @983e5152@
dcw @a831c66d@
dcw @b00327c8@
dcw @bf597fc7@
dcw @c6e00bf3@
dcw @d5a79147@
dcw @06ca6351@
dcw @14292967@
dcw @27b70a85@
dcw @2e1b2138@
dcw @4d2c6dfc@
dcw @53380d13@
dcw @650a7354@
dcw @766a0abb@
dcw @81c2c92e@
dcw @92722c85@
dcw @a2bfe8a1@
dcw @a81a664b@
dcw @c24b8b70@
dcw @c76c51a3@
dcw @d192e819@
dcw @d6990624@
dcw @f40e3585@
dcw @106aa070@
dcw @19a4c116@
dcw @1e376c08@
dcw @2748774c@
dcw @34b0bcb5@
dcw @391c0cb3@
dcw @4ed8aa4a@
dcw @5b9cca4f@
dcw @682e6ff3@
dcw @748f82ee@
dcw @78a5636f@
dcw @84c87814@
dcw @8cc70208@
dcw @90befffa@
dcw @a4506ceb@
dcw @bef9a3f7@
dcw @c67178f2@
* 8 initial hash values for SHA-256
h0init dcw @6a09e667@
h1init dcw @bb67ae85@
h2init dcw @3c6ef372@
h3init dcw @a54ff53a@
h4init dcw @510e527f@
h5init dcw @9b05688c@
h6init dcw @1f83d9ab@
h7init dcw @5be0cd19@
input0 equ h7init+64
org h7init+65
dc @80000000000000000000000000000000@
input dc @00000000000000000000000000000100@ * 512 bits with the mostly-zero padding
end start
I punched the executable onto a deck of about 85 cards, which you can see at the beginning of the article. I also punched a card with the input to the hash algorithm. To run the program, I loaded the card deck into the card reader and hit the "Load" button. The cards flew through the reader at 800 cards per minute, so it took just a few seconds to load the program. The computer's console (below) flashed frantically for 40 seconds while the program ran. Finally, the printer printed out the resulting hash (as you can see at the top of the article) and the results were punched onto a new card. Since Bitcoin mining used double SHA-256 hashing, hashing for mining would take twice as long (80 seconds).
The console of the IBM 1401 shows a lot of activity while computing a SHA-256 hash.

On the left, SMS cards inside the IBM 1401. Each card has a handful of components and implements a circuit such as a gate. The computer contains more than a thousand of these cards.
On the right, the Bitfury ASIC chip for mining Bitcoins does 2-3 Ghash/second. Image from zeptobars (CC BY 3.0 license)
IBM 1009 Data Transmission Unit. This dishwasher-sized modem was introduced in 1960 and can transmit up to 300 characters per second over phone lines. Photo from Introduction to IBM Data Processing Systems.
Subscribe to get updates by email.