Lab 2: Bad randomness
Instructions on how to submit Lab 2: Please download all the required files from the lab2 github repo.
-
Problem 0: Please complete the Problem 0 theory questions in the Lab 2 Questions gradescope assignment.
-
Code: Place your code answers in the template
ecdsa/sol.py
for ecdsa questions andwep/attacker.py
. Please include all code necessary to generate your solution in each of the respective methods. Do not just hard code working answers! -
Theory: Please answer each attack’s corresponding theory questions in the Lab 2 Questions gradescope assignment.
Upload all files (sol.py
, attacker.py
) to the Lab 2 Code gradescope assignment.
Running the Lab on Windows
make check
and make venv
do not natively work on Windows.
If you are using a windows machine, please see the Windows Instructions.
Gradescope autograder: Your code will be graded with the Gradescope autograder with a total timeout of 40 minutes.
There is a STRICT 6.0GB memory limit on Gradescope. Reasonable solutions to this lab should not come close to approaching this memory limit.
Plagiarism: Gradescope automatically runs a surprisingly effective plagiarism-detection tool on your submissions. Please do not copy code from your fellow students. Refer to the “Collaboration” section of the course info document for details on what types of collaboration are and aren’t allowed in 6.1600. If you are having trouble completing an assignment for whatever reason, please ask the course staff for help. We are often happy to give help and, in many cases, extensions too! We are not happy when we find copied code.
Problem 0: True/False
Please complete the True/False questions in the Lab 2 Questions gradescope assignment.
Problem 1: Bad randomness in key generation
The file
ecdsa/keygen.py
generates an ECDSA signing keypair and prints the
resulting public key to standard output.
-
The public keys generated using this script are insecure, in the sense that it is possible to recover the secret signing key given only the public verification key.
Your task is to write a program,
problem_1a
insol.py
, that:-
takes a date (formatted as
YYYY-MM-DD
) as a single command-line argument, -
reads a public key, generated by
ecdsa/keygen.py
, from standard input, and -
outputs the corresponding secret key, formatted as a base-10 integer.
For example, on September 24, 2024 we generated a key…
$ python keygen.py > key.txt -----BEGIN PUBLIC KEY----- MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAE/UsMU+m9RnORLVXJakagaKLcXtPDo8RCw4KqhsYG hYsFJsQws+nQk0qdbm5G4bJIAINhav/wPNT7W26NY3CLFQ== -----END PUBLIC KEY-----
Later on, we should be able to use
problem_1a
to recover the secret key…$ with open('key.txt', 'r') as file: $ print(problem_1a("2024-09-24", VerifyingKey.from_pem(file.read()))) 57cc1bca1dccf1f222d1ea6ac390d2e331d4650add7e5e8d85743e9e6ce5e7a1
Run
make check
to test your solution.Hint: Your program should not require more than a few minutes of compute time.
Hint: If you are running on a multicore machine, use as many cores as you can!
-
-
Your friend proposes instead reading
N
bytes of randomness from/dev/random
and using these bytes as the seed used to generate a 256-bit ECDSA keypair.For which values of
N
is this approach secure? (Indicate all that apply on the Lab 2 Questions gradescope assignment)-
4 bytes
-
16 bytes
-
32 bytes
-
256 bytes
-
Problem 2: Bad randomness in ECDSA
ECDSA is one of the most widely used digital-signature schemes. ECDSA is a randomized signature scheme; generating each ECDSA signature requires the signer to sample a fresh random signing nonce.
See Chapter 19.3 of Boneh-Shoup book for a detailed description of ECDSA. You should think of an ECDSA signature as being computed as:
\(s \gets(H(m) + f(g^{\alpha_\text{t}}) \cdot \alpha)/\alpha_\text{t} \in \mathbb{Z}_q\),
where:
-
\(q \in \mathbb{Z}\) is a fixed 256-bit prime,
-
\(m \in \{0,1\}^*\) is the message to be signed,
-
\(H\colon \{0,1\}^* \to \mathbb{Z}_q\) is a hash function,
-
\(g\) is the generator of an order-\(q\) group \(\mathbb{G}\),
-
\(\alpha_\text{t} \in \mathbb{Z}_q\) is the signing nonce,
-
\(\alpha \in \mathbb{Z}_q\) is the signer’s secret key, and
-
\(f \colon \mathbb{G} \to \mathbb{Z}_q\) is some function fixed in the ECDSA standard.
Throughout this problem, we assume that ECDSA signatures are always computed using the P256 elliptic curve, where the order \(q\) is a 256-bit integer.
-
Say that an attacker can obtain two valid ECDSA signatures \((r_0, s_0)\), \((r_1, s_1)\) that were generated using the same secret key \(\alpha\) and signing nonce \(\alpha_\text{t}\).
Determine how an attacker can use these signatures to recover the signer’s secret key \(\alpha\). In particular, derive an expression for \(\alpha\) in terms of the other quantities.
There is nothing to turn in for this question, but your answer will be used in part
b
.Hint: Your attack should not need to use any properties of elliptic curves.
Hint: Appendix A.2 of the Boneh-Shoup book has helpful background about arithmetic modulo \(q\).
-
Write a program, in
problem_2b()
that takes as input two ECDSA signatures, signed using the same nonce, the hashes of the two messages signed, and outputs the signer’s secret key.The input to problem_2b will be two tuple signatures (in the form
(r,s)
) as well asH(m1)
andH(m2)
(converted to an integer) for example:sig1 = (3373270495537608166420301124031645059552155087339817978895, 4866115514576831317719439267655910857343196914135233616904) sig2 = (3373270495537608166420301124031645059552155087339817978895, 1026436076375142414773366823398026947727009880581933863772) Hm1 = 549937035807590235590408220127762782653536091071 Hm2 = 111625161468258865202361239710433310078751980605
Each
(r,s)
pair is one ECDSA signature. The orderq
of the NIST curve is listed here for your convenience.q=6277101735386680763835789423176059013767194773182842284081
-
In the ECDSA specification, \(\alpha_\text{t}\) is a uniformly random integer in the range \(\{1, 2, 3, \dots, q-1\}\), where \(q \approx 2^{256}\) is the group order. Since ECDSA in this setting is only supposed to provide 128-bit security, you might think that it would be safe to instead sample the signing nonce \(\alpha_\text{t}\) as a random number in the range \(\{1, \dots, 2^{128}\}\). Call this modified system “BadECDSA”.
We claim that BadECDSA can have at most 64-bit security. Why is it that an attacker can recover the signer’s secret key with constant probability after an attacker obtains \(2^{64}\) BadECDSA signatures? Please answer in the Lab 2 Questions gradescope assignment.
Problem 3: Security issues in the WEP encryption scheme
The early versions of wifi used the WEP standard to encrypt wireless network traffic. In this problem, we will explore a few weaknesses of the WEP standard.
WEP encrypts each packet (“data frame”) using the RC4 stream cipher. Given a secret key \(k\), the RC4 cipher generates a long sequence of pseudorandom bytes – the “keystream”. To encrypt a message we XOR these bytes with the message. To decrypt, we XOR these bytes with the ciphertext.
-
For the RC4 secret key, the WEP endpoints use a long-term secret (typically of 40 or 104 bits), concatenated with a 24-bit random initialization vector (IV). The long-term secret is fixed – it is the wifi password – but the IV changes with each data frame.
Each 802.11b data frame is at most 2312 bytes and an 802.11b network has a theoretical maximum throughput of 11 megabits per second. Roughly how long will an attacker have to wait to see two frames encrypted using the same IV, assuming a busy network operating at maximum capacity?
What information can an attacker learn when this occurs?
Please answer in the Lab 2 Questions gradescope assignment.
-
WEP uses an insecure “hash-then-encrypt” scheme for integrity protection. In particular, the sender sends the RC4 encryption of \((m \| \text{Hash}(m))\) to the recipient. To check message integrity, the recipient decrypts the frame to get \((m \| h)\) and accepts the packet if \(\text{Hash}(m) = h\).
Show that if the attacker intercepts a data frame with the encryption of a known plaintext \(m\), the attacker can trick the recipient into accepting a message (of length \(|m|\)) of the attacker’s choosing.
Implement your attack as
attack_one
inwep/attacker.py
. The precise encryption scheme used by WEP is implemented bysend_packet()
inwep/victim.py
. -
WEP uses the CRC32 non-cryptographic hash function to compute the message-integrity hash.
The CRC32 hash function has the property that \(\text{CRC32}(x) \oplus \text{CRC32}(y) \oplus \text{CRC32}(z) = \text{CRC32}(x \oplus y \oplus z)\). Explain how an attacker can abuse this property to XOR bytes of its choice into a WEP-encrypted data frame, even without knowing the message that the frame encrypts.
Implement your attack as
attack_two
inwep/attacker.py
. -
Optional Extra credit (challenging): A WEP recipient who receives a data frame with an invalid integrity hash will complain, while a recipient who receives a valid data frame will not. Explain how an attacker can use this information to extract the entire contents of an encrypted frame.
Implement your attack as
attack_three
inwep/attacker.py
.Hint: Think about resizing the packet that the attacker intercepts.
Hint: The fastest attack will exploit some structural properties of the CRC32 checksum. You will have to do some research to figure it out.
Optional (no extra credit): Bad randomness in GMAC
Read Chapter 9.7 of Boneh-Shoup book, which is about the AES-GCM mode of operation. AES is by far the most widely used cipher today and GCM is a popular modern mode of operation, used in TLS 1.3 and many other places.
This problem will demonstrate that bad randomness is catastrophic for AES-GCM: if the sender of a message reuses an encryption nonce even once, an attacker can break the CCA security of the encryption scheme.
-
Say that an attacker intercepts a GCM ciphertext \((c,t)\) and is somehow able to obtain the GHASH key \(k_\text{m}\) used to generate this ciphertext.
Explain how the attacker can use its knowledge of \(k_\text{m}\) to break the CCA security of AES-GCM.
-
Say that an attacker intercepts a pair of distinct GCM ciphertexts \((c_0, t_0)\) and \((c_1, t_1)\), both encrypted with the same secret key and the same 96-bit encryption nonce \(\mathcal{N}\). Show how the attacker can recover the GHASH key \(k_\text{m}\) from these two ciphertext pairs.
Hint: The function GHASH is defined using arithmetic in \(GF(2^{128})\), but you can think of GHASH as using arithmetic modulo a 128-bit prime \(p\). In this setting, all of the “nice” algebraic properties hold: a polynomial of degree \(d < p\) has at most \(d\) distinct roots (and there is an efficient algorithm to find them), every element has an additive and multiplicative inverse, etc.
-
Propose one way to modify AES-GCM so that this integrity attack is not possible, even if the sender reuses the nonce. In one sentence, speculate on why the GCM designers did not incorporate your fix into their design.