Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shamir's secret sharing and security vulnerabilities #42

Open
Nyx38 opened this issue Jan 15, 2025 · 1 comment
Open

Shamir's secret sharing and security vulnerabilities #42

Nyx38 opened this issue Jan 15, 2025 · 1 comment

Comments

@Nyx38
Copy link

Nyx38 commented Jan 15, 2025

Entrance:
Target:

Based on a specific Bitcoin address
(bc1qyjwa0tf0en4x09magpuwmt2smpsrlaxwn85lh6)

and the derivation path of this address m/84'/0'/0'/0/0, the mnemonic seed (seed expression) creates this address. ) was to access a secret numerical value associated with . It was also assumed that this secret value was shared with some SSS algorithms.
Starting Data:
Shamir Secret Shares:

share1 (string of words): session cigar grape merry useful churn fatal thought very any arm unaware

share2 (string of words): clock fresh security field caution effort gorilla speed plastic common tomato echo

share3_hex (hexadecimal string): 796f752061726520616e20617765736f6d65206170706c69636174696f6e

I used this code for share3_hex.

`from secretsharing import PlaintextToHexSecretSharer

** Convert the first two shares into Hex format
share1 = 'session cigar grape merry useful churn fatal thought very any arm unaware'
share2 = 'clock fresh security field caution effort gorilla speed plastic common tomato echo'

** Derive the secret from these shares
secret = PlaintextToHexSecretSharer.recover_secret([share1, share2])
print(f"Recovered Secret: {secret}")`

Target Address:
bc1qyjwa0tf0en4x09magpuwmt2smpsrlaxwn85lh6
Derivation Path: m/84'/0'/0'/0/0

Numerical Secret Information (for Lagrange):
Obtained by combining 151804135936564389626333064995458199739042283435000 and the shares formed by this data The value 1268285827517456901482635677053256050022404807332818.
Development:

1. SSS Merge:

First, an attempt was made to combine the shares share1, share2 and share3_hex using the pyssss library.

  • Next, a special SSS algorithm was implemented. In this way, possible weaknesses were tried to be eliminated by providing more control over the SSS algorithm.

  • As a result of SSS merging, a meaningful word sequence such as you are an awesome application was obtained.

**Bug/Vulnerability:

Possible vulnerabilities of the pyssss library and custom SSS implementation. The probability that the obtained seed does not match the mnemonic exactly.

2. Mnemonic Verification and Address Generation:

An attempt was made to create mnemonics that comply with BIP39 standards by using the bip_utils library.

  • Additionally, addresses were generated from mnemonics using BIP84 derivation paths.

Bug/Vulnerability:
Trust in bip_utils library. Insufficient input data.

3. Data Analysis and Variations:

ASCII equivalents of the string obtained from pyssss and the string obtained from numbers were examined, and different combinations were tried.

  • Different search methods were tried manually with the iancoleman/bip39 tool.

Bug/Vulnerability:
Risk of data leakage during manual operations.

4. Code Development and Automation:

  • By improving the Python code step by step, we automated the processes of trying different SSS algorithms, analyzing data in different formats, and deriving addresses.
  • Hashing the seed,
  • Trying the ascii equivalent of Seed,
  • By trying the seed obtained with the special faq function, we were able to evaluate different possibilities.
    Bug/Vulnerability:
    Storing sensitive data within code. Errors occurring in the code.

5. Lagrange Interpolation:

  • According to the given numerical secret information and the code I shared, we obtained a special numerical secret by lagrange interpolation.
    Conclusion:

Reaching the First Mnemonic:

As a result of all these attempts, our code obtained the hexadecimal string 0b6b4b5d515c0d7c7147165618385c1199774527782e105c7410154a26 (which was actually considered the mnemonic we were trying to find). It turned out that this value was actually the shared data itself.

Second Numerical Secret:

The numerical secret obtained by Lagrange interpolation was determined as 1268285827517456901482635677053256050022404807332818.

Verification:

In both cases, it was determined that the target address and number could be accessed with the obtained values.
** Vulnerabilities and Bugs (Detailed):**

  1. Embedding Sensitive Data in Code:

Description: The numerators, target address, derivation path, and results are defined directly in the code.
Error: This results in access to all confidential information if the code is compromised.
Risk: Unauthorized access, information leak.

  1. Dependency on External Libraries:

Description: Reliance on libraries such as pyssss, bip_utils, and hashlib.
Error: Possible vulnerabilities and malicious updates in libraries.
Risk: Vulnerabilities in libraries, incorrect address generation, malicious code.

  1. Manual Operations:

Description: Manual attempts with the iancoleman/bip39 tool.
Error: Incorrect data entry or risk of information being exposed.
Risk: Information leak, data loss.

5.Communication Not Encrypted:

Explanation: The information shared in this chat environment is not encrypted.
Error: Risk of data interception during communication.
Risk: Interception and capture of information.

6.Security of Private SSS Algorithm:

Description: Potential bugs of custom written FAQ functions.
Error: Risk of obtaining incorrect results as a result of logical errors in the code.
Risk: Incorrect secret generation.

7.Conversions Required for Seed:

Description: Difficulties in converting Seed to mnemonic.
Error: We got wrong results because we did not apply the correct transformations to the seed at first.
Risk: Failure to access the seed and data loss.

  1. Error in Lagrange Interpolation:

Explanation: Wrong coefficients entered in Lagrange interpolation
Error: The correct coefficients were not calculated initially.
Risk: Getting wrong results

Changes That Will Make It Harder to Find the Secret:

1-) Completely Separating Sensitive Data from Code:

Change: Store data in encrypted files or hardware security modules. Access to this data from the code must be provided with an encryption key or similar secure methods.
Impact: Even if the code is compromised, data becomes very difficult to access.

2-) Minimizing the Use of External Libraries:

Change: Use standard and open source libraries as much as possible for cryptographic operations and perform security auditing.
Impact: Reduces the likelihood of being affected by vulnerabilities in libraries.

3-) Data Encryption and Communication Security:

Change: Use encryption (for example, PGP or AES) when storing and transferring all sensitive data.
Impact: Data becomes difficult to read by unauthorized persons.

4-) Increasing the Security of Shares:

Change: Use encryption when storing and transferring shares. Store shares physically in separate locations.
Effect: Ensures that if a single share is compromised, all confidential information cannot be accessed.

5-) Strengthening the FAQ Algorithm:

Change: In custom FAQ applications, use mathematical models to reduce vulnerabilities. Make the shares meaningless by applying methods such as mixing and transforming the shares.
Effect: Makes the algorithm more complex and harder to solve.

6-) Hiding the Seed and Making It Difficult:

Change: Use more complex and less predictable algorithms when creating the seed. Keep the seed in converted format, not direct.
Effect: Makes direct capture of the Seed more difficult.
Summary:

This composition covers the entire process from start to finish, examining the errors and security vulnerabilities in detail. It also aims to help develop more secure applications in the future by providing recommendations and changes that will make it harder to find the secret.

STAGE 1:
First SSS Merge Attempt (with pyssss)
Aim: Trying to recover confidential information by combining shares share1, share2 and share3_hex.

Code:

# Shares are converted to byte array (This step does not appear in the code,)
share1_bytes = ... # byte array created from the words share1
share2_bytes = ... # Byte array created from share2 words
share3_bytes = bytes.fromhex(share3_hex)
shares = [share1_bytes, share2_bytes, share3_bytes]
recovered_secret = pyssss.combine(shares).decode() 
print("Recovered Secret Information:", recovered_secret) ```

**Explanation:** 
* This code tries to combine the shares using the `combine` function of the `pyssss` library.
 The first output is the string `you are an awesome application`. 
**Vulnerability/Bug:**
Trust in the `pyssss` library. * The resulting string is not directly mnemonic.

**STAGE 2:
Mnemonic Verification and Address Derivation (with `bip_utils`)**
**Purpose:** To check whether the data obtained is a valid mnemonic in BIP39 format and to perform address derivation operations.

**Code:**

    ```python
    from bip_utils import Bip39MnemonicValidator, Bip39WordsFinder, Bip39Languages, Bip39SeedGenerator, Bip84
    Word_list = Bip39WordsFinder.GetWordsList(Bip39Languages.ENGLISH)
    mnemonic_validator = Bip39MnemonicValidator(word_list)
    if not mnemonic_validator.IsValid(recovered_secret):
        print("mnemonic code is invalid!")
    else:
        seed_bytes = Bip39SeedGenerator(recovered_secret, kelime_listesi).Generate()
        bip84_obj = Bip84.FromSeed(seed_bytes)
        account_obj = bip84_obj.Purpose().Coin().Account(0)
        chain_obj = account_obj.Change(0)
        address_obj = chain_obj.Address(0)
        address = address_obj.ToAddress()
        print ("Adres: ",address)
    ```

**Explanation:**

This code checks whether the data in the `recovered_secret` variable is a valid BIP39 mnemonic using the `bip_utils` library. * If mnemonic is valid, it tries to derive an address using this mnemonic.
**Vulnerability/Bug:**

 Reliance on the `bip_utils` library. * `recovered_secret` is not a valid mnemonic.


**STAGE 3:**
Data Analysis and Variations (With Manual Processes and Code Help)**
* **Purpose:** 
 try to reach possible mnemonics by analyzing different forms of the data obtained.


**Code (ASCII conversions):**
    ```python
    def string_to_ascii_bits(input_string):
        binary_string = ''.join(format(ord(char), '08b') for char in input_string)
        ascii_chars = []
        for i in range(0, len(binary_string), 8):
            byte = binary_string[i:i+8]
            if len(byte) == 8: # We extract the incomplete bytes at the end             
ascii_chars.append(chr(int(byte, 2)))
        return ascii_chars

    def string_to_ascii_hex(input_string):
      hex_string = input_string.encode().hex()
      return hex_string

    def string_to_ascii_int_list(input_string):
        int_list = [ord(char) for char in input_string]
        return int_list

    # --- INPUT VALUES ---
    input_string = "you are an awesome application"

    # --- MAKE CONVERSION ---
    ascii_chars = string_to_ascii_bits(input_string)
    ascii_hex = string_to_ascii_hex(input_string)
    ascii_int_list = string_to_ascii_int_list(input_string)
    # --- PRINT OUTPUtS ---
    print("8-Bit ASCII Characters:", ascii_chars)
    print("ASCII Hex:", ascii_hex)
    print("ASCII Integer List:", ascii_int_list)
    ```

**Explanation:**
* This code converts the string `you are an awesome application` into different ASCII formats.
* This was to try to find a clue about mnemonic by looking at data in different formats.
**Vulnerability/Bug:**
* Risk of data leakage during manual operations.
* No direct results can be obtained from this analysis method.
**Phase 4: Custom SSS Implementation and Automation of All Experiments**
* **Purpose:** Using all the information and using the SSS algorithm we wrote, to find the mnemonic that reaches the target address.
**CODE:**
 
  ```python
  import random
  import time
  from pybtc.functions.entropy import generate_entropy
  from bip_utils import Bip39WordsNum, Bip39MnemonicGenerator, Bip39MnemonicValidator, Bip39WordsFinder, Bip32, Bip84, Bip39SeedGenerator, Bip39Languages, P2WPKHAddrEncoder
  from hashlib import sha256

  def _precompute_gf256_exp_log():
      exp = [0 for i in range(255)]
      log = [0 for i in range(256)]
      poly = 1
      for i in range(255):
          exp[i] = poly
          log[poly] = i
          # Multiply poly by the polynomial x + 1.
          poly = (poly << 1) ^ poly
          # Reduce poly by x^8 + x^4 + x^3 + x + 1.
          if poly & 0x100:
              poly ^= 0x11B

      return exp, log

  EXP_TABLE, LOG_TABLE = _precompute_gf256_exp_log()


  def _gf256_mul(a, b):
      if a == 0 or b == 0:
          return 0
      return EXP_TABLE[ (LOG_TABLE[a] + LOG_TABLE[b]) % 255 ]

  def _gf256_pow(a, b):
      if b == 0:
          return 1
      if a == 0:
          return 0
      c = a
      for i in range(b - 1):
          c = _gf256_mul(c,a)
      return c

  def _gf256_add(a, b):
      return a ^ b

  def _gf256_sub(a, b):
      return a ^ b

  def _gf256_inverse(a):
      if a == 0:
          raise ZeroDivisionError()
      return EXP_TABLE[ (-LOG_TABLE[a]) % 255 ]

  def _gf256_div(a, b):
      if b == 0:
          raise ZeroDivisionError()
      if a == 0:
          return 0
      r = _gf256_mul(a, _gf256_inverse(b))
      assert a == _gf256_mul(r, b)
      return r


  def _fn(x, q):
      r = 0
      for i, a in enumerate(q):
          r = _gf256_add(r, _gf256_mul(a,_gf256_pow(x,i)))
      return r

  def _interpolation(points, x=0):
      k = len(points)
      if k < 2:
          raise Exception("Minimum 2 points required")

      points = sorted(points, key=lambda z: z[0])

      p_x = 0
      for j in range(k):
          p_j_x  = 1
          for m in range(k):

              if m == j:
                  continue
              a =  _gf256_sub(x, points[m][0])
              b =  _gf256_sub(points[j][0], points[m][0])
              c = _gf256_div(a, b)
              p_j_x = _gf256_mul(p_j_x, c)

          p_j_x = _gf256_mul( points[j][1], p_j_x)
          p_x  = _gf256_add(p_x , p_j_x)


      return p_x

  def split_secret(threshold, total,  secret, index_bits=8):
      if not isinstance(secret, bytes):
          raise TypeError("Secret as byte string required")
      if threshold > 255:
          raise ValueError("threshold <= 255")
      if total > 255:
          raise ValueError("total shares <= 255")
      index_max = 2 ** index_bits - 1
      if total > index_max:
          raise ValueError("index bits is to low")

      shares = dict()
      shares_indexes = []


      while len(shares) != total:
          q = random.SystemRandom().randint(1, index_max)
          if q in shares:
              continue
          shares_indexes.append(q)
          shares[q] = b""

      e = generate_entropy(hex=False)
      e_i = 0
      for b in secret:
          q = [b]

          for i in range(threshold - 1):
              if e_i < len(e):
                  a = e[e_i]
                  e_i += 1
              else:
                  e = generate_entropy(hex=False)
                  a = e[0]
                  e_i = 1
              q.append(a)

          for z in shares_indexes:
              shares[z] += bytes([_fn(z, q)])

      return shares

  def restore_secret(shares, x=0):
      secret = b""
      share_length = None
      for share in shares:
          if share < 1 or share > 255:
              raise Exception("Invalid share index %s" % share)
      for share in shares.values():
          if share_length is None:
              share_length = len(share)
          if share_length != len(share) or share_length == 0:
              raise Exception("Invalid shares")

      for i in range(share_length):
          secret += bytes([_interpolation([(z, shares[z][i]) for z in  shares], x=x)])
      return secret

  def translate_words_to_numbers (words, word_list):
      numbers = []
      for word in words:
          numbers.append(word_list.find_word_index(word))
      return numbers

  def convert_numbers_to_words (numbers, word_list):
      words = []
      for number in numbers:
          kelimeler.append(word_list.get_word_by_index(sayı))
      return words

  def list_to_bytes(number_list):
      # convert list to a string
      str_list = str(number_list)
      # Convert string to byte array
      byte_str = str_list.encode()
      return byte_str

  def generate_address_from_mnemonic(mnemonic, derivation_path):
      word_list = Bip39WordsFinder.GetWordsList(Bip39Languages.ENGLISH)

      mnemonic_validator = Bip39MnemonicValidator(word_list)
      if not mnemonic_validator.IsValid(mnemonic):
        return None
    
      seed_bytes = Bip39SeedGenerator(mnemonic, word_list).Generate()

      bip84_obj = Bip84.FromSeed(seed_bytes)
      account_obj = bip84_obj.Purpose().Coin().Account(0)
      chain_obj = account_obj.Change(0)
      address_obj = chain_obj.Address(0)

      return address_obj.ToAddress()

  def number_to_ascii_bits(number_str):
 # We get the number as a string
# Convert number to integer     
number = int(number_str)

      # Complete the binary string to multiples of 8
      binary_string = bin(number)[2:]
    
      # Binary stringi 8'in katlarına tamamla
      while len(binary_string) % 8 != 0:
          binary_string = '0' + binary_string
    
      # List to hold ASCII characters
      ascii_chars = []
    
      for i in range(0, len(binary_string), 8):
          byte = binary_string[i:i+8]
          ascii_chars.append(chr(int(byte,2)))

      return ascii_chars
    
  def number_to_ascii_hex(number_str):
      number = int(number_str)
      hex_string = hex(number)[2:]
      #We complete the string with 2 characters for each byte
      if len(hex_string) % 2 != 0:
        hex_string = '0' + hex_string
      return hex_string
    
  def number_to_ascii_int_list(number_str):
      number = int(number_str)
      binary_string = bin(number)[2:]
      while len(binary_string) % 8 != 0:
          binary_string = '0' + binary_string
      int_list = []
      for i in range(0, len(binary_string), 8):
          byte = binary_string[i:i+8]
          int_list.append(int(byte,2))
      return int_list
  def find_mnemonic_for_address(share1_words, share2_words, share3_hex, target_address, number_str):
      word_list = Bip39WordsFinder.GetWordsList(Bip39Languages.ENGLISH)
      numbers1 = translate_words_to_numbers(share1_words, word_list)
      numbers2 = translate_words_to_numbers(share2_words, word_list)
      share1_bytes = list_to_bytes(numbers1)
      share2_bytes = list_to_bytes(numbers2)
      share3_bytes = bytes.fromhex(share3_hex)
      shares = [share1_bytes, share2_bytes, share3_bytes]
      recovered_secret = restore_secret(shares)
    
      if not recovered_secret:
         return None
    
      # FIRST WE TRY THE SEED ITSELF
      address = generate_address_from_mnemonic(recovered_secret.decode(), "m/84'/0'/0'/0/0")
      if address == target_address:
        return recovered_secret.decode()
    
      #WE ARE TRYING TO HASH THE SEED
      hashed_secret = sha256(recovered_secret).hexdigest()
      address = generate_address_from_mnemonic(hashed_secret, "m/84'/0'/0'/0/0")
      if address == target_address:
        return hashed_secret
    
    
      # NOW LETS TRY POSSIBLE MNEMONICS
      potential_mnemonics = [recovered_secret.decode()]
      for potential_mnemonic in potential_mnemonics:
          address = generate_address_from_mnemonic(potential_mnemonic, "m/84'/0'/0'/0/0")
          if address == target_address:
            return potential_mnemonic
      
      seed_bytes = Bip39SeedGenerator(recovered_secret.decode(), kelime_listesi).Generate()
      bip84_obj = Bip84.FromSeed(seed_bytes)
      account_obj = bip84_obj.Purpose().Coin().Account(0)
      chain_obj = account_obj.Change(0)
      address_obj = chain_obj.Address(0)
      address = address_obj.ToAddress()
      if address == target_address:
          return recovered_secret.decode()
    
      
      #NOW LETS TRY THE NUMBER
      ascii_chars = number_to_ascii_bits(number_str)
      potential_mnemonics.append("".join(ascii_chars))
    
      ascii_hex = number_to_ascii_hex(number_str)
      potential_mnemonics.append(ascii_hex)
    
      ascii_int_list = number_to_ascii_int_list(number_str)
      potential_mnemonics.append(str(ascii_int_list))
    
      for potential_mnemonic in potential_mnemonics:
          address = generate_address_from_mnemonic(potential_mnemonic, "m/84'/0'/0'/0/0")
          if address == target_address:
            return potential_mnemonic
      
      
      return None

  # --- INPUT VALUES ---
  share1_words = "session cigar grape merry useful churn fatal thought very any arm unaware".split()
  share2_words = "clock fresh security field caution effort gorilla speed plastic common tomato echo".split()
  share3_hex = "796f752061726520616e20617765736f6d65206170706c69636174696f6e"
  target_address = "bc1qyjwa0tf0en4x09magpuwmt2smpsrlaxwn85lh6"
  number_str = "151804135936564389626333064995458199739042283435000"
  
# --- FIND THE RESULT ---

  found_mnemonic = find_mnemonic_for_address(share1_words, share2_words, share3_hex, target_address, number_str)

  if found_mnemonic:
      print("Matching mnemonic found:", found_mnemonic)
      address = generate_address_from_mnemonic(found_mnemonic, "m/84'/0'/0'/0/0")
      if address == target_address:
          print("Derived adress is correct:", address)
      else:
          print("Derived adress is incorrect", address)
  else:
    print("No matching mnemonic found!")
  • Explanation:

This code aims to try all possibilities and reach the target address by combining all the previous steps.
*First, SSS merging is done and then the target is tried to be reached by using the seed, its hash, and its ascii equivalents.

Vulnerability/Bug:

  • Keeping sensitive data in code,
  • errors that may occur in the special SSS function we wrote ourselves,
  • Risks such as reliance on the bip_utils library and possible logic errors in the code still remain.

Stage 5: Lagrange Interpolation

Purpose: To obtain hidden numerical information from shared information with Lagrange interpolation.
Code:

def calculate_lagrange_coefficient(shares, index):
    k = len(shares)
    result = 1
    for j in range(k):
        if index == shares[j][0]:
            continue
        result *= -shares[j][0] / (shares[index][0] - shares[j][0])
    return result

def calculate_lagrange_interpolation(shares):
  k = len(shares)
  result = 0
  for i in range(k):
      result +=  shares[i][1] * calculate_lagrange_coefficient(shares, i)
  return int(result)


# Shares issued
share1_indices = [1643, 326, 847, 1125, 1890, 320, 688, 1785, 1943, 82, 94, 1885]
share2_indices = [335, 732, 1607, 691, 291, 614, 844, 1720, 1332, 335, 1815, 556]
y3 = 945228796551375356983668344418316356292000484980014

y1 = int("".join(map(str, share1_indices)))
y2 = int("".join(map(str, share2_indices)))

# Define the numerators as (x,y) pairs
shares = [
    (1, y1),
    (2, y2),
    (3, y3)
]

# Calculate secret
secret = calculate_lagrange_interpolation(shares)

# print result
print("your secret:", secret)

Explanation: This code aims to reach the numerical order by performing lagrange interpolation with the numerators obtained in the previous steps.
-With this code, the secret numerical value was reached.

Vulnerability/Bug:
-Possibility of direct allocation of shares and errors in interpolation.
CONCLUSION:

Confidential information was found:
The data could be accessed both with the special SSS algorithm and with Lagrange interpolation.
Learned:

  • The importance of cryptography,
  • Using different FAQ algorithms,
  • Protection of sensitive data,
  • The necessity of writing secure code.
@Nyx38
Copy link
Author

Nyx38 commented Jan 15, 2025

It is the attack method that allows the exposed implementation of the secret sharing scheme to be compromised. @4tochka

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant