Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huffman Coding 💯 lossless data compression algorithm. #2664

Open
Ash914027 opened this issue Oct 9, 2024 · 1 comment
Open

Huffman Coding 💯 lossless data compression algorithm. #2664

Ash914027 opened this issue Oct 9, 2024 · 1 comment

Comments

@Ash914027
Copy link

Huffman Coding Problem

Problem Description

Huffman Coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters based on their frequencies. The more frequent a character appears, the shorter its code. The goal is to reduce the total size of data while preserving all information.

You are given a set of characters and their corresponding frequencies. Your task is to build a Huffman Tree and find the Huffman codes for each character.

Input

  • A set of unique characters.
  • A corresponding frequency for each character.

Output

  • The Huffman codes for each character, represented as binary strings.

Constraints

  • The number of characters, n, will be such that 1 ≤ n ≤ 100.
  • Frequency values are positive integers.

Approach

  1. Create a Min Heap:

    • Each character and its frequency are stored as nodes in a priority queue (min-heap).
  2. Build the Huffman Tree:

    • While there is more than one node in the heap:
      • Extract the two nodes with the lowest frequency.
      • Create a new internal node with these two nodes as children, and the frequency as the sum of their frequencies.
      • Insert this internal node back into the heap.
    • The remaining node in the heap is the root of the Huffman Tree.
  3. Generate Huffman Codes:

    • Perform a depth-first traversal of the Huffman Tree, assigning a binary digit to each left (0) and right (1) edge.
    • When a leaf node is reached, the path taken from the root to the leaf forms the character's code.

Code Implementation in C++

#include <iostream>
#include <queue>
#include <unordered_map>
#include <vector>

using namespace std;

// A Huffman tree node
struct Node {
    char ch;
    int freq;
    Node* left;
    Node* right;

    Node(char c, int f) : ch(c), freq(f), left(nullptr), right(nullptr) {}
};

// Comparison object to order the heap based on frequency
struct Compare {
    bool operator()(Node* left, Node* right) {
        return left->freq > right->freq;
    }
};

// Function to print the Huffman Codes from the root of the tree
void printHuffmanCodes(Node* root, string str) {
    if (!root)
        return;

    // If this is a leaf node, then it contains one of the input characters
    if (!root->left && !root->right)
        cout << root->ch << ": " << str << "\n";

    printHuffmanCodes(root->left, str + "0");
    printHuffmanCodes(root->right, str + "1");
}

// Main function to build Huffman Tree and decode characters
void huffmanCoding(const unordered_map<char, int>& freqMap) {
    // Create a priority queue (min heap)
    priority_queue<Node*, vector<Node*>, Compare> pq;

    // Create a leaf node for each character and add it to the priority queue
    for (auto pair : freqMap)
        pq.push(new Node(pair.first, pair.second));

    // Iterate until the heap contains only one node (root of the Huffman tree)
    while (pq.size() != 1) {
        // Remove the two nodes of highest priority (lowest frequency)
        Node* left = pq.top();
        pq.pop();
        Node* right = pq.top();
        pq.pop();

        // Create a new internal node with these two nodes as children
        // The frequency of the new node is the sum of their frequencies
        Node* internalNode = new Node('$', left->freq + right->freq);
        internalNode->left = left;
        internalNode->right = right;

        // Add this node to the heap
        pq.push(internalNode);
    }

    // Print the Huffman codes by traversing the tree
    printHuffmanCodes(pq.top(), "");
}

int main() {
    // Input: Frequency of characters
    unordered_map<char, int> freqMap = {
        {'a', 5}, {'b', 9}, {'c', 12}, {'d', 13}, {'e', 16}, {'f', 45}
    };

    // Build Huffman Tree and display the codes
    huffmanCoding(freqMap);

    return 0;
}
@Ash914027
Copy link
Author

PLEASE ASSIGN ME THIS ISSUE AS HACKTOBERBER ACCEPTED AND ACCEPT MY PULL REQUEST

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant