博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
哈夫曼树与哈夫曼编码和解码实现
阅读量:4048 次
发布时间:2019-05-25

本文共 6972 字,大约阅读时间需要 23 分钟。

今天在复习哈夫曼树的时候,发现了一个不错的B站视频,讲解的非常清晰直观。。有讲到了哈夫曼树的原理及哈夫曼编码解码的过程。原理讲解直接移步上述链接,就不再画蛇添足。作为巩固随手练习一道题:

给定一篇文章,统计里面的各个字符出现的频次,并构建哈夫曼树,实现哈夫曼编码和解码的过程。并且计算哈夫曼编码和定长编码的空间节省了多少?

输入的文本:

each year, the american heart association (aha), in conjunction with the centers for disease control and prevention, the national institutes of health, and other government agencies, brings together the most up-to-date statistics on heart disease, stroke, other vascular diseases, and their risk factors and presents them in its heart disease and stroke statistical update. the statistical update is a valuable resource for researchers, clinicians, healthcare policy makers, media professionals, the lay public, and many others who seek the best national data available on disease morbidity and mortality and the risks, quality of care, medical procedures and operations, and costs associated with the management of these diseases in a single document. indeed, since 1999, the statistical update has been cited more than 8700 times in the literature (including citations of all annual versions). in 2009 alone, the various statistical updates were cited 1600 times (data from isi web of science). in recent years, the statistical update has undergone some major changes with the addition of new chapters and major updates across multiple areas. for this year s edition, the statistics committee, which produces the document for the aha, updated all of the current chapters with the most recent nationally representative data and inclusion of relevant articles from the literature over the past year and added a new chapter detailing how family history and genetics play a role in cardiovascular disease (cvd) risk. also, the 2011 statistical update is a major source for monitoring both cardiovascular health and disease in the population, with a focus on progress toward achievement of the aha s 2020 impact goals. below are a few highlights from this year s update. mortality data are presented according to the underlying cause of death. any-mention mortality means that the condition was nominally selected as the underlying cause or was otherwise mentioned on the death certificate. for many deaths classified as attributable to cvd, selection of the single most likely underlying cause can be difficult when several major comorbidities are present, as is often the case in the elderly population. it is useful, therefore, to know the extent of mortality due to a given cause regardless of whether it is the underlying cause or a contributing cause (ie, its any-mention status). the number of deaths in 2007 with any mention of specific causes of death was tabulated by the nhlbi from the nchs public-use electronic files on mortality. the first set of statistics for each disease in this update includes the number of deaths for which the disease is the underlying cause. two exceptions are chapter 7 (high blood pressure) and chapter 9 (heart failure). high bp, or hypertension, increases the mortality risks of cvd and other diseases, and hf should be selected as an underlying cause only when the true underlying cause is not known. in this update, hypertension and hf death rates are presented in 2 ways: (1) as nominally classified as the underlying cause and (2) as any-mention mortality. national and state mortality data presented according to the underlying cause of death were computed from the mortality tables of the nchs world wide web site, the health data interactive data system of the nchs, or the cdc compressed mortality file. any-mention numbers of deaths were tabulated from the electronic mortality files of the nchs world wide web site and from health data interactive.

step1:统计文章中每个字符出现的次数,并保存在map中

step2:将map中的每一项转化为一个单节点树,构成深林
step3:构建霍夫曼树
step4:获得每个字符的编码,保存在map中
step5:编码文章的字符,并将编码的结果输出
step6:依据编码的文档,加上霍夫曼树的信息,解码文档输出
在这里插入图片描述

#include
#include
#include
#include
using namespace std;struct TreeNode {
int count; char charcate; TreeNode* left; TreeNode* right; TreeNode(int n, char c) {
count = n; charcate = c; left = nullptr; right = nullptr; }};bool compare(TreeNode* s1, TreeNode* s2) {
return s1->count < s2->count;}class Solution {
private: unordered_map
codingMap;public: //step1:统计文章中每个字符出现的次数,并保存在map中 unordered_map
countC(string& s) { ifstream ifs; ifs.open(s); _ASSERT(ifs.is_open()); ifs >> noskipws; char temp; unordered_map
data; while (!ifs.eof()) { ifs >> temp; if (data.find(temp) == data.end()) data[temp] = 1; else data[temp]++; } ifs.close(); return data; } //step2:将map中的每一项转化为一个单节点树,构成深林 vector
mapToNode(unordered_map
& data) { vector
out; for (auto ptr = data.begin(); ptr != data.end(); ptr++) { TreeNode* temp = new TreeNode(ptr->second, ptr->first); out.push_back(temp); } return out; } //step3:构建霍夫曼树 TreeNode* creatTree(vector
& Node) { while (Node.size() > 1) { sort(Node.begin(), Node.end(), compare); TreeNode* left = Node[0]; TreeNode* right = Node[1]; Node.erase(Node.begin()); Node.erase(Node.begin()); TreeNode* new_Node = new TreeNode(left->count + right->count, '#'); new_Node->left = left; new_Node->right = right; Node.push_back(new_Node); } return Node[0]; } //step4:获得每个字符的编码,保存在map中 void codeString(TreeNode* root, string code) { if (root == nullptr) return; if (root->left == nullptr&&root->right == nullptr) { codingMap[root->charcate] = code; return; } codeString(root->left, code + "0"); codeString(root->right, code + "1"); } //step5:编码文章的字符,并将编码的结果输出 void coding(string& input, string& output) { ifstream ifs; ifs.open(input); _ASSERT(ifs.is_open()); ifs >> noskipws; char temp; ofstream ofs; ofs.open(output); _ASSERT(ofs.is_open()); while (!ifs.eof()) { ifs >> temp; string codeCharacte = codingMap[temp]; for (auto single : codeCharacte) ofs << single; } ofs.close(); } //step6:依据编码的文档,加上霍夫曼树的信息,解码文档输出 void encoding(string& input_path, string& output_path, TreeNode* root) { ifstream ifs; ifs.open(input_path); _ASSERT(ifs.is_open()); ifs >> noskipws; char temp; ofstream ofs; ofs.open(output_path); _ASSERT(ofs.is_open()); TreeNode* ptr = root; while (!ifs.eof()) { ifs >> temp; if (temp == '0' && ptr->left != nullptr) ptr = ptr->left; else if (temp == '1' && ptr->right != nullptr) ptr = ptr->right; else { ofs << ptr->charcate; if (temp == '0') ptr = root->left; if (temp == '1') ptr = root->right; } } }};int main() { Solution sol; string input_path = "C:\\Users\\426-4\\Desktop\\artic_input.txt"; //step1:统计文章中每个字符出现的次数,并保存在map中 unordered_map
map = sol.countC(input_path); //step2:将map中的每一项转化为一个单节点树,构成深林 vector
node = sol.mapToNode(map); //step3:构建霍夫曼树 TreeNode* root = sol.creatTree(node); //step4:获得每个字符的编码,保存在map中 sol.codeString(root, ""); //step5:编码文章的字符,并将编码的结果输出 string coding_path = "C:\\Users\\426-4\\Desktop\\artic_coding.txt"; sol.coding(input_path, coding_path); //step6:依据编码的文档,加上霍夫曼树的信息,解码文档输出 string encoding_path = "C:\\Users\\426-4\\Desktop\\artic_encoding.txt"; sol.encoding(coding_path, encoding_path, root); return 0;}

转载地址:http://ityci.baihongyu.com/

你可能感兴趣的文章
做个男人
查看>>
转:S3C2410 bootloader ----VIVI阅读笔记
查看>>
转:嵌入式系统 Boot Loader 技术内幕
查看>>
ARM 的宏定义
查看>>
SIGN UP BEC2
查看>>
S3C2440中对LED驱动电路的理解
查看>>
《天亮了》韩红
查看>>
Windows CE下USB摄像头驱动开发(以OV511为例,附带全部源代码以及讲解) [转]
查看>>
关于货币符号以及发音、币别码
查看>>
关于预处理器的学习
查看>>
ARM,S3C2410中脉宽调制定时器
查看>>
Zebra Bar-One 不能批量打印离散号码
查看>>
Platform创建WinCE内核时的编译错误
查看>>
玻璃杯
查看>>
柳永 《雨霖铃》
查看>>
MD2410开发板通过仿真器烧Bootloader简单流程
查看>>
MD2410仿真器烧Bootloader补充[1]:JTAG
查看>>
Meav《One I Love》
查看>>
林锐《高质量C++/C 编程指南》附录之《C++/C 代码审查表》
查看>>
林锐《高质量C++/C 编程指南》附录之《C++/C 编程质量试题》
查看>>