encoding - How to differentiate between optimal prefix codes and Huffman codes? - Stack Overflow

QuestionWhile studying encoding theory, I encountered a set of code tables where I need to determine w

Question While studying encoding theory, I encountered a set of code tables where I need to determine whether they belong to one of the following categories:

  • A:Prefix codes
  • B:Huffman codes
  • C:Optimal prefix codes
  • D:None of the above

I understand the basic properties of prefix codes and Huffman codes:

Here is the specific issue I’m facing: I have the following code table (CODE4):

Symbol Frequency CODE4
A 3 100
B 4 101
C 8 01
D 5 110
E 6 111
F 10 00

Question While studying encoding theory, I encountered a set of code tables where I need to determine whether they belong to one of the following categories:

  • A:Prefix codes
  • B:Huffman codes
  • C:Optimal prefix codes
  • D:None of the above

I understand the basic properties of prefix codes and Huffman codes:

Here is the specific issue I’m facing: I have the following code table (CODE4):

Symbol Frequency CODE4
A 3 100
B 4 101
C 8 01
D 5 110
E 6 111
F 10 00

My Question:

Why is CODE4 classified as an optimal prefix code but not a Huffman code?

Is there a systematic way to determine whether a code is an optimal prefix code but not a Huffman code?

Share Improve this question edited Nov 18, 2024 at 14:26 JosefZ 30.3k6 gold badges51 silver badges91 bronze badges asked Nov 18, 2024 at 13:18 459zyt459zyt 751 silver badge7 bronze badges 0
Add a comment  | 

2 Answers 2

Reset to default 2

Try to run Huffman algorithm on the frequencies - you would see, that those codes could not be produced by a Huffman algorithm. Huffman always merges nodes with smallest frequencies, but to get those exact codes, you would have to merge nodes that are not smallest.

To get those codes, the following steps must be performed:

  1. A merged with B (both have a prefix of 10) for a total frequency of 7
  2. D merged with E (both have a prefix of 11) for a total frequency of 11
  3. AB merged with DE (both have a prefix of 1). But that's impossible for Huffman, because we are merging frequencies of 7 and 11, while C (frequency 8 ) is still present!

Note, however, that this code still gives each symbol the same number of bits as Huffman would. Because we know that Huffman is always an optimal prefix code - this code is also optimal.

To answer the title question, it might take a bit of work to tell whether a given optimal prefix code can be generated by Huffman's algorithm. Not all can. Also I'm not sure what exactly is meant by a "Huffman code" in this case. More about that later. But, yes, there is a systematic way.

It is straightforward to see if a code is a prefix code. If not, then you're done (D).

If it is a prefix code, which this one is (reading left to right), then you can run Huffman's algorithm and compute the total number of bits to encode the given probabilities, x. Then compare that to the total number of bits using the prefix code, y. If x < y, then the prefix code is not optimal. If x == y, then it is optimal. (It is not possible to have x > y.)

If the prefix code is optimal, which this one is, now you would need to generate all possible Huffman trees for the given probability. There can be many if there are ties when deciding which branches or leaves to merge next. For all ties, make all possible choices to proliferate the generated trees. Then you would need to see if any of those trees match the given prefix code.

In this case, it does not. For these frequencies there are no ties, so there is only one Huffman tree. And it does not match the CODE4 tree. The first tree here is the Huffman code, and the second tree is the given CODE4 prefix code:

They are topologically distinct.

We will consider the case where one of the trees does match. Now you can assign 0's and 1's to each branch of the tree generated by Huffman's algorithm (with the given tie decisions) to match the given optimal prefix code. But then what algorithm exactly would just so happen to make those choices? For that matter, what is the algorithm that would happen to make the same tie decisions? Does the algorithm need to be "simple" in some sense, where it doesn't have a bunch of code stuck in just to match the tie and 0/1 decisions for a particular prefix code?

I would assert that the question is ambiguous for an optimal prefix code that does not "naturally" fall out using some straightforward implementation of Huffman's algorithm.

By the way, there is a sneaky shortcut here. If you know that only one answer is correct, then it cannot be a Huffman code, since then answers B and C would both be correct. B could never be the answer.

发布者:admin,转转请注明出处:http://www.yc00.com/questions/1745615183a4636190.html

相关推荐

发表回复

评论列表(0条)

  • 暂无评论

联系我们

400-800-8888

在线咨询: QQ交谈

邮件:admin@example.com

工作时间:周一至周五,9:30-18:30,节假日休息

关注微信