The Bold Blueprint Podcast

The Bold Blueprint Avideh Zakhor However, success is achieved through daily actions

However, success is achieved through daily actions—small steps that accumulate over time.

Broadcast on:: 09 Oct 2024
Audio Format:: other

Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus they've 10% on Amazon brands like our new brand Amazon Saver, 365 by Whole Foods Market, a plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work day by day, stitch by stitch. The Dickies we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code WorkWear20 at Checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. Okay let's start. My name is Wei. I'm a graduate student of Professors.com and because Professors.co is out of town today so I teach this lecture. Today we talk about image coding. So the first objective of image coding is that we try to compress an image. So what we mean by compress is that we represent an image with as few pieces as possible in order to that but still in the meantime we still preserve the required quality. So we do not lose much quality but we reduce the piece to represent of an image. And there are three main components of an image coding system. The first one we want to know that what would we call the foreign image. So basically there are several possibilities. The first one in that we could just call the image directly which means that we could call the image intensity directly like the pixel value. And also the more popular approach that we first do a transform for the image and then we call the image transform coefficient. We call the transform coefficients. And also today in a more advanced image coding technique we first model an image first and then call the model parameter. So it depends on the model if we try to call that a circle maybe. So all we need to is call the center the position on the center and also the radius R. And the second component of an image coding system is the quantization. The problem that the quantization can solve is that how to assign the reconstruct level to each value. I mean in reality generally the value is at analog or it requires lots of bits to store it. By conditioning it reduces the bit to store the image. By doing that actually we reduce the piece in the cost of losing some information so a condition generally means lossy compression. And there are two physical approaches of quantization. The first one is we just assign uniform spacing between two condition levels. So this is a voltage very simple. And also the second one is that we try to optimize the performance by minimize the some error criteria so that the spacing between two quantization interval is non-uniform. I'll cover these two approaches in multi-tier. And also the third component of an image coding system is bit assignment also we call it code work assignment which means that after the condition we have some integers but we need to know how many bits we assign to each condition level. And also this I have two approach. The first one in that we just assign equal line speeds to each condition level. And the second one is that we assign unequal line speeds which is more complex here but will be more efficient. So I'll cover these two in more details in this lecture. Now I think I talk the condition. First I talk the uniform condition. Here we have two assumptions that to simplify the explanation. The fourth one is that we just directly encode the image intensity. And the second one is that we assume that we use the simple equivalent speed assignment for this stage. And condition we have scalar condition and also vector condition. But in the scalar condition we mean that we encode each value independently without considering the correlation between different values. Vector, and vector condition means that we encode that different scalars together jointly but also considering the correlation between different values. We could possibly improve the efficiency of the condition by doing the vector case but it's much more complicated. In this class we only cover the scalar case. And so the uniform case actually is very simple. We assume that if the image intensity is from 0 to 255. And also we have L means the number of, number of reconstruct levels. Before we could compute the distance between two decision boundary, the data which means that it's the maximum of f minus the minimum value of f divided by the number of reconstruct levels. And using this we can get the descent boundary and the descent boundary. The descent boundary is that we have this simple equation which means the two neighboring decision boundary. The spacing, the spacing between two neighboring decision boundary is equal to the data. And also the reconstruct level of in that region it's just the middle point of the two neighboring decision boundary. And here I are extended by using an example here. We have from 0 to 255. So we have four levels. Actually we even divide this interval. So we can get 1, 2, 1, and 8. And 1, 2, 64. And here this one to 192. So this are the decision boundaries. From D0 to D1, D2, D3, D4. And the reconstruct level is just the middle point of two neighboring decision boundaries. This is R1, R0. It should be 32 and R1, R2, R3. So 96, 1, 60. And also the 2, 2, 0, 4. Anyway, 2, 2, 24, I think. So this is the uniform condensation. It's very simple and very easy to to implement. But the shortcoming of the uniform condensation is very obvious. It's not optimal for most of cases. Actually it's only optimal for when the distribution of the image intensity is only uniform. So by intuition we think that here is not an example. So the XSS can be treated as T or the position. And the projection to the XSS means that the frequency appeared by of these values. So we could see that the value appear more frequency in this region. And also in this region. So it's reasonable that we assign more reconstruct level in this 2 region. And apart less reconstruct level in this region. So that between this we can possibly actually reduce the mean square arrow between the reconstructive value and the original value. So people propose that we actually do the condition based on some arrow criterion. So it's non-uniform case. Non-uniform based on some arrow criterion. Hey Amazon Prime members. Why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus they've 10% on Amazon brands like our new brand Amazon Saver. 365 by Whole Foods Market. A plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work day by day, stitch by stitch. The Dickies we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection. Remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. Our criterion used by the society is MSC which means that the means square arrow. In order to form this problem we first need to introduce several variables. The first one is the water code. We model it as a random variable which has a probability density Pf. Pf is the probability density function. And also we use our eye to represent the reconstruction level and di to represent the decision boundary. And we use three to represent the number of reconstruction levels. So these are the variables I use to form these problems. So first I define the MSC between the original value and the original value and the reconstruction level value as epsilon which is the expected value between f minus the f hat which is reconstruction that reconstructing the value squared. And we can do the integral between the minimum value of the f and to the maximum value of f as f minus f hat squared times Pf df. So the problem that I want to solve is that we want to minimize, minimize epsilon. So by choosing different ri and di. So this is the problem I want to solve. And we know that we have several decision boundary decision regions. So inside the each decision region actually reconstruct the value f hat is just a constant. So we can further represent this equation into the following equation. So which is just the expanded version of this equation which is the sum from the first decision region to the last decision region. And inside each decision region we can just integrate from Dj to Dj plus one. This is the decision boundaries. And inside this decision boundary this original value minus this reconstruct value just rj which is the reconstruction level. So we're going to times Pf df. In order in order to optimize this equation we take derivatives. So we first take the derivative of from two. We first take the derivative of Dj and let it be zero. And we could look at one relationship between Dj and rj which is that Dj is equal to rj plus rj minus one divided by two. So we showed that the decision boundary Dj actually is the middle point of two neighboring reconstruction levels. And also we take we can take the derivative with with respect to rj. And that is to be zero. And we can get another equation. So which means rj is equal to from the integral from Dj to Dj plus one. f times Pf df dy's Dj dj plus one. Pf df. And this showed that rj actually is the centroid of the probability mass inside this decision region. I think this agrees with our intuition in that we tried to send more reconstruct level to to the place where the the value are more likely to appear in order to minimize the mean square arrow. And the number of these equations are actually 2j. And unfortunately they are not linear equations. So it's not very easy to solve. And in 1960 Max and also in 1957 Loyer, these two people actually proposed a solution for this this equation independently. They solve these equations in an iterative way. So we call this the solution of this equation as Loyer and Max algorithm. In the book, in the book, I had a page 595 in state. So see this table actually leads to the solution for two distributions, the uniform Gaussian and Laplacian. And we could see that for the uniform distribution, actually the uniform conversation is already the optimal way to to do the condition. But the uniform, but for Gaussian and Laplacian, the optimal way is directly, it's not uniform, it's non-uniform. And in this figure 10.5, we could see that how to use the non-uniform quantizer to quantilize a Gaussian distribution. So basically from all the values which is below 1.5 minus 1.5, we will send the value, I'm sorry, all the value which is below minus 0.98, we will send the value of minus 1.5. And for all the value between minus 0, minus 0.98 to 0, we will send the value of minus 0.45. So it isn't, the spacing between different, different level actually is, is, is non-uniform. And also the spacing between the decision boundary is also non-uniform. And the figure, the figure 10.8, 10.6, actually compared the performance of two schemes. One is the Laier-Max quantilizer and the other one is the uniform condition. We could see that the, in terms of the matrix of the MSE, the Laier-Max quantization, all the uniform, the uniform quantization. And when the number of b's are sent to each value as 4 b's, the number of quantization level is 16. I think the performance can be 1 or 2 dB, so which is held in the compression society. So by taking a look at the table, we can see that the, for the uniform distribution, actually the uniform condition is already optimum. So this provides some insight to us and we can possibly find an alternative way to do the, to do the, to do the condition for the non-uniform distribution. So the idea is here, so first we have the, f has non-uniform distribution. But we will apply some non-uniform function to map f into g, some non-uniform function. So the property of g, that g is uniform. And then we can simply apply the uniform quantizer to g and get g hat. We know that this is the optimal for g and then after we get g, we could just apply a inverse of the original transform function. And we get f hat. So we know these two actually do not affect the, do not loss information and we use uniform quantizer. This is optimal. And we could see that we get a non-uniform solution for f. And there are several choices of how to select the non-linear function. One possibility is that we choose the CDF of f, which is, the mapping function can be the integral from infinity, minus infinity to f. And this is the probability density function of f dx, minus 1/2. Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop prime exclusive deals and save up to 50% on weekly grocery favorites. Plus they've 10% on Amazon brands, like our new brand Amazon Saver, 365 by Whole Foods Market, a plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop prime exclusive deals at Amazon Fresh. Select varieties. We wear our work, day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. So, I think in this we can map f to a uniform function, uniform distribution tree. Okay, now I'll talk about the code work design. So the problem of the code work design to solve it, that we we have the condition. Here is our image coding system. We have an image here and we apply some transform. Generally, we use DCT and we get some value and we can we need to quantize it. Here, we get some integer values. And then we understand the code work design of the beat assignment. You know, to get some beats and now we get the beats as 0, 0, 1, 0, a screen of beats. So, now I'll talk about the beat assignment. Also for the beat assignment, we have two options. The fourth one is uniform. We just assign the equal number of beats to each condition level. And this also we can do not do it in a non-uniform way. The intuition is that some symbols appear more than other symbols. So, we can possibly assign shorter code work to a symbol which are more likely to appear and assign a longer longer code work to a symbol which are less likely to appear. So, by doing that, in average, we could possibly shorten the beat lines. So, I mean reach the effect of compression. And however, so here is one example and we have four symbols. Each has a probability to appear. It can be one half and be one fourth and see one eighth. One eighth. So, the uniform way is very simple. It has assign two beats for each for each symbol. It can 0, 1, 1, 1, 1, and 0, 0. So, the every beats per symbol we can get is two beats per symbol. But we could try another way which is non-uniform. So, we can assign a as 0 and b as 1, 0, c as 1, 1, 0, d as 1, 1, 1. And by doing that, we can compute the every beats per symbol by just the compute the expectation. So, which is 1 times 1, 0, 2 times 2 times 1, 0, 4 plus 3 times 1, 0, 8 plus 3 times 1, 0, 8, which is equal to 1.75 beats per symbol. So, by doing this, we actually save 0.25 beats per symbol. And however, using the non-uniform beat assignment, we need to be careful of the code word design. Otherwise, there may be some ambiguity in the decoder side. For example, if I do not use the 1, 1 for d, I use instead of here for d, we use instead of 0, 1, 0, 1, 0. So, we can say that when the decoder see this beat, when this 3 beats, it can either explain this beat as a b, because a is 0 and b as b is 1, 0 or explain it as d. So, this has some ambiguity and we cannot design the code word like this. So, there is some rule which means the prefix rule is that we require that each code cannot be a prefix of another code. So, the code which follows which agree with this requirement we call is prefix code. So, in order to introduce the code word design schemes, I try to, I first need to overview, talk about a little bit of information theory. So, in information theory, they define an entropy. So, I think most of you might have, might have heard about it. So, an entropy H is just an expectation of log 2 pi. So, here pi is the probability that a symbol is equal to the symbol i. So, we can explain it as minus sum from i to 1 to l pi log 2 pi. l is the harmony symbols we have in the system. And we have one in, one equation for the entropy is that the H actually is smaller or equal to log 2 l and also is larger than 0. The entropy actually is some metric to measure how many information carried by a message. So, if the entropy is 0, we can see that we can say that there is no information in this image and the entropy reaches to log 2 l, this measure carries the most information. So, one example is that when there are only two symbols in the system and, and we will assign the first symbol as 1 and the second symbol as 0. So, we can get the entropy as 0, we could see that this message actually has no information because we already know what the message contains before we received its message. And also, if the message assigns the probability for the, for the symbol is 1/2 and the probability to the second symbol is also 1/2, the entropy actually reaches the maximum, which is 1, which is 1. Because before receiving this message, we have no idea which symbol is more likely to appear. And for this symbol case, the entropy between the probability of p1 actually has 1 is a curve to remain. It's just like a half circle. And this is 0 and this is 1. The maximum is 1. In the middle point is just 1/2. So, which means the, when the two, when the probability for the two symbol are equal, the entropy reaches the maximum and when they are far away from each other and the entropy reduces. So, in this mean that if p1 is near 1, so before we receive this message, we have already have some idea that this message is more likely to contain a symbol 1. And we can, and from the information theory, we know that h actually is the lower bound, the lower bound for the pit rate of the comparison scheme. So, which means if you do not want to lose information, your scheme cannot be lower than h, the pit rate. So, this theorem does not tell us how to compress, how to compress some content, but it provides a guideline for the comparison scheme. And if the comparison scheme already reaches h, so we know that we do not need to put more efforts to design some fancy new schemes. But if it does not, there is some room for us to improve our, improve our performance. And one example is that we have L symbols. And this is the, the number of symbols is the power of 2. And each symbol appears more likely, even likely, the probability of each symbol is 1/L. So, we can. Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus, they've 10% on Amazon brands, like our new brand Amazon Saver, 365 by Whole Foods Market, a plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work, day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So, whether you're gearing up for a new project, or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. And get H is equal to luck to error. So, in this case, actually we can see that the uniform length coding actually reaches the optimum performance. So, we do not need to find the other scheme to comprise this kind of content. Okay, I think we can break for five minutes. Okay. [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] [inaudible] Okay, let's go on. Okay. So, before I talk about half-month coding, so any question for the staff I just talked? So, no questions? Okay, I talk about half-month coding. Half-month coding actually is the practical method, which can achieve near optimum performance in coding. And the coding performance actually pretty much is close to the entropy. And I use the example. I use one example to explain how half-month coding actually works. Here we have six symbols from S1, S2, S3, S4, S5, S6. And each symbol has a probability to appear. Here, for the first symbol, it has five over eight, and the second symbol, three over 32, the next three over 32, one over eight, and one over 32. So, using half-month coding actually, it's built a tree. It first selects the two symbols, which at least likely to appear and combine these two symbols together to form a new node. And also the new node has the probability, which is combination of the two select symbols. In this case, S4 and S6 has the least probability to appear. So, we select these two and form a new node, S7. And the probability for S7 is one over 16. And then, it will also assign a bit for each branch. For this branch, it will assign a bit zero. For this one, it will assign a bit one. And then it will pick the next two lists, that's two lists likely to appear, and also combine a new node. In this case, the S2 and S7 are two lists the probability to appear. We pick S2 and S7, and also combine these two nodes together to form a new node, S8, which is five over 32. And then this process just goes on, and we pick S3 and S5. I forgot to assign a bit for each branch, zero one. And this process just goes on, we pick S3 and S5. And this should be S9. The performance is seven over 32. And then we pick S9, S8. The node can be S10. And the combination, I also assign the bit for each branch. The combined probability is three over eight. And the process ends until we reach the probability one, which means that it covers all the symbols in the system. So here is one. We also assign a symbol to each branch, a bit for each branch. In order to gather code word, we just travel from the root to each node, and we can get the code word. So for S1, we travel from the root to S1, it does zero. And for S2, we travel from the root is this way. So we assign a bit here. So it should be 110. So S3 is this way, 100. And S4, 110, S5, 101, S6, 1111. So we notice that by doing Huffman coding actually, we are based the prefix code rule. And no code is the prefix of the other code. The reason is- Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus, they've 10% on Amazon brands, like our new brand Amazon Saver, 365 by Whole Foods Market, A Plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work, day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. That for each node, actually the last branch is itself. Although it shares the code with other node, but the last branch will differentiate it. The code, I mean the node from others. So no node can be a prefix of other nodes. So the performance of half-month coding actually is pretty good. In this case, we have six nodes. We have to use we have six nodes. So if we use the equal length piece for each symbol, we have to assign three bits per symbol. But using half-month coding, the every bits per symbol we can compute as the expected value, which is one times five over eight plus three times three over three two. We just compute the expected value and the result is 1.810 bits per symbol. We can see that apply half-month coding actually receive more than one bits per symbol in this case. And also we can compute the entropy, which is minus five eight times log two, five eight plus three over three two times log two. That's going. And the result is actually 1.752 bits per symbol. So half-month coding actually saves one bits per symbol and also it's pretty close to the entropy in this case. I think Harman coding is used by today commercialize the video coding standard like I'm pack 4 and pack all eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, eight, Of course, we have two large or equal to the entropy, but it can be guaranteed to be just the entropy plus one. This is from information theory and the result from information theory. And there is also a tighter upper bond. So this upper bond is a little bit loose tight upper bond. So we use pmax to denote that the probability, the maximum probability of a symbol. So when the pmax is smaller than 0.5, the upper bond actually is his, the entropy plus pmax. When the pmax is larger or equal to 0.5, the upper bond is hs plus pmax plus 0.086. So in this case, from this upper bond and lower bond, we could see that if the entropy is high, it's like maybe 10 beats per symbol. So we can see that the upper bond is quite efficient, guaranteed by this upper bond. But if the entropy actually is very low, I mean, much less than one piece per symbol. So the upper bond cannot guarantee that the upper bond can get a good performance. So there is one example in that we have three symbols. And the probability of 1, a1 is very high, it's 0.95. And the probability of these two are low, are very low. So using half-month coding, we could possibly assume one piece to a symbol 1 and two pieces for a symbol 3 and three pieces for a symbol 2. But we only get 1.05 beats per symbol. But the entropy actually in this case is 0.0, 0.335 beats per symbol. So although the half-month coding is still guaranteed by this bond, but because the entropy is much smaller than 1, one piece per symbol, half-month coding is not very efficient in this case. And also half-month coding actually assumed that the source is identically independent. So if the source, like our common textbook, so that the source are correlated, the half-month coding might not be very efficient. So here I can talk about one more scheme which applied a correlation between different symbols, which we call the dictionary scheme. So the dictionary basis scheme. The basic idea of dictionary scheme is that first we'll build a list dictionary. In other words, we'll build a list of patterns. And this pattern must be repeatedly occur in this message. And then we can just transmit the index of each symbol in this list. So, the dictionary actually exploit the fact that certain patterns appear again and again in certain messages. And there are two kinds of dictionary. The first one is static dictionary. And the second one is dynamic dictionary. I use an example to explain how the static dictionary works. So we have five symbols to be called. And we can't limit the number of occurrence of each symbol in the message. And we can't let the statistic of this source. So of course, for the static dictionary, we have to have entry for each symbol itself. So, and zero, zero, for a zero, one, zero, one, zero, one, one, zero, zero. So we first need to include all the single symbols into the dictionary. And then, because we use three b for each symbol, so we have three more entries and we'll compute the patterns in the source using some training sequences. And we can, for example, if we find these three patterns appear more often than other patterns. So we can encode these three simple combinations inside the dictionary. And we can assign all the code one, one, one, one, one, one. So these two symbols, so these three entries. To encode, we can have one general example. So, because we know that the longest length of the entry is two, so we first read into two symbols, read into a b. And we can start into the lookup in the dictionary and we find a b is actually one, zero, one here. And then we'll read two more, a b. And then we'll read R_A, R_A here, and we look up R_A in the dictionary, but we cannot find R_A, the entry for R_A. So we just look up R here instead, and R_A is one, zero, zero. And then we can look up, you see, he has one entry and we can code, and it just goes on. So it's very interesting that we noted that the... Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus, they've 10% on Amazon brands, like our new brand Amazon Saver. 365 by Whole Foods Market, a plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work, day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So, whether you're gearing up for a new project, or looking to add some tried and true workware to your collection, remember that Dickies has been standing the test of time for a reason. The workware isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workware20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. Dickies is actually a kind of opposite of Huffman Coding. For Huffman Coding, we actually use the fixed length symbol by the beast representing to each symbol is variable. I mean, the length of the beast is varying. It's a certain different length of beast to one symbol. But for the dictionary code, we can see that we're actually the beat for each entry is fixed, but the fixed length piece actually represents the variable length of symbols. And the static dictionary, I think the decoder side is also very straightforward. We could just get it because the length of beast represents each entry is fixed, so we can just reach the beast at the decoder side and look up in the dictionary. And we can decode the sequence. A more interesting part of the difference scheme is adaptive dictionary. Adaptive dictionary is known as LZ as this scheme. We have LZ77 and LZ78 is proposed by Ziver and Lumper in 1977 and 1978 separately. So these two schemes are known as LZ77 and LZ78. And in this class, we first cover LZ77. It's also known as LZ1 and this is known as LZ2. Actually, the more popular scheme is LZ2. In the current zip technique, also the GIF format is they both use LZ2, but we cover LZ1 in this class. And the basic idea of the adaptive dictionary is that they do not use the fixed dictionary. The dictionary changes. They use a portion of previously encoded sequence as dictionary. And they use kind of sliding window technique. So we have two buffers. One, we call it as search buffer. This is a portion of previously encoded sequence and serves as dictionary. And another buffer we call is local head buffer. So this is symbols which are not encoded and researched from the search buffer and match. The patterns from search buffer to the local head buffer and encode the local head buffer. I also use an example to explain how the LZ1 works. So we have a long sequence of symbols. So there are also symbols outside and outside. And this is, we first add the search buffer. And this will also add local head buffer. We know that there are each symbols in the search buffer and the seven symbols in local head buffer. But generally, the technique actually, the buffer side is much larger than this in order to obtain the recurring patterns. And this just for explanation. And we also have a pointer which we call the match pointer. It moves from the last symbol of the search buffer back to the head of the search buffer. So first we have, we have a, in local head buffer, we have a symbol A. So we move back the match point and we find there is a match here. So we compute there. We actually in the adaptive dictionary party, we compute a triplet, which is the offset and the length of the match. And also the next symbol after the match. So here for the match pointer, the first symbol of local head buffer is A. So the match pointer, we move back and we have a match here. But the match length is only A. It's one. And it can not, it may not be the longest match in the search buffer. So the match pointer just moves on and it will record it and it just moves on. And it reaches this place and this is the offset is four. And the length of the match is one. So it does continue to move on. And here, it gets, the move head is seven and the offset is seven. And the length of the match is two. And it compares all these matches. So it find the longest match actually happens here. So it just encode this two symbol into this triple as seven as the offset of the match pointer, two as the length of the match. And it also encodes the next symbol into. So this is a code order for the next symbol. So you may be stringing that, why do we need to code the next symbol? Because we only, these two, two numbers are enough to decode the symbol, the symbol sequences. And the reason for encode the next symbol of the match into this triple is that sometimes there is not might. There is no match in the third buffer for the new head buffer. I mean that maybe some symbol does not appear in the third buffer. So we cannot find any match in the third buffer for this symbol. So we have two design of a man for this case. So this case is not actually the. So when there is no match, we represent the symbol as just zero, zero, and the code order of the symbol. So we include this into the format in order to include this special case. And we could also compute the number of beats for each symbol, each encoded the symbol sequences in that we use ice as the, the length of the third buffer. And w is the length of the window. So what do I mean by window is that it's actually the third buffer plus the local head buffer. And also we use 80 notes, the sets of the simple set. So in order to transmission this triple, we actually use the number of beats. The number of beats we need is actually the upper bound of luck to s plus luck to w plus luck to a. So you may be strange again that why do we need the w, why can't we just use w minus ice instead in order to reduce the number of beats we transmit. Because we only the w minus ice actually is the size of the look head buffer. So people use w instead of w minus ice because they want to cover another case in that. Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus they've 10% on Amazon brands, like our new brand Amazon Saver. 365 by Whole Foods Market, a plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work, day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good, it's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. Sometimes a match begins from the local head buffer, but the match can end at look, I'm sorry. Sometimes the match begins at the search buffer, but it can end at local head buffer. So which means that the sales of this match might be longer than local head buffer. Yeah, I think. Let me just cover one example to see what this might be true. So this is one more example. We have a window set of 13. The set of look help of is six and the set of the set of is seven. So first we have seven, one, two, three. This is our search buffer, and this is our look help buffer. So actually it covers all the three cases in this example. First, we're trying to find a match to D, but we could see that there is actually no match to D. So we just encode the D as zero, zero, and the code word of D itself. And we know that this match has only, so we only encode one symbol. So we move both of the search buffer and the look help buffer by one, by one symbol. So this is our new search buffer, and this is our new look help buffer. Here we need one common case, which is we have a match for A. Actually we have several match for A here. So as I have explained that we move the match point back from the end of the set buffer to the head of the search buffer to find the longest match. And in this case, we don't can match actually is here. So the offset, the length of offset is seven. And the length of the match is we could match A, B, R, A in the search buffer. So the length of the match is four. And also, A, B, R, A, and also we code the next symbol after the match, which is the code word of R here. We totally encode the five symbol in this tree, in these tree bars. And we move the, we move the search buffer by five. So the search buffer now is here. So this is the search buffer, and we also move the look help buffer by five. This is our new look help buffer. And in this case, we have a match for R in here, but it's not very long. So we move further the match pointer to here. And we could see that here we have this R, A, R, R, A, R. So it's stopped here. But actually we could continue the match because the next symbol in the look help buffer actually is, is also matched here. I'm not very clear here. I think, so you can see that actually the, this five symbol, R, A, R, A, actually matched here. It starts from the, it starts from the search buffer, but it's end at look help buffer. So we also can still, we could, so this is a third case that match begins from, from search buffer, but end at look help buffer. So we also use the triple to encode these five symbols, which is the, the offset is three, but we match five symbols. And then the next, the next one after the match is the code word for D. And between this actually can further comprise the, the symbol sequences. And the decoding actually is very easy for doing this. We could have, this is in the decoder side. So I just, I think this first two are very easy to decode. So I just want to show how to decode in the third case. So this we have, so none, this is our search buffer and this symbol are already decoded. And we get the triple as three, five, say D. And we know that the, the, the match pointer, it moves back to three, move back three. So, so it's point to R and we are first copy R into the look help buffer. And it move forward and the copy A here. And it's just move forward R here. And when the match pointer arrives here, it does not stop. It can just continue into the new, you only decode it, decode it the buffer. So it just can copy this R here. And then copy this A here. And then it's decode the D, the symbol D. So actually we can see that the circuit is also very simple. If we use C language to program, it's just a three forward to copy from the previous buffer to the next buffer. And the dictionary scheme actually works pretty well when there are correlation between symbols. And actually the authors of the LZ scheme, they prove, they prove, they prove that some, a symbol politically, actually the performance of LZ77 could approach the entropy. And then the best compression ratio, that could be obtained by, I mean, by obtained, by obtaining all the statistics of the sources. So in a different scheme, actually there is no assumption for the statistics of the source. And they claim that their scheme can come as good as those, the best scheme, which use the fluid statistics of the source. And however, the LZ77 scheme actually has one assumption that the patterns have to be close to each other. Otherwise it may not be very efficient, because it's the search buffer and look at how to adjust the code to each other. So if the pattern repeats itself from a point far away from the look-how buffer, it may not take the advantage of this. And I think I have 10 more minutes and any questions for the staff I just talked. So is it clear for the LZ77 scheme? Yeah, LZw actually is 78. It's a variation of LZ78. And LZw is what is used in the zip. So I can talk a little bit about another coding scheme, which we call as arithmetic coding. But I may not finish it today, arithmetic coding. The basic idea of arithmetic coding is that it causes a sequence of symbols together, but it's a real number to the symbol sequence, and then they call the real number. So there are two steps for the arithmetic coding. One that is generate a tag, which means that actually it's a real number representing a sequence of symbols, and then it will assign a binary code to the tag. I also use an example to explain how the arithmetic coding to generate a tag. And also here, in this example, we have three symbols, A1, A2, A3. Each symbol has a probability to appear. The probability for P1 is 0.7, the probability for P2 is 0.1, the probability of P3 is 0.2. And suppose we encode this sequence, A1, A2, A3, we encode this sequence together. So they generally use one, so we draw one line from 0 to 1. And we know that from 0 to 0.7, because A1's probability is 0.7. And this segment represents A1, and this segment 0.8 represents A2. The A2 has the probability it's 0.1, and then this segment represents A3. And so we just get this segment, and enlarge it here. So this segment starts from 0 and ends at 0.7. And still we use the same technique, here 70% of segments to A1. So it's 0.49. And here is 0.56. And then the last other segment is 0.83. And here we want to actually try to code A2. So we select this segment and enlarge it here. And this segment starts to 0.49 and ends at 0.56. So again, we are at 70% to A1, which is 0.539. And here is 0.546, and we'll choose this one. It will start at 0.546 and 0.56. And actually, we only call three symbols. And we can pick any real number inside this segment to represent these three symbols without any ambiguity for the decoder side. So generally, we can just assign the middle point of this symbol, which might be 0.5... and we have no ambiguity. So why is that? For example, you know, the code side. We use the number of 0.5.3. And in the first step, we are asking that we know that A1 is belong to 0 to 0.7, A2 belongs to 0.7 to 0.8, and A3 belongs to 0.8 to 1. So this real number actually falls into the category of A1. So we decode the first symbol as A1. And we select this range, enlarge it. So from here, we can know that the second symbol, if it's A1, it will fall into 0.49, and A2 will fall 49 to 0.56, A3 from 0.56 to 0.7. Again, we look, we check which category this real number falls, and we can see that it falls into the category of A2. So the second symbol can be A2. And then, we further expand the A2's category as the third symbol. We have a range for each number. We could either check it and we could get this real number actually falls into the category of A3. We can decode the third symbol as A3. And after getting this real number, we have to, I mean, a sun piece for this real number. The scheme used by a recent method coding that is first decided the length of the piece to represent this real number. 0.553, it represents the A1, A2, A3. So it assumes that the salt actually is IID, so it can easily to compute the probability that this symbol appears, which is equal to just the product of this A3, the probability of these three symbols. And we could get one number, which we call PX. And the number of pieces assigned by this to this real number is just equal to log 2, 1 over PX plus 1 piece. So we use so many pieces to represent this real number. And they also actually prove that by using this, it can be unique, they generate a sequence for each real number. So in our example, we have PX actually is equal to 0.014. So the number of pieces we could compute is 8 bits. So we can get a binary representative as 0.553, we write it in the binary case, it's just 1 over 2 plus over 3 plus over 6 plus 1 over 8. So which is equal to 0.100101. So we use eight symbols. And we only use eight bits to represent this real number. So the bits we use to represent 0.553, it's just 10001101. And the theorem tells us that actually the average length of this arithmetic coding is, of course, it should be larger or equal than the entropy. But it's smaller or equal to entropy plus 2 over m. And m is the length of symbol sequence. So we can see that if we increase the length of the symbols, actually we can asymptotically approach to the lower bound entropy, which makes the arithmetic coding interesting. And the questions for the arithmetic coding. Okay, I think that's for today. Thank you for attendance. Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus they've 10% on Amazon brands, like our new brand Amazon Saver, 365 by Whole Foods Market, a Plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work, day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So, whether you're gearing up for a new project, or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code WorkWear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century.