Archive.fm

The Bold Blueprint Podcast

The Bold Blueprint Avideh Zakhor Celebrate Your Progress

As you work toward your dreams, take time to celebrate the small victories.

Broadcast on:
09 Oct 2024
Audio Format:
other

Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus save 10% on Amazon brands like our new brand Amazon Saver, 365 by Whole Foods Market, Aplenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. Their work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. I think I believe Cindy is going to bring the solution towards the end of the class. I did some soul-searching about the presentations and the last day of classes. I think the best thing to do is just stick to the original plan. I'm going to skip the faculty retreat on May 5th, at least the portion in the morning. We have people who did projects rather than just literature survey. At this point, I forget who's doing what. I think a lot of you are doing projects, but some are just doing pure literature surveys. So, people who are doing projects just actually let me just do a quick show of hand. Just about everybody here is doing projects, right? Yeah. You're doing a survey. And David, are you doing a survey? Okay, you're with that. Okay. So, we'll just we'll just stick to that plan. We have about two hours. I think the only issue is with you, Casey, right? Well, that's okay. You just hand in your power points together with your report. I mean, unless you can attend it, Friday May 5th class. That might be possible. Okay. Oh, if you're here, by all means, I just was under the impression you couldn't make it. Okay. Okay. So, I don't mean to exclude you. It's just that if you can't make it, that you can hand it in as part of a part. Okay. Any question? Yeah. Whatever I said in the original handout, which was May 15th. Yeah. Yeah. Well, basically, the algorithm we apply is we have about an hour and a half. We might go over because on Fridays, there's no class right after this. I might also actually reserve a different room in Korea because then they have, then they have a LCD projector. If I change the room, I'll let you know ahead of time. But basically, I think there's about 10 projects and there's about 90 minutes or so. So, it's about 10 minutes per project. Any other? Yeah, there's there's some groups that have more than one one students. Actually, at some point, I'd like you to let's do that just so that I have an update. Since the last time I received your proposals, there were some changing and changes in the groupings. Some students merged into one group and I was going to say some split, but I don't think anybody split. So, if you guys can, what's the best way of me figuring out exactly how many groups there are? Why don't you just send me an email, I'll keep track of it. So, each group, and please, no more than one person for a group, because otherwise, I'm going to confuse the game. Okay, send an email. Can we send that? I think the subject line makes it EE225B, because then when I go search among my 7,000 emails, at the end of the day, I can figure out if you just put E225B in the subject line, then I can quickly locate it. It's because, yeah, I mean that, otherwise, it's going to take me five hours to find all your emails, which is another big problem. Okay, subject line EE225B, tell me the number of people in your project and the title. Let me make an announcement here for people who come late is May 5th. As planned is the presentation day for projects. Okay, and if you didn't completely get everything done by May 5th and between May 5th and May 15th, you've got more done. That's fine. I don't expect this new project to be in this final form on May 5th. Kind of an overview of what you've done and what you hope to accomplish by the 15th. Any other questions or comments? Okay, so today was, yeah, let's take off the solution on your way out. Yeah, why don't you see C70 as well? That's a good idea. And once we collect the title of the projects, then I'll come up with a schedule and I announce it. I've tried to come up with a schedule and then announce it so that, you know, okay, well, this group will go first, this group second and the timing and stuff like that. Okay, any other questions? All right, so let me get going on today's lecture. So, what we talked about last time was essentially variations of waveform coding. We talked about pulse code modulation, which is a simple idea of get the pixels, quantize it uniformly and kind of just send the quantization bin allocate bits to each quantization bin and send that over. Then we talked about delta modulation where you essentially subtract successive pixels and quantize the, well, you subtract the current pixel under consideration from the previous pixel that you've encoded, but the quantized version of that, which the receiver has access to, subtract those to get an error and then quantize that. It's not f of n minus f of n minus 1. It's not that what you quantize rather, it's f of n minus the quantized version of f of n minus 1, simply because the encoder and decoder need to maintain this loop. They have to be, they have to maintain the same state that the encoder has to know, it can only work with quantities for prediction, it can only work with quantities that the decoder has access to. And the decoder, unfortunately, only has access to the quantized version of the previously decoder's pixels, not the actual uncompressed ones. So, in any event, so, so that's how, if I were to show it on the camera, this was the, the guts of delta modulation from last time, okay. This was PCM right here with, with, with Roberts noise, this was regular PCM, this is PCM with Roberts noise, this is delta modulation, and we end at last time's lecture with a DPCM, differential pulse code modulation. And so, just as a reminder to you guys, how it worked was you start with the, with the pixels. Once again, you have some prediction for what f of n1 and n2, the pixel at location n1 and n2 is, this is the pixel you want to encode. You come up with some prediction, and that, I deliberately don't say, tell you how it is, it could be linear, it could be nonlinear, it could be fancy, it could be any, any complicated function you ever dream of. And this prediction is a function of all the previously encoded pixels. And in this case, I've only read the three of them, but it could be any, any, all of them, if that's what you want. You come up with the error, you quant PCM here means you quantize that error you get. E hat, this E hat is what gets sent to the house. Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus save 10% on Amazon brands, like our new brand Amazon Saver, 365 by Whole Foods Market, a plenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh select varieties. We wear our work day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true workware to your collection, remember that Dickies has been standing the test of time for a reason. The workware isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workware20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. Saver, but meanwhile at the transmitter you want to keep track of what the receiver is doing. So what do you do? You add E hat to that prediction and you compute what the receiver would have computed, which is f hat of n1 and n2, so that later on in all your predictions you only use the values that the decoder has access to. Decoder and receiver are using to changeably. So you only use the pixel values that the decoder has access to. So you store that in the bank of all the previously encoded pixels so that then you can use it later on for prediction of the next pixel and the next pixel. Is that concept more or less clear? Yep. Anybody. And for different kinds of data you do different things. For example, if you come up with an autoregressive model for images you can you can make prediction which is a linear combination of these things. Right? What Cindy is working on is she's working on compressing VLSI layouts and VLSI layouts have very nice Manhattan geometry. Straight horizontal edges, straight vertical edges. So then you build your predicted to match that kind of a data. The better you make your predictor and the more you utilize the knowledge of the domain that you're working in, the smaller the energy of this error is, the closer f prime is to f, therefore the smaller the variance or the energy of this error signal is, and the smaller the energy of this signal is, the fewer bits you need to spend it to represent it with a given fidelity. Fidelity meaning distortion, mean square error. If I have a fixed mean square error in mind, encoding this error signal, I would spend fewer bits in representing that if the variance of E is the smallest possible. So you want to make f prime to come as close as possible to f and you can be as clever as you possibly want in this, in the prediction process. And this is by the way true when we get to video coding where the prediction box is, is predicting between successive frames. That's where motion vectors come in. Anyway, the receiver is pretty straightforward to get E hat, you add f prime to it and f prime because it was built, because it was built of a known prediction function of the previously encoded pixels, it's all those previously encoded pixels are available to the receiver, you add them up and that's what the receiver displays as f hat of N1 and N2. So what are some variations, before I get into the equations and tell you how we do the prediction, at least one example of the prediction, I want to talk about two things. What's some of the pitfalls of this system and what are some variations of this system? So how about if we change prediction as a function of the region of the image that we're dealing with? There are some regions of the image, it's flat, some other regions, it's more textury, you might say, well, let's just adapt the prediction. What would be the pitfall in doing that? Yeah, I mean every time you adapt you gain, but what do you lose? Well, you have to tell, you have to send extra bits to the receiver so that the receiver also knows you switch to a different prediction mode. Let's say that the transmitter had five prediction modes and the receiver had five prediction modes and you could choose or four for example. So if the transmitter has access to their actual raw uncompressed data, so if the transmitter chooses the best prediction model, the problem with that approach is that you still have to send the overhead as to specify which one of the four or five prediction model you're using and we'll talk about that a little bit more now. So every time you do adaptations in all these compression techniques, you gain some but you have to make sure that doesn't get offset by the extra overhead you have to send. And looking at these two loops, you know, the transmitter loop and the receiver, what else could go wrong? Anybody? Suppose you're sending this thing over a wireless channel where the packets get lost, the packet gets dropped. So suppose one particular e-head of N1 and N2 was in the packet that dooders of channel conditions got lost and we weren't applying an end-to-end protocol like TCP. It was UDP and it was a low delay application. We didn't even have a chance to retransmit it, so boom, one of these guys is gone. That means the receiver can't build F-head of N1 and N2. What do you think happens? Yeah. That's exactly right. They get out of sync and it actually depends on whether or not the receiver knows that that pixel is lost. Let's just for the sake of the position it knows. If it doesn't know, then it really gets massacred. But let's just assume it does know that it didn't receive a particular e-head of N1 and N2. Yeah, it can't compute their F-head and it gets out of sync and the error just propagates. And so for that reason in almost all prediction-based schemes that are error-prone that is likely for things to get lost, you have to have re-sync points where things get re-synchronized in order to remove all these things. So of course there's error concealment techniques. If you read the image processing and video communication literature, there's error concealment techniques that like the, for example, the receiver in order to make sure that all hell doesn't break loose. It could, for example, use some other values of previously encoded F-hats. F-head of N1 minus 1 and N2, for example, as estimate of F-hat. So what I want you to understand is these are two very tight loops and they're in sync with each other. So now what I want to talk about is what are some possibilities, today I want to start with what are some possibilities in terms of coming up with a prediction box like this. And I'm going to talk about that just now. So the picture that, for the sake of completeness for this system, let me just redraw the, so we're talking about DPCM. Let me redraw the diagram from last time. So you start at F of N1 and N2 and you subtract from it your prediction, F prime of N1 and N2. Here is your quantization and outcomes E-headed N1 and N2. And this is what gets sent to the receiver. But locally at the transmitter you still want to know what the receiver is doing. So you add up here this E-hat with F prime in order to come up with F-hat of N1 and N2. And this F prime signal here really is the output of some prediction box. And prediction is based on previously encoded pixels. I'm just repeating that same diagram so that it's fresh in the new sheet. F-hat of N1 minus 1, N2, et cetera, et cetera. So having drawn that, so the question that we're going to start answering is how do we do the prediction? Great. A little bit more. Great. Okay. So how to predict? And the answer is well let's write down, let's say that we do what's called linear prediction. We write down F prime, the predicted value for pixel N1 and N2 as a linear combination of the previously encoded pixels. So we write down F prime of N1 and N2 is the double summation of K1 and K2 taking the values in some region of support R sub A, A of K1 and K2, times F-hat of N1 minus K1 comma N2 minus K2. Okay. And so all you're saying is linearly combined these previously encoded pixels in order to come up with this prediction. So now the question is how to choose A. And this is where we ended last time's lecture. Well, we better solve some, we better choose a kind of in an optimal way, we better choose a, you know, it's such a way that we maximize something or minimize. Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop prime exclusive deals and save up to 50% on weekly grocery favorites. Plus save 10% on Amazon brands, like our new brand Amazon Saver, 365 by Whole Foods Market, Aplenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop prime exclusive deals at Amazon Fresh. Select varieties. We wear our work day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work where to your collection. Remember that Dickies has been standing the test of time for a reason. The work where isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code work where 20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. And one way to do that is, as I told you earlier, this is E of N1 and N2. The smaller the energy of this signal is, or the variance of this signal is in a statistical sense, the fewer bits we spent to represent it. So an icing to optimize is to minimize the energy of this signal or the variance of the signal. So a good thing to do, as I said, is minimize expected value of E square of N1 and N2. That's what we want to do. And what is expected value of E of N1 and N2? Well, it's equal to expected value of E of N1 and N2 squared is equal to the expected value. And you can now write down the gigantic equation for E. Well, it's f minus f prime. But f prime is now this summation. So it's square of f of N1 and N2 minus f prime. And f prime is this double summation. A of k1 and k2, A of k1 and k2, f hat of N1 minus k1 and 2 minus k2. Okay? So that's, that's what we want to minimize. And in general, this is a, because f and f hat are related in a nonlinear way through this quantizer up here and various other things that's going on in this system that this, this, this is hard to minimize. Ideally, every time you're minimizing something, you want to end up with something quadratic with, if you're minimizing with respect to some variable, in this case A, you want to have something that's quadratic in A so that when you think that the derivative of it, you get linear system of equations, you solve it with known techniques and boom, you're done. This is, this is not quadratic because this f and f hat are related in a nonlinear way. In particular, this is f hat. They're, they're related through this quantizer, which is nonlinear. So what do we do? Well, we, we cluge our way. And the way you cluge your way, you say, well, you know, and ideally, if you're doing a good job of quantizing, and if it's a really good compression system, f and f hat are, are, are very close to each other. Okay. So ideally, f is approximately equal to f hat. So I'm, I'm just going to shamelessly replace all these f hats here with f. And then I do get a quadratic system. So then I get e of, expected value of e of n1 and n2 is just expected value of this thing squared where this thing is just f minus a, and then f of n1 minus k1 and n2 minus k2. Okay. So I just shamelessly replace this f hat with f. And this is index for k1 and k2. And now what I have, this thing is really quadratic in a, a of k1 and k2. And in order to, so, so our goal is to minimize this with respect to a. So, so we want to minimize with respect to a's. So take the derivative, set it equal to zero. And if we do that, then what we get is a linear system of equations. So then we can get a linear system of equations that we can use any technique we like. And the solution to that is, is given by the following, r sub f of l1 and l2 is equal to double summation of k1, k2 being a member of r sub a, a of k1, k2 times r sub f of l1 minus k1, l2 minus k2. Okay. Well, what is r sub f? Well, it's the, we're assuming that f is a, is a stationary random process. We're assuming that r sub f is the correlation function. So r sub f of n1 and n2 is the correlation function for the stationary random process, f. Okay. So once again, we've, we've not only in restoration problems, but in compression, you, you've kind of come up with statistical models of your image. And you're assuming that it has, you know, a stationary, it has correlation function. And we've already dashed this stationality assumption many, many times, but we keep coming back to it. And it's, you can, I mean, it's not strictly stationary, but you can still do some interesting things if, if you assume it's, it's of that nature. Okay. So, so you can, so how do you, how do you, so, so if you come up with, with some models for, for your r sub f, for example, then you can come up with, so you assume some sort of r sub a region of support, and then assume some sort of the r sub f, and then you can come up with a of k1's and k2's. Okay. And in particular, if I were to show figure 10 30 in Jim Lim's book. If you use, so this picture here, if you zoom in, shows the three bit per pixel quantization, DPCM quantization of the Lenov picture, and with the signal entomotor ratio of 10, 16.6, where the values of a case of k of k1 and k2 are, are already done in just a second, where obtained, or given as, as follows. So this three bits per pixel, I want you to compare kind of with, unfortunately, again, JLIM doesn't have a very good set of comparisons. Here, he has a delta modulation picture with two curves. I'll switch to, I'll switch to Gonzales to give you some better feel of the different prediction methods and various other things. But, but for now, if you can kind of come back here, so I'll just, I'll just write down figure 10 30 in JLIM, you have, you assume a of 1 comma 0 is the same as a of 0 comma 1 is equal to 0.95. And that probably corresponds to a particular model. He doesn't even state it exactly, or precisely what it is, but some model of r sub f, and then a of 1 comma 1 is minus 0.95. Now, as I said, a much, much cleaner example is, is a little bit more in detail, is, is Gonzales and Woods. He actually, he actually uses different prediction equations in order to come up with different images. And I'll show you some of that too. I think it's a lot more insightful than, than what you have in, in JLIM's book. So, bottom line there is that here are some possible prediction equations. Okay, so you could say that f prime of n1 and n2 is, for example, just 0.97 of f of n1 comma n2 minus 1. Just one tap prediction model. You could say f prime of n1 and n2, another prediction model is 1/2 of f of n1 comma n2 minus 1 plus 1/2 of f of n1 minus 1 comma n2. So, we call that prediction model 1, prediction model 2. Then the third one f, f prime of n1 and n2 is 3/4 of f of n1 comma n2 minus 1 plus 3/4 of f of n1 minus 1 comma n2 minus 1/2 of f of n1 minus 1 comma n2 minus 1. And notice in all these examples, we're, we're, we're, we're, we're making sure the, we're, making sure that the co, the sum of the coefficients of these guys add up to 1. Because if, if, if all the pixels are between 0 and 256 and some of the coefficients add up to 1, the predicted pixels also have, will have. Hey Amazon prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop prime exclusive deals and save up to 50% on weekly grocery favorites. Plus save 10% on Amazon brands. Like our new brand, Amazon Saver, 365 by Whole Foods Market, Aplenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop prime exclusive deals at Amazon Fresh. Select varieties. We wear our work. Day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. The workwear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workwear20 at Checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. The same range. So we don't want to increase the energy of the signal or decrease it. So we call this prediction model three. And finally, the fourth one is a little bit more complicated. It doesn't necessarily do better, but I'll write it down anyway. F prime of N1 and N2 is 0.97, F of N1, N2 minus 1. If delta H is smaller than delta V, and I'll tell you what these guys are in just a second, 0.97, F of N1 minus 1, N2, otherwise. And this is called model number four. And delta H is defined as F of N1 minus 1, N2 minus F of N1 minus 1, N2 minus F of N1 minus 1, N2 minus 1. And delta V is F of N1, N2 minus 1, minus F of N1 minus 1, N2 minus 1. So basically, what this last thing is saying is that these two things, delta H and delta V, they basically denote the horizontal and vertical gradients at point N1 and N2. So this is the horizontal gradient. And this is the vertical gradient at N1 and N2. This is the vertical gradient. So if the horizontal gradient is smaller than the vertical gradient, then copy to pixel for this pixel, N1 and N2, you're copying the one from below. And in this case, you're copying the one from to the left. So this is copied from below, essentially this is copied, or 0.97, almost the same thing. Actually, yeah, and for the other case, copy from left. So here's some prediction schemes that people have been playing with. And let me show you the resulting, let me now log in here, fire up the internet explorer, go to my webpage, which is all this stuff. Chapter 8, yep. Okay, and we want to go to figure 823. Okay, so if you switch back to the computer, so this is this is favorite picture that Gonzalez uses for all these examples. I don't know what he calls it. This is a 512 by 512 monochromic, monochromatic image. If you go to this page now, what you see is the prediction error that you obtain by using all these four predictors. And again, ideally you want your prediction error to look like what? Very low energy and white noise. You don't want to be able to discern any of the features of the original image in there. Because if you do, it means that you didn't decorulate the pixels good enough, right? It means that there's correlation between the pixels that's left to be exploited. So on the upper left is scheme number one, which if you switch back to here on the paper, it's just a linear thing. Point 97, so here I'm just saying copy the pixel from below. The previously encoded pixel and that's my prediction. And you can see in here, switch back to the camera, it kind of does terrible. Okay. Then moving here is a two-tap filter. It's half of n1 comma n2 minus one plus half of n1 minus one comma n2. There's a little bit better than this, but still you can discern a lot of the features. This one is a three-tap filter and you can see there's considerable less energy than these two. The variance of this signal is smaller than the other two. And this is the one with delta H and delta V with the horizontal and vertical. And I mean, I thought that this guy guns on, let's put this example in, because this is going to be much better than this. But in fact, you'll find out that they're not. And if I were to assign a mean square standard deviation of the prediction error, not the variance, which is the standard deviation, which is very close to the variance, this image is 4.9, very high. This one is 3.7, slightly lower. This is 3.3, slightly lower. And this one is 4.1. So it didn't, this fourth one with the delta H and delta V and look at the gradients for horizontal and look at the gradients for vertical. It didn't necessarily do better than the three-tap filter. So with the smallest amount of energy, is this. So let's just, any questions on this? Yeah, actually, I was thinking about that as I was writing it. You can just have one. I don't think it makes any difference. I think, I think, yeah, I don't think, I don't think it makes any difference. So let's just go one step further and try to combine this, assume for just a second, now you have this prediction error. And so how do you end up quantizing it? So quantize and then allocate. This is not the signal you send. Every pixel in these prediction error images now has, it still has an integer, right? In fact, you subtract at two images that are, that are each one is between 0 and 256. The dynamic range, in fact, increased. It could be anywhere between plus 512 to all the way through, what could it be? What's the range? Minus 256, all the way to plus 512. So it's quite large. So we still have to do quantization to do. So what are some approaches for quantization and coding? And I realized I didn't, I could have covered that last time, but we'll cover it now. So you can do non-uniform quantizers. And the way you do that, the optimal one is Lloyd-Max. And we talked to some extent about Lloyd-Max in last time's lecture. And so what does Lloyd, what, and then we talked about it in two lectures ago, what does Lloyd-Max do? It says, I have a random variable with a given PDF, probability distribution function. And I wanted to quantize that to end levels. Find me the quantization bin boundaries and the reconstruction levels for each of those so that I minimize mean square error. So let me just remind you one more time of Lloyd-Max. It's the optimum non-uniform quantizer. Basically it says assume a PDF for your random variable, we want to quantize to end levels, then what is the decision boundaries and what is the reconstruction levels. Right? You all remember that from last time. So this is min, this is max, and this is some probability distribution function. So you put it in some numerical algorithm and say, okay, these are the decision boundaries. D1, D2, D3. And for this quantization bin, the reconstruct, the optimum reconstruction level is here. For this one, it's here. And I'm exaggerating here, right? For this one, it's here, and for this one, it's all the way to the end. Right? So think of Lloyd-Max quantizer as a nonlinear box that takes the probability distribution into as input and n as input and produces this D1, D2, D3, which are decision boundaries. And I call these R1, R2, R2, R3, R4 as quantization levels. And so if I apply that... Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop prime exclusive deals and save up to 50% on weekly grocery favorites. Plus save 10% on Amazon brands, like our new brand Amazon Saver, 365 by Whole Foods Market, Aplenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop prime exclusive deals at Amazon Fresh. Select varieties. We wear our work day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection. Remember that Dickies has been standing the test of time for a reason. The work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code Workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. So now suppose that I assume that that these prediction errors have have we need we need to know what the probability distribution for this prediction errors are to begin with. So how do you do that? You take few images the images have tons of pixels you plot their histogram and that gives you an idea what the what the distribution are and what is the distribution of prediction error pixels I don't expect you to know it so I just tell you it's Laplacian right so so given so assuming that the PDF of the prediction error is Laplacian then I can quantize it to I don't know two levels four levels eight levels and if I were to look at Jay Lim's book that you had it I don't know again if we showed that or not but he has a table that tells you what the we can so it's so so there's tables that tell you for a given kind of a PDF for example Gaussian Laplacian uniform I don't know why they leave all these distributions with unit unit variance all these these are these all have parameters you have to tell what the variance of it it tells you it has it solved the load max quantization and it tells you the optimal DIs and and RIs right so if I were to go back to this look at the table at 10.1 in Jay Lim book we can see what they are here if you zoom in please okay great so like for uniform Gaussian on Laplacian so it so uniform if I have one bit that's two levels if I have two bits I have four levels if I have three bits it's eight if four it's sixteen this is how this table works so for for uniform this we should yeah so for uniform the the reconstruction level it and here we're assuming that that for uniform this is it's between minus one and plus one right the reconstruction levels therefore make sense it's minus half and plus half and the decision boundaries are minus one zero and one and if you go to two levels then these are the reconstruction levels and the uniform did we go through this table to two lectures ago okay and this is what I'm saying anyway so if you do all of those things then you can do go back here and see the the DPCM results using one pixel sorry using one bit per pixel two bits per pixel and three bits per pixel on this on this and on this image what that he doesn't give it a name and again if I don't if you look at the book I don't know if it comes on the on this cameras on the TV screen but this is much more this is blurry this is quite blurry with the edges are not really sharp and it's good to two bits and three bits it becomes better and better now let me explain to you what these other pictures are so and the way to do that but let's come back to the paper here what if the probability distribution of the random variable you're dealing with is not unit variance I mean these tables are just showed you for Laplace and a Gaussian they were constructed for unit variance what if they're not those the variance was five or ten or twenty it's very simple you just scale these these quantities these decision boundaries and these reconstructions multiply them by whatever number they are so it turns out that you can you can make this DPCM adaptive by for example doing the following thing look at 16 by 16 block and you compute its variance right well if it is one then good just just go to the the table that has Laplacian for one but you can you can categorize or classify the variance of of this 16 by 16 blocks into like four classes okay and so essentially you have four scaled quantizers right and with where the scale factor in this case is zero point five one one point seven five and two point five right so you can pick so ideally ideally what you if you wanted to do Lord max quantization you compute the variance and then go to the ideal PDF distribution for the Laplacian but then if you do that you have to send the variance values of each block across and that's expensive that's complete adaptation so in order not to spend a lot of bits on specifying the variance what you do is you say okay I'm going to compromise I'm going to assume my variance falls into one of these four groups and therefore I scale my unit variance Laplacian assumption by these four values and I and I and I classify each of my 16 by 16 blocks into one of these four classes and and therefore I'm doing now what's called adaptive quantization compared to DPCM and as you can see the quality of the picture is quite a bit improved and if you don't see that this picture is even better this is the error the difference between so so let me go back here so this is the let's just call it the Lenon what should we call this lady the lady image she might have Queen Elizabeth it's her birthday today but if you call her lady just anyway this this this woman picture this is one bit per pixel this is two bits per pixels and this is three bits per pixel so this is adaptive quantization now that I have to specify for each 16 by 16 block I have to send two additional bits because I have to tell which one of the four classes of variance are fall into so it's two bits divided by 16 pixel now I have one over eight which is point one two five bits per pixel extra overhead so this this is one point one two five this is two point two point one two five and this is three point one two five better yet look at the error after you've quantized encoded that this is the error that you get with one bit per pixel with no adaptation this is one point one two five with with adaptation this is with two two point one two five three the left columns are all adaptation the right sorry the right column is all adaptation the left columns are non-adaptation one bit to be three bits one point one two five two point one two five and three point one point so as you can see you go down here the error gets reduced and this is how in compression papers you you present your results you don't the same problem we have by the way showing things to the screen is just as bad when you publish your papers you can you can submit electronically your paper but then by the time it gets on paper on a little hard copy paper it's all gone all the things you observed in real life in your lab just evaporate so a better thing to show is these error signals that kind of completely shows what they what the picture is all about and if you're further if you're not happy with that you look at the actual mean square error values and the different prediction equations this is kind of the table you get right so if you use that simple prediction equation this is these are the mean square error values you get for two level four level and eight level load max quantizer and it can see when you move to adaptive it's dramatically increased by a factor of four or so a much lower mean square error again if you go down this way you you also see that for for non-adaptive the energy keeps decreasing except for that this is the delta H delta V thing which never did anything useful and I don't even know why he has that example the same thing here the same thing here the same thing almost here here and here but so as you go down the means for the prediction error goes down and as you go from here non-adaptive to adaptive it also goes down so these are kind of in drive with our intuition as to how EPCM should work okay so let me just write down here so two bits of overhead hey Amazon Prime members why pay more for groceries when you can save big on thousands of items at Amazon Fresh shop prime exclusive deals and save up to 50% on weekly grocery favorites plus save 10% on Amazon brands like our new brand Amazon saver 365 by Whole Foods Market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at Amazon Fresh select varieties we wear our work day by day stitch by stitch at Dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the Dickies has been standing the test of time for a reason their work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to Dickies calm and use the promo code work where 20 at checkout to save 20% on your purchase it's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century to do the classification just for the sake of having them and so two bits over 16 by 16 pixels 0.125 bits of overhead okay any questions in this mode scheme okay so I'm gonna continue now or we're done with waveform coding okay it's not really used it's the simplest of techniques but you use it in situations where you don't have much compute power or it's one of the earliest image compression techniques design so the next the next thing in our list of what to code besides waveform so we did waveform coding so far done the next thing is transform coding okay and how many of you are how many of you can tell me why we do transform coding or how how yeah what transform coding is anybody any transforms you guys are seeing in your life do you know what does anybody know what transform is it can be in frequency domain TCTF yeah yeah okay excellent so the before that can we ask your question yeah you said overhead is you over 16 shouldn't be to over 16 square yeah I wrote down 16 square but I set to over 16 so two bits over 16 times 16 so that won't be 0.125 oh is it 8 by 8 sorry I just quickly that was a good point ah it's not 16 by 16 blocks it's 4 by 4 blocks sorry thank you okay so variance funds variance to variance okay so how does transform coding work well you start with your signal and you take the transform of it and then you quantize the transform coefficients and transform could be DCT DFT etc and and then you do a code word assignment this is the bid allocation so after you do quantization which could be uniform or non-uniform you then have to assign bits into each of the quantization bins so good allocation this could be Hoffman arithmetic anything you like and then at the other end you receive this and you do inverse transform and then sorry inverse inverse entropy code this is bid allocation or what's called entropy coding you do inverse entropy coding for example if you did Hoffman you have to undo your Hoffman code and then you do inverse quantization and then you do inverse transform you just inverse all the steps you took in order to spit out f hat now if you didn't do any quantization then this would be a lost less scheme but because you quantize you're introducing errors so my first question is why do we do transform coding what's the number one motivation for doing it I mean wait for according it looks look pretty nice right what do we hope to gain because some transforms the compact energy exactly exactly a lot of coefficients are close to zero exactly because the transforms do two things that we're interested in number one if you take an 8x8 or 16x8x8 blocks and you take its transform you get 64 numbers in large number of those 64 coefficients are very close to zero and can be discarded and we've already seen that at the beginning of the semester when we took DFT of images and we threw away I don't know 99% of the coefficients to take inverse transform and boom it still looks like the like the original so this is called the compaction property they compact the energy of the of the of the signal in the space no one into very very few coefficients so ideally you're interested in finding transform that have very good compaction properties and number two what's this so that's good surrogate what's number two reason you might know it also of the of the coefficient that didn't become zero what can we say about those they are highly uncorrelated the pixel domain the problem with the darn pixels is that they're very correlated and we have to find these a coefficients to uncorrelated them and sometimes we're good sometimes we're bad it's just you know shooting in the dark almost hit and miss but the transform if you have a good transform the statistically the coefficients of the transforms are uncorrelated with each other so let me write this down so why transforms there's two things number one the they achieve energy compaction okay in a small number of coefficients and number two um they they reduce or or eliminates if this a fantastic transform eliminate I can't spell today the correlation between transform coefficients so knowing the fifth transform coefficient doesn't tell me anything about the sixth one so there's nothing for me to exploit statistically I don't have to do if I'm in transform no I don't have to do prediction anymore I can toss that out which is which is a big headache I mean you saw the different predictions so no need for prediction yeah so why what what is a what is another constraint what what are some constraints that we have to consider in doing transforms first of all it has to be invertable if you do a transform that you can't invert then you send these coefficients across and and the receiver the decoder can reproduce the original image so transform must be invertable and secondly it computationally it has to be tractable it shouldn't be too compute intensive that it kills us and this was again this was an issue like in the 60s where there was not too much compute power it disappeared in the 80s because now we got fancy computers on this it's reappearing in the 2000 because why you need to do it on a low-power cell phone portable device without burning a hole in your pocket right so before I I go and talk about you know show you the basis functions for some of the popular transforms I like to talk about the optimal transform which is KLT how many of you have heard of it or seen it in your other classes okay is it in 225A did Michael Gaspar cover okay so how did you guys know about it some other class or something just heard about it so the optimum transform let's just let's spend a few minutes over it so the optimum from a statistical point of view is KLT karhunen oh boy now I have to spell that oh okay hey Amazon Prime members why pay more for groceries when you can save big on thousands of items at Amazon fresh shop prime exclusive deals and save up to 50% on weekly grocery favorites plus save 10% on Amazon brands like our new brand Amazon saver 365 by Whole Foods market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at Amazon fresh select varieties we wear our work day by day stitch by stitch a dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the dickies has been standing the test of time for a reason the work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to dickies.com and use the promo code work where 20 at checkout to save 20% on your purchase it's the perfect time to experience the quality and reliability that is made dickies a trusted name for over a century. Hey are you any and no transform do we have any finish people in the class I think Sergei comes closest but you're not finished right actually I used to have some finish classmates when I was studying in UK and the biggest insult you can do to them is to tell them they're Russians they get so mad they say oh you're just an extension of Russia you're a colony of Russia oh they get furious now anyway well they all speak Russian I mean this was in between 78 and 80 before the you know before the Berlin wall fell and before you know when the cold war was going on you probably weren't even born at that time but it's very true they're right next to Russia and and they teach Russian as their second language. Anyway so Conan in law of transform so Conan as you can see is a finished guy and how does this thing work well first of all let me tell you what properties it achieves number one it's optimal in a sense that the coefficients are completely uncorrelated and that's exactly what we wanted right and number two on average in a statistical sense the first m coefficients have more energy for any m have more energy than any other transform in other words you start with a bunch of pixels here you pass it through KLT and outcomes a bunch of coefficients right so if you have you know a thousand pixels you come up with a thousand coefficients but it comes up with rank order this coefficient C1 C2 all the way C1000 and if I draw the line anywhere if I pick the first three coefficients they have more energy than any other first highest value highest magnitude coefficient from any other transform if I pick the first five coefficients they have more energy in them percentage wise see these are all energy preserving transforms in other words the sum over f of n1 and then 2 squared that's very important to say is the sum over f of k1 comma k2 squared so the energy is preserved right but so so if I add it up C1 squared plus C2 squared all the way it's C1000 squared for all the transforms DCT, KLT, DWT, DHD, anything the sum square of all of them is the same number this is preserved because they're all equal to this quantity but the good thing about this KLT transform is that if you pick the top three percentage wise it has the highest percentage of energy in that top three than any other transform if you pick the top three of DCT or DFT or any other ones if you pick the top five for any m the top m coefficients has the contains largest percentage of energy compared to all the other transform and that's what's beautiful about it you know and like many other things in similar processing information to when you first are presented with this you say oh god I'm in 6th heaven you know where is it let's use it because what is it let me see what it is and and unfortunately we have to pour cold water just like I told you you know the brick wall filter doesn't really exist this thing doesn't really work very well either why I'll tell you why because as soon as I tell you what it is so assuming so how do we build KLT assuming F of N1 and N2 belongs to a stationary random process and you can already see the beginnings of the troubles right as soon as you assume a stationary random process you have to assume as a correlation function then K with with the covariance function which is defined as K sub F of N1 comma N2 column L1 comma L2 which is defined by expected value of X of N1 comma N2 minus X bar of N1 comma N2 the whole thing times X of L1 and L2 minus X bar of L1 comma L2 so this is the this is the covariance function for this random process then the you have to come up with the basis functions for for this transform so assuming all of these things then the basis function for this transform which is defined as A of N1 comma N2 column table and K2 satisfies the following it satisfies the following eigen equations and again I won't have time to show why this is true but but here's the eigen equation that is satisfied essentially lambda of lambda is the eigenvalue lambda of K1 and K2 times the basis function A of N1 comma N2 column K1 comma K2 and a lot of these things are complicated because there's too many two two two dimensions is the same as the summation from K1 equals 0 to N1 minus 1 and K2 equals 0 and 2 minus 1 of K sub F which is the correlation function here or covariance function of N1 N2 K1 K2 times the A guy A of L1 comma L2 comma K1 comma K2 okay so A is the eigenfunction or the eigenvector I mean in the discrete case you can look at what eigenvector instead of eigenfunction so A is the eigenvector of this K F guy the covariance function okay that that's called the I forgot what it's called but this is essentially satisfying the eigen equation of this nature have you seen that something like this into 25 into 25A like whitening filters and stuff like that yeah anybody have you seen an equation like this into 25A I don't think so anyway so so so the base so the basis function so what I mean what is all of these and so why do I care about the basis function well because the basis function is what I use to compute my KLT coefficients so let's just wrap up this whole discussion in a sense that my my coefficients which are F of K1 and K2 for KLT having said all of these things is just a projection of my signal on top of these basis functions it's the double summation of F of N1 and N2 where N1 and N2 of A of N1 comma N2 K1 comma K2 so what's the sum what what is all of this thing amount to you start with this signal F you assume it belongs to some stationary random process class which has this kind of covariance function you have to compute that or know it or evaluate it to find it somehow somewhere then once you know that then you can solve this equation to find the basis functions the A guys and once you know A then you can project your signal F on A to get your KLT coefficients okay now that we've described it why is this process full of bad things well communities aren't stationary even if you just overlook that computing this guy F is is hell and even if you just say okay you know what I'll do in fact Ross Picard who was my classmate at MIT who's not a professor at MIT she had an entire summer job at Bell Labs where what she did is she would evaluate K sub F this this kernel function for every image and come up with an image adaptive basis function because then for that image this this coefficients aren't completely uncorrelated that but what's what the problem with that approach you have to send the A guys the basis functions to the receiver and that's that already kills the entire compression scheme so okay you know she spent the summer proving that then you can think about and say okay well that that was a bad approach let's just not do that let's just get a class of images compute case of F for all of them at once come up with some sort of universal basis functions that work with a lot of images and that's what we use and to some extent that's kind of what what DCT is all about so let me explain what I mean by that so if you so so in practice so KOT in practice it's just hell actually I shouldn't read hell on paper difficult okay this thing says there's a permanent record of this stuff goes on the web pages every and it actually is being used as a course many courses in China okay it is so so if I assume it's true they came and asked for permission so if I assume that my my image is it is a Markov mod has a Markov model okay and what do I mean by the image having a Markov model that means the correlation function expected value of F of n 1 and then 2 times F of n 1 minus i comma n 2 minus j is equal to some very some number sigma square times rho sub vi times rho sub h j that's that's what that assumes right where rho sub h is the horizontal correlation coefficient and rho sub v is the vertical correlation coefficient and anyway if you assume something like that then you can show well in this case let's just assume that let's just assume that these two things are are equal to each other and let's just call them rho okay then we can show that as rho tends to one then fortunately KLT approaches to DCT which is fantastic okay and that brings us to the next page if you can roll up please what is DCT what there's four types of DCP DCT but the type 2 which is the one we're most interested in is is given by the following equation it's discrete cosine transform and it's given by s of k 1 and k 2 is equal to so the the DCT coefficients is equal to the square root of 4 over n squared of c of k 1 times c of k 2 times the double summation of n1 comma n2 ranging from 0 to n minus 1 both of them of the signal s of n1 at n2 and I apologize for now switching to the s signal but that's okay cosine of two things oh I'm not gonna have space there so cosine of pi 2 pi n1 plus 1 times k 1 over 2n and then cosine of 2 pi and 2 plus 1 k 2 over 2n so this is the forward DCT transform and you might say but what are these C guys well C of k is defined as 1 over square root of 2 if k is equal to 0 it's 1 otherwise okay this is how you take DCT of type 2 for for what you call here Zen and what I'm going to do on on Wednesdays lecture I'm going to explain how DCT is related to DFT okay but but for now what I like to do is show you the the basis functions this cosine guys pictorially so you can see so so you can see how they how they look like okay so here we go if you can figure 830 yeah so this is a discrete cosine basis for N equals 4 okay where so so you have 16 functions as you would expect this one is the DC term this one is the highest frequency this one you have vertical bars this one you have horizontal bars and this is what you get when when when you're dealing with a 4x4 we have a 4x4 blocks then they have 16 basis functions these are the 16 DCT basis functions that that you end up getting okay actually I'm telling you there's I don't have a typo here you know what my typo is I just discovered it and it's in these notes as well this is pi times 2n plus 1 the 2 is here 2n plus 2 2n 2 plus 1 2n 1 plus 1 okay I was wondering what's happening okay so anyway there's four kinds of these cities and I'll get into the details of it a little bit more and just for comparison or doesn't this is the Walsh Hadamard transfer and I will write down the expression for it and and everything next time but let me just go to this picture and show you what happens if you apply approximating that that woman picture with Fourier transformers up here Hadamard transformers here and Cosine transformers here okay why are you laughing they're all the same right let me see if I can find something but okay this is much better yeah okay so so as a matter of practice what you know what I think I'm gonna stop here because there's too much too much once I start doing that then I'm gonna skip I'm never gonna be able to finish this argument so let's let's pick up this DCT discussion and we talk about the the block sizes we talk about energy compaction how it compares to the Walsh Hadamard and we talk about threshold coding don't know yeah there's too much to just start to wrap up today okay yeah sure yeah sure your free pizza lunch and and if you don't want to meet Swiss people what do you do then then you can't talk to the office hours any hours by the way is out of business so don't look for that it's at also not like that's a good time to say yes hey Amazon Prime members why pay more for groceries when you can save big on thousands of items at Amazon Fresh shop Prime exclusive deals and save up to 50% on weekly grocery favorites plus save 10% on Amazon brands like our new brand Amazon Saver 365 by Whole Foods Market a plenty and more come back for new deals rotating every week don't miss out on savings shop Prime exclusive deals at Amazon Fresh select varieties we wear our work day by day stitch by stitch a dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the dickies has been standing the test of time for a reason their work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to dickies.com and use the promo code workwear20 at checkout to save 20% on your purchase it's the perfect time to experience the quality and reliability that has made dickies a trusted name for over a century