Archive.fm

The Bold Blueprint Podcast

The Bold Blueprint Avideh Zakhor Believe in Your Own Power

The journey to unlocking your potential begins with a belief in yourself.

Broadcast on:
09 Oct 2024
Audio Format:
other

Hey Amazon Prime members, why pay more for groceries when you can save big on thousands of items at Amazon Fresh? Shop Prime exclusive deals and save up to 50% on weekly grocery favorites. Plus save 10% on Amazon brands like our new brand Amazon Saver, 365 by Whole Foods Market, Aplenty and more. Come back for new deals rotating every week. Don't miss out on savings. Shop Prime exclusive deals at Amazon Fresh. Select varieties. We wear our work day by day, stitch by stitch. At Dickies, we believe work is what we're made of. So whether you're gearing up for a new project or looking to add some tried and true work wear to your collection, remember that Dickies has been standing the test of time for a reason. Their work wear isn't just about looking good. It's about performing under pressure and lasting through the toughest jobs. Head over to Dickies.com and use the promo code workwear20 at checkout to save 20% on your purchase. It's the perfect time to experience the quality and reliability that has made Dickies a trusted name for over a century. Let me first start taking care of some logistics. Okay so the presentations on Friday from 9 to 11 wang room which is the fifth floor of Cori Hall in front of the elevator. Okay I received I think email from a lot of you. The only one whose name I have a question is Howard. Are you doing a project or represent a registered review? Okay so a project. Okay so then based on that the total number of presentations are 10 and let me write down who those are. I won't write and I'll just read it. So I got Jimmy Wang is he here? No but he mentioned that it's the project. Robert Feld, that's you right? Okay Casey who is not, Casey is not doing anything. He is attending a conference. Howard, Chris, Chris Anderson, Alain, Prasanth Jivon together with Patrick Gork together with David Lin, and then Sergei together with Min together with Mary, and then Vijay together with Galen together with there's a oh Eric Battenberg. I don't have his name down here. So together with Eric Battenberg. So those are those are the groups that I have or the individuals. There are four names that I don't have any information on. Two of them are from nuclear engineering, two of them are from mechanical engineering. Specifically I have Shima and Jiang from nuclear engineering. Is he here? Mingshan Sun from nuclear engineering. That's you. Are you taking the course for credit? And are you having a project or or a literature review? Did you submit the proposal? Okay and what's your project on? What's your project on? Okay so I didn't read your name so you will be giving a presentation. Okay so Howard is with Howard, we had 10 and with Mingshan, that's 11. Okay and then Suon Fan, mechanical engineering, and then Xi Xing Zheng, mechanical engineering. So okay so I assume those guys are not in. So let me just, do you guys want to come up with a tentative order that you give your talks? Any chance? Okay so roughly, let me just calculate, we will do the ordering and real time in the class from 9 to 11. But roughly speaking I have, let's see, yeah I don't want to spend time to do that. I'm trying to count the number. So Jimmy Wang, Robert Hill, Howard, Chris, Allen, Prasanth and the team, Mingshan, Khatrya and the team, Sergey and the team, and Galen Reeves and the team, and Samuel. Did I miss anyone? So I think it's 11. Okay so we have 11 presentations and we have two hours and we have three multi-person projects. So I was just going to give a little bit more time to the multi-person presentations than the other ones. So two hours is approximately 120 minutes. Yeah so 120 minutes divided by 11 projects. So yeah we'll just give a little, a tiny bit more to the, to the multi-person projects. That's just, let's stick to 10 minutes, 10 minutes per project presentation. Okay so that will take us to 110 minutes and then we have 10 minutes for question and answer for slipping and various other things. Okay question. Yeah yeah but if you're showing video then that's more problematic. It's better to bring your laptop. I don't think many people are showing videos are they? Okay any other questions or comments? Okay very good and actually what I'd like you all to do that's another important thing is I want you to bring along a hard copy of your presentation and make, or make 15 copies of it but it doesn't have to be one per page that's pretty wasteful. You can do four per page or two per page, something like that. Actually is that, is that overly imposing on you guys? Is that difficult? Just tell me if it is. I won't hold it against you if you tell me that's a pain in the neck. Don't ask us to do that. Okay no I want one copy at least for myself. Let's just do that. That's a good intermediate solution. Bring along one hard one hard copy of your presentation because I want to make comments and notes on that and pass it along to you and pass it and keep it for myself. So bring one copy of my PowerPoint and then later on you can email it to me. One hard copy of your PowerPoint for instructor. Okay and then obviously the papers or the reports are due on May 15th. Okay so in that one hard copy do make it one page, one per page so that I can write comments as you talk so that at the end of your talk I can then say okay here, here, here, you know comments, feedback, questions, whatever. Okay any other questions or comments? All right so let me let me backtrack as to where we were last time. We talked about JPEG and we talked about discrete cosine transform and ways that one could quantize and and and use DCT together with Huffman codes and various other things to to do image still image compression. What I'm going to talk about today is multi-resolution coding and sub-bank coding and wavelets. All of those are interrelated kind of topics and I will begin by talking about what's called permit coding which was introduced by Burton Adelson in the 80s. Then I will move on to talk about subbank coding and then from subbank coding I'll proceed to talk about wavelets. So wavelets have been around since late 80s and if they've been connected to different and related to different disciplines by various people and indeed in the signal processing community people have been doing subbank analysis and sub-bank coding for a long time and so when wavelets were introduced the the reaction of that community was "Oh wow we've been doing that for a long time" and the reaction of computer vision community would all be doing multi-resolution analysis for a long time but basically what happened was Stefan Malal put the concept of wavelets in. Hey Amazon Prime members why pay more for groceries when you can save big on thousands of items at Amazon Fresh shop Prime exclusive deals and save up to 50% on weekly grocery favorites plus save 10% on Amazon brands like our new brand Amazon Saver 365 by Whole Foods Market a plenty and more. Come back for new deals rotating every week don't miss out on savings shop Prime exclusive deals at Amazon Fresh select varieties we wear our work day by day stitch by stitch a Dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the Dickies has been standing the test of time for a reason the work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to Dickies.com and use the promo code work where 20 at checkout to save 20% on your purchase it's the perfect time to experience the quality and reliability that is made Dickies a trusted name for over a century rigorous mathematical flame framework and then I'll talk a little bit about quantization and coding of wavelet decomposed images because I think that's just as important as the decomposition remember the three basic questions that we deal with in compression is what to code how to quantize and how to do bid allocation so for the problem of what to code we've talked about we could code these pixels and a spatial domain in a predictive way or differential way whatever or you could code these CT coefficients or now today we're talking about the third technique which is wavelet I can code sub band coefficients or wavelet coefficient etc and for the CT coefficients we've already talked about quantization and bid allocation we've decided we're going to be doing uniform quantization but using perceptual cues and human visual characteristics we might want to apply different quantization steps as to different coefficients of DCT and then for bid allocation for those we've decided to use Huffman coding so that that is a combination of those three things DCT coefficient is uniform quantization but with different step size or different coefficient and Huffman coding that's a bundle that defines JPEG baseline system and today we're going to talk about wavelets quantization of wavelet coefficients and we will see in a little bit that one of the most popular techniques for doing bid allocation for wavelet coefficient is arithmetic coding so so these are all kind of interrelated and and and combination of these three things make up a compression package just a few words about if there's a little bit of time I talk a little bit about video compression basically in video what you what you try to do is combine some of these techniques essentially you remove the redundancy between successive frames by trying to predict a frame from the previous frame using a series of motion vectors and then so then that from that point of view it's a predictive technique what to code is it is the error of that prediction between successive frames and then how to code it people have applied DCTs and wavelets to that the same way as they apply DCT and wavelet to regular still images okay so if there's time I'll talk a little bit about one of the fundamental ideas of behind video compression so let me let me start talking about multi-resolution coding and in particular pyramid coding I'm going to use pictures from both J-lim and also Gonzales and Woods okay so number the pages really quickly sorry okay so multi-resolution subband pyramid wavelet coding all those in one one shot okay so I'm going to start by by the notion of pyramid coding and before I even do that we've already to some extent talked about multi-resolution let me just explain what the word multi-resolution means we talked about it a little bit last time I mean it's essentially means looking at the same picture at at different resolutions different number of pixels it could be different spatial resolution or could be different temporal resolution so if you're talking about video it it's talking it refers to for example having different frame rates that generates a multi-resolution temporal video sequence if you're talking about still images it just talks about number of pictures number of pixels we can switch to the screen on the computer for just a second this just reminds you what we mean by multi-resolution so this picture has been coded at three different spatial resolution so this is a multi-resolution kind of an image so initially this as I said earlier this notion of multi-resolution was introduced by Bert and Edelson and it was in 1986 or so and it's interestingly enough when their paper got reviewed this is one of the comments that that if you can still switch to the to the screen one more time this is the comment that the reviewers said about their paper they said that this manuscript is okay if compared to some of the weaker papers however I thought that anyone will ever use this algorithm again so this was in 1982 and by far that whole concept of multi-resolution has is one of the most powerful techniques today 2002 so don't get discouraged if at times you submit a paper to a conference and you get discouraged in comments like that okay so so the reviewers of 1982 paper for these guys are very negative okay nevertheless these guys kind of pushed forward and their main motivation was they were trying to compute the motion correspondence between successive frames of video and by the way Edelson is now at MIT and I don't really know where Bert is but Edelson is in the psychology department at MIT and I think the work might have been at MIT but although I have a hunch it was done in Sarnoff Labs maybe Edelson at that time was in Sarnoff Labs but their main motivation was they had successive frames of video and and they wanted for every for every little walk in this picture they wanted to find out where it came from in the previous frame and it was it was just taking too long to do that so the idea was if you come up with multi-resolution representation of both both images so and you have n by n version of this image and n by n and instead of solving the correspondence between the two n by n pixel images if I now come up with n over 2 by n over 2 and then n over 4 by n over 4 and n over 8 by n over 8 versions of those pictures and if I solve the the motion vector estimation problem on top of the pyramid so it's here I'm building a pyramid that's why it's called the pyramid coding technique so if I build a pyramid in this manner and I solve the correspondence problem on top of the pyramid by finding the motion vectors between n over 8 by n over 8 pictures because it's much simpler there's fewer pixels and faster etc and then when I get that rough estimated motion vectors I'll just keep refining it as I come down the pyramid so their main motivation was a purely a computational one then then by the time I get at the bottom of the pyramid then you know I have found my motion vectors but I've done it in a much in a kind of a course to find scale and and and now I have the same answer it might not be the exact answer I would have gotten if I did exhaustive search on the end between the end by end pictures but it's close enough okay so the initial motivation for these guys was to build to compute a reduced complexity of motion estimation so as I said the basic idea was you have two n by n images and you you build a pyramid n over 2 by n over 2 and over 2 by n over 2 let's say n over 4 by n over 4 and over 4 by n over 4 and you do the matching between these guys first and then you propagate the motion vectors down and then you propagate the motion vector refine the motion vectors as you go from here to here etc so the question is how do you how do you build this pyramid well if right off the bat when you when you look at this picture you would say oh what I have to do is probably just subsample this image to get n by n image to get the n over 2 by n over 2 image right and that's that's probably correct except that what's wrong with that what's wrong with that approach if I just subsample an audio signal if I just subsample on an image what happens hey Amazon Prime members why pay more for groceries when you can save big on thousands of items at Amazon Fresh shop prime exclusive deals and save up to 50% on weekly grocery favorites plus save 10% on Amazon brands like our new brand Amazon saver 365 by Whole Foods Market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at Amazon Fresh select varieties we wear our work day by day stitch by stitch a Dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the Dickies has been standing the test of time for a reason their work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to Dickies calm and use the promo code work where 20 at checkout to save 20% on your purchase it's the perfect time to experience the quality and reliability that is made Dickies a trusted name for over a century you get aliasing right so instead of doing just subsampling what do I have to do before subsampling don't ask filter to get rid of aliasing so the question now we're going to address is how to build pyramid and this pyramid that these guy built is more or less the same pyramid as we build for pyramid coding so they're kind of the same idea so the basic idea is really to do successive low pass filtering and subsampling so what do I mean by that well I start with this signal f of n 1 and n 2 and I do low pass filtering on it and then I subsampling and this this I call f sub i and I and what comes out is f sub i plus 1 of n 1 and n 2 so as as I keeps up sample as I keep subsampling this thing I'm traversing up the pyramid right so in this picture this is probably f naught this is f 1 this is f 2 because a low pass low pass filter and subsampling to get this one low pass filter subsampling this to get that one okay so how is the what's subsampling so so this is the this shows the process of generating the i plus first level from the i th level okay so how do I do the filtering well you can do any kind of filter that the that your heart desires the the f sub i l means the low pass version of n 1 and n 2 is just fi the image at the i th level just convolve with some filter h of n 1 and n 2 would generally want to do fi r filters because we are interested in linear phase and it's easy to design fi filters with linear phase so that's that's that's it and so the subsampling that's pretty straightforward we say f sub i plus 1 of n 1 and n 2 is just now f sub i of l n 1 comma n 2 for n 1 and n 2 in the range of the the image we call it to let me write down here for n 1 and n 2 in between 2 to the m minus 1 and 0 where this m is the mth level of the pyramid okay and it's 0 otherwise okay and what what what filter I use here what h I use basically determines the kind of pyramid so this h guy if I were to use my red pen here determines the kind of pyramid so in particular if I use I can build a what's called a Gaussian pyramid if I use the following a separable filter h of n 1 and n 2 is h of n 1 times h of n 2 where h of n I approximate some sort of the Gaussian let's say it's equal to a when n is equal to 0 it's equal to 1 over 4 when n is plus or minus 1 and it's equal to 1 1 fourth minus a over 2 for n equals plus or minus 2 and usually people pick a between like 0.6 and 0.3 or some number like that okay so let me just let me just for sake of completeness show you the an example of a set of images that we will get if we do that instead of Gaussian pictures and a Gaussian pyramid that's figure 1036 so let me let me just show two things I can you zoom in please actually zoom on just okay great so this is the h of n when a is equal to 0.3 it's kind of more flat looking Gaussian this is for h of n a 0.4 0.5 and by the time you get to 0.6 it's kind of more narrow and and this kind of shows the one-dimensional graphic representation of how we build this Gaussian pyramid okay so at the bottom you have f0 of n1 and n2 if you can zoom in now for this bottom picture that would be great so down here you have f0 of n1 and n2 the pixels and at this level you have f1 and this one I have you have two so these so for the case that your h of n is four points long you have you have these four points linearly well actually there's a five points look at this this picture one two three four five so these five points contribute to this signal and then these five points contribute to this point etc etc so you're building a pyramid by combining the pixels in that way and if you start with the actually I don't know the name of this picture but this is it's not the Vegas okay some other lady picture 512 by 512 and if K is 4 that means you build four levels of your pyramid so this is what you get by this is the Gaussian pyramid that you get so this is the 0 f0 f1 f2 f3 and f4 okay so I just so let me just quickly write down here show figure 10.34 to 10.36 of Jlin okay and so now the question is how do we how do we apply this concept to pyramid coding okay so get a fresh sheet here great so how we do pyramid coding okay so basically the the main idea is that you want to code successive images in your pyramid down the from from top to bottom okay so the basic idea is essentially to code successive images down the pyramid from the ones above it and and and use some sort of a fancy technique so so in other words let me come back to this picture here so essentially somehow code the picture on top of the pyramid the N over 4 by N over 4 either dct or whatever your favorites technique is and then once you've coded that try to use the coded version of this top signal to code the next one and then try to code to use this in order to code this one okay so it's kind of a dot has a kind of a domino effect on top start from top code the top one and then use successive levels to code the ones below it okay so how do how do we do this successive thing that I'm talking about well you interpolate the i plus f of i plus 1 the i plus 1th level image in order to come up with the prediction for f sub i and then code that predict subtract that prediction of f sub i from the actual f sub i and that comes up generate what's called prediction error and then code that prediction error using whatever favorite take quantize and code using whatever technique that you have so that send it to the receiver so that the receiver can use that error receiver does the same thing essentially the receiver receives that error and tries to build the the f i level just keep doing that until you hit the bottom okay so you know code the so what so the basic idea is that you interpolate f of i plus 1 in of n1 and n2 in order to get a prediction hey amazon prime members why pay more for groceries when you can save big on thousands of items at amazon fresh shop prime exclusive deals and save up to 50% on weekly grocery favorites plus save 10% on amazon brands like our new brand amazon saver 365 by Whole Foods market aplenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at amazon fresh select varieties we wear our work day by day stitch by stitch at dikies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember that dikies has been standing the test of time for a reason their work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to dikies.com and use the promo code work where 20 at checkout to save 20% on your purchase it's the perfect time to experience the quality and reliability that has made dikies a trusted name for over a century for f i so this is the top this is higher up than this one okay and then so so basically the prediction for f sub i this is prediction for f sub i is is given by the interpolated version of f of i plus one and one and then two and you can use linear interpolation non-linear interpolation in in or or any any fancy scheme that you want and then code the prediction error so ei of n one and n two is just f the actual signal at the ith level minus the prediction that you came out with it and this is really the story of our life in in in all compression schemes is making prediction try to make that prediction as good as possible so that the energy of this error signal the variance of it is small so it can be coded very very few bits okay and then you just repeat the above process until the bottom level image is reconstructed until the bottom level image is reconstructed that i either original okay so in this in this scheme of things the the the sequence f sub i is called the Gaussian pyramid which i already showed you and this the sequence e sub i a sub i of n one and n two is called the Laplacian pyramid okay so let me just for the sake of completeness i show you the a picture from so figure 1038 which is an example of a Laplacian pyramid so that you know how these e signals look like can zoom in place okay so so this at the bottom i know it's very very small you you've somehow coded the top level image using any kind of an intro frame coding scheme then you you interpolate that subtracted from the next level signals which is this signal here and you get this this error and then keep on going until until the final error that you have is this so you have to code these error signals this error signals this error signal and this error signal is that clear to everyone how it works yeah and so right off the bat what's one of the difficulties or disadvantages of this pyramid coding technique anyone how many pixels did we start off with this this Leno signal whatever we want to call it it had five it had this many right it was 512 by 512 originally how many how many error values do we have to code now well there's a 512 by 512 256 128 and 64 by 64 so the total number of pixels that we have to code has been increased it's true they have smaller variances they have less less variance and therefore it takes fewer bits to code each of these pixels than the original but the sheer fact that the total number of random variables we have to code has increased is a bad thing remember in transform coding that was one of the best advantages of transform coding is that it has a compaction property you have 64 pixels by the time you transform it into into dct you won't really have to code five or six of them that's awesome you can just throw away 59 or 50 or whatever a huge number here it's cutting down the reverse and that's exactly why pyramid coding is not it's not used today it's it's a basis for what came after which is self encoding and wavelet in which i talk about in just a second but per se it's not a very good coding scheme and just so that you can be aware of this you know this is the example of the coding that that 512 by 512 image at half a bit per pixel which is very high i mean i've showed you last time color pictures using jpeg 2000 that were like 0.1 bits per pixel so so to wrap up this discussion i like to just show the flow diagram of this technique you can zoom zoom out just a little bit okay can you read those i guess a little bit it's not it's not too too hard i mean these boxes are interpolations and and so what you start is you you start with so how do we how do we build this this Laplacian pyramid how do we generate it well we've built the the the Gaussian pyramid and this is our by sub sampling sub sample filter sub sample filter sub sample filter just to get to the end of it now at the end is is that is we have we're going to code this thing and send it somehow to the receiver that's the f that's the f sub k this is the kth level of the pyramid that's the top of the pyramid so you start from that f sub k signal you interpolate it right and because uh you also add so that this is the prediction for f sub k minus one and then you add to it the error signal that you generated that that that the at the encoder uh to to make sure that the prediction together with the error signal now matches the k minus first level of the pyramid and i repeat this process until i get to f two and then f one and then this error gets added up and i get f zero so what it what it takes is these these error signals need to be sent from the sender to the receiver so that they can both successfully in sync reconstruct this f naught signal okay uh i wish that j-lim's book had had had these the encoder side but it hasn't been but it's pretty straightforward okay uh next uh any any questions on pyramid coding okay so uh let me show you uh one other example of pyramid coding here uh not here yeah i'm sure figure 7.2 oh i'm afraid i have to download this let me just quickly do that [silence] what's it called digital image processing? [silence] what's the publisher predence hall let's just add one more i wanted to take me to the book site oh i misspelled oh god damn on what oh my god you got it thank you so let's just quickly download the um faculty and then uh instructors manual powerpoint presentation and we want chapter seven i forgot to put it on my own website so we just quickly download it here the funny thing about google is that even if you misspelled it comes up with a thousand answers of the misspelled thing sometimes and they only went to know where they just said oh i know that topic should have raised you know hundred thousand answers why is it only giving me five and those five are all misspelled too okay so if i go to figure 7.1 and 7.2 i can show you okay so this is an image uh this is how we build the pyramid we already talked about that and this is the pyramid representation of that original image is the was image or how how do you say it in American English is it vase or was this was okay so so this is the the the gaussian pyramid and this is the laplacian pyramid at the end and okay so now um switch gears and i'll talk about the sub-band coding so how does um if you can zoom out just a little bit so how does sub-band coding work sorry can you shift a little thank you well the idea again was introduced many years ago in the in the speech world right uh for example uh where's this other power point uh so for example if i hey amazon prime members why pay more for groceries when you can save big on thousands of items at amazon fresh shop prime exclusive deals and save up to 50 on weekly grocery favorites plus save 10 on amazon brands like our new brand amazon saver 365 by whole foods market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at amazon fresh select varieties we wear our work day by day stitch by stitch the dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true workware to your collection remember the dickies has been standing the test of time for a reason the workware isn't just about looking good it's about performing under pressure the last thing through the toughest jobs head over to dickies.com and use the promo code workware20 at checkout to save 20 percent on your purchase it's the perfect time to experience the quality and reliability that has made dickies a trusted name for over a century hi if i go to um if i look up the speech literature i think it's estabon that's right estabon and gallant and cross here were the first guys so it was initially introduced in speech coding and but these i think these guys french guys cross here estabon and gallant it was in 1976 so nearly 30 years ago and the basic idea is to start with your signal x of n and pass it through two filters h0 of n and h1 of n subsample both of those once again you can see very similar to pyramid coding but here you're not only so h0 is kind of a low pass with their h1 is kind of a high pass with a zero dividing into two pans two bands right but you can see that because we're subsampling both of those by a factor of two there's no expansion of the number of samples okay so one thing about pyramid coding that i forgot to write down but we talked about it is there's an expansion of the number of samples one has to code here there's no expansion so sub-band no expansion and the number of samples so then you quantize and code this and and somehow send it across to some sort of a channel and then at the other end you op-sample it by factor of two and then pass it through signals g0 of n and g1 of n another set of filters and and outcomes a new signal which we call x hat of n okay and we're kind of hoping i'm playing that x x hat comes very close to x0 the x hat comes very close to x okay and you can roll up just a little bit and intuitively the basic idea is that roughly speaking well you can already see how the system would work if h0 was a brick low pass filter and h1 was a brick high pass filter right you start use from a spectral point of view you started this signal you decomposing to low pass component high pass component now you can down sample without introducing any aliasing now you up sample it and then you you pass it through the some other filters and you can get perfect reconstruction now but the beauty of sub- encoding is that these h0 and h1 nor g0 and g1 they don't have to be brick anything you can have filters with smooth roll-off and still do a fairly good job so i'm going to plot that so if i were to plot magnitude of h0 of omega or magnitude of h1 of omega it's going to you know be something like this this is zero pi however two it's going to be something like that okay the middle point is pi over two so this is h1 and this is h0 okay and this is the low band and this is the high band okay so let's do um let's do a little bit of analysis or just write down the expression relating the z-transform of this signal or to this signal and we already know these what happens to a z-transform of a signal when we sub-sample it we know what happens to the z-transform signal when we up-sample it all of those we remember from the beginning of the course or e1 20 or e20 etc but the bottom line after a bit of an analysis what we can show is that x hat of z the z-transform of this signal is given by um one-half of h0 of z times g0 of z plus h1 of z times g1 of z times x of z plus one-half of h0 of minus z g0 of z plus h1 of minus z g1 of z all of that multiplied by x of minus z this this thing multiplied by x of minus z so this term here is called the aliasing term why because it has something times x of minus z therefore in order to generate what's called in order to cancel aliasing we have to make sure that our filters satisfy this condition where this thing inside this square bracket is equal to zero okay so let me write that down let me roll up a little bit okay therefore to cancel aliasing we must have um essentially h0 of z times g0 of z plus h1 of minus z times g1 of z equals to zero furthermore if i wanted x hat of z to be perfectly equal to x of z this whole term inside one-half time this thing this this term inside should be equal to two okay to get what's called perfect reconstruction we must have um the following we must have just this guy h0 of z times g0 of z plus h1 of z times g1 of z equals to two so it's possible to write these things as matrices and you know in other words you can you can define this matrix h m of z such that okay so so if i define this let me just do that this guy hm of z a matrix made of h0 of z h0 of minus z h1 of z and h1 of minus z i can combine these two conditions as a matrix equation and i can say that uh to get to aliasing cancellation let's just say to cancel aliasing and achieve uh PR PR doesn't mean it's public relations but just perfect reconstruction we must have the following matrix equation g0 of z g1 of z this two-dimensional vector times this hm guy has got to be equal to two zero two because of this and zero because of that so it turns out people have spent a great amount of time energy and and resources in order to come up with conditions that these filters don't have to satisfy for these to hold true and in in the signal processing literature it's referred to as filter bank okay why is a bank because there's four of them there's there's g0 g1 h0 h1 so there's there's a lot of stuff going on and what one of the one of the things that before i go on about so so basically let me just give a preview of what's coming there's basically there's three broad categories of design techniques people have come up in order to satisfy these things one is called qmf quadrature mirror filters the other one is called cqf conjugate quadrature filters and the third class is called orthonormal and i'll talk about each one of those in just a second but before i do that i want to say one more thing about this set of conditions and that is if if these conditions are satisfied then we can show that the analysis but let me let me first introduce the term analysis and synthesis so in this diagram here these filters are called the analysis filters and these h0 and h1 and g1 g0 and g1 are called synthesis filters so so what we're going to show is that for for these analysis and synthesis filters to satisfy these conditions the aliasing cancellation perfect reconstruction um they they are going hey amazon prime members why pay more for groceries when you can save big on thousands of items at amazon fresh shop prime exclusive deals and save up to 50 percent on weekly grocery favorites plus save 10 percent on amazon brands like our new brand amazon saver 365 by whole foods market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at amazon fresh select varieties we wear our work day by day stitch by stitch at dikis we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true workware to your collection remember the dikis has been standing the test of time for a reason the workware isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to dikis.com and use the promo code workware 20 at checkout to save 20 percent on your purchase it's the perfect time to experience the quality and reliability that has made dikis a trusted name for over a century to be satisfied the by orthogonality condition and i'll tell you what that means in just a second so um so h naught and h one are called analysis filters g naught and g one are called synthesis filters okay and we can show that if um let's call this star this equation star if the star is satisfied then analysis and synthesis filters are by orthogonal and what i mean by those is that the inner product between h i of two n minus k comma g j of k is equal to delta of i minus j delta of n for i and j being equal to being zero and one and the inner product it is is is just defined as um this thing is just defined as the summation of h of over k of h y of n two n minus k g j of k that's inner product just means this summation so this is called the by orthogonality condition why is it why is it called by orthogonal because there's two two functions we were talking about h's and j's right so that means the only time this inner product is non-zero is both the case where i is equal to j and n is equal to zero okay okay so now let me having said that let me now talk about the um three cases um well before even doing that let me let me just talk about some some famous examples of two-channel filter banks with perfect reconstruction and then i'll talk about the three classes can you roll up please okay so here's an example of a uh two-channel filter bank with perfect reconstruction so at the at the surface you might say wow this thing is impossible and so the low pass filter i'm going to use minus one fourth one half three half one half minus one fourth for my analysis filter and this is the tap coefficients this is an fir filter and for my high pass i'm going to use minus one fourth minus one fourth sorry minus one half and one fourth and for my synthesis can you roll down please my low pass is one fourth one half one fourth and my high pass is one fourth one half minus three half one half and one fourth okay so these are this set of filter banks is referred to as the by orthogonal uh five three filters it's called five three because it's got five taps here and three taps here and three taps here and five taps here and as you can see uh these are highly related to each other and that's not by accident we'll show in just a second is the way to build sub-antletors that that's kind of the way to do it and this guy did didier legal who was uh at the time he did this was a researcher can you roll down please i was a researcher in um but then later on he found it c cubed um which is a lsi which is a real SI company building impact chips later on in the valley uh he put patents on these things and these are uh which was which is kind of at the time it kind of was ridiculous but another being an important pattern and it was enforceable why why did he put patents because these have low complexity they have few taps and they have very simple coefficients one fourth and one halves are very easy to multiply because it's just a shift to the left right um in any event just so that you can see what they look like let me show you the frequency response of those yeah i have to go to page 22 here we go so this shows h0 which is a low pass this shows h1 uh this is the axis the frequency axis going omega going from zero to pi uh this is one this is two h0 h1 g0 and g1 all in the four-year domain so um another um so here's so having having convinced you that there are some filters that do this let me know just talk about the three classes of sub end filters so there's three main classes of perfect reconstruction uh uh filter families okay and um case one that i like to talk about is the most famous one is qmf quadrature a mirror filters and in in these filters what we have is to achieve aliasing calculations to achieve aliasing cancellation which is that that term here where is it which is this term here to make this whole term equal to zero uh people achieve that by imposing the following conditions by saying that h1 of z is equal to h0 of minus z is equal to minus g1 of z and is equal to g g0 of minus z so all these four quantities so you're putting constraint on all of these things and you can show that if you allow that to happen then then you hit you you cancel aliasing so and this was derived by Esteban at all in 1980 1976 sorry okay so essentially the high pass band that the reason is called the quadrature mirror filters that the high pass band is the mirror image of the low pass band in the uh frequency design from the frequency domain so the high pass band is the mirror of the low pass band in frequency domain and one advantage of having a system like this is what instead of having to design four filters now you have to design what you have designed one filter right because all four of before we had to design h0 h1 g1 g0 now we have to design one and we better design that one if if if not that we design one and from that one we we derive this other fours we better design that one in in order to do what to achieve perfect reconstruction in other words to make this thing equal to d2 okay so essentially to achieve you only need to design one for aliasing cancellation and then design it in such a way that so to achieve pr perfect reconstruction design that one filter um um to to achieve pr and the condition then becomes let's call that the one that we end up designing is just h0 okay um it becomes h0 squared of z minus h0 squared of minus z is equal to two okay and just so that you can see what filters what qmf filters look like here's an example if you can switch back to the computer hello can you switch back to the computer please so here's an example of a 16 tab qmf filter uh this is an agent hey amazon prime members why pay more for groceries when you can save big on thousands of items at amazon fresh shop prime exclusive deals and save up to 50 percent on weekly grocery favorites plus save 10 percent on amazon brands like our new brand amazon saver 365 by Whole Foods market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at amazon fresh select varieties we wear our work day by day stitch by stitch at dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the dickies has been standing the test of time for a reason their work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to dickies dot com and use the promo code work where 20 at checkout to save 20 percent on your purchase it's the perfect time to experience the quality and reliability that has made dickies a trusted name for over a century one of z that that satisfies this condition and and you can see that has this kind of repose okay now it might not be it might not be possible to satisfy this perfect construction h and h not squirt of z it might not be able we might not be able to satisfy it precisely okay but um but we can we can certainly uh approximate it okay okay so uh the next and then of course you can translate these conditions between h not and h one into from from z domain into into the impulse domain domain okay so let me now move on um so this was qmf filters uh i'm now we're going to talk about um conjugate quadrature filters in other words cqf that's the second class and what can we say about um those those kind of filters where you achieve aliasing a different way by allowing h not let's see h not of z to be equal to um to be equal to g not of z inverse which we define by this little function f of z and then at the same time we're letting h one of z to be equal to g one of z inverse which uh which is which what we now call that this is a definition let me now call that z times f of minus z inverse okay and this result was derived by uh smith and born well both at georgia tech i think smith is not left georgia tech is that correct so okay yeah smith so you didn't overlap with smith smith was there for many years he's now a dean or chair at per dude do we have anything from per dude no anyway so smith and born well 1986 by the way this guy smith is a runner and it's very rare to find like an academic professor type who's also good in athletics he lit um or he was one of the part of the group that lit the olympics oh now i have to remember what year um like eight years ago or twelve years ago or something like that in any event um he has he to his credit he also developed this this famous you know um filters okay so this is conditions under which we you achieve aliasing and in the impulse domain this kind of ends up um ends up being as um um so in the in the impulse in the in this time domain if you will we end up having h0 of k is equal to g0 of minus k which uh we define as f of k and then h1 of k is g1 of minus k uh is equal to minus one to the power of k plus one um f of minus k plus one okay and so that that these conditions achieve aliasing and again we only have to design one filter f of z once we know what f of z all four other filters are are known okay and um for perfect reconstruction so to achieve pr we need to have um the following conditions which is h0 of z times h0 of z inverse plus h0 squared of z minus z sorry um h0 of minus z inverse has to be equal to two or um in the frequency domain all we have to do is to make sure in the in the f guy which is the same as h0 we have to make sure that h0 of omega squared plus h0 of omega plus pi plus or minus pi squared is equal to two those are all this both the same condition okay and finally i'm going to talk about the third class of subband filters which are called orthonormal which is the beginning of what our wavelets were so a special kind of so here's here's the link between subband here's how we're going to jump from subband to describe what wavelets is so if if the if our analysis filter set of filters satisfy the orthonormality condition then then we've designed wavelets okay so a special kind of sub-band counting if you will is is wavelets so class three orthonormal and in that case um we have uh we have the following kind of set of conditions if you will h0 of z is equal to g0 of z inverse uh h1 of z we pick as g1 of z inverse and then g0 of z sorry g1 of z we pick as minus z to the power of minus 2k plus one g0 of minus z inverse okay where um i'll tell you in just a second what the k stands for um is the number of uh filter taps in each filter so 2k is the number of filter taps in each filter and finally to so not only do we have to have these but for to achieve perfect reconstruction we also have to have g0 of z times g0 of z inverse plus g0 of minus z g0 of minus z inverse to be equal to 2 so these are all aliasing and perfect condition perfectly constructed condition perfectly construction conditions okay so orthonormal sub-bands um are uh are essentially the same thing as as wavelets okay and in the in the time domain so so in the the same orthonormal filters in the time domain what we have essentially is that um there there's step beyond bio um bio orthogonal remember bio orthogonal for was for all bio orthogonal was equivalent to the condition of this speaking equal to 2 and this being equal to 0 and all of these things satisfy already bio orthogonality but these guys are orthonormal which means that um the inner product between gi of n and gj of n plus 2m is equal to delta of i minus j delta of m okay and uh or and then furthermore you can show that um uh for orthonormal filters what we can also show in the in the time domain is that g1 of n is equal to minus 1 to the n g0 of 2k minus 1 minus n and hi of n is equal to gi of 2k minus 1 minus n for i equals 0 and 1 so there's a lot of relationship between between these various kinds of um filters okay so now let's let me now move in and show you some uh let me just as an example let's look at figure 7.6 uh in um in gazalas and woods so hey amazon prime members why pay more for groceries when you can save big on thousands of items at amazon fresh shop prime exclusive deals and save up to 50% on weekly grocery favorites plus save 10% on amazon brands like our new brand amazon saver 365 by Whole Foods market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at amazon fresh select varieties we wear our work day by day stitch by stitch at dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the dickies has been standing the test of time for a reason their work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to dickies.com and use the promo code work where 20 at checkout to save 20% on your purchase it's the perfect time to experience the quality and reliability that has made dickies a trusted name for over a century so bring this down this is figure eight okay okay so if i um go back here so this is the basic sub band filter this is the qmf cqf north orthonormal cases this is how we do sub band image coding you start with your signal you pass it through these two filters you you repeat it along the rows and then and then you use in other words if it is x of m comma n m is the first index n is the second index first you pass it through one-dimensional sub band one-dimensional filter h0 and then of m to do to deal with the rows and then you pass it after you sub sample it you pass it along uh you sub sample along the rows then you sub pass it through a vertical filter h0 of n and then you do another sub sampling along the columns and so you end up getting four different kind of signals this is low pass in in in in x and y this is low passing one-dimensional high in the other low passing one-dimensional high passing the other and this is high passing both dimensions anyway since we're talking about orthonormal filters let's not get distracted so this is an example of double share orthonormal filters h0, h1, g0, and g1 this is the first set of wavelet filters that were that was invented these h0 and h1 and g0 and g1 indeed satisfied the orthonormality condition that we've been talking about here and they're referred to as double share orthonormal filters she's a mathematician by training actually in one of the conferences in late 80s i was roommates with her i forgot it was an SLMR or something like that um here's a um i'm going to skip a lot of these things yeah and now talk about uh you know how how does wavelet transform actually work so i'm going to i don't think this guy is going to show any more images so uh so the way wavelet transform actually works is kind of the following way you start with with a signal and you pass it through these filters that are designed you subsample it yeah let's say this is the low pass filter subsample it and get a signal you pass it through a high pass filter this is h0 this is h1 subsample it and then sorry the high pass filter you keep the low pass filter here you further decompose it into low pass high pass i don't think this is the best picture for me to show you so let me go back to um no this doesn't have it either this might have it okay this is a good picture excellent then i don't have to draw it on the picture so here's a discrete wavelet transform so not only these h0 h1s and g0 g1 satisfy this orthonomontic condition talked about but typically here's how we do it you start with it with your signal uh you sample if it's analog you sample it but i don't think we need to worry about that for now so h0 is your low pass filter h1 is the high pass filter you subsample that send it over up sample and filter but the for the low pass signal because it has a lot of energy what you end up doing is you um after you down sample it you repeat that process again up sample down sample that other and then this signal here you can again repeat the process again in a nested way so that the low pass signal keeps getting filtered and and and processed okay so what what is the final picture that you end up getting for one thing there's no signal expansion going on but for another this is this is kind of what you get so if you started with the uh so this is this is the co-end so depending upon how you design your h0 h1 you could have different classes of different kinds of wavelets so i already showed you the dobershare filters which is this one eight eight and then this co-end dobershare four-year filters which is 17 11 that means the h0 has 17 h1 has 11 this one h0 and h1 both have eight and eight har which is a well-known filter even before dobershare two and two taps and simlets which is eight and eight pounds but basically this shows kind of the the the fundamental basic idea in that you start with a big image this is the low you first do a low pass operation and and you have a signal here and they have three high pass signals and then you get the low pass signal and then do degenerate the low pass here high high high and then you get that low pass signal generated low high high high is everybody clear how this works um let me see if it has and and this shows the dobershare orthonormal eight tap filters h0 h1s for it h0 h1 g0 g1 further going down simlets h0 h1 g0 g1 and biothermal co-end dobershare 17 11 filters um so i like to go to the beginning of this presentation because i think i yeah this this thing might might clear a little bit better as to how a wavelet coding works spare with me okay so so the the the switch from one d to do the might not have been terribly clear but here here's the idea this is omega x this is omega y so when you do the the low low component is this gray area that's shown here then the this this gray component is high frequency along x low frequency along y this one is high frequency along y low frequency along x and these four little frequency domain in the frequency domain this is the region that the high high version has and then what you do is you start with this guy and further decompose it to low low high low high and high high okay is everybody clear about how this thing is working and if you do that then here's here's an example of a 2d wavelet transform you start with the lanai image which we all have seen and know uh the first step is just this you have the lanai here which is low passed in high and in x and y dimension this one is low passed in x high passed in y this is the other way and this is high passing both of them and then you start from here from this picture you then decompose this ll component low low component further in order to get this picture these three pieces are the same as what they were before but that that big low low picture is now has low low low low high low and high high again so this is a two level wavelet decomposition and you can use your favorite wavelet you can use dobershoe and dobershoe this whatever it doesn't matter really um and then further you can decompose that signal this this signal again into each low low low low high high etc and what you get is something that looks like this and now i want to end the lecture this is the very last thing i want to say how do you quantize and code these guys so the famous break and then initially when wavelets introduced they were it took 10 years it took a long time before image processing people figured out how to efficiently quantize and code it just uniform quantization apply half-man coding it doesn't beat dct it never did so this smart guy Jeremy Shapiro who actually was classmates with me at MIT who was at Sarnoff lab when he invented this came up with this concept of zero trees and this is almost at the end of the the presentation here and here's how it works so the observation this guy made is if you go back sorry let me go back to where we came from if you go back to this picture you can see it's that in the in for example here you can see that first of all these high high components have very little energy so wavelets have achieved one of the things that transform coding the compactization right the number of samples hasn't increased unlike sub-pre pyramid coding it's it's constant but the number the number of pixels with high energy is very few these signals have almost close to zero energy but here's the observation Shapiro Meg this guy at MIT who was at Sarnoff when he came out with this which is if there's a non-zero coefficient at a particular location it it also a non-zero in the in the other bands the same corresponding locations so for example this location corresponds to this location corresponds to this location hey amazon prime members why pay more for groceries when you can save big on thousands of items at amazon fresh shop prime exclusive deals and save up to 50 percent on weekly grocery favorites plus save 10 percent on amazon brands like our new brand amazon saver 365 by Whole Foods market a plenty and more come back for new deals rotating every week don't miss out on savings shop prime exclusive deals at amazon fresh select varieties we wear our work day by day stitch by stitch at dickies we believe work is what we're made of so whether you're gearing up for a new project or looking to add some tried and true work where to your collection remember the dickies has been standing the test of time for a reason their work where isn't just about looking good it's about performing under pressure and lasting through the toughest jobs head over to dickies.com and use the promo code work where 20 at checkout to save 20 percent on your purchase it's the perfect time to experience the quality and reliability that has made dickies a trusted name for over a century