Talk:Cepstrum

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
WikiProject Statistics (Rated C-class, Mid-importance)
WikiProject icon

This article is within the scope of the WikiProject Statistics, a collaborative effort to improve the coverage of statistics on Wikipedia. If you would like to participate, please visit the project page or join the discussion.

C-Class article C  This article has been rated as C-Class on the quality scale.
 Mid  This article has been rated as Mid-importance on the importance scale.
 

What do you take the log of?

If you do a fft, say in matlab, you get the frequencies on an X axis and an associated amplitude on the Y axis. Lets say it says that the amplitude is 27 at 40 Hz. When I take the log of the FT, is it the log of the 27 or the of the 40? The same goes for the absolute value and the squaring. I know this must seem like a stupid question... and I think it would the be 27, but I appreciate any corroboration on this. It doesn't seem to answer this in the article.

Also, I thought that if you take an Inverse FFT on the FFT of a signal you get the signal back, so it effectively does the opposite operation, so how can both equations be right: FT->log->IFT and FT->log->FT


—Preceding unsigned comment added by 165.124.112.31 (talk) 09:46, 15 March 2010 (UTC)

Lots of sources claim that the cepstrum is FT->log->IFT, while others say "not only is the cepstrum FT->log->FT, but FT->log->IFT is wrong", whereas I don't see FT->log->IFT sources claiming that "FT->log->FT" is wrong. What is it? -Jay Kominek

The two methods are functionally equivalent. In the real case, when we use the IFT there is more data in the end result; however half of it is redundant due to symmetry. The IFT will double the number of points, while the FT will halve the number of points. So, using the IFT will yeild twice the frequency resolution. In this respect, using the IFT is equivalent to using the FT with zero padding to double the length of the signal. Using the IFT may also be desirable for purposes of mathematical analysis, which is probably why most DSP references prefer the IFT. From the intuitionists' perspective is that using the FT 'makes more sense', however using the IFT is certainly NOT wrong. In fact, it is always possible to interchange IFT and FT operators, as long as you know what you're doing. ;) --andy

Andy, this is misleading. The only difference between the Fourier transform and the inverse Fourier transform (or DFT and inverse DFT) is the sign of the exponent in the transform (and possibly the normalization), not the number of outputs. I think you are confusing this with the fact that there are often specialized versions of the DFT (FFT) for real inputs, which only store half of the complex outputs, because the other half is redundant (and conversely, there are specialized inverse FFTs that take conjugate-symmetric outputs to real inputs), but this is an optimization of the implementation, and does not change the logical mathematical transform being computed. —Steven G. Johnson 19:44, Apr 21, 2004 (UTC)
In [1] Oppenheim defines the complex cepstrum as complex Fourier transform -> complex logarithm -> inverse Fourier transform. Using this definition, complex cepstrum is invertible, and can be used for homomorphic filtering, i.e. deconvolution of a signal and a filter. In their original paper, Bogert et al. studied seismic echoes, for which the definition "the power spectrum of the log power spectrum" is more intuitive, and need not be invertible. Usually cepstrum is discussed in the context of deconvolution, so it is natural to use the Oppenheim's definition of complex cepstrum, even if the system doesn't need to be invertible. The important thing is the logarithm. Without the logarithm, the second spectral transform would essentially transform back to time domain. For the purposes of speech recognition features, either definition could be used. Mel-frequency cepstral coefficients are calculated using DCT as the second transform, because it works just as well, and also approximately decorrelates the features.

I removed the following note (in bold) from the article: "There are many ways to calculate the cepstrum, some of them need a phase-warping algorithm, others do not. (fixme which one one is?)" Dori 21:16, Nov 18, 2003 (UTC)


Many texts incorrectly state that the process is FT->log->IFT, i.e. that the cepstrum is the "inverse Fourier transform of the log of the spectrum". This is not the definition given in the original paper, but unfortunately is widespread.

Note that taking the IFT is equivalent to an FT if you have a purely real signal as input and you are computing the "real cepstrum" (log of magnitudes), which gives a real-symmetric signal as input to the final IFT or FT. So, maybe the IFT version is simply a generalization of the original definition to complex signals or complex cepstrums, and is not "incorrect?"

I don't know much about cepstrums (although I have a lot of experience with FFTs), but this article strikes me as bit weak from a mathematical standpoint. —Steven G. Johnson 19:44, Apr 21, 2004 (UTC)

"...unfortunately is widespread." it's more than widespread, it's almost the only used definition in the signal processing community. By the way, the definition in the mel-frequency domain (MFCC) use a DCT (discrete cosinus transform) instead of IFFT or FFT.Celsius813 17 May 2005

---

This is a joke, right? "saphe-cracking" is the give away. Nominate for deletion, as well as anything that links here. --Wtshymanski 21:48, 20 Dec 2004 (UTC) My mistake...this really is legitimate, though in my defense Wikipedia is the first place I've ever heard of this. --Wtshymanski 18:31, 21 Dec 2004 (UTC)

---

Shouldn't the first sentence be "A cepstrum (pronounced "kepstrum") is the result of taking the Fourier transform (FT) of the logarithm of the decibel spectrum as if it were a signal."?

please get a WP username and sign your posts. anyway, i think the sentence could be improved but (even though i don't use the term), the "decibel spectrum" has already had a logarithm applied to it. you don't want to apply the log to the log of the spectrum. r b-j 03:08, 11 January 2006 (UTC)

Hey...I have read about cepstrums as homomorphic systems. So, if 2 signals are related in the time domain by a non-linear combination as a convolution..taking the FFT would result in the multiplication of the signals in the ffrequency domain. A log maps the multiplication into the addition domain. Any filtering action can now be performed to remove/smooth any of the signals. But the signals still remain in the frequency domain and an inverse FFT would result in us getting the transformed signal in the time domain. I think an IFFT is more appropriate. But this is the only place where I have seen this definition. --S.Sriram 11:28, 25 January 2006 (UTC)


Is the etymology section some kind of joke? It's completely useless and should be taken out. One part of a sentence is all that is needed to explan that cepstrum is a rearrangement of spectrum.

"Correctness" of cepstrum definition[edit]

I know this has been discussed above, but a lot of the above discussion is old, and I don't want my contribution to be lost, especially since it concerns an edit I made. Moreover, I'm going to take this from a slightly different angle: what really makes one definition of "cepstrum" more "correct" than the other? Yes, FFT -> log -> FFT was the original definition, but FFT -> log -> IFFT is obviously useful to many people, and is what a lot of people mean when they say "cepstrum". Why can't it be both? Why does the original definition have to be the One True Definition? This is obviously a somewhat subjective matter. Because of this, I decided that it is not NPOV to declare that a given definition is "correct" and have reworded the article accordingly. - furrykef (Talk at me) 18:11, 27 March 2006 (UTC)

"Real cepstrum"[edit]

The real cepstrum uses the logarithm function defined for real values, while the complex cepstrum uses the complex logarithm function defined for complex values also.

I'm not quite sure what this means. Is it saying that the real cepstrum is real FFT -> log -> real FFT? In other words, throwing away the imaginary part after the first FFT as well as the second? I've also seen "real cepstrum" defined as the complex cepstrum with the real components thrown away at the end; which is it? - furrykef (Talk at me) 18:21, 27 March 2006 (UTC)

(Rewritten) Components are not usually thrown away but one may choose to reduce each real+imaginary pair of values to a single real value by rotating their vector angle to zero, where the imaginary part becomes zero. That obtains the absolute value of the complex value. Only its phase information is ignored, the new real value keeps magnitude information.Cuddlyable3 11:29, 25 June 2007 (UTC)

Absolute value[edit]

Also, I've also seen a cepstrum defined such that the absolute value is taken before the logarithm. This is not mentioned anywhere in the article; which way is it? Or is it another matter of debate like whether or not taking the IFFT can be considered "correct"? - furrykef (Talk at me) 18:51, 27 March 2006 (UTC)

(Rewritten) The absolute value is taken before the logarithm. Do not take an absolute value of a logarithmic (e.g. decibel) value! Cuddlyable3 11:29, 25 June 2007 (UTC)

The article describes the definitions for the power cepstrum:

--

mathematically: power cepstrum of signal
  • algorithmically: signal → FT → abs() → square → log → FT → abs() → square → power cepstrum

--

However, the formula does not agree with the other definitions. It would be


mathematically: power cepstrum of signal

Otherwise, the square may even return numbers smaller than zero, because the argument is a complex number; remember that . --62.159.14.3 (talk) 10:55, 10 June 2009 (UTC)

Another note, the definition of the complex cepstrum is messed up as well. It says:

signal → FT → abs() → log → phase unwrapping → FT → cepstrum


However, this does not make sense: the absolute value of the Fourier transform is real, the following logarithm yields always a real result. Therefore, the logarithm does not have any imaginary part, and phase unwrapping would not be necessary. Actually, in the most reliable definition of the complex cepstrum that I have available here, Digital Signal Processing by Alan Oppenheim and Ronald W. Schafer, Prentice Hall, 1975 edition, page 500, which is part of the chapter 10, "Homomorphic Signal Processing", there is no absolute value. To me, it does not make any sense. Please get some books, verify and fix up the article. --62.159.14.3 (talk) 11:31, 10 June 2009 (UTC)

Added convolution note[edit]

The property that convolutions turn into additions is very important and is indeed a large part of the motivation behind the cepstral domain in the first place. I'm not too good at the math latex stuff so maybe someone can fix that up. --Speedplane 05:14, 5 March 2007 (UTC)

How to edit "References" section?[edit]

New at this. The word "Analysis" is misspelled in the first (BP Bogert, MJR Healy, JW Tukey) reference.Smoo222 (talk) 21:28, 29 June 2009 (UTC)

No, it's not. It's a playful alteration of the word in the same way that "cepstrum" is an alteration of "spectrum". - furrykef (Talk at me) 04:38, 30 June 2009 (UTC)
Oops! You're right. About 1/3 of the references to that article online have "corrected" the spelling, but indeed, the original was spelled 'alanysis'. Thank you. Smoo222 (talk) 16:07, 30 July 2009 (UTC)

Cepstral concepts[edit]

>For example, if the sampling rate of an audio signal is 44100 Hz and there is a large peak in the cepstrum whose quefrency is 100 samples, the peak indicates the presence of a pitch that is 44100/100 = 441 Hz.

For me, this seems incorrect at first glance. The peak indicates there are some pitches with frequency differences of 440 Hz. Couldn't this for example also be a 220Hz symmetric square wave? (220Hz-660Hz-1100Hz-1540Hz...)? I have no means to test this right now without endless calculations, but this is what I'd guess from what I assumed to have understood - it could even be something completely disharmonic, as if you take this square and frequency-shift it by 100Hz. The 440Hz base frequency would then just be the "most probable guess", as the other examples are more or less unlikely for "real world signals". You might have to take into account the saphe to separate such cases. Jssr67 (talk) 20:03, 2 January 2010 (UTC)

I think that using the word "pitch" doesn't make the case clearer. One cause is that Pitch (psychophysics) is actually a psychoacoustical quantity. The perceived pitch can certainly be related to periodicities in the signal, but it can also be quite different. I think it is safer to say that the cepstrum shows marked peaks when harmonic structures in the spectrum present, and is therefore usually related to periodicities in the time signal, which in themselves can be perceived as a pitch. In the example above, the perceived pitch would be the fundamental frequency, or F0 (but alas: a pitch can also be perceived for a F0 which isn't there at all, for example when the signal is high-pass filtered, e.g. telephone speech). The physical periodicity in the time signal is also, clearly, 1 / (220 Hz), and one would expect the highest peak in the cepstrum at this position, at a "Quefrency" of 4.54545454... ms.
Another thing that might well be worth noting is that when some physical body is periodically excited in a nonlinear manner, then a harmonic spectrum appears, and maximum place in the cepstrum (ignoring an always existing maximum at zero) will indeed show the fundamental frequency. Therefore, the cepstrum is quite important in the analysis of machine vibrations e.g. of gears, because damage of toothed wheels will usually cause such nonlinearities and harmonic noises, which are readily detectable in the cepstrum.
--62.159.14.3 (talk) 18:15, 17 February 2010 (UTC)

Confusion - two definitions[edit]

In engineering signal analysis the cepstrum is the spectrum of the spectrum ie Ecepstrum=FFT(log(abs(FFT(timesignal))). In speech analysis it is something else involving an IFFT, something like Scepstrum=IFFT(log(abs(FFT(timesignal))).

I very much doubt the two are equivalent and I think it would be a good idea to have sections on each. At the same time I have no great enthusiasm for doing so since Ecepstrum is not very useful. Greglocock (talk) 03:22, 18 September 2013 (UTC)

In Lectures "Speech Processing by Computer" (Notes on Computing the IDFT and Cepstrum) by Professor Leah H. Jamieson we have explanation that those two definitions are interchangeable when considering real even functions, which is the case of log(abs()) function.--46.215.36.139 (talk) 17:17, 21 January 2014 (UTC)

Minor Ambiguity[edit]

The picture located here: http://en.wikipedia.org/wiki/File:Cepstrum_signal_analysis.png is very useful for seeing how a cepstrum 'works'; however, it's located within the paragraph defining the four types of cepstra and doesn't have any mention of which type the picture contains. Can someone clarify? — Preceding unsigned comment added by 75.155.204.180 (talk) 16:13, 9 April 2015 (UTC)

It's the one I have called an engineering cepstrum elsewhere on this talk page. Greglocock (talk) 00:22, 10 April 2015 (UTC)

  1. ^ Oppenheim, Alan (1968). "Homomorphic Analysis of Speech". IEEE Transactions on Audio and Electroacoustics. Unknown parameter |coauthors= ignored (|author= suggested) (help); Unknown parameter |month= ignored (help)