Discrete Fourier Transform

Noise vs Tone

If a puff of air hits your eardrum, you hear a noise. If a puff of air vibrates back and forth very quickly against your eardrum, you hear a tone. If the vibration happens to be at 440 cycles per second (440 Hertz or Hz), you hear middle A (the A above middle C on the piano).

Graphing a Tone

A line on a graph can represent the vibrations of the air. The rising and falling of the line as time proceeds is called the signal. In audio signals, the line represents the puff of air as it approaches or recedes away from your eardrum. If the air is approaching your ear, the line is above zero. If receeding, it is below zero. The distance from the signal to the zero line is called the amplitude.

Fourier Transform

The Fourier Transform depends on a theory that says that a signal can be represented by a infinite summation of sine waves (also called sinusoids). To recreate any given signal, you take an infinite number of sine waves of the right kind, add their amplitudes at all times, and they will faithfully reproduce the original signal.

Sinusoids

Each of the sinusoids has three properties that defines them:
  1. Frequency
  2. Amplitute
  3. Phase

The Frequency is how many times the sine wave crosses the zero line per second. The Amplitude is how high above or below the zero line the sine wave rises to. The Phase indicates where the sine wave was at time = 0. If it was at zero amplitude and rising it has Phase 0 by definition. If it is at it's maximum amplitude and falling it has Phase 90 degrees (pi/2).

Continuous vs Discrete

The vibrations of the puff of air are continuous. That is, between any two points in time, there is another point of time where the air is doing something. On the other hand, if you wish to represent a sound in a digital way, discretely that is, you must sample the sound periodically and represent the amplitude at those times as a number, positive or negative.

FT and IFT

The Fourier Transform (FT) takes a continuous signal (the time domain) and tells you which sinusoids are required to represent it (the frequency domain). The Inverse Fourier Transform (IFT) goes from the frequency domain to the time domain, that is it takes a set of sinusoids and tells you what signal they represent.

    F(f)=+∞-∞s(t) e -j 2 f t dt    (1)

F(f) is the transform at frequency f. s(t) is the amplitude of the signal at time t. e is the natural exponent. j is the square root of -1 as used by complex numbers. The IFT is

    s(t)=+∞-∞F(f) e j 2 f t df    (2)

DFT and IDFT

To convert formula (1) to its discrete form, the DFT, we do the following:
    F[f] = 1/N S  s[k] e -j k p f / N  for k=0..N-1, for f= 0..N-1

The Inverse DFT (IDFT) is similar:

    s[t] = S  F[f] e j k p f / N  for k=0..N-1, for f= 0..N-1

i.e. there's no 1/N coefficient and the -1 sign is removed.

This identity -- e^j*theta = cos(theta) + j sin(theta)

    ejq = cos(q) + j sin(q)

gets rid of the exponent:
    F[f] = 1/N S   s[k]  cos( -j k p f / N)  + j sin( -j k p f / N)   for k=0..N-1, for f= 0..N-1

Converting to C++

To convert to C++, start with the innermost and common pieces. This piece:

 -j k p f / N

is common in two places and if we move the varying parts out:
         #define _USE_MATH_DEFINES
         #include 
         double arg = -1 * 2.0 * M_PI / (double) NumSamples();
Where NumSamples() returns the number of samples, i.e. N. Substitute back into the equation: F[f] = 1/N Z s[k]*(cos(k * f * arg) + j*sin(k * f * arg)) for k=0..N-1, for f=0..N-1

    F[f] = 1/N S   s[k]  cos( k f arg)  + j sin( k f arg)   for k=0..N-1, for f= 0..N-1

Multiply by the signal component assuming its samples are stored in an array mData:
   mData[k] * Complex(cos(k * f * arg), sin(k * f * arg));
If the result is held in an array result[]:
  result[f] += mData[k] * Complex(cos(k * f * arg), sin(k * f * arg));

The indicies

Now add the loops for the two indicies:
  for (long f = 0; f < NumSamples(); f++)
    for (long k = 0; k < NumSamples(); k++)
      result[f] += mData[k] * Complex(cos(k * f * arg), sin(k * f * arg));
The indicies range from 0 to N-1 and are integers. Because of this the multiplication k*f will take on the same values as k and f range across 0..N-1. For example if N = 4:
 f  k  f*k
 0  0   0
 0  1   0
 0  2   0
 0  3   0
 1  0   0
 1  1   1
 1  2   2
 1  3   3
 2  0   0
 2  1   2
 2  2   4
 2  3   6
 3  0   0
 3  1   3
 3  2   6
 3  3   9
In fact, k*f only has 7 values 0, 1, 2, 3, 4, 6, 9. Here's a table comparing N to the number of unique values generated by k*f:
     #unique
  N   values
  2     2
  3     4
  4     7
  5    10
  6    15
  7    19
  8    26
  9    31
A large number of k and f combinations result in k*f equal to 0. We can speed up the algorithm by detecting when f or k is zero, and therefore k*f is zero, and using:

    cos(0) = 1
    sin(0) = 0


      if (f == 0 || k == 0)
         result[f] += mData[k] * Complex(1.0, 0.0);
      else
         result[f] += mData[k] * Complex(cos(k * f * arg), sin(k * f * arg));
Further increases in speed can be found by using the Fast Fourier Transform (see links below).

The rest

Assuming that we can manipulate the result array as an entity, divide by the number of samples:
    result /= (double) NumSamples();
Using a parameter we can switch between the forward and inverse DFT's.
  double arg = -direction * 2.0 * M_PI / (double) NumSamples();
  //...
  if (direction == 1)
    result /= (double) NumSamples();

Putting it Together

The whole function looks like this:
//direction =  1 => forward Transform
//direction = -1 => inverse Transform
void Signal::DFT(short int direction)
  {
  ComplexVector result;
  result.resize(NumSamples());

  double arg = -direction * 2.0 * M_PI / (double) NumSamples();
  for (long f = 0; f < NumSamples(); f++)
    for (long k = 0; k < NumSamples(); k++)
      if (f == 0 || k == 0)
         result[f] += mData[k] * Complex(1.0, 0.0);
      else
         result[f] += mData[k] * Complex(cos(k * f * arg), sin(k * f * arg));

  if (direction == 1)
    result /= (double) NumSamples();

  mData = result;
  }

What the results mean

The value of result[0] is the amplitude of the bias or DC (direct current) component, to use electronic terms. It is the offset above or below zero that the signal's vibrates around. If the signal is all pure sine waves, the offset is zero and result[0] will be 0.0. But if the signal is offset slightly then result[0] will be non-zero.

The remainder of the array is information about the frequencies in the original signal. The frequency information is duplicated so only the first half of the array needs to be examined.

For each array item:

Note that the frequencies in the result array are an arithmetic progression. Each index i is related to i+1 by adding a constant value to it. The relationship between the input signal samples and the resulting frequencies in the result array is via the sampling rate.

Converting to C++

The following code assumes that there is a class Complex that implements a complex number, and that there is a function fmtdouble() that formats a double ready to print.

double Complex::length() const
  {
  return sqrt(real() * real() + imag() * imag());
  }
double Complex::direction() const
  {
  return atan(imag() / real());
  }

void Signal::printFreqDomain(const string& title)
  {
  cout << title << endl;
  cout << "Bias: ampl="
    << fmtdouble(mData[0].real())
    << endl;
  double arg = (double) NumSamples() / (double)mSampleRate;
  for(long i = 1; i < NumSamples() / 2 ; ++i)
    {
    double ampl = 2.0 * mData[i].length();
    double phase = mData[i].direction();
    cout << fmtdouble(i  / arg)
      << " "
      << fmtdouble(ampl)
      << " "
      << fmtdouble(phase)
      << endl;
    }
  }

Links

The source code above can be found here:
http://www.arrizza.com/downloads/signalprocessing/signalprocessing.html

A good reference for the DFT is here:
http://astronomy.swin.edu.au/~prourke/analysis/dft