## Python: Peak finding for FFTd 1D data

This peak finding method has been developed for the kind of data that is the result of a fast Fourier transform (FFT). It may also work for other data but hasn't been tested that way.

## Demonstration

Let's assume datafftok is a one dimensional vector (a list in Python) and represents the fast Fourier transform of e.g. a recorded sound.

Here as an example we've got the sound of a coin dropping on the floor. In the frequency spectrum (which conforms to the Fourier transform of the original sound file) you can clearly see the two peaks on the right which mark the main resonant frequencies of the coin.

The goal of the program is to identify those two distinctive peaks automatically. This will be achieved by the following Python code structured into 4 parts annotated at the bottom. As the left side of the spectrum turns out to be chaotic and irrelevant it should be ignored. That's why we set the variable leftborder to about 20000.

#*** Settings ***
# Read the functionality section in order to understand what the variables mean.
leftborder = 20000 # 0 when the whole spectrum is analysed

front1 = 0
back1 = 1
size1 = 10000

front2 = 3
back2 = 4
size2 = 350000
tolerance = 2

attempts = 10000
mindistance = 100

minheight = 2000000

#*** Peak finding ***
peaks = list()
peaksfreq = list()

#STEP 1 ----------------------------------------------------------------------
for i in range(front1 + leftborder,len(datafftok) - 1 - back1):
fronttrue1 = True;
for a in range(i - front1, i + 1):
if (datafftok[a] < datafftok[a - 1]):
fronttrue1 = False;
backtrue1 = True;
for a in range(i + 1, i + back1 + 1):
if (datafftok[a] > datafftok[a - 1]):
backtrue1 = False;
if (backtrue1 == True and fronttrue1 == True and((abs(datafftok[i - front1] - datafftok[i]) + abs(datafftok[i + back1] - datafftok[i]))/2) > size1):
peaks.append(datafftok[i])
peaksfreq.append(i)

#STEP 2 ----------------------------------------------------------------------
peaks2 = list()
peaksfreq2 = list()
for i in range(front2,len(peaks) - 1 - back2):
fronttrue2 = 0;
for a in range(i - front2, i + 1):
if (peaks[a] < peaks[a - 1]):
fronttrue2 = fronttrue2 + 1;
backtrue2 = 0;
for a in range(i + 1, i + back2 + 1):
if (peaks[a] > peaks[a - 1]):
backtrue2 = backtrue2 + 1;
if (backtrue2 <= tolerance and fronttrue2 <=tolerance and((abs(peaks[i - front2] - peaks[i]) + abs(peaks[i + back2] - peaks[i]))/2) > size2):
peaks2.append(peaks[i])
peaksfreq2.append(peaksfreq[i])

#STEP 3 ----------------------------------------------------------------------
for i in range(0,attempts):
r = random.randint(zero_offset, len(datafftok) - mindistance)
fmin = r
fmax = fmin + mindistance
candidates = list()
for h in range(0,len(peaks2)):
if peaksfreq2[h] > fmin and peaksfreq2[h] <= fmax:
candidates.append(h)
maxindex = 0
if (len(candidates) > 1):
for a in range(0, len(candidates) ):
if peaks2[candidates[a]] > peaks2[maxindex]:
maxindex = candidates[a]
for a in range(0, len(candidates)):
if candidates[a] != maxindex:
peaks2[candidates[a]] = 0
peaksfreq2[candidates[a]] = 0
peaks3 = peaks2
peaksfreq3 = peaksfreq2

#STEP 4 ----------------------------------------------------------------------
peaks4 = list()
peaksfreq4 = list()
for i in range(0,len(peaks3)):
if peaks3[i] > minheight:
peaks4.append(peaks3[i])
peaksfreq4.append(peaksfreq3[i])

#Output
for h in range(0,len(peaks4)):
print str(peaksfreq4[h]) + "   " + str(peaks4[h])

#Clean up
peaks = peaks4
pekasfreq = peaksfreq4


The output looks like this and proves that the the peak finding has been successful.

27499   3455469.95281
28102   2055843.92659

## Functionality

This peak finding method takes the Fourier transformed data and performs four steps on it in total. Zooming into the spectrum shown above you'll recognize that it consists out of narrow variously sized micro-peaks. This kind of texture is typical for FFTs and can be systematically utilized for the detection of peaks.

Step 1

Sorts out all data points which don't have neighbours ordered monotonically increasing on the left and monotonically decreasing on the right. The number of neighbouring points needed matching this criterium is specified with the variables back1 respectively front1size1 presets a minimum averaged height between the central data point and the farthermost data point taken into account on both sides.

Step 2

This step finds macro-peaks consisting of micro-peaks. It's equivalent to step 1 and performs the sorting again, this time on the data points left after step 1. The corresponding variables are back2, front2 and size2. An additional variable named tolerance defines the number of exceptions allowed in monotonicity without the data point being sorted out.

Step 3

Represents a Monte Carlo method and aims to bundle macro-peaks lying close together into something that can be considered as a real peak. The minimum distance between two peaks is given by the variable mindistance. Step 3 randomly chooses an interval with length mindistance and looks which macro-peak found within ist the tallest one. This macro-peak survives, the others are set to zero and will be removed in step 4. This optimization sequence has to be repeated plenty of times depending on the size of the data set. The variable attempts specifies the number of repetitions.

Step 4

Removes all remaining candidates whose height falls below the value of the variable minheight.

2016-02-25 12:28:17