Harma Syllable Segmentation

Segments a signal stored in a WAV file into individual syllables.
2.3K Downloads
Updated 17 Jan 2014

View License

HARMASYLLABLESEG - Segments a signal stored in a WAV file into individual
syllables. Also graphs the spectrogram and signal with syllables
highlighted in red to show what parts of the signal contain syllables.

INPUT:

- SIGNAL: mono signal in a one dimensional array
- FS: Sampling frequency of mono signal
The following arguments are used by the spectrogram function type:
'help spectrogram' for more information on WINDOW,NOVERLAP, and NFFT
- WINDOW: Either an integer value N or coefficients of a Window function
stored in a length N matrix. If an integer is passed a default hamming
window of length N is used on each segment of the signal.
- NOVERLAP: Number of samples each segment of the signal overlaps.
- NFFT: Number of points used to calculate the DFT (discrete Fourier
transform) of each segment. This may be greater than the window length.
In this case, each segment is zero padded to the NFFT length.
- MINDB: Stopping criteria T (in dB) as defined in the original paper by Harma.
A good default value for this parameter is ~20 dB.

OUTPUT:
- SYLLABLES: A struct array. Each struct represents a single syllable and contains the following parameters:
- SIGNAL: An 1-dimensional array of doubles that represent the value of the signal over the range of this syllable.
The following fields are in the order:
[Peak Peak-1 Peak-2...Peak+1 Peak+2...]
- SEGMENTS: The spectrogram index of each segment in this syllable.
- TIME: The time domain values of this syllable.
- FREQS: Peak frequency found in each segment.
- AMPS: Amplitude each peak frequency
- FS: Sampling frequency of signal in the WAV file.
- S: The spectrogram of the signal in the WAV file.
- F: Frequency bins used in FFT.
- T: Time domain values of each segment in the spectrogram.
- P: Power spectral density of each segment in the spectrogram.

Usage Example:
[syllables,FS,S,F,T,P] = harmaSyllableSeg('[Path To WAV File]',kaiser(512),128,1024,20);

References:
1) Harma, A.; , "Automatic identification of bird species based on sinusoidal modeling of syllables,"
Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).
2003 IEEE International Conference on , vol.5, no., pp. V- 545-8 vol.5, 6-10 April 2003
doi: 10.1109/ICASSP.2003.1200027
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1200027&isnumber=26996
2) Lee, C. H., "Automatic Recognition of Bird Songs Using Cepstral Coefficients"
Journal of Information Technology and Applications, 2006. Vol. 1 No. 1. p. 17-23
URL: http://140.126.5.184/Jita_web/publish/vol1_num1/05-20050044-text-sec.pdf

Script By: Michael Lindemuth at the University of South Florida

The code below may be licensed according to the BSD license.
For a copy see: http://www.opensource.org/licenses/bsd-license.php

Cite As

Michael Lindemuth (2024). Harma Syllable Segmentation (https://www.mathworks.com/matlabcentral/fileexchange/29261-harma-syllable-segmentation), MATLAB Central File Exchange. Retrieved .

MATLAB Release Compatibility
Created with R2010a
Compatible with any release
Platform Compatibility
Windows macOS Linux

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!
Version Published Release Notes
1.3.0.0

Updated file to make it independent from how the sound file is read.

1.0.0.0