Kun-Ching Wang
Scientific Tracks Abstracts: J Inform Tech Soft Engg
This paper presents a wavelet-based speech/music classification using spectrogram image feature, (SIF). The SIF can
efficiently reflect the visual signature characteristics from the sound’s time-frequency representation. First, the input audio/
speech sound is decomposed by wavelet packet transform into different subband levels. Through useful subband selection, we
can keep the subbands, which contain rich texture information, are used as features for this discrimination problem. Finally,
the support vector machine (SVM) is then used to classy the speech segment or audio segment.
Kun-Ching Wang is a faculty member of Department of Information Technology & Communication in Shih Chien University, Taiwan.