A free tool to transform modal voice into irregular voice

Version 0.9.1 - released on June 26, 2008

Overview
Download
Getting started

Overview

Glottalizer is a program that can introduce irregular pitch periods (also known as glottalization, laryngealization, creak, vocal fry, etc.) into a speech signal. The graphical user interface was developed by Nicolas Audibert (GIPSA-lab Speech & Cognition Dept) while the transformation method is the work of Tamás Bőhm (BME TMIT SpeechLab). Glottalizer is freely available to download as a binary for academic and personal use. If you would like to use it for other purposes or you would like to get the source code, please contact us by e-mail. Please also send us any feedback, questions or bug reports:
bohm /ae t/ tmit /d aw t/ bme /d aw t/ hu
Nicolas /d aw t/ Audibert /ae t/ gipsa-lab /d aw t/ inpg /d aw t/ fr

You can find more information about the transformation method itself and its evaluation in our Acoustics08 paper "Transforming modal voice into irregular voice by amplitude scaling of individual glottal cycles" (authors: Tamás Bőhm, Nicolas Audibert, Stefanie Shattuck-Hufnagel, Géza Németh and Véronique Aubergé). Detailed istructions on how to use the program can be found here and in the download package.

Please refer to this program as:
Bőhm, Tamás and Audibert, Nicolas (2008). Glottalizer (Version 0.9.1) [Computer program]. Retrieved June 26, 2008 [update to the date when you downloaded it], from http://www.bohm.hu/glottalizer.html

Glottalizer is licensed under Creative Commons Attribution-Noncommercial-Share Alike 2.5 License (by-nc-sa-2.5). Visit http://creativecommons.org/licenses/by-nc-sa/2.5/hu/deed.en for details. There is absolutely no warranty for Glottalizer. The program was written and compiled in MATLAB®, © 1984 - 2008 The MathWorks, Inc. The Matlab Component Runtime bundled with the application is the property of The MathWorks, Inc. and must not be distributed separately.

Download

As Glottalizer was written in Matlab and compiled for Windows using the Matlab Compiler (in order to make it possible to use without a Matlab licence), it requires to have the Matlab Component Runtime (MCR).

If you already have MCR installed on your computer, you will only need to download the Glottalizer light install package (552 KB ZIP archive) and unpack it to a separate folder. You will then be ready to run Glottalizer (see Getting started).

If you have not installed MCR yet, you have to download the Glottalizer full install package (173 MB) and install MCR as follows:

  • Download the .zip package and unpack it into a separate folder.
  • Run MCRInstaller.exe to install the Matlab Component Runtime (MCR). You can use the default settings everywhere.
  • If the installer is interrupted (saying something like "The wizard was interrupted before MATLAB Component Runtime 7.7 could be completely installed." or "Error 1304 writing to file MWArray.dll" in the last dialog), then you need to install Microsoft Visual C++ Redistributable first and then try installing MCR again. If the MCR installer asks about the .Net framework, just select OK (as we don't require it).
  • If the MCR installation was successful, you should restart the computer and then you are ready to run Glottalizer (see Getting started).

We have to warn you that:

  • This is only a preliminary version, meaning that it has not been thoroughly tested yet and thus some features may not work properly. If you find such bugs, please let us know (see our e-mail addresses above in the Overview).
  • The documentation is not comprehensive. For any question about how to use Glottalizer, please contact us by email.
  • As of now, the compiled version is available for Windows only. In order to run Glottalizer in another operating system, please contact us to get the source code (you will need to have MATLAB installed).

Getting started

The bottom panel displays the waveform of the recording to be manipulated. The top panel depicts the model waveform that can be used to guide the transformation (either manually or by copying its pulse pattern); for creating irregular phonation, a model recording that contains irregular phonation can be loaded into this panel. Note that the model recording cannot be manipulated.

In order to open a wave file in either one of the two panels, a corresponding pitch mark file must also be available (e.g. in Praat .PointProcess format). Pitch marks should be positioned approximately at the glottal closure instants, i.e. near the largest peak in a glottal cycle. This peak is usually negative but, if the waveform is inverted, it is positive. The pitch marks can be overlaid on the waveform. After switching to ‘Edit pitchmarks’ mode by the corresponding button, pitch marks can also be edited (added by left mouse clicks and removed by right clicks) and saved.

In the bottom panel, individual periods can be removed (scaled to zero) by right-clicking with the mouse around the corresponding pitch mark. A second right-click brings the period back to its original form (i.e. resets the scaling factor to 1). Periods can be boosted and attenuated by left-clicking around the pitch mark, with the vertical position of the mouse pointer determining the new peak value (and thus the scaling factor). The applied scaling factors are shown above the manipulated waveform, and can be saved in a separate file that can be reloaded later. The first and last periods of a voiced region cannot be manipulated for technical reasons and thus the corresponding pitch marks are drawn in red.

The transformation can also be carried out by copying a ‘stylized’ pulse pattern. For details, see this excerpt from our Acoustics08 paper:

When the scaling factors are set by pattern copying, one has to select both the regular region to be manipulated in the signal and the irregular region to be copied. Then a ‘stylized’ pulse pattern is extracted from the selected irregular region, consisting of the scaling factors to be used in transforming the regular region (i.e. not the absolute pulse positions and amplitudes). This stylized pulse pattern is initially constructed as a vector containing the relative amplitudes of the glottal pulses in the sample irregular region. The amplitude of each period is measured as the peak amplitude (either positive or negative) around the pitch mark. The values in the stylized pulse pattern are expressed relative to the mean amplitude of some regular periods preceding the irregular region.

When an irregular cycle is substantially longer than a reference cycle length (e.g. two or three times or more than the reference, T0ref, that is calculated as the mean of some preceding regular cycles), zeros are inserted in the stylized pulse pattern since, at these points, periods need to be removed from the regular recording. The number of zeros to be inserted between two consecutive scaling values (i.e. the number of periods to be removed) is determined by the rounded ratio of the actual cycle length to the reference: ni=round(T0i/T0ref). Cycle lengths are measured as time differences between consecutive pitch marks. The number of periods used to calculate the reference cycle length and reference amplitude is 5 by default, but can be set as a parameter.

In order to do this, one has to select the region in the model waveform that the target pulse pattern is to be extracted from, and the region in the bottom panel where the pattern is to be applied. When there is a selection in both panels and there are enough pitch marks preceding the model selection to calculate the reference values, then the pattern can be copied automatically by the corresponding pushbutton.

In order to ease the process of the transformation, the program also has the standard sound editor features, such as zooming and scrolling in the waveforms, playing either the original or the manipulated recording, saving the transformed sound, and undo/redo.

A more comprehensive documentation is included in the downloadable Glottalizer package, and can also be found here.

.