The Superposition of Functional Contours (SFC) model is based on neural network contour generators (CGs), each responsible for encoding one linguistic function. Four rhythmic unit (RU) position ramps are input to the CGs that describe the absolute and relative position of the current RU within the function’s scope. The CGs then output the prosodic contour for that RU, as shown in Fig. 1.
In the standard SFC, and in all of the ProsoDeep models, the prosody includes a number of pitch targets spread across the vowel nucleus of the RU, and a duration expansion coefficient for the whole RU [sfc]. The prosody contour of the utterance is then obtained by overlapping and adding these elementary contours. CGs are trained using an analysis-by-synthesis loop that distributes the error at each iteration, shown in Fig. 2.
Fig. 2 — The SFC analysis-by-synthesis loop used to train the contour generators by redistributing the error for each RU among the contributing CGs. Taken from [sfc].
PySFC is a Python implementation of the SFC model that was created with two goals: i) to make the SFC more accessible to the scientific community, and ii) to serve as a foundation for future improvements of the prosody model. The PySFC also implements a minimum set of tools necessary to make the system self-contained and fully functional [code].
Here are a few example plots with PySFC just to show case what it can do. The plotted files are included as examples in the
examples/ directory. All of them are taken from our PySFC paper [pysfc].
We validated the usability of the PySFC through confirming the established carry-over phenomenon present in Chinese tones [pysfc-tones]. We observed that including the succeeding rhythmic unit within the tonal function scope, makes a statistically significant impact on modelling performance. On the other hand, including the preceding unit, does not. The carry-over effect can clearly be seen in the juxtaposition of the four tonal function contours in Fig. 7.