Catching Words in a Stream of Speech
Computational simulations of segmenting transcribed child-directed speech
Segmenting continuous speech into lexical units is one of the early tasks an infant needs to tackle during language acquisition. This thesis investigates this particular problem, segmentation, by means of computational modeling and simulations.
The segmentation problem is more difficult than it may be appreciated at first sight. Children need to find words in a continuous stream of speech, with no knowledge of words to start with. Fortunately, experimental studies reveal that children and adults use a number of cues in the input and simple strategies that exploit these cues in order to segment the speech. More interestingly, some of these cues are language independent, allowing a learner to segment the continuous input before knowing any words.
Two major aspects set the models presented in this thesis apart from other computational models in the literature. First, the models presented here use simple local strategies---as opposed to global optimization--- that rely on cues known to be used by children, namely, predictability statistics, phonotactics and lexical stress. Second, these cues are combined using an explicit cue-combination model which can easily be extended to include more cues.
The models are tested using real-world transcribed child-directed speech. The simulation results show that the performance of individual strategies are comparable to the state-of-the-art computational models of segmentation. Furthermore, combinations of individual cues provide a consistent increase in performance. The combined model performs on a par with the reference state-of-the-art model, while while employing only mechanisms more similar to those available to humans performing the same task.