Abstract
Assessing child growth in terms of speech and language is a crucial indicator of long term learning ability and life-long progress. Since the preschool classroom provides a potent opportunity for monitoring growth in young children's interactions, analyzing such data has come into prominence for early childhood researchers. The foremost task of any analysis of such naturalistic recordings would involve parsing and tagging the interactions between adults and young children. An automated tagging system will provide child interaction metrics and would be important for any further processing. This study investigates the language environment of 3-5 year old children using a CRSS based diarization strategy employing an i-vector-based baseline that captures adult-to-child or child-to-child rapid conversational turns in a naturalistic noisy early childhood setting. We provide analysis of various loss functions and learning algorithms using Deep Neural Networks to separate child speech from adult speech. Performance is measured in terms of diarization error rate, Jaccard error rate and shows good results for tagging adult vs children's speech. Distinction between primary and secondary child would be useful for monitoring a given child and analysis is provided for the same. Our diarization system provides insights into the direction for pre-processing and analyzing challenging naturalistic daylong child speech recordings.
Original language | English |
---|---|
Pages | 89-93 |
Number of pages | 5 |
DOIs | |
State | Published - 2019 |
Event | 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19 - Graz, Austria Duration: Sep 20 2019 → Sep 21 2019 |
Conference
Conference | 8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19 |
---|---|
Country/Territory | Austria |
City | Graz |
Period | 9/20/19 → 9/21/19 |
Bibliographical note
Publisher Copyright:© SLaTE 2019. All rights reserved.
Keywords
- Deep Neural Networks
- TO-Combo SAD
- child speech diarization
- i-Vectors
- naturalistic environment
- speech activity detection
ASJC Scopus subject areas
- Computer Science (miscellaneous)
- Computational Mathematics
- Education