Tagging child-adult interactions in naturalistic, noisy, daylong school environments using i-vector based diarization system

Prasanna V. Kothalkar, Dwight Irvin, Ying Luo, Joanne Rojas, John Nash, Beth Rous, John H.L. Hansen

Research output: Contribution to conferencePaperpeer-review

5 Scopus citations

Abstract

Assessing child growth in terms of speech and language is a crucial indicator of long term learning ability and life-long progress. Since the preschool classroom provides a potent opportunity for monitoring growth in young children's interactions, analyzing such data has come into prominence for early childhood researchers. The foremost task of any analysis of such naturalistic recordings would involve parsing and tagging the interactions between adults and young children. An automated tagging system will provide child interaction metrics and would be important for any further processing. This study investigates the language environment of 3-5 year old children using a CRSS based diarization strategy employing an i-vector-based baseline that captures adult-to-child or child-to-child rapid conversational turns in a naturalistic noisy early childhood setting. We provide analysis of various loss functions and learning algorithms using Deep Neural Networks to separate child speech from adult speech. Performance is measured in terms of diarization error rate, Jaccard error rate and shows good results for tagging adult vs children's speech. Distinction between primary and secondary child would be useful for monitoring a given child and analysis is provided for the same. Our diarization system provides insights into the direction for pre-processing and analyzing challenging naturalistic daylong child speech recordings.

Original languageEnglish
Pages89-93
Number of pages5
DOIs
StatePublished - 2019
Event8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19 - Graz, Austria
Duration: Sep 20 2019Sep 21 2019

Conference

Conference8th ISCA Workshop on Speech and Language Technology in Education, SLaTE 19
Country/TerritoryAustria
CityGraz
Period9/20/199/21/19

Bibliographical note

Publisher Copyright:
© SLaTE 2019. All rights reserved.

Keywords

  • Deep Neural Networks
  • TO-Combo SAD
  • child speech diarization
  • i-Vectors
  • naturalistic environment
  • speech activity detection

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Computational Mathematics
  • Education

Fingerprint

Dive into the research topics of 'Tagging child-adult interactions in naturalistic, noisy, daylong school environments using i-vector based diarization system'. Together they form a unique fingerprint.

Cite this