Missing data in cross-sectional networks – An extensive comparison of missing data treatment methods

Robert W. Krause, Mark Huisman, Christian Steglich, Tom Snijders

Research output: Contribution to journalArticlepeer-review

44 Scopus citations

Abstract

This paper compares several missing data treatment methods for missing network data on a diverse set of simulated networks under several missing data mechanisms. We focus the comparison on three different outcomes: descriptive statistics, link reconstruction, and model parameters. The results indicate that the often used methods (analysis of available cases and null-tie imputation) lead to considerable bias on descriptive statistics with moderate or large proportions of missing data. Multiple imputation using sophisticated imputation models based on exponential random graph models (ERGMs) lead to acceptable biases in descriptive statistics and model parameters even under large amounts of missing data. For link reconstruction multiple imputation by simple ERGM performed well on small data sets, while missing data was more accurately imputed in larger data sets with multiple imputation by complex Bayesian ERGMs (BERGMs).

Original languageEnglish
Pages (from-to)99-112
Number of pages14
JournalSocial Networks
Volume62
DOIs
StatePublished - Jul 2020

Bibliographical note

Publisher Copyright:
© 2020

Funding

This article was funded by the University of Groningen , through the employment of the authors. During the time of the writing of the manuscript the first author, Robert Krause, was an employee of the University of Groningen.

FundersFunder number
University of Groningen: Rijksuniversiteit Groningen

    Keywords

    • Bayesian ERGM
    • ERGM
    • Missing data
    • Multiple imputation
    • Social networks

    ASJC Scopus subject areas

    • Anthropology
    • Sociology and Political Science
    • General Social Sciences
    • General Psychology

    Fingerprint

    Dive into the research topics of 'Missing data in cross-sectional networks – An extensive comparison of missing data treatment methods'. Together they form a unique fingerprint.

    Cite this