Hadoop-EDF: Large-scale Distributed Processing of Electrophysiological Signal Data in Hadoop MapReduce

  • Yuanyuan Wu
  • , Xiaojin Li
  • , Jinze Liu
  • , Licong Cui

Producción científica: Conference contributionrevisión exhaustiva

5 Citas (Scopus)

Resumen

Rapidly growing volume of electrophysiological signals has been generated for clinical research in neurological disorders. European Data Format (EDF) is a standard format for storing electrophysiological signals. However, the bottleneck of existing signal analysis tools for handling large-scale datasets is the sequential way of loading large EDF files before performing signal analyses. To overcome this, we develop Hadoop-EDF, a distributed signal processing tool to load EDF data in a parallel manner using Hadoop MapReduce. Hadoop-EDF uses a robust data partition algorithm making EDF data parallelly processable. We evaluate Hadoop-EDF's scalability and performance by leveraging two datasets from the National Sleep Research Resource and running experiments on Amazon Web Service clusters. The performance of Hadoop-EDF on a 20-node cluster achieved about 26 times and 47 times faster than the sequential processing of 200 small-size files and 200 large-size files, respectively. The results demonstrate that Hadoop-EDF is more suitable and effective in processing large EDF files.

Idioma originalEnglish
Título de la publicación alojadaProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
EditoresIllhoi Yoo, Jinbo Bi, Xiaohua Tony Hu
Páginas2265-2271
Número de páginas7
ISBN (versión digital)9781728118673
DOI
EstadoPublished - nov 2019
Evento2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 - San Diego, United States
Duración: nov 18 2019nov 21 2019

Serie de la publicación

NombreProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019

Conference

Conference2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
País/TerritorioUnited States
CiudadSan Diego
Período11/18/1911/21/19

Nota bibliográfica

Publisher Copyright:
© 2019 IEEE.

Financiación

This work was supported by the US National Institutes of Health under grants R24HL114473 and U01NS090408. Correspondence: [email protected]

FinanciadoresNúmero del financiador
National Institutes of Health (NIH)U01NS090408, R24HL114473
National Institutes of Health (NIH)

    ODS de las Naciones Unidas

    Este resultado contribuye a los siguientes Objetivos de Desarrollo Sostenible

    1. Good health and well being
      Good health and well being

    ASJC Scopus subject areas

    • Biochemistry
    • Biotechnology
    • Molecular Medicine
    • Modeling and Simulation
    • Health Informatics
    • Pharmacology (medical)
    • Public Health, Environmental and Occupational Health

    Huella

    Profundice en los temas de investigación de 'Hadoop-EDF: Large-scale Distributed Processing of Electrophysiological Signal Data in Hadoop MapReduce'. En conjunto forman una huella única.

    Citar esto