Imbalanced data classification using MapReduce and relief

Joanna Jędrzejowicz , Robert Kostrzewski , Jakub Neumann , Magdalena Zakrzewska

Abstract

Classification of imbalanced data has been reported to require modification of standard classification algorithms and lately has attracted a lot of attention due to practical applications in industry, banking and finance. The aim of the paper is to examine algorithms known from literature when two modifications are introduced: MapReduce to parallelize computations and Relief to select most valuable attributes. Both modifications are needed in Big Data area. Also two new algorithms are considered.
Author Joanna Jędrzejowicz (FMPI / II)
Joanna Jędrzejowicz,,
- Institute of Informatics
, Robert Kostrzewski (FMPI)
Robert Kostrzewski,,
- Faculty of Mathematics, Physics and Informatics
, Jakub Neumann (FMPI / II)
Jakub Neumann,,
- Institute of Informatics
, Magdalena Zakrzewska (FMPI / II)
Magdalena Zakrzewska,,
- Institute of Informatics
Journal seriesJournal of Information and Telecommunication, ISSN 2475-1839, e-ISSN 2475-1847, (0 pkt)
Issue year2018
Vol2
No2
Pages217-230
Publication size in sheets0.65
Keywords in Englishimbalanced data, classification, parallelization, feature selection
DOIDOI:10.1080/24751839.2018.1440454
URL https://www.tandfonline.com/doi/full/10.1080/24751839.2018.1440454
Languageen angielski
LicenseJournal (articles only); published final; Uznanie Autorstwa (CC-BY); with publication
Score (nominal)5
ScoreMinisterial score = 0.0, 09-05-2018, ArticleFromJournal
Ministerial score (2013-2016) = 5.0, 09-05-2018, ArticleFromJournal - czasopismo zagraniczne spoza list
Citation count*
Cite
Share Share

Get link to the record


* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Back