Imbalanced data classification using MapReduce and relief

Joanna Jędrzejowicz , Robert Kostrzewski , Jakub Neumann , Magdalena Zakrzewska

Abstract

Classification of imbalanced data has been reported to require modification of standard classification algorithms and lately has attracted a lot of attention due to practical applications in industry, banking and finance. The aim of the paper is to examine algorithms known from literature when two modifications are introduced: MapReduce to parallelize computations and Relief to select most valuable attributes. Both modifications are needed in Big Data area. Also two new algorithms are considered.
Author Joanna Jędrzejowicz II
Joanna Jędrzejowicz,,
- Institute of Informatics
, Robert Kostrzewski WMFiI
Robert Kostrzewski,,
- Faculty of Mathematics, Physics and Informatics
, Jakub Neumann II
Jakub Neumann,,
- Institute of Informatics
, Magdalena Zakrzewska II
Magdalena Zakrzewska,,
- Institute of Informatics
Journal seriesJournal of Information and Telecommunication, ISSN 2475-1839, e-ISSN 2475-1847
Issue year2018
Vol2
No2
Pages217-230
Publication size in sheets0.65
Keywords in Englishimbalanced data, classification, parallelization, feature selection
DOIDOI:10.1080/24751839.2018.1440454
URL https://www.tandfonline.com/doi/full/10.1080/24751839.2018.1440454
Languageen angielski
LicenseJournal (articles only); published final; Uznanie Autorstwa (CC-BY); with publication
Score (nominal)5
ScoreMinisterial score = 0.0, 09-05-2018, ArticleFromJournal
Ministerial score (2013-2016) = 5.0, 09-05-2018, ArticleFromJournal - czasopismo zagraniczne spoza list
Citation count*0
Cite
Share Share

Get link to the record
msginfo.png


* presented citation count is obtained through Internet information analysis and it is close to the number calculated by the Publish or Perish system.
Back