The paper “Evaluating MapReduce for Profiling Application Traffic“ authored by Thiago P. B. Vieira, Stenio F. L. Fernandes and Vinícius C. Garcia has been accepted for publication at the 1st Workshop on High Performance and Programmable Networking (HPPN’13), co-located with 22nd International Symposium on High-Performance Parallel and Distributed Computing (HPDC). The conference will be held in June 17-21, 2013 at New York City.
A brief overview of the paper is given next.
The use of MapReduce for distributed data processing has been growing and achieving benefits with its application for different workloads. MapReduce can be used for dis- tributed traffic analysis, although network traces present characteristics which are not similar to the data type com- monly processed through MapReduce. Motivated by the use of MapReduce for profiling application traffic and due to the lack of evaluation of MapReduce for network traffic analysis and the peculiarity of this kind of data, this paper evaluates the performance of MapReduce in packet level analysis and DPI, analysing its scalability, speed-up, and the behavior of MapReduce phases. The experiments provide evidences for the predominant phases in this kind of job, and show the impact of input size, block size and number of nodes, on MapReduce completion time and scalability.