P. Hu, H. Li, H. Fu, D. Cansever, and P. Mohapatra, Dynamic defense strategy against advanced persistent threat with insiders, IEEE Conference on Computer Communications (INFOCOM), pp.747-755, 2015.

V. Paxson, Bro: a System for Detecting Network Intruders in RealTime, Computer Networks, vol.31, issue.23-24, pp.2435-2463, 1999.

M. Roesch, Snort-Lightweight Intrusion Detection for Networks, Proceedings of the 13th USENIX conference on System administration. USENIX Association, pp.229-238, 1999.

M. Vallentin, R. Sommer, J. Lee, C. Leres, V. Paxson et al., The NIDS Cluster: Scalable, Stateful Network Intrusion Detection on Commodity Hardware, Recent Advances in Intrusion Detection, pp.107-126, 2007.

A. Bar, A. Finamore, P. Casas, L. Golab, and M. Mellia, Largescale network traffic monitoring with DBStream, a system for rolling big data analysis, 2014 IEEE International Conference on Big Data (Big Data), pp.165-170, 2014.

M. Stonebraker, U. Zdonik, and S. , The 8 requirements of real-time stream processing, ACM SIGMOD Record, vol.34, issue.4, pp.42-47, 2005.

M. Mayhew, M. Atighetchi, A. Adler, and R. Greenstadt, Use of machine learning in big data analytics for insider threat detection, IEEE Military Communications Conference, MILCOM, pp.915-922, 2015.

D. Mladeni?, C. Saunders, M. Grobelnik, S. Gunn, and J. Shawe-taylor, Feature Selection for Dimensionality Reduction, Subspace, Latent Structure and Feature Selection (SLSFS): Statistical and Optimization Perspectives Workshop, pp.84-102, 2006.

A. Bifet and G. D. Morales, Big data stream learning with samoa, 2014 IEEE International Conference on Data Mining Workshop, pp.1199-1202, 2014.

I. Khamassi, M. Sayed-mouchaweh, M. Hammami, and K. Ghédira, Discussion and review on evolving data streams and concept drift adapting, Evolving systems, vol.9, issue.1, pp.1-23, 2018.

E. Rahm and H. H. Do, Data cleaning: Problems and current approaches, IEEE Bulletin of the Technical Committee on Data Engineering, vol.23, issue.4, pp.3-13, 2000.

S. García, J. Luengo, and F. Herrera, Data preprocessing in data mining, 2016.

M. Robnik-?ikonja and I. Kononenko, Theoretical and Empirical Analysis of ReliefF and RReliefF, Machine Learning, vol.53, pp.23-69, 2003.

B. Schölkopf, A. J. Smola, and K. Müller, Kernel principal component analysis, Advances in kernel methods, pp.327-352, 1999.

S. García, J. Luengo, and F. Herrera, Tutorial on practical tips of the most influential data preprocessing algorithms in data mining, Knowledge-Based Systems, vol.98, pp.1-29, 2016.

S. Zhang, C. Zhang, Y. , and Q. , Data preparation for data mining, Applied artificial intelligence, vol.17, issue.5-6, pp.375-381, 2003.

S. Tan, Neighbor-weighted k-nearest neighbor for unbalanced text corpus, Expert Systems with Applications, vol.28, issue.4, pp.667-671, 2005.

S. Ramírez-gallego, B. Krawczyk, S. García, M. Wo?niak, and F. Herrera, A survey on data preprocessing for data stream mining: Current status and future directions, Neurocomputing, 2017.

L. Van-der-maaten, E. Postma, and J. Herik, Dimensionality reduction: a comparative, Journal of Machine Learning Research, vol.10, pp.66-71, 2009.

J. C. Ang, A. Mirzal, H. Haron, and H. N. Hamed, Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection, IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol.13, issue.5, pp.971-989, 2016.

G. Chandrashekar and F. Sahin, A survey on feature selection methods, Computers & Electrical Engineering, vol.40, issue.1, pp.16-28, 2014.

I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, Gene selection for cancer classification using support vector machines, Machine learning, vol.46, issue.1-3, pp.389-422, 2002.

M. A. Hall, Correlation-based Feature Selection for Machine Learning, 1999.

A. Kumar, M. Sung, J. J. Xu, W. , and J. , Data streaming algorithms for efficient and accurate estimation of flow size distribution, ACM SIGMETRICS Performance Evaluation Review, vol.32, issue.1, pp.177-188, 2004.

Y. Ben-haim and E. Tom-tov, A streaming parallel decision tree algorithm, Journal of Machine Learning Research, vol.11, pp.849-872, 2010.

G. I. Webb, Contrary to popular belief incremental discretization can be sound, computationally efficient and extremely useful for streaming data, IEEE International Conference on Data Mining (ICDM)

, IEEE, pp.1031-1036, 2014.

M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, A detailed analysis of the kdd cup 99 data set, 2009 IEEE Symposium on Computational Intelligence for Security and Defense Applications, pp.1-6, 2009.

A. Lobato, M. Lopez, I. J. Sanz, A. Cárdenas, O. C. Duarte et al., An adaptive Real-Time architecture for ZeroDay threat detection, IEEE ICC 2018 Next Generation Networking and Internet Symposium (ICC'18 NGNI), 2018.
URL : https://hal.archives-ouvertes.fr/hal-02099022

A. Lopez, M. Silva, R. S. Alvarenga, I. D. Rebello, G. A. Sanz et al., Collecting and characterizing a real broadband access network traffic dataset, IEEE/IFIP 1st Cyber Security in Networking Conference (CSNet), pp.1-8, 2017.
URL : https://hal.archives-ouvertes.fr/hal-02099033

H. Hu and M. Kantardzic, Smart preprocessing improves data stream mining, 49th Hawaii International Conference on System Sciences (HICSS), pp.1749-1757, 2016.

A. Buczak and E. Guven, A Survey of Data Mining and Machine Learning Methods for Cyber Security Intrusion Detection, IEEE Communications Surveys Tutorials, issue.99, pp.1-26, 2015.

V. B. Prasath, H. A. Alfeilat, O. Lasassmeh, and A. B. Hassanat, Distance and Similarity Measures Effect on the Performance of K-Nearest Neighbor Classifier -{A} Review, CoRR, 2017.

T. Zhang, Solving large scale linear prediction problems using stochastic gradient descent algorithms, Proceedings of the twenty-first international conference on Machine learning, p.116, 2004.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, SMOTE: synthetic minority over-sampling technique, Journal of artificial intelligence research, vol.16, pp.321-357, 2002.

S. Perkins and J. Theiler, Online feature selection using grafting, Proceedings of the 20th International Conference on Machine Learning (ICML-03), pp.592-599, 2003.

J. Zhou, D. P. Foster, R. A. Stine, and L. H. Ungar, Streamwise feature selection, Journal of Machine Learning Research, vol.7, pp.1861-1885, 2006.

X. Wu, K. Yu, W. Ding, H. Wang, and X. Zhu, Online feature selection with streaming features, IEEE transactions on pattern analysis and machine intelligence, vol.35, pp.1178-1192, 2013.