While striving to support the education and development of young talent, 3DS OUTSCALE regularly conducts a number of actions in schools. The visits of 3DS OUTSCALE employees to schools are a means of presenting career opportunities in the digital industry. OUTSCALE for Women program destined to young women aims to inform on digital careers and thus promote employment diversity. In addition, numerous collaborative projects intend on offering students a qualitative practical framework for them to conduct research work. This is the case for this collaborative project between 3DS OUTSCALE and the ISEP engineering school that I am going to unveil in this article along with the benefits it brings to both the industry and the higher education institution.
Historically, a logbook refers to a set of registers used by ship crews to chronologically record a variety of events, such as a change of course or the loading of goods. These registers were then used by inspectors to trace operations in search of potential frauds or errors.
Ubiquitous Traceability in IT
Event logging also exists in the IT world, with timestamps being produced each time a significant action occurs. The thereby generated registers, called log files, make up a valuable source of information for system analysis and monitoring. Controlling these files helps retrace the different stages of a process in search of an anomaly that could potentially be at the root of a system failure, an application crash or any other event impacting quality of service, and thus improve performance and customer experience.
Processing Hundreds of Thousands of Events per Second: Mission (Almost) Impossible
In the context of Cloud computing, processing log files can be challenging. First of all, because of the substantial volumetry, represented by the generation of hundreds of thousands of log lines each second. Second of all, because of the disparity of messages, each source having its own log structure and the generated message types being likely to vary in time.
A Vast Technical and Technological Challenge
To address these issues that make solely manual control impossible, administrators look for predefined patterns corresponding to known abnormal behaviors. However, this requires prior knowledge of the said errors, which is not possible in the case of new anomalies.
Hence the importance of developing autonomous systems capable of detecting anomalies efficiently (accuracy), in near-real time (ms order), whilst maintaining performance over time. This last criterion is particularly crucial in guaranteeing the autonomy of the solution and its suitability in the long run. Such systems must also be able to identify the context and type of the anomaly with great accuracy in order to notify the appropriate team for action.
As part of my dissertation, I am working on the detection of anomalies in Cloud infrastructure based on logs. My aim is to take part in the conception of an autonomous system, particularly to ensure its capacity to adapt to the high volumetry and variability constraints of 3DS OUTSCALE’s logs.
Deep Learning and Neural Networks at the Service of Anomaly Detection
To achieve this, I am taking a close look at deep learning. The use of convolutional neural networks has paved the way for facial recognition and the development of autonomous cars. LSTM (long short-term memory) neural networks are fundamental in the rise of spell checkers that rely on the context of a word or a sentence.
Deep learning for anomaly detection has shown its first results with the use of LSTM network stacks. However, the corresponding works are relatively recent and subject to improvement, the pre-processing of logs being, in my opinion, still underutilized, as is the analysis of an anomaly’s context.
A Winning Collaboration for the Industry and the Scientific World
The practical framework provided by 3DS OUTSCALE is ideal for conducting research work. Access to large volumes of data and to substantial computing power is key in deep learning research. This also means being in regular contact with domain experts and making the most of their knowledge. The work will be carried out as part of a collaboration between 3DS OUTSCALE and the LISITE-ISEP research laboratory. The latter had already made anomaly detection one of its main concerns and will be able to leverage the partnership to develop its research activity.