REAL TIME MACHINE LEARNING

Saed Sayad

Published: 24 Jul 2020, Last Modified: 10 Aug 2024ResearchGateEveryoneCC BY 4.0

Abstract: Although machine learning algorithms are widely used in extremely diverse situations, in practice, one or more major limitations almost invariably appear and significantly constrain successful machine learning applications. Frequently, these problems are associated with large increases in the rate of generation of data, the quantity of data and the number of attributes (variables) to be processed: Increasingly, the data situation is now beyond the capabilities of conventional machine learning methods. The term “Real Time” is used to describe how well a machine learning algorithm can accommodate an ever-increasing data load instantaneously. However, such real time problems are usually closely coupled with the fact that conventional machine learning algorithms operate in a batch mode where having all of the relevant data at once is a requirement. Thus, here Real Time Machine Learning is defined as having all of the following characteristics, independent of the amount of data involved: 1. Incremental learning (Learn): immediately updating a model with each new observation without the necessity of pooling new data with old data. 2. Decremental learning (Forget): immediately updating a model by excluding observations identified as adversely affecting model performance without forming a new dataset omitting this data and returning to the model formulation step. 3. Attribute addition (Grow): Adding a new attribute (variable) on the fly, without the necessity of pooling new data with old data. 4. Attribute deletion (Shrink): immediately discontinuing use of an attribute identified as adversely affecting model performance. 5. Scenario testing: rapid formulation and testing of multiple and diverse models to optimize prediction. 6. Real Time operation: Instantaneous data exploration, modeling and model evaluation. 7. In-Line operation: processing that can be carried out in-situ (e.g.: in a mobile device, in a satellite, etc.). 8. Distributed processing: separately processing distributed data or segments of large data (that may be located in diverse geographic locations) and re-combining the results to obtain a single model. 9. Parallel processing: carrying out parallel processing extremely rapidly from multiple conventional processing units (multi-threads, multi-processors or a specialized chip).