Context Aware Adaptive ML Inference in Mobile-Cloud Applications

Koustabh Dolui, Sam Michiels, Danny Hughes, Hans Hallez

Published: 01 Jan 2022, Last Modified: 17 May 2023WoWMoM 2022Readers: Everyone

Abstract: With the emergence of mobile devices having enough resources to execute real-time ML inference, deployment opportunities arise on mobile devices while keeping privacy-sensitive data close to the source and reducing server load. Moreover, offloading inference to a cloud server facilitates deployment of neural network-based applications on resource-constrained devices. Depending on the application goals and execution context of the application, the optimal deployment on either cloud server or mobile device varies during the lifetime of an application. In this paper, we propose a context-aware middleware that enables optimization of deployed application software to satisfy the application’s functional goals in accordance with changing execution context and environmental conditions. We facilitate system design through the abstraction of deployed software components as states and make use of finite state machines and contextual triggers to model the reconfiguration of the system. We evaluate our framework using a real-world nutritional monitoring application via food image recognition deployed in a two-tier mobile and cloud architecture. We compare the proposed solution with various static deployments of the application and show that our approach can react to changing application goals at run-time in order to reduce server load and thereby increase scalability.

0 Replies