Imagine turning your smartwatch into an interactive boxing instructor that can analyze and assess your movements as you progress through different combinations of punches. Or perhaps instead, a virtual opponent that tests your speed and stealth by trying to predict what punches you're going to throw before you land them. The aim of my Cortex project was to see if deep learning could make this possible and the steps needed to do so.
My name is Tom Montgomery, and I'm a Senior AI Developer with Tessella. Like many people, I mainly use my Apple Watch for fitness. The watch utilizes data from its inbuilt accelerometers and gyroscopes in conjunction with a heart rate sensor to better understand activity levels during different types of exercise.
After using my Apple Watch while swimming, I realised the device was able to count lengths and even determine which strokes I was using, providing a richer context for monitoring, motivation, and self-improvement.
Another workout routine I do regularly is boxing using the freestanding boxing bag in my garage. In sharp contrast to swimming, where there are longer intervals of repeated movements, boxing consists of quick, jarring movements with sometimes several punches landing per second.
I was intrigued to see if the vast amount of sensor data being recorded could be utilized via deep learning to create a more engaging and interactive boxing workout experience.
On-Device Machine Learning
Although our mobile phones have been dubbed as the supercomputers in our pocket, they are still limited in compute power, memory, and battery when compared to dedicated machine learning hardware in the cloud. However, since mobile phones can be connected to the internet via Wi-Fi / cellular networks, it may seem sensible to utilize the unlimited compute power in the cloud to crunch through the large number of complex calculations involved in deep learning.
Below are two reasons I wanted to look at performing machine learning directly on the device rather than handing it off to the cloud.
Apple’s taking privacy, and the security of personal data, very seriously. By enabling complex modeling and machine learning on combined software and hardware architectures, Apple avoids the need for personal data to leave the device.
When comparing cloud to on-device performance, there's more to think about than the machine learning models themselves. By executing machine learning models directly on-device, the latency introduced by having to message large amounts of data between the phone and the cloud is automatically removed. This is particularly important when machine learning models use continuous streams of data and where we want to give real-time feedback to the user. This was exactly the case for my boxing application and thus a great incentive to take this approach.
Why This Project?
When starting this project there was an element of ‘this would be cool’, but also by solving this problem I would actually be improving my fitness regime, so I was highly motivated to make it work. Just thinking about how to approach this problem led to some exciting conceptual and technical challenges that I was keen to tackle. Many of these questions seemed relevant to the use of smart wearables as part of activity monitoring in general, an emergent field in life sciences, with applications including being able to detect when someone is itching in their sleep for patients that suffer from chronic eczema.
This project offered an opportunity to explore some interesting elements of the development of software driven by machine learning leveraging recent device hardware and software advancements within Apple’s mobile ecosystem From data collection, modeling, and deployment, there were many interesting challenges to overcome.
The Challenges to Overcome
What is a punch?
The prerequisite to creating a smarter boxing workout is the ability for an algorithm to decide what punches you have thrown or are going to throw, and also when the punch lands; in this regard a punch is both an event – when your fist hits the bag but also an activity – the characteristic motions leading to and following from when the punch lands. It was important to incorporate both of these aspects when thinking about how to model the problem, particularly to handle scenarios where the same type of punch is repeated in quick succession or during periods where no punches are being thrown at all.
An example of a boxing workout with Apple Watch data collected. Dashed lines show annotated punches, with lines of variable colour intensity showing the model predictions.
For an AI to learn the motions associated with different types of punches I had to give it many examples. When multiple punches are thrown over short periods of times, the movements associated with each punch become correlated, so it wasn’t enough to train the algorithm to learn, for example, what a left hook looked like in isolation because it could look different when thrown after a right uppercut or a left jab.
Boxers come in different shapes and sizes, have different styles and levels of experience, and it wasn’t clear how well a model trained only on my own workouts would perform for others.
Because my machine learning models could be trained directly on a phone, the idea was to create a personalized model for each user. This would start as a model trained on just my workouts, but then fine-tuned from examples taken directly from the user through an interactive calibration step within the boxing workout app itself. In my project I explored this transfer learning method on two colleagues with different experience levels and the results were very encouraging. In reality, starting with a model trained on more users and combining it with on-device transfer learning to create a personalized model would likely produce better results but I did find that just having the ability for the model to learn to personalize to the user was very interesting.
Optimizing for deployment on a device
When considering the different deep neural network architectures that did the best job at detecting the type of punches being thrown, I also had to look at which architectures would work fast enough on the mobile hardware. Let’s say your virtual boxing instructor tells you to throw a left hook, it would be a poor user experience if the AI had to think for even a few seconds to decide if you had completed the command correctly or not.
In the end I was able to take some important steps towards creating a more interactive and smarter boxing workout experience for myself and others in the future using my Apple watch. This brought together a deep learning neural network optimised to run on mobile hardware. The neural network was integrated into a mobile boxing app that allowed it to be personalized to a user’s particular style as they begin to use the boxing app.
The project helped me to realize the utility of our smart wearables and IOT devices in general both as data gathering devices but as
I am currently looking at how to use federated learning to take the personalized models created on each user’s device and use that to update the model deployed before personalization.