Machine learning (ML) is a technique of using computers to predict things based on past observations.
The following sections discuss basic machine learning workflow on microcontrollers:
- Identification
- Datasets
- Model Architecture
- Training
- Model Conversion
- Run Inference
- Evaluate and Troubleshoot
Identification
Based on the end problem, the user needs to decide what to predict, what data to collect and recognize the type of machine learning problems to be solved (classification, regression, ranking, etc.).
Datasets
Once the problem statement is identified, the next step is to determine what data is needed. The dataset preparation involves several elements like selection, collection, and labeling of data.
Selection:
- Include information that is relevant to solving the problem
- A combination of domain expertise and experimentation
- Statistical techniques to identify significant data
Collection:
- More data is better
- Collect data that represents the full range of conditions and events that can occur in the system
Labeling:
- The process of associating data according to the problem statement model
Model Architecture
The machine learning algorithm builds a system model based on the input data through a process called training. The model is a type of computer program. Data is run through this model to make predictions in a process called inference.
The following factors are to be considered while designing a model architecture:
- Type of problem being solved
- Type of data you have access to
- Ways to transform the data before feeding it into a model
- Constraints of the end device
Training
Training is how a model learns to produce the correct output for a given set of inputs. It involves feeding data through a model and making small adjustments until it makes the most accurate predictions possible.
An ML model could be a network of simulated neurons portrayed by arrays of neurons organized in layers.
These numbers are called parameters (or weights and biases). When data is fed into the network, it is remodeled by sequential mathematical operations involving weights and biases in every layer.
Training is a repetitive process (known as epochs) that continues until decided to stop, when a model's performance stops improving is when it begins to make accurate predictions.
Model Conversion
One of the common and famous tools to run an ML model is TensorFlow. TensorFlow's interpreter is designed to run models on powerful desktop computers and servers. A user loads the model into memory and executes it through the TensorFlow interpreter.
ML models for small devices need to run on much smaller devices (microcontrollers), and therefore a different interpreter is required. TensorFlow provides an interpreter and the accompanying tools to run models on small, low-powered devices. This set of tools is called TensorFlow Lite. Before TensorFlow Lite can run a model, the model must first be converted into the TensorFlow Lite format and saved to disk as a file. This conversion is done using a tool named the TensorFlowLite Converter.
Run inference
Once the model is converted to the target device format, it's ready to be deployed.
For models converted using TensorFlowLite Converter, the TensorFlow Lite for Microcontrollers C++ library runs the model and makes predictions.
The application implements code that takes raw input data from our sensors and transforms it into the same form that the model was trained on. This transformed data is passed to the final model and run inference. This will result in output data containing predictions.
Evaluate and Troubleshoot
The real-world behavior of the model is observed to see if it performs as expected because there could be possibilities of performance not matching expectations. This mismatch could be due to several reasons. One reason is that the data used in training might not be exactly representative of the data available in real operation.
Troubleshooting is required when the model is not performing as expected.
- Rule out any hardware problems (like faulty sensors or unexpected noise) that might be impacting the data that gets to our model
- Capture the data generated during inference with the training data to see the differences
After a cause is established, the model needs to be retrained with more data and rerun.