Some time ago I wrote an article about comparing the performance of the TensorFlow runtime versus the ONNX runtime. In that article I showed how to improve inference performance greatly by simply converting a TensorFlow neural network to ONNX and running it using the ONNX runtime. I didn’t explain what ONNX actually is, this article will go into that. ONNX is more than just a runtime that makes your neural networks run faster. It stands for Open Neural Network Exchange (ONNX) and is an open ecosystem for training, running and sharing neural networks. Furthermore, ONNX is a community project backed by big companies like Microsoft and Facebook.
But what does “open ecosystem” mean? The term ecosystem is always very vague to me, so I will explain how I see ONNX. ONNX simply makes it possible to use your Neural network in different tools and libraries. This means that you can, for example, create your model in tool A, train it on tool B and deploy it on tool C. The ONNX ecosystem makes this possible, it provides formats and converter tools so that you can always import and export ONNX files into your favorite Neural network related tool or library.
What is ONNX used for?
Basically ONNX makes sure that there is interoperability between all of your favorite tools. But there is more! ONNX also provides a system to optimize your models such that they perform more efficiently. This can be done automatically for the most part, and it is generally independent of the eventual runtime. The optimizations are quite technical, but in simple terms the optimizers remove redundant and unneeded parts of your network making it more performant. In the article that I wrote about comparing the performance of the TensorFlow runtime versus the ONNX runtime you can clearly see the difference. Without any effort and by simply converting my network I already got big performance gains (the TensorFlow converter does optimizations automatically).
In practice, you will probably not use ONNX directly. I do not think you want to write your model in the very complicated ONNX format. Instead, you will use a library of your preference, for example Keras or PyTorch and use that to create and experiment with your neural networks. So first you will probably create and experiment first with a library like Keras, maybe on your local machine or somewhere else. Then, when you are satisfied with the performance of the network you can deploy it on a platform like UbiOps. Between these two systems you will use ONNX as an intermediate format. If in the future one of these systems gets replaced, a switch from Keras to PyTorch for example, there will be no hassle because ONNX is supported by all of them.
How to get started with ONNX?
Finally, I want to give some handles on how to get started with ONNX yourself. There is of course the article I made some time ago that explains how to convert a Keras model to ONNX and run it on our platform UbiOps. But there are also great examples and tutorials on the ONNX github page. Furthermore, I think the list of ONNX converters and the model zoo (A list of popular models in the ONNX format!) are especially useful to get yourself started.
Raoul Fasel, UbiOps