Add Microsoft to the list of companies declaring they’re all in for AI. At its Developer Day we even heard that they were going to be an AI-first platform, although I’m not quite sure what that is supposed to mean. However, there were plenty of announcements to put some meat behind the hype. We’ll take you through some of the most important and what they’re likely to mean for the future of AI-enabled Windows applications.
Microsoft Parries Google’s CloudML With Its Own ML Tools
Google has made it remarkably easy to develop a model locally, especially in TensorFlow; train it on the Google Cloud using CloudML; and then run it just about anywhere using TensorFlow, TensorFlow Lite, or the Nvidia-optimized TensorRT. That effort has close ties to Nvidia GPUs, so it wasn’t too surprising that Nvidia’s new GPU foe, Intel, and its Movidius VPU, were front and center as Microsoft launched an array of new AI-friendly development and runtime offerings at its Developer Day.
Microsoft’s offerings start with the Azure Machine Learning Workbench and AI Tools for Visual Studio. The ML Workbench allows you to use your choice of several machine learning frameworks including TensorFlow and Caffe, along with a container framework like Docker, to develop ML systems that can be trained in the Azure Cloud, and then deployed throughout the Windows ecosystem as ONNX models. It also includes a Studio application that supports drag-and-drop creation of models. After playing with IBM’s similar tool and being disappointed, I’ll be curious if the Studio environment is powerful enough to be a tool of choice in real-world situations. Certainly the Workbench will be helpful for Windows developers needing large-scale computing for training models.
Training, Validation, and Inferencing
Training is the most processor-intensive part of building a machine learning system. Typically a massive amount of pre-labeled data is fed into a prototype model, and a machine learning tool tries to optimize the parameters of the model to closely match its own results to the supplied labels. (Essentially you give the ML system a bunch of questions along with the correct answers and have it tune itself until it gets a great score.) Serious model builders leave some of the training data out, and then use it to validate the model, in parallel with training.
Validation helps detect a condition called over-fitting, where the model is basically just learning all the supplied data (think of it as memorizing the test results instead of learning anything about the subject). Once the model succeeds in becoming accurate enough for the intended use, it’s ready for deployment. If it can’t be trained successfully, it’s back to the drawing board, with either the model’s design or the way features are pulled from the data needing to be changed. In the case of gesture recognition for the Kinect, it took many months of iterations before the developers figured out the right way to look at the camera’s data and build a successful model.
Microsoft execs used the term “evaluation” quite a bit to refer to what I’ve more typically heard described as inferencing (or prediction), which is where the rubber meets the road. It’s when actual data is fed to the model and it makes some decision or creates some output — when your camera or phone tries to detect a face, for example, or perhaps a specific face, when looking at a scene.
Inferencing doesn’t need the same horsepower as training, although it certainly benefits from both GPU and custom silicon like the Intel’s Movidius VPU and Google’s TPU. Typically you also want inferencing to happen very quickly, and the results are used locally, so having it available right on your computer, phone, or IoT appliance is optimal. To make this happen, Microsoft has collaborated with Facebook, Amazon, and others on ONNX, a standard format for model interchange. ONNX models can be created with Microsoft’s new AI development tools and deployed on upcoming versions of Windows using WinML.
As someone who develops neural networks in Visual Studio, I was excited to hear about the AI tools for Visual Studio. Unfortunately, the only new piece seems to be tighter integration with Azure and its new AI-specific VMs. That’s pretty cool, and if you need to scale training up quickly, it’ll save you some manual labor, but it doesn’t seem to add any new capabilities. The Azure AI VMs also aren’t cheap. A single P40 GPU is $ 2/hour unless you make a large commitment. For one relatively simple audio classification model I’m working on, that means $ 10 for each full training pass that currently takes about six hours on my over-clocked Nvidia GTX 1080 GPU.
Pre-trained Models Are a Big Deal
Training models sucks. You either wait forever or spend a ton renting many GPUs in the cloud and running a parallelized version of your model. Traditionally, every modeling effort trained its model from scratch. Then developers noticed something really interesting. A model trained for one task might be really good at a bunch of other tasks. For example, one project at Stanford uses a standard image recognition model for evaluating camera designs. The advantage of this is you skip the headaches of organizing the test data, and the time and expense — possibly days or weeks — of training the model.
Whether you train a model from scratch or are able to use one that’s already trained, having access to a library of models in a standard interchange format will be a great productivity boost for Windows developers.
It’s Not Just About the Cloud Anymore: Local Deployment
WinML is the new runtime layer that will allow deployment of ONNX models on every edition of Windows by the end of 2018. It can be used from both Win32 and Windows Store apps, and relies on DirectX 12 to implement acceleration on the GPU. That’s an interesting difference from many machine learning systems, which rely heavily on Nvidia’s CUDA, and of course makes it easier to partner closely with Intel. Microsoft gave a compelling demo of using an already-trained model in a Visual Studio project. It looks straightforward, as long as you’re using C++ or C# at least. Using the Movidius chip, or perhaps high-end SoCs, Microsoft is also looking forward to running ONNX models on IoT devices — starting with HoloLens, but including embedded camera systems and other appliances.
With Microsoft locked in a battle for cloud supremacy with Google and Amazon, and counting Windows developers as one of its biggest assets in the fight, it makes perfect sense for it to make a massive push into state-of-the-art AI development tools that integrate with both Windows and Azure. Similarly, as Microsoft works to advance its own Windows and Cloud services like photo sharing, it will benefit from having a high-performance AI toolset for its own developers.