The course is part of this learning path
Using Azure ML Workbench
Azure Machine Learning Workbench is a front-end for a variety of tools and services, including the Azure Machine Learning Experimentation and Model Management services.
Workbench is a relatively open toolkit. First, you can use almost any Python-based machine learning framework, such as Tensorflow or scikit-learn. Second, you can train and deploy your models either on-premises or on Azure.
Workbench also includes a great data-preparation module. It has a drag-and-drop interface that makes it easy to use, but its features are surprisingly sophisticated.
In this course, you will learn how Workbench interacts with the Experimentation and Model Management services, and then you will follow hands-on examples of preparing data, training a model, and deploying a trained model as a predictive web service.
- Prepare data for use by an Azure Machine Learning Workbench experiment.
- Train a machine learning model using Azure Machine Learning Workbench.
- Deploy a model trained in Azure Machine Learning Workbench to make predictions.
- Anyone interested in Azure’s machine learning services
- Introduction to Azure Machine Learning Studio course or basic machine learning experience.
- Python experience.
- Azure account recommended (sign up for free trial here if you don’t already have an account).
Once you’ve trained a model and you’re happy with its accuracy, you’ll likely want to deploy it as a predictive service.
Fortunately, the iris model we trained in the previous lesson was already saved by the training script.
In the Run List, open your last run. In the Outputs section, you should see three files. The confusion matrix, the trained model, and the ROC graph are all stored here. The model was saved as a Pickle file, which is a serialized Python object structure. Check the box next to “model.pkl” and click “Download”. Put it in your project directory.
To deploy your model, you also need two other files: a schema file that tells the service what input it needs to get to make a prediction, and a scoring script that knows how to use the model to make a prediction. It needs to include two functions: init and run. The init function should load the trained model. The run function should take a new data record as input and then return the model’s prediction for that data.
The iris project template contains a script that takes care of both of these requirements. When you run it, it creates a schema file. The script also contains the init and run functions that the service needs.
The script is called score_iris.py. Let’s scroll down to see what the main function does. This code generates the schema file by handing some sample data to the generate_schema function. It saves the schema file to the special outputs directory, so we’ll be able to download it from the run results page.
This code tests the init and run functions. It feeds a sample data record to the model and prints the prediction so we can check it in the output of the job.
The init function basically just loads the model from the pickle file. The run function is also pretty simple, with these three lines of code getting a prediction from the model and returning it. The only complication is that because the training script added 40 random features to the data, we have to do that here too. That’s what all of this code is for.
OK, let’s run it. Alright, now check the output log. You can ignore the error message. It just says that we’re running the script locally instead of in Docker mode, so data collection was disabled. It came back with a prediction and it says it generated the schema.
Now if we bring up the full output page, the schema file does, in fact, show up in the Outputs section. Check the box and download it. Now we have all three files in our project directory: the trained model, the schema file, and the scoring script.
You can deploy the prediction service either locally or in the Azure Container Service. If you deploy it locally, then you have to have Docker installed. I’m going to deploy locally in this demo, so if you don’t already have Docker installed, then please install it. Luckily, it’s quite simple to install.
Once you have Docker installed and running, there are still many steps you need to go through to do your first deployment. You need to make sure the Microsoft ContainerRegistry is registered in your subscription, create the deployment environment, activate your Model Management account, configure the deployment environment, and finally, create the predictive web service.
First, you need to make sure the Microsoft ContainerRegistry is registered in your subscription. In the Azure Portal, search for “Subscriptions”. Then click on your subscription and select “Resource providers” from the Settings menu. Scroll down to “Microsoft.ContainerRegistry” and make sure it’s registered. If it’s not, then click “Register”, and wait for the registration process to finish.
You need to do the rest of the tasks from the command line, so go to the File menu in Workbench and select “Open Command Prompt”.
The first task is to create the deployment environment, which includes a storage account for storing Docker images, an Azure container registry that lists the Docker images, and an AppInsight account that gathers telemetry.
To create the deployment environment, use the “az ml env setup” command, which you can copy from the github readme. The -n argument says what you want to name the environment. I called it “local”. The location argument specifies the Azure region. You should set it to the same region as you did when you created your Experimentation account. Note that only some Azure regions are supported. I set it to “westcentralus”.
If you wanted to deploy to a cluster on the Azure Container Service, then you would just add a “-c” flag, but we’re not going to do that here.
OK, it started creating the environment and it helpfully gave us a couple of commands we can use. This one will show you the information about the environment. Notice that its "Provisioning State" is "Creating". So it’s not ready yet. While we’re waiting for it to finish, we can do another task.
Set your Model Management account as active with this command. For the -n argument, put in your own Model Management account name that you created in the Installation lesson. For the -g argument, put in the resource group that you chose when you created your Model Management account.
Now run the “az ml env show” command again to see if the deployment environment is ready yet. Good, it is. Now you can configure your environment with this command. This sets the environment as the active deployment environment and it also specifies the resource group that it’s in. To verify that you configured it correctly, run the “az ml env show” command again, but without any arguments this time. Now it will show the deployment environment that’s active. It looks good.
There’s just one more thing we’re going to do before we create the web service. You only need to do this if you want the web service to store all the data that gets input to the service, as well as the predictions that the service returns. To make it work, you have to set an environment variable to the Azure Storage connection string for where you want to store the data.
To find out which storage account is in use, type “az ml env show -v”. Here’s the name of the account. If you’re on a Mac, type ‘export’. If you’re on Windows, type ‘set’. Then all in upper-case, type ‘AML_MODEL_DC_STORAGE=’. On a Mac, type a quote, but on Windows, don’t. To get the connection string, go to the Azure Console and click on “Storage accounts”. Find the account that you got back from the command. Then select "Access keys" in the Settings menu. I’ve blacked out the keys in this video for security reasons. Click the copy button next to the connection string for key1. Then go back to the command line, paste the connection string, and add a trailing quote if you’re on a Mac.
OK, now we can finally can create the predictive web service. Here’s the ridiculously long command to do that. The first three arguments specify the scoring script, the trained model, and the schema. Then we say what name we want the service to be called. The -r argument specifies the runtime for the web service. At the moment, there are only two possible choices: python and spark-py. The final option is where we specify that we do want to collect input and prediction data, and save it in Azure Storage.
First, it registers the trained model and creates a manifest for it. Next, it creates a Docker image and puts your model, schema, and scoring script in it. This will take a while, so I’ll fast forward. Then it pushes the image to the Azure Container Registry. After that, it pulls the image down to your computer and uses it to start a Docker container. Finally, it creates an endpoint for the web service.
It also gives us a couple of commands to help us use the service. Copy and paste this one to get usage information. This is where the web service is running on your computer. If you need to do some debugging, then you can use this command to look at the logs. And this is a command you can use to feed sample data into the service and get a prediction. Copy and paste that one. It comes back quite quickly with a prediction of “Iris setosa”.
If you want to run the service from an application, you’ll need to get the authentication keys, using this command. However, this command will only work if you’ve deployed the web service on Azure. It won’t work with a local deployment.
Remember how we set the collect-model-data argument to true when we created the web service? Now we can have a look at the data it collected. In the Azure Console, go back to your storage account and select “Containers” in the Blob Service menu. There should be a container called “modeldata”. Click on it. You probably won’t see any data yet, because it can take up to 10 minutes before it shows up. I’ll fast forward.
Once it’s there, click on the folder and keep clicking until you see the inputs and prediction folders. Drill down on the inputs folder. There’s a data.csv file. Click on it and then click “Download”. This is the data we sent to the web service to test it, except that there are a bunch of extra columns. Those are the 40 random features that it added to make the input match what the model expected.
Predictions are also stored in a csv file.
And that’s it for this lesson.
About the Author
Guy launched his first training website in 1995 and he's been helping people learn IT technologies ever since. He has been a sysadmin, instructor, sales engineer, IT manager, and entrepreneur. In his most recent venture, he founded and led a cloud-based training infrastructure company that provided virtual labs for some of the largest software vendors in the world. Guy’s passion is making complex technology easy to understand. His activities outside of work have included riding an elephant and skydiving (although not at the same time).