DEMO: Creating and Configuring Your Custom Vision Model
Start course

This course explores the Azure Custom Vision service and how you can use it to create and customize vision recognition solutions. You'll get some background info on what the service is before looking at the various steps for creating image classification and object detection models, uploading and tagging images, and then training and deploying your models.

Learning Objectives

  • Use the Custom Vision portal to create new Vision models
  • Create both Classification and Object Detection models
  • Upload and tag images according to your requirements
  • Train and deploy the configured models

Intended Audience

  • Developers or architects who want to learn how to use Azure Custom Vision to tailor a vision recognition solution to their needs


To get the most out of this course, you should have:

  • Basic Azure experience, at least with topics such as subscriptions and resource groups
  • Basic knowledge of Azure Cognitive Services, especially vision-related services
  • Some developer experience, including familiarity with terms such as REST API and SDK

So, we're back here in the Azure Custom Vision Portal picking up where we left on the previous demo. 'Cause I already gave so many examples of landmarks, let's create a project related to this. I'll name this project ESvsETOD, and then in the description I'll paste, checks if picture is the Empire State Building or the Eiffel Tower.

For the resource, I'll select the resource that we already created on the previous demo, CACustomVision. This will be a classification project. But I want to show you the differences between Object Detection and Image Classification.

So, start with Object Detection here. Notice that the options for classification types disappear. As there's no landmarks domain for object detection, I'll just leave General A1 selected and click on Create Project. After a few seconds, the new project is automatically opened for me.

Now, I'm being asked to add images. So, I'll click on the Add Images button, navigate to the Eiffel Tower pictures, and then add them. Then I'll click on the button to upload the files and after a few seconds, I'll click on Done and here are my pictures. Let's select the first one, and if I mouse over the picture, notice that it automatically selects the Eiffel Tower for me. That's neat. I'll just click on the region selected and on the dialog box that opens, I'll type Eiffel Tower and then click on the Add button. Then I'll close this dialog box and now notice that the image disappeared from this list. That's because the image moved from the Untagged area to the Tagged one.

If I click here on Tagged, here's the image that I have just tagged. See? I'll now switch back to the Untagged tab, click on the next image and again click on the selected region. But this time I'll try to add Negative as the tag. Notice that this is not recognized by Custom Vision. That's because Negative tags are only related to Classification projects. The absence of the Negative tab, the capability of creating bounding boxes around the objects selected, and the small difference in some domains are probably the most noticeable differences between object detection and classification. As I mentioned before, this was supposed to be a classification model anyway.

So, let's create another model with the correct settings. I'll close this dialog box, click on the eye button on the top-left to come back to my projects list and delete this Object Detection project as I no longer need it. Then I'll click to create another one. I'll name this project "ESvsET, paste the same description and leave selected the same Cognitive Services resource as before. But this time, I'll select these as a Classification project. And as it's very unlikely that we'll ever have the Empire State Building and the Eiffel Tower in the same picture, I'll set these as Multiclass, which has a single tag per image.

Also, notice that I have a Landmarks domain. So, I'll select this one, click on the Create Project, and after a few seconds, my new project created appears on the screen. I'll click again on Add Images, select all my Eiffel Tower pictures and click open. But notice here already a difference in the interface. Because this is a Classification model, there's no need for me to draw bounding boxes. So, the interface can already ask me the tag for these images immediately.

Also, when I click on the text box, notice another difference from Object Detection. I get the option to tag these images as Negative. See? But these are not negative images. So, let's type here Eiffel Tower and click on the upload button. After a few seconds, I'll click again on Add Images and do the same operation with Empire State Building pictures, tagging them appropriately.

Finally, I'll click on Add Images once again, and this time I'll add pictures of a few other landmarks and tag them as negative, to give to the model an example of pictures that are neither the Eiffel Tower nor the Empire State Building.

Okay. So, my project is pretty much ready to be trained. So, I'll click here on the Train button at the top to start the process. Now as you can see, the dialog box asks me if I want to use Quick Training or Advanced Training. Quick Training perfectly fine, but let me switch to Advanced Training for a minute to show you the options.

Notice that I can set the training budget, the maximum number of hours to spend on this training and be notified when this process is done. But let me switch back to Quick Training and click on the Train button. After a few more seconds, we have the results of this iteration, called Iteration 1 on the Performance tab. Don't worry too much about these evaluation metrics as we'll cover them in the next video.

Here on the left, I'll have a list of all iterations that are created by my model trainings and I can switch between them to compare the model performance of each version. There's just one last thing that I would like to show you, which is this Smart Labeler feature.

Let's switch back to the Training Images tag and click again on Add Images. Then I'll select another picture of the Empire State Building. But this time I'm not going to add a tag, I'll just click on the upload button and then click on Done. Now I have this Get Suggested Tags button available in the interface. See, I'll click on it and now Custom Vision is telling me that there's a cost associated with it. That's because Custom Vision is using the trained iteration that we have just created to predict new images.

Predictions are generally quite cheap, so I'm fine with leaving these as All Untagged Images. However, you can set a maximum limit if you want. Then I'll click on Get Started to enable the feature, and then I'm taken back to the Untagged Images area with the picture that I have just uploaded. If I click on it and I need to make sure that this Suggested Tags option is On, notice that Custom Vision is predicting this tag with over 99% certainty. I just need to click on the Empire State Building tag, click on Save and Close, and now my pictures properly tagged and move to the Tagged Images section. This is huge! It allows you to upload several images to Custom Vision and have the model automatically tag them, making the process considerably easier for you.

But I'm pretty sure that you're curious about those evaluation metrics on the performance tab. So, let's see what they mean in the next video.

About the Author

Emilio Melo has been involved in IT projects in over 15 countries, with roles ranging across support, consultancy, teaching, project and department management, and sales—mostly focused on Microsoft software. After 15 years of on-premises experience in infrastructure, data, and collaboration, he became fascinated by Cloud technologies and the incredible transformation potential it brings. His passion outside work is to travel and discover the wonderful things this world has to offer.