Learn the ways in which data comes in many forms and formats with the second course in the Data and Machine Learning series.
Traditionally, machine learning has worked really well with structured data but is not as efficient in solving problems with unstructured data. Deep learning works very well with both structured and unstructured data, and it has had successes in fields like translation, and image classification, and many others. Learn and study how to explain the reasons deep learning is so popular. With many different data types, learn about its different formats, and we'll analyze the vital libraries that allow us to explore and organize data.
This course is made up of 8 lectures, accompanied by 5 engaging exercises along with their solutions. This course is part of the Data and Machine Learning learning paths from Cloud Academy.
Learning Objectives
- Learn and understand the functions of machine learning when confronted with structured and unstructured data
- Be able to explain the importance of deep learning
Prerequisites
- It would be recommended to complete the Introduction to Data and Machine Learning course, before starting.
Resources
The Github repo for this course, including code and datasets, can be found here.
Hello! And welcome to this video on feature engineering. In this video, we will talk about what feature engineering is, we will talk about problems with traditional approaches to feature engineering, and we will talk about the advantage of using deep learning for feature engineering. As we've seen, unstructured data does not look like tabular data, and the traditional solution to connect the two is feature engineering. However, feature engineering comes with some limitation. In feature engineering, we need an expert that uses his or her domain knowledge to create features that correctly encapsulate the relevant information from the unstructured data.
Feature engineering is fundamental to the application of machine learning, but it is both difficult and expensive. Let's see a few examples. If we're training a machine learning model on face recognition task, we could encode the information on the face by using a well-tested method that consists in identifying key points in the face, and measure the distance between such points. In this way, we went from a 2-D image of a face to features that represent key distances of elements of that face. These distances are then the features passed to the machine learning model that will recognize the faces. In a similar way, in the domain of speech recognition, we can build features based on wavelet transforms, or Fourier transform, or short-time Fourier transform, and these were the standard until not very long ago. Deep learning has disrupted feature engineering by introducing a way to learn the best features directly from the raw, unstructured data.
This approach is not only very powerful, but also much, much faster. It's kind of a paradigm shift. It's a more versatile technique, that takes the role of the domain expert. So, in this video, we've discussed traditional feature engineering, and how it takes time and expertise, and we've highlighted the fact that deep learning is automated feature engineering. So, thank you for watching, and see you in the next video.
I am a Data Science consultant and trainer. With Catalit I help companies acquire skills and knowledge in data science and harness machine learning and deep learning to reach their goals. With Data Weekends I train people in machine learning, deep learning and big data analytics. I served as lead instructor in Data Science at General Assembly and The Data Incubator and I was Chief Data Officer and co-founder at Spire, a Y-Combinator-backed startup that invented the first consumer wearable device capable of continuously tracking respiration and activity. I earned a joint PhD in biophysics at University of Padua and Université de Paris VI and graduated from Singularity University summer program of 2011.