Vectorised comparisons to generate many results in R

Developed with
QA

The course is part of this learning path

Start course
Overview
Difficulty
Intermediate
Duration
25m
Students
382
Ratings
4/5
starstarstarstarstar-border
Description

Course Description 

This module looks at more operators, and introduces conditional statements in R.  

Learning Objectives 

The objectives of this module are to provide you with an understanding of: 

  • How to compare the values of two expressions  
  • How to compare the values of two Boolean expressions  
  • How to compare values of vectors  

Intended Audience 

Aimed at all who wish to learn the R programming language. 

Pre-requisites 

No prior knowledge of R is assumed. 

Delegates should already be familiar with basic programming concepts such as variables, scope, and functions. 

Experience of another scripting language such as Python or Perl would be an advantage. 

Understanding mathematical concepts will be beneficial.

 Feedback 

We welcome all feedback and suggestions - please contact us at qa.elearningadmin@qa.com to let us know what you think. 

Transcript

- [Instructor] A vectorized form of the if/else statement in R is the if/else statement. This allows us to generate many results, one result per branch. The structure of which would be a condition if true, we can return an action, and we can return an alternative action where the condition is false. Lets take the example of a simple data structure, such as a variable known as hope, and minus three being the number assigned to it, so we can ask the question, "Is hope less than zero?" Here we are using a comparative operator. We can return back the answer. If the condition is true, this is the action that will be returned. I can make this a bit simpler on the notation just to reduce the number of characters that are outputted. Previously or historically, rather than using the if/else statement with many results, you might want to use the if/else function in order to allow us to see the same output in this case of a very simple structure of data, but imagine the world where we had a more complicated structure, such as many numbers. Here now, the if/else statement returns an output based on each of our different comparisons to our condition. So each test returns a different output, but this whole output is stored in one space, so we generate many results within one state. We can explain the match function, which allows us to compare the values of vectors. Now let me create a couple of vectors to help us understand that. So here we have subjects being short forms or abbreviations for, say for example, programming, mathematics, statistics, and so on. Let's ask the question, "How many subjects do we have?" Six. We can create a different set of subjects called, say for example, advanced and noting mathematics, statistics, and so on. And we can ask for the length of this to know that we have four of these. So we have different lengths of vectors, where some of the entries, or some of the elements overlap, and we can ask for the match between these two, such that we return the indices of the second vector, specifically the subjects advanced, specifically the smaller vector in this case. Of all of the values which occur in the first, and as you can see, MT occurs and hence we see an index number of one. ST in the smaller version exists in the larger version, and hence we see a two. ML and hence we see a three. AI is the fourth entry, and hence is shown as a four. We see NA where there is no match. So this helps us understand that the number of subjects in advanced are less than the number of subjects in the six subject vector. We might ask the question next which is, "Which courses are not advanced?" and the answer would be the NA's, so we can use a simple if/else statement utilizing the is.na function to verify which items in my output here are NA, and return advanced or not advanced accordingly. If instead of having a nonsensical list out here, I could utilize in the same function, the argument or the action being subjects being returned for those items which were Is.na true, meaning where we have not advanced subject such as PR and DS being outputted. Where we have advance subjects outputted, we have the keyword "advanced" outputted.

About the Author
Students
2074
Labs
1
Courses
11
Learning Paths
3

Kunal has worked with data for most of his career, ranging from diffusion markov chain processes to migrating reporting platforms.  

Kunal has helped clients with early stage engagement and formed multi week training programme curriculum. 

Kunal has a passion for statistics and data; he has delivered training relating to Hypothesis Testing, Exploring Data, Machine Learning Algorithms, and the Theory of Visualisation. 

Data Scientist at a credit management company; applied statistical analysis to distressed portfolios. 

Business Data Analyst at an investment bank; project to overhaul the legacy reporting and analytics platform. 

Statistician within the Government Statistical Service; quantitative analysis and publishing statistical findings of emerging levels of council tax data. 

Structured Credit Product Control at an investment bank; developing, maintaining, and deploying a PnL platform for the CVA Hedging trading desk.