Vectorised comparisons to generate one result in R
Fundamentals of R
The course is part of this learning path
This module looks at more operators, and introduces conditional statements in R
The objectives of this module are to provide you with an understanding of:
- How to compare the values of two expressions
- How to compare the values of two Boolean expressions
- How to compare values of vectors
Aimed at all who wish to learn the R programming language.
No prior knowledge of R is assumed
Delegates should already be familiar with basic programming concepts such as variables, scope and functions
Experience of another scripting language such as Python or Perl would be an advantage
Understanding mathematical concepts will be beneficial
We welcome all feedback and suggestions - please contact us at firstname.lastname@example.org to let us know what you think.
- [Narrator] Imagine the scenario where you had a series of expenses. Now these are just arbitrary numbers that I've decided to throw into a vector to help us understand an example. And I was to ask the question, expenses, meaning the vector ranging from one to 103, less than 10, are greater than zero. So imagine this scenario where 10 is a threshold, for example. What happens in this statement here is that we subtract 10 from each of our entries in the expenses vector and then compare each and every one of those results to zero. And the result is a logical vector of trues and falses. We might ask the question can we reduce this sequence of trues and falses down to one value, one summary number? I'd like to just add in a variable called threshold to make life a bit more general. And we could ask the question are any, in order to get to just one value, to generate one result, are any of our expenses high? So I can say, take this calculation up here replace the 10 with the expense threshold, wrap this whole thing in a pair of brackets, and ask the question, are any of these items greater than zero? And this returns one true value. Whenever we run the any function, we receive back true if at least one true value is found. And in this instance here, we saw three true values out of seven. Had they all been false, then we would have received false. But if any of them is true, then this returns a single true value, which is sometimes useful for testing. Another question you might ask, with regards to, for example, this expenses vector are all of the expenses high? So we can run the same calculation in the same format as above, replacing any for all, where all is a very useful function which returns us with just one result, and this returns true if all values in our vector defined by the bracketed term here, expenses minus expense dot threshold are true. And in this instance here, if we look at our initial output above we can see that not all the values in the vector are true. There are some falses existing. Some of the expenses were low and hence we return the singular false as a return statement. We can see an example of how this might be used even in the case of this very simple expenses vector which seems quite silly with a natural use of the if statement. Though we can say if any expenses are high print that "some of the expenses are high" otherwise, print that "there are no high expenses." So your manager might want to know this based on this arbitrary, fictitious set of, for example, lunchtime meals, and you spent one unit on the first day and every weekday you slowly increased your expenses and then on the weekend you decided to start spending more money. So some of the expenses are high, which might need to be flagged.
About the Author
Kunal has worked with data for most of his career, ranging from diffusion markov chain processes to migrating reporting platforms.
Kunal has helped clients with early stage engagement and formed multi week training programme curriculum.
Kunal has a passion for statistics and data; he has delivered training relating to Hypothesis Testing, Exploring Data, Machine Learning Algorithms, and the Theory of Visualisation.
Data Scientist at a credit management company; applied statistical analysis to distressed portfolios.
Business Data Analyst at an investment bank; project to overhaul the legacy reporting and analytics platform.
Statistician within the Government Statistical Service; quantitative analysis and publishing statistical findings of emerging levels of council tax data.
Structured Credit Product Control at an investment bank; developing, maintaining, and deploying a PnL platform for the CVA Hedging trading desk.