1. Home
2. Training Library
3. Big Data
4. Courses
5. 4. Intermediate Data Structures in R

Subsetting matrices in R

Developed with

1
Objects in R
PREVIEW2m 34s
2
Integers in R
PREVIEW2m 1s
8

The course is part of this learning path

Fundamentals of R
11
3
1
1
Start course
Overview
Difficulty
Intermediate
Duration
40m
Students
74
Ratings
5/5
Description

Course Description

This module looks at more complex data structures, building on what was covered in the beginner data structures module.

Learning Objectives

The objectives of this module are to provide you with an understanding of:

• Different data types
• Integers
• How to coerce elements, and force coercion
• How to construct a matrix
• How to construct an array
• How to construct a list

Intended Audience

Aimed at all who wish to learn the R programming language.

Pre-requisites

No prior knowledge of R is assumed

Delegates should already be familiar with basic programming concepts such as variables, scope and functions

Experience of another scripting language such as Python or Perl would be an advantage

Understanding mathematical concepts will be beneficial

Feedback

Transcript

- [Instructor] In order to understand subsetting of matrices, I'd like to just construct a vector, and explain how to access a single element in that vector using index notation. If I wanted to grab several entries from this vector, I could use a vector of indices, or I could use a vector of logicals. With a matrix element access, we can access each element by row followed by column. So let me just create a matrix on the screen, and I can access elements in here for a single access by editing the indice for the row and the indice for the column. For example, if I wanted to see the number 24 from this matrix here, I would go to the fourth row and the second column and I would return back 24. How can I return an entire vector? Yes, we can do this by omitting one of the indices. So this will return the entire second column, and I can do this again for the entire second row, and just to bring c back up onto the screen to show you. As a reminder, the second row looks like that and the second column has 21 all the way through til 30 in it. In other words, blank represents all. Complex subsetting is also recognized and used by matrices, such as if I was to use a subset of, in the same way as if I was to take the first element from here, I can ask for using vector notation for this. But if I wanted to expand that vector to include one and two I can see a 2x2 matrix growing from this c. I can repeat the same thing for a 3x3 matrix, which takes the first three rows and the first three columns. If I leave out both indices, I will be asking for all rows and all columns. I can take any combination of these indices or subsetting using this vector of wanted indices. I can use a vector of logicals to help me filter elements as well. So bringing c back up onto the screen and creating a vector now using the rep function of TRUEs and FALSEs, so this is the items from the rows that I would like to see, every other element, every second, fourth, sixth, eighth, and 10th element, meaning all of the even numbers. For the columns, I'd like to grab the odd columns, the first and the third. I can show you that on the screen. And if I use the index notation to access these elements, I see that in here I've picked up the even rows, and the odd columns. The complex subsetting also automatically allows me to vector recycle as part of the R infrastructure, so I don't need to utilize the rep function. This is automatically occurring in the background, and it returns the same result whether I use c or whether I use the rep .

Kunal Haria
Data Science Trainer
Students
728
Labs
1
Courses
11
Learning Paths
1

Kunal has worked with data for most of his career, ranging from diffusion markov chain processes to migrating reporting platforms.

Kunal has helped clients with early stage engagement and formed multi week training programme curriculum.

Kunal has a passion for statistics and data; he has delivered training relating to Hypothesis Testing, Exploring Data, Machine Learning Algorithms, and the Theory of Visualisation.

Data Scientist at a credit management company; applied statistical analysis to distressed portfolios.

Business Data Analyst at an investment bank; project to overhaul the legacy reporting and analytics platform.

Statistician within the Government Statistical Service; quantitative analysis and publishing statistical findings of emerging levels of council tax data.

Structured Credit Product Control at an investment bank; developing, maintaining, and deploying a PnL platform for the CVA Hedging trading desk.