1. Home
2. Training Library
3. Big Data
4. Courses
5. 3. Beginner Data Structures in R

Missing data in R

Developed with

1
Vectors in R
PREVIEW2m 53s

The course is part of this learning path

Start course
Overview
Difficulty
Intermediate
Duration
38m
Students
427
Ratings
4.8/5
Description

Course Description

This module introduces you to the some of the basic data structures that can be used in R.

Learning Objectives

The objectives of this module are to provide you with an understanding of:

• What a vector is in R
• How to create a sequence
• How to create a vector using a repetition
• How to pull elements out of vectors
• Vectorised operations
• Logical comparisons
• Strings in R
• Undefined situations in mathematics
• 0, NA, NaN, and Null

Intended Audience

Aimed at all who wish to learn the R programming language.

Pre-requisites

No prior knowledge of R is assumed.

Delegates should already be familiar with basic programming concepts such as variables, scope, and functions.

Experience of another scripting language such as Python or Perl would be an advantage.

Having an understanding of mathematical concepts will be beneficial.

Feedback

Transcript

Imagine you have just conducted an experiment in the study you're in, in the classroom you're in, in your bedroom and you're trying to measure the temperature and you have a week's worth of data. And you store them in a vector known as temp. There are a few interesting readings that we have received here. One being an anomaly of perhaps 70 for Fahrenheit, 16 probably centigrade. Zero, which would be the zero value or the freezing point in, if we were to measure our temperature in Celsius.

There are also three interesting readings for NA, NAN, and NULL, which we should take a look at. NA is usually something that is, indicates Not Available, where the data is not available. It's useful as a placeholder and it's usually an indicator of a missing value. NAN is where we have, Not a Number is the technical definition of what NAN stands for, and it would be in the case where perhaps the thermometer was broken. It is the expectation of a numerical calculation should result in a number.

So, this is more of an indicator that we have an error. The last point I'd like to make is that NULL NULL represents Not Yet Calculated. It's a item that does not appear within the data structure. If I look at the output for temp, I don't see the value included in the output. In this case for our thermometer experiment it's usually an indication that something is not yet been properly initialised.

Students
1720
Labs
1
Courses
11
Learning Paths
3

Kunal has worked with data for most of his career, ranging from diffusion markov chain processes to migrating reporting platforms.

Kunal has helped clients with early stage engagement and formed multi week training programme curriculum.

Kunal has a passion for statistics and data; he has delivered training relating to Hypothesis Testing, Exploring Data, Machine Learning Algorithms, and the Theory of Visualisation.

Data Scientist at a credit management company; applied statistical analysis to distressed portfolios.

Business Data Analyst at an investment bank; project to overhaul the legacy reporting and analytics platform.

Statistician within the Government Statistical Service; quantitative analysis and publishing statistical findings of emerging levels of council tax data.

Structured Credit Product Control at an investment bank; developing, maintaining, and deploying a PnL platform for the CVA Hedging trading desk.