This module looks at how to control data in R, through reading, writing, and loading objects.
Learning Objectives
The objectives of this module are to provide you with an understanding of:
- How to bring in data from a file in R
- Saving and loading objects in R
- Interacting with the clipboard
- How to connect to files in R
- How to read from a file in R
- How to write to a file in R
Intended Audience
Aimed at anyone who wishes to learn the R programming language.
Prerequisites
No prior knowledge of R is assumed. You should already be familiar with basic programming concepts such as variables, scope, and functions. Experience of another scripting language such as Python or Perl would be an advantage. An understanding of mathematical concepts would be beneficial.
Feedback
We welcome all feedback and suggestions - please contact us at qa.elearningadmin@qa.com to let us know what you think.
[Instructor] When bringing in data, from a file, into R, we utilise the working directory. In order to understand this, let us think about what a file path is. A file path sets the default location of any files you read into R, or save out of R. We can ask for the working directory to be returned to the screen by using the command Get Working Directory. G-E-T-W-D. There are two types of paths, an absolute path, and a relative path. Here, we have returned an absolute path, which is entirely unambiguous, and consists of a series of directories, here we have the Users directory, the KUNAL directory, and the Fundamentals of R directory. It begins from the base of your system, here, being c colon backslash, or forward slash, because I'm using Windows. In Windows, the file path separator in R, is the forward slash. Relative paths are context specific. They help us understand, they help R understand, where to begin from. So wherever the current working directory is, that is where other files can be accessed from. Say, for example, I wished to save down a file in R, a series of commands, and hand it over, which required the data.csv file, here. I could use, a file to store my R code, and the data.csv file here in the Files tab, and allow the system to infer that the data.txt file is in the present working directory. Meaning that if I was to distribute, any of my R source code, with the data files, I wouldn't worry about where the files are located in another system, they'd be in the same place, relative to one another.
I could have also, another useful command is, the file.choose. In case I wished to manually allow the user to ask or select a data file. So, for example, I could run this command here, which would cause a pop-up to show up on the screen, and I can click on the data.csv file, and as you can see, if I bring these up to the screen, filepath has been stored, in the appropriate format for R. I can then, use the read.csv file to bring the data into the R's studio session. There are two functions that help me cut up the actual file path filename, one being base name, say the actual file, the other being directory name, D-I-R-N-A-M-E. That's the directory where the file is located. I could then convert this, directory name, and combine it with a different file, say, for example, data.txt rather than data.csv using the appropriate separator for the operating system I'm on. I can see that I have created a filepath. And in the same way I had earlier, I can read this in if I wanted to, or I can point a user to use this in whichever way they'd like. I can also create filepaths manually, rather than having to find a file, and then look for another file in the same folder, I could have built this up manually. Constructed and glued arguments together, using the file.path function. Here, I could have typed in, for example, and this would return back, a path that I can use later on, and it will be operator system specific, here it's using the forward slashes. By default, file.path understands what operating system you're on.
Lectures
Kunal has worked with data for most of his career, ranging from diffusion markov chain processes to migrating reporting platforms.
Kunal has helped clients with early stage engagement and formed multi week training programme curriculum.
Kunal has a passion for statistics and data; he has delivered training relating to Hypothesis Testing, Exploring Data, Machine Learning Algorithms, and the Theory of Visualisation.
Data Scientist at a credit management company; applied statistical analysis to distressed portfolios.
Business Data Analyst at an investment bank; project to overhaul the legacy reporting and analytics platform.
Statistician within the Government Statistical Service; quantitative analysis and publishing statistical findings of emerging levels of council tax data.
Structured Credit Product Control at an investment bank; developing, maintaining, and deploying a PnL platform for the CVA Hedging trading desk.