Fundamentals of R
The course is part of this learning path
This module looks at how to control data in R, through reading, writing, and loading objects.
The objectives of this module are to provide you with an understanding of:
- How to bring in data from a file in R
- Saving and loading objects in R
- Interacting with the clipboard
- How to connect to files in R
- How to read from a file in R
- How to write to a file in R
Aimed at anyone who wishes to learn the R programming language.
No prior knowledge of R is assumed. You should already be familiar with basic programming concepts such as variables, scope, and functions. Experience of another scripting language such as Python or Perl would be an advantage. An understanding of mathematical concepts would be beneficial.
We welcome all feedback and suggestions - please contact us at firstname.lastname@example.org to let us know what you think.
We can connect to files in R by writing to them, or reading from them. In order to write some data, I'd like to create a vector. Here I have created a character vector, which I would like to put in a file called Kunal_data.txt. Here I am creating a connection object, storing this connection object in a variable called connec, and I am holding relevant meta data within this connection. I then open this connection, and how I open this is chosen by the second argument, which can be w for writing, r for reading, a for appending. I then choose to write using the writeLines function. The text that I have stored in the character vector Kunal, using the connection connec. I must remember to close the connection before doing anything else the minute it is not required anymore. If a connection is not closed the file can use up system resources and create unpredictable issues when other programs try to read the same file. Let us try and read in what we have just written by creating another connection. So now I'm connecting to this file using this connection. I can open a read connection to this file that I have just created using the readLines function on the connec_2. I can see that I have created the lines variable. I must now close the connection because it is no longer required. And I can view the data that I have brought in and it is exactly the same as my character vector that I had defined at the start. I can also read in from a url. For example, I can create a connection to this url here which contains lots of fake meta data. I can open this connection using the read option. I can use the readLines function to read in from this connection that I have created into the lines object. As I'm finished with this, before I look at the data I should close the connection down to avoid any issues. And as I know that this is a large file I'll have a look at just a little section of it just to show you what I have brought in. I can delete files that are no longer required. Say for example Kunal_data.txt, that's no longer required. I can click on the delete button in the file explorer, and then have a pop up indicating to me if I'm sure I would like to delete this or not. Rather than doing that I can also use the function unlink, and the unlink function will delete whatever file you name here. So you must be very careful when you're interacting with the operating system. And it is now deleted. There are a couple other commands that are useful at this point. You can find files. So I can have a look and see what files I have in my current working directory, which is indicated by a single full stop in quotes. And I can see that I have a csv, a text file, and a directory. I could look at the directories independent of the files. And by default, this is recursive, meaning it looks through the sub folders. And I can amend that to being non recursive to being just the top folder only to show me just year as opposed to all those folders that I have created within year.
Kunal has worked with data for most of his career, ranging from diffusion markov chain processes to migrating reporting platforms.
Kunal has helped clients with early stage engagement and formed multi week training programme curriculum.
Kunal has a passion for statistics and data; he has delivered training relating to Hypothesis Testing, Exploring Data, Machine Learning Algorithms, and the Theory of Visualisation.
Data Scientist at a credit management company; applied statistical analysis to distressed portfolios.
Business Data Analyst at an investment bank; project to overhaul the legacy reporting and analytics platform.
Statistician within the Government Statistical Service; quantitative analysis and publishing statistical findings of emerging levels of council tax data.
Structured Credit Product Control at an investment bank; developing, maintaining, and deploying a PnL platform for the CVA Hedging trading desk.