Getting started with R LanguageData framesReading and writing tabular data in plain-text files (CSV, TSV, etc.)Pipe operators (%>% and others)Linear Models (Regression)data.tableboxplotFormulaSplit functionCreating vectorsFactorsPattern Matching and ReplacementRun-length encodingDate and TimeSpeeding up tough-to-vectorize codeggplot2ListsIntroduction to Geographical MapsBase PlottingSet operationstidyverseRcppRandom Numbers GeneratorString manipulation with stringi packageParallel processingSubsettingDebuggingInstalling packagesArima ModelsDistribution FunctionsShinyspatial analysissqldfCode profilingControl flow structuresColumn wise operationJSONRODBClubridateTime Series and Forecastingstrsplit functionWeb scraping and parsingGeneralized linear modelsReshaping data between long and wide formsRMarkdown and knitr presentationScope of variablesPerforming a Permutation TestxgboostR code vectorization best practicesMissing valuesHierarchical Linear ModelingClassesIntrospection*apply family of functions (functionals)Text miningANOVARaster and Image AnalysisSurvival analysisFault-tolerant/resilient codeReproducible RUpdating R and the package libraryFourier Series and Transformations.RprofiledplyrcaretExtracting and Listing Files in Compressed ArchivesProbability Distributions with RR in LaTeX with knitrWeb Crawling in RArithmetic OperatorsCreating reports with RMarkdownGPU-accelerated computingheatmap and heatmap.2Network analysis with the igraph packageFunctional programmingGet user inputroxygen2HashmapsSpark API (SparkR)Meta: Documentation GuidelinesI/O for foreign tables (Excel, SAS, SPSS, Stata)I/O for database tablesI/O for geographic data (shapefiles, etc.)I/O for raster imagesI/O for R's binary formatReading and writing stringsInput and outputRecyclingExpression: parse + evalRegular Expressions (regex)CombinatoricsPivot and unpivot with data.tableInspecting packagesSolving ODEs in RFeature Selection in R -- Removing Extraneous FeaturesBibliography in RMDWriting functions in RColor schemes for graphicsHierarchical clustering with hclustRandom Forest AlgorithmBar ChartCleaning dataRESTful R ServicesMachine learningVariablesThe Date classThe logical classThe character classNumeric classes and storage modesMatricesDate-time classes (POSIXct and POSIXlt)Using texreg to export models in a paper-ready wayPublishingImplement State Machine Pattern using S4 ClassReshape using tidyrModifying strings by substitutionNon-standard evaluation and standard evaluationRandomizationObject-Oriented Programming in RRegular Expression Syntax in RCoercionStandardize analyses by writing standalone R scriptsAnalyze tweets with RNatural language processingUsing pipe assignment in your own package %<>%: How to ?R Markdown Notebooks (from RStudio)Updating R versionAggregating data framesData acquisitionR memento by examplesCreating packages with devtools

Variables

Other topics

Variables, data structures and basic Operations

In R, data objects are manipulated using named data structures. The names of the objects might be called "variables" although that term does not have a specific meaning in the official R documentation. R names are case sensitive and may contain alphanumeric characters(a-z,A-z,0-9), the dot/period(.) and underscore(_). To create names for the data structures, we have to follow the following rules:

  • Names that start with a digit or an underscore (e.g. 1a), or names that are valid numerical expressions (e.g. .11), or names with dashes ('-') or spaces can only be used when they are quoted: `1a` and `.11`. The names will be printed with backticks:

     list( '.11' ="a")
     #$`.11`
     #[1] "a"
    
  • All other combinations of alphanumeric characters, dots and underscores can be used freely, where reference with or without backticks points to the same object.

  • Names that begin with . are considered system names and are not always visible using the ls()-function.

There is no restriction on the number of characters in a variable name.

Some examples of valid object names are: foobar, foo.bar, foo_bar, .foobar

In R, variables are assigned values using the infix-assignment operator <-. The operator = can also be used for assigning values to variables, however its proper use is for associating values with parameter names in function calls. Note that omitting spaces around operators may create confusion for users. The expression a<-1 is parsed as assignment (a <- 1) rather than as a logical comparison (a < -1).

> foo <- 42
> fooEquals = 43

So foo is assigned the value of 42. Typing foo within the console will output 42, while typing fooEquals will output 43.

> foo
[1] 42
> fooEquals
[1] 43

The following command assigns a value to the variable named x and prints the value simultaneously:

> (x <- 5) 
[1] 5
# actually two function calls: first one to `<-`; second one to the `()`-function
> is.function(`(`)
[1] TRUE  # Often used in R help page examples for its side-effect of printing.

It is also possible to make assignments to variables using ->.

> 5 -> x
> x
[1] 5
> 

Types of data structures

There are no scalar data types in R. Vectors of length-one act like scalars.

  • Vectors: Atomic vectors must be sequence of same-class objects.: a sequence of numbers, or a sequence of logicals or a sequence of characters. v <- c(2, 3, 7, 10), v2 <- c("a", "b", "c") are both vectors.
  • Matrices: A matrix of numbers, logical or characters. a <- matrix(data = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12), nrow = 4, ncol = 3, byrow = F). Like vectors, matrix must be made of same-class elements. To extract elements from a matrix rows and columns must be specified: a[1,2] returns [1] 5 that is the element on the first row, second column.
  • Lists: concatenation of different elements mylist <- list (course = 'stat', date = '04/07/2009', num_isc = 7, num_cons = 6, num_mat = as.character(c(45020, 45679, 46789, 43126, 42345, 47568, 45674)), results = c(30, 19, 29, NA, 25, 26 ,27) ). Extracting elements from a list can be done by name (if the list is named) or by index. In the given example mylist$results and mylist[[6]] obtains the same element. Warning: if you try mylist[6], R wont give you an error, but it extract the result as a list. While mylist[[6]][2] is permitted (it gives you 19), mylist[6][2] gives you an error.
  • data.frame: object with columns that are vectors of equal length, but (possibly) different types. They are not matrices. exam <- data.frame(matr = as.character(c(45020, 45679, 46789, 43126, 42345, 47568, 45674)), res_S = c(30, 19, 29, NA, 25, 26, 27), res_O = c(3, 3, 1, NA, 3, 2, NA), res_TOT = c(30,22,30,NA,28,28,27)). Columns can be read by name exam$matr, exam[, 'matr'] or by index exam[1], exam[,1]. Rows can also be read by name exam['rowname', ] or index exam[1,]. Dataframes are actually just lists with a particular structure (rownames-attribute and equal length components)

Common operations and some cautionary advice

Default operations are done element by element. See ?Syntax for the rules of operator precedence. Most operators (and may other functions in base R) have recycling rules that allow arguments of unequal length. Given these objects:

Example objects

> a <- 1
> b <- 2
> c <- c(2,3,4)
> d <- c(10,10,10)
> e <- c(1,2,3,4)
> f <- 1:6
> W <- cbind(1:4,5:8,9:12)
> Z <- rbind(rep(0,3),1:3,rep(10,3),c(4,7,1))

Some vector operations

> a+b # scalar + scalar
[1] 3
> c+d # vector + vector
[1] 12 13 14
> a*b # scalar * scalar 
[1] 2
> c*d # vector * vector (componentwise!)
[1] 20 30 40
> c+a # vector + scalar
[1] 3 4 5
> c^2 # 
[1]  4  9 16
> exp(c) 
[1]  7.389056 20.085537 54.598150

Some vector operation Warnings!

> c+e # warning but.. no errors, since recycling is assumed to be desired.
[1] 3 5 7 6
Warning message:
In c + e : longer object length is not a multiple of shorter object length

R sums what it can and then reuses the shorter vector to fill in the blanks... The warning was given only because the two vectors have lengths that are not exactly multiples. c+f # no warning whatsoever.

Some Matrix operations Warning!

> Z+W # matrix + matrix #(componentwise)
> Z*W # matrix* matrix#(Standard product is always componentwise)

To use a matrix multiply: V %*% W

> W + a # matrix+ scalar is still componentwise
     [,1] [,2] [,3]
[1,]    2    6   10
[2,]    3    7   11
[3,]    4    8   12
[4,]    5    9   13

> W + c # matrix + vector... : no warnings and R does the operation in a column-wise manner
     [,1] [,2] [,3]
[1,]    3    8   13
[2,]    5   10   12
[3,]    7    9   14
[4,]    6   11   16

"Private" variables

A leading dot in a name of a variable or function in R is commonly used to denote that the variable or function is meant to be hidden.

So, declaring the following variables

> foo <- 'foo'
> .foo <- 'bar'

And then using the ls function to list objects will only show the first object.

> ls()
[1] "foo"

However, passing all.names = TRUE to the function will show the 'private' variable

> ls(all.names = TRUE)
[1] ".foo"          "foo"   

Contributors

Topic Id: 9013

Example Ids: 18649

This site is not affiliated with any of the contributors.