Getting started with R LanguageData framesReading and writing tabular data in plain-text files (CSV, TSV, etc.)Pipe operators (%>% and others)Linear Models (Regression)data.tableboxplotFormulaSplit functionCreating vectorsFactorsPattern Matching and ReplacementRun-length encodingDate and TimeSpeeding up tough-to-vectorize codeggplot2ListsIntroduction to Geographical MapsBase PlottingSet operationstidyverseRcppRandom Numbers GeneratorString manipulation with stringi packageParallel processingSubsettingDebuggingInstalling packagesArima ModelsDistribution FunctionsShinyspatial analysissqldfCode profilingControl flow structuresColumn wise operationJSONRODBClubridateTime Series and Forecastingstrsplit functionWeb scraping and parsingGeneralized linear modelsReshaping data between long and wide formsRMarkdown and knitr presentationScope of variablesPerforming a Permutation TestxgboostR code vectorization best practicesMissing valuesHierarchical Linear ModelingClassesIntrospection*apply family of functions (functionals)Text miningANOVARaster and Image AnalysisSurvival analysisFault-tolerant/resilient codeReproducible RUpdating R and the package libraryFourier Series and Transformations.RprofiledplyrcaretExtracting and Listing Files in Compressed ArchivesProbability Distributions with RR in LaTeX with knitrWeb Crawling in RArithmetic OperatorsCreating reports with RMarkdownGPU-accelerated computingheatmap and heatmap.2Network analysis with the igraph packageFunctional programmingGet user inputroxygen2HashmapsSpark API (SparkR)Meta: Documentation GuidelinesI/O for foreign tables (Excel, SAS, SPSS, Stata)I/O for database tablesI/O for geographic data (shapefiles, etc.)I/O for raster imagesI/O for R's binary formatReading and writing stringsInput and outputRecyclingExpression: parse + evalRegular Expressions (regex)CombinatoricsPivot and unpivot with data.tableInspecting packagesSolving ODEs in RFeature Selection in R -- Removing Extraneous FeaturesBibliography in RMDWriting functions in RColor schemes for graphicsHierarchical clustering with hclustRandom Forest AlgorithmBar ChartCleaning dataRESTful R ServicesMachine learningVariablesThe Date classThe logical classThe character classNumeric classes and storage modesMatricesDate-time classes (POSIXct and POSIXlt)Using texreg to export models in a paper-ready wayPublishingImplement State Machine Pattern using S4 ClassReshape using tidyrModifying strings by substitutionNon-standard evaluation and standard evaluationRandomizationObject-Oriented Programming in RRegular Expression Syntax in RCoercionStandardize analyses by writing standalone R scriptsAnalyze tweets with RNatural language processingUsing pipe assignment in your own package %<>%: How to ?R Markdown Notebooks (from RStudio)Updating R versionAggregating data framesData acquisitionR memento by examplesCreating packages with devtools

Writing functions in R

Other topics

Named functions

R is full of functions, it is after all a functional programming language, but sometimes the precise function you need isn't provided in the Base resources. You could conceivably install a package containing the function, but maybe your requirements are just so specific that no pre-made function fits the bill? Then you're left with the option of making your own.

A function can be very simple, to the point of being being pretty much pointless. It doesn't even need to take an argument:

one <- function() { 1 }
one()
[1] 1

two <- function() { 1 + 1 }
two()
[1] 2

What's between the curly braces { } is the function proper. As long as you can fit everything on a single line they aren't strictly needed, but can be useful to keep things organized.

A function can be very simple, yet highly specific. This function takes as input a vector (vec in this example) and outputs the same vector with the vector's length (6 in this case) subtracted from each of the vector's elements.

vec <- 4:9
subtract.length <- function(x) { x - length(x) }
subtract.length(vec)
[1] -2 -1  0  1  2  3

Notice that length() is in itself a pre-supplied (i.e. Base) function. You can of course use a previously self-made function within another self-made function, as well as assign variables and perform other operations while spanning several lines:

vec2 <- (4:7)/2

msdf <- function(x, multiplier=4) {
    mult <- x * multiplier
    subl <- subtract.length(x)
    data.frame(mult, subl)
}

msdf(vec2, 5)
  mult subl
1 10.0 -2.0
2 12.5 -1.5
3 15.0 -1.0
4 17.5 -0.5

multiplier=4 makes sure that 4 is the default value of the argument multiplier, if no value is given when calling the function 4 is what will be used.

The above are all examples of named functions, so called simply because they have been given names (one, two, subtract.length etc.)

Anonymous functions

An anonymous function is, as the name implies, not assigned a name. This can be useful when the function is a part of a larger operation, but in itself does not take much place. One frequent use-case for anonymous functions is within the *apply family of Base functions.

Calculate the root mean square for each column in a data.frame:

df <- data.frame(first=5:9, second=(0:4)^2, third=-1:3)

apply(df, 2, function(x) { sqrt(sum(x^2)) })
    first    second     third 
15.968719 18.814888  3.872983 

Create a sequence of step-length one from the smallest to the largest value for each row in a matrix.

x <- sample(1:6, 12, replace=TRUE)
mat <- matrix(x, nrow=3)

apply(mat, 1, function(x) { seq(min(x), max(x)) })

An anonymous function can also stand on its own:

(function() { 1 })()
[1] 1

is equivalent to

f <- function() { 1 })
f()
[1] 1

RStudio code snippets

This is just a small hack for those who use self-defined functions often.
Type "fun" RStudio IDE and hit TAB.

enter image description here

The result will be a skeleton of a new function.

name <- function(variables) {
        
}

One can easily define their own snippet template, i.e. like the one below

name <- function(df, x, y) {
        require(tidyverse)
        out <- 
        return(out)
}

The option is Edit Snippets in the Global Options -> Code menu.

Passing column names as argument of a function

Sometimes one would like to pass names of columns from a data frame to a function. They may be provided as strings and used in a function using [[. Let's take a look at the following example, which prints to R console basic stats of selected variables:

basic.stats <- function(dset, vars){
    for(i in 1:length(vars)){
        print(vars[i])
        print(summary(dset[[vars[i]]]))
    }
}

basic.stats(iris, c("Sepal.Length", "Petal.Width"))

As a result of running above given code, names of selected variables and their basic summary statistics (minima, first quantiles, medians, means, third quantiles and maxima) are printed in R console. The code dset[[vars[i]]] selects i-th element from the argument vars and selects a corresponding column in declared input data set dset. For example, declaring iris[["Sepal.Length"]] alone would print the Sepal.Length column from the iris data set as a vector.

Contributors

Topic Id: 7937

Example Ids: 12720,12721,25731,27283

This site is not affiliated with any of the contributors.