Getting started with R LanguageData framesReading and writing tabular data in plain-text files (CSV, TSV, etc.)Pipe operators (%>% and others)Linear Models (Regression)data.tableboxplotFormulaSplit functionCreating vectorsFactorsPattern Matching and ReplacementRun-length encodingDate and TimeSpeeding up tough-to-vectorize codeggplot2ListsIntroduction to Geographical MapsBase PlottingSet operationstidyverseRcppRandom Numbers GeneratorString manipulation with stringi packageParallel processingSubsettingDebuggingInstalling packagesArima ModelsDistribution FunctionsShinyspatial analysissqldfCode profilingControl flow structuresColumn wise operationJSONRODBClubridateTime Series and Forecastingstrsplit functionWeb scraping and parsingGeneralized linear modelsReshaping data between long and wide formsRMarkdown and knitr presentationScope of variablesPerforming a Permutation TestxgboostR code vectorization best practicesMissing valuesHierarchical Linear ModelingClassesIntrospection*apply family of functions (functionals)Text miningANOVARaster and Image AnalysisSurvival analysisFault-tolerant/resilient codeReproducible RUpdating R and the package libraryFourier Series and Transformations.RprofiledplyrcaretExtracting and Listing Files in Compressed ArchivesProbability Distributions with RR in LaTeX with knitrWeb Crawling in RArithmetic OperatorsCreating reports with RMarkdownGPU-accelerated computingheatmap and heatmap.2Network analysis with the igraph packageFunctional programmingGet user inputroxygen2HashmapsSpark API (SparkR)Meta: Documentation GuidelinesI/O for foreign tables (Excel, SAS, SPSS, Stata)I/O for database tablesI/O for geographic data (shapefiles, etc.)I/O for raster imagesI/O for R's binary formatReading and writing stringsInput and outputRecyclingExpression: parse + evalRegular Expressions (regex)CombinatoricsPivot and unpivot with data.tableInspecting packagesSolving ODEs in RFeature Selection in R -- Removing Extraneous FeaturesBibliography in RMDWriting functions in RColor schemes for graphicsHierarchical clustering with hclustRandom Forest AlgorithmBar ChartCleaning dataRESTful R ServicesMachine learningVariablesThe Date classThe logical classThe character classNumeric classes and storage modesMatricesDate-time classes (POSIXct and POSIXlt)Using texreg to export models in a paper-ready wayPublishingImplement State Machine Pattern using S4 ClassReshape using tidyrModifying strings by substitutionNon-standard evaluation and standard evaluationRandomizationObject-Oriented Programming in RRegular Expression Syntax in RCoercionStandardize analyses by writing standalone R scriptsAnalyze tweets with RNatural language processingUsing pipe assignment in your own package %<>%: How to ?R Markdown Notebooks (from RStudio)Updating R versionAggregating data framesData acquisitionR memento by examplesCreating packages with devtools

Fault-tolerant/resilient code

Other topics

Remarks:

tryCatch

tryCatch returns the value associated to executing expr unless there's a condition: a warning or an error. If that's the case, specific return values (e.g. return(NA) above) can be specified by supplying a handler function for the respective conditions (see arguments warning and error in ?tryCatch). These can be functions that already exist, but you can also define them within tryCatch (as we did above).

Implications of choosing specific return values of the handler functions

As we've specified that NA should be returned in case of an error in the "try part", the third element in y is NA. If we'd have chosen NULL to be the return value, the length of y would just have been 2 instead of 3 as lapply will simply "ignore/drop" return values that are NULL. Also note that if you don't specify an explicit return value via return, the handler functions will return NULL (i.e. in case of an error or a warning condition).

"Undesired" warning message

When the third element of our urls vector hits our function, we get the following warning in addition to the fact that an error occurs (readLines first complains that it can't open the connection via a warning before actually failing with an error):

Warning message:
    In file(con, "r") : cannot open file 'I'm no URL': No such file or directory

An error "wins" over a warning, so we're not really interested in the warning in this particular case. Thus we have set warn = FALSE in readLines, but that doesn't seem to have any effect. An alternative way to suppress the warning is to use

suppressWarnings(readLines(con = url))

instead of

readLines(con = url, warn = FALSE)

Using tryCatch()

We're defining a robust version of a function that reads the HTML code from a given URL. Robust in the sense that we want it to handle situations where something either goes wrong (error) or not quite the way we planned it to (warning). The umbrella term for errors and warnings is condition

Function definition using tryCatch

readUrl <- function(url) {
    out <- tryCatch(

        ########################################################
        # Try part: define the expression(s) you want to "try" #
        ########################################################

        {
            # Just to highlight: 
            # If you want to use more than one R expression in the "try part" 
            # then you'll have to use curly brackets. 
            # Otherwise, just write the single expression you want to try and 

            message("This is the 'try' part")
            readLines(con = url, warn = FALSE) 
        },

        ########################################################################
        # Condition handler part: define how you want conditions to be handled #
        ########################################################################

        # Handler when a warning occurs:
        warning = function(cond) {
            message(paste("Reading the URL caused a warning:", url))
            message("Here's the original warning message:")
            message(cond)

            # Choose a return value when such a type of condition occurs
            return(NULL)
        },

        # Handler when an error occurs:
        error = function(cond) {
            message(paste("This seems to be an invalid URL:", url))
            message("Here's the original error message:")
            message(cond)

            # Choose a return value when such a type of condition occurs
            return(NA)
        },

        ###############################################
        # Final part: define what should happen AFTER #
        # everything has been tried and/or handled    #
        ###############################################

        finally = {
            message(paste("Processed URL:", url))
            message("Some message at the end\n")
        }
    )    
    return(out)
}

Testing things out

Let's define a vector of URLs where one element isn't a valid URL

urls <- c(
    "http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html",
    "http://en.wikipedia.org/wiki/Xz",
    "I'm no URL"
)

And pass this as input to the function we defined above

y <- lapply(urls, readUrl)
# Processed URL: http://stat.ethz.ch/R-manual/R-devel/library/base/html/connections.html
# Some message at the end
#
# Processed URL: http://en.wikipedia.org/wiki/Xz
# Some message at the end
#
# URL does not seem to exist: I'm no URL 
# Here's the original error message:
# cannot open the connection
# Processed URL: I'm no URL
# Some message at the end
#
# Warning message:
# In file(con, "r") : cannot open file 'I'm no URL': No such file or directory

Investigating the output

length(y)
# [1] 3

head(y[[1]])
# [1] "<!DOCTYPE html PUBLIC \"-//W3C//DTD HTML 4.01 Transitional//EN\">"      
# [2] "<html><head><title>R: Functions to Manipulate Connections</title>"      
# [3] "<meta http-equiv=\"Content-Type\" content=\"text/html; charset=utf-8\">"
# [4] "<link rel=\"stylesheet\" type=\"text/css\" href=\"R.css\">"             
# [5] "</head><body>"                                                          
# [6] ""    

y[[3]]
# [1] NA

Parameters:

ParameterDetails
exprIn case the "try part" was completed successfully tryCatch will return the last evaluated expression. Hence, the actual value being returned in case everything went well and there is no condition (i.e. a warning or an error) is the return value of readLines. Note that you don't need to explicilty state the return value via return as code in the "try part" is not wrapped insided a function environment (unlike that for the condition handlers for warnings and error below)
warning/error/etcProvide/define a handler function for all the conditions that you want to handle explicitly. AFAIU, you can provide handlers for any type of conditions (not just warnings and errors, but also custom conditions; see simpleCondition and friends for that) as long as the name of the respective handler function matches the class of the respective condition (see the Details part of the doc for tryCatch).
finallyHere goes everything that should be executed at the very end, regardless if the expression in the "try part" succeeded or if there was any condition. If you want more than one expression to be executed, then you need to wrap them in curly brackets, otherwise you could just have written finally = <expression> (i.e. the same logic as for "try part".

Contributors

Topic Id: 4060

Example Ids: 14150

This site is not affiliated with any of the contributors.