tidyverse
?tidyverse
is the fast and elegant way to turn basic R
into an enhanced tool, redesigned by Hadley/Rstudio. The development of all packages included in tidyverse
follow the principle rules of The tidy tools manifesto. But first, let the authors describe their masterpiece:
The tidyverse is a set of packages that work in harmony because they share common data representations and API design. The tidyverse package is designed to make it easy to install and load core packages from the tidyverse in a single command.
The best place to learn about all the packages in the tidyverse and how they fit together is R for Data Science. Expect to hear more about the tidyverse in the coming months as I work on improved package websites, making citation easier, and providing a common home for discussions about data analysis with the tidyverse.
(source))
Just with the ordinary R
packages, you need to install and load the package.
install.package("tidyverse")
library("tidyverse")
The difference is, on a single command a couple of dozens of packages are installed/loaded. As a bonus, one may rest assured that all the installed/loaded packages are of compatible versions.
The commonly known and widely used packages:
Packages for manipulating specific data formats:
Data import:
And modelling:
Finally, tidyverse
suggest the use of:
A tbl_df (pronounced tibble diff) is a variation of a data frame that is often used in tidyverse packages. It is implemented in the tibble package.
Use the as_data_frame
function to turn a data frame into a tbl_df:
library(tibble)
mtcars_tbl <- as_data_frame(mtcars)
One of the most notable differences between data.frames and tbl_dfs is how they print:
# A tibble: 32 x 11
mpg cyl disp hp drat wt qsec vs am gear carb
* <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
2 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
3 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
4 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
5 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
6 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
7 14.3 8 360.0 245 3.21 3.570 15.84 0 0 3 4
8 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2
9 22.8 4 140.8 95 3.92 3.150 22.90 1 0 4 2
10 19.2 6 167.6 123 3.92 3.440 18.30 1 0 4 4
# ... with 22 more rows
32 x 11
)dbl
)options(tibble.print_max = [number])
).Many functions in the dplyr package work naturally with tbl_dfs, such as group_by()
.