Source: vignettes/glue.Rmd
glue.Rmd
The glue package contains functions for string interpolation: gluingtogether character strings and R code.
library(glue)
Gluing and interpolating
glue()
can be used to glue together pieces of text:
glue("glue ", "some ", "text ", "together")#> glue some text together
But glue’s real power comes with {}
: anything inside of{}
is evaluated and pasted into the string. This makes iteasy to interpolate variables:
name <- "glue"glue("We are learning how to use the {name} R package.")#> We are learning how to use the glue R package.
As well as more complex expressions:
release_date <- as.Date("2017-06-13")glue("Release was on a {format(release_date, '%A')}.")#> Release was on a Tuesday.
Control of line breaks
glue()
honors the line breaks in its input:
glue(" A formatted string Can have multiple lines with additional indention preserved ")#> A formatted string#> Can have multiple lines#> with additional indention preserved
The example above demonstrates some other important facts about thepre-processing of the template string:
- An empty first or last line is automatically trimmed.
- Leading whitespace that is common across all lines is trimmed.
The elimination of common leading whitespace is advantageous, becauseyou aren’t forced to choose between indenting your code normally andgetting the output you actually want. This is easier to appreciate whenyou have glue()
inside a function body (this example alsoshows an alternative way of styling the end of a glue()
call):
foo <- function() { glue(" A formatted string Can have multiple lines with additional indention preserved")}foo()#> A formatted string#> Can have multiple lines#> with additional indention preserved
On the other hand, what if you don’t want a line break in the output,but you also like to limit the length of lines in your source code to,e.g., 80 characters? The first option is to use \\
to breakthe template string into multiple lines, without getting line breaks inthe output:
release_date <- as.Date("2017-06-13")glue(" The first version of the glue package was released on \\ a {format(release_date, '%A')}.")#> The first version of the glue package was released on a Tuesday.
This comes up fairly often when an expression to evaluate inside{}
takes up more characters than its result,i.e.format(release_date, '%A')
versusTuesday
. A second way to achieve the same result is tobreak the template into individual pieces, which are thenconcatenated.
glue( "The first version of the glue package was released on ", "a {format(release_date, '%A')}.")#> The first version of the glue package was released on a Tuesday.
If you want an explicit newline at the start or end, include an extraempty line.
# no leading or trailing newlinex <- glue(" blah ")unclass(x)#> [1] "blah"# both a leading and trailing newliney <- glue(" blah ")unclass(y)#> [1] "\nblah\n"
We use unclass()
above to make it easier to see theabsence and presence of the newlines, i.e.to reveal the literal\n
escape sequences. glue()
and friendsgenerally return a glue object, which is a character vector with the S3class "glue"
. The "glue"
class existsprimarily for the sake of a print method, which displays the naturalformatted result of a glue string. Most of the time this isexactly what the user wants to see. The example above happensto be an exception, where we really do want to see the underlying stringrepresentation.
Here’s another example to drive home the difference between printinga glue object and looking at its string representation.as.character()
is a another way to do this that is arguablymore expressive.
x <- glue(' abc " } xyz')class(x)#> [1] "glue" "character"x#> abc#> " }#> #> xyzunclass(x)#> [1] "abc\n\"\t}\n\nxyz"as.character(x)#> [1] "abc\n\"\t}\n\nxyz"
Delimiters
By default, code to be evaluated goes inside {}
in aglue string. If want a literal curly brace in your string, doubleit:
glue("The name of the package is {name}, not {{name}}.")#> The name of the package is glue, not {name}.
Sometimes it’s just more convenient to use different delimitersaltogether, especially if the template text comes from elsewhere or issubject to external requirements. Consider this example where we want tointerpolate the function name into a code snippet that defines afunction:
fn_def <- " <<NAME>> <- function(x) { # imagine a function body here }"glue(fn_def, NAME = "my_function", .open = "<<", .close = ">>")#> my_function <- function(x) {#> # imagine a function body here#> }
In this glue string, {
and }
have veryspecial meaning. If we forced ourselves to double them, suddenly itdoesn’t look like normal R code anymore. Using alternative delimiters isa nice option in cases like this.
Where glue looks for values
By default, glue()
evaluates the code inside{}
in the caller environment:
glue(..., .envir = parent.frame())
So, for a top-level glue()
call, that means the globalenvironment.
x <- "the caller environment"glue("By default, `glue()` evaluates code in {x}.")#> By default, `glue()` evaluates code in the caller environment.
But you can provide more narrowly scoped values by passing them toglue()
in name = value
form:
x <- "the local environment"glue( "`glue()` can access values from {x} or from {y}. {z}", y = "named arguments", z = "Woo!")#> `glue()` can access values from the local environment or from named arguments. Woo!
If the relevant data lives in a data frame (or list or environment),use glue_data()
instead:
mini_mtcars <- head(cbind(model = rownames(mtcars), mtcars))rownames(mini_mtcars) <- NULLglue_data(mini_mtcars, "{model} has {hp} hp.")#> Mazda RX4 has 110 hp.#> Mazda RX4 Wag has 110 hp.#> Datsun 710 has 93 hp.#> Hornet 4 Drive has 110 hp.#> Hornet Sportabout has 175 hp.#> Valiant has 105 hp.
glue_data()
is very natural to use with the pipe:
mini_mtcars |> glue_data("{model} gets {mpg} miles per gallon.")#> Mazda RX4 gets 21 miles per gallon.#> Mazda RX4 Wag gets 21 miles per gallon.#> Datsun 710 gets 22.8 miles per gallon.#> Hornet 4 Drive gets 21.4 miles per gallon.#> Hornet Sportabout gets 18.7 miles per gallon.#> Valiant gets 18.1 miles per gallon.
Returning to glue()
, recall that it defaults toevaluation in the caller environment. This has happy implications insidea dplyr::mutate()
pipeline. The data-masking feature ofmutate()
means the columns of the target data frame are “inscope” for a glue()
call:
library(dplyr)mini_mtcars |> mutate(note = glue("{model} gets {mpg} miles per gallon.")) |> select(note, cyl, disp)#> note cyl disp#> 1 Mazda RX4 gets 21 miles per gallon. 6 160#> 2 Mazda RX4 Wag gets 21 miles per gallon. 6 160#> 3 Datsun 710 gets 22.8 miles per gallon. 4 108#> 4 Hornet 4 Drive gets 21.4 miles per gallon. 6 258#> 5 Hornet Sportabout gets 18.7 miles per gallon. 8 360#> 6 Valiant gets 18.1 miles per gallon. 6 225
SQL
glue has explicit support for constructing SQL statements. Usebackticks to quote identifiers. Normal strings and numbers are quotedappropriately for your backend.
con <- DBI::dbConnect(RSQLite::SQLite(), ":memory:")colnames(iris) <- gsub("[.]", "_", tolower(colnames(iris)))DBI::dbWriteTable(con, "iris", iris)var <- "sepal_width"tbl <- "iris"num <- 2val <- "setosa"glue_sql(" SELECT {`var`} FROM {`tbl`} WHERE {`tbl`}.sepal_length > {num} AND {`tbl`}.species = {val} ", .con = con)#> <SQL> SELECT `sepal_width`#> FROM `iris`#> WHERE `iris`.sepal_length > 2#> AND `iris`.species = 'setosa'
glue_sql()
can be used in conjunction with parameterizedqueries using DBI::dbBind()
to provide protection for SQLInjection attacks.
sql <- glue_sql(" SELECT {`var`} FROM {`tbl`} WHERE {`tbl`}.sepal_length > ?", .con = con)query <- DBI::dbSendQuery(con, sql)DBI::dbBind(query, list(num))DBI::dbFetch(query, n = 4)#> sepal_width#> 1 3.5#> 2 3.0#> 3 3.2#> 4 3.1DBI::dbClearResult(query)
glue_sql()
can be used to build up more complex querieswith interchangeable sub queries. It returns DBI::SQL()
objects which are properly protected from quoting.
sub_query <- glue_sql(" SELECT * FROM {`tbl`} ", .con = con)glue_sql(" SELECT s.{`var`} FROM ({sub_query}) AS s ", .con = con)#> <SQL> SELECT s.`sepal_width`#> FROM (SELECT *#> FROM `iris`) AS s
If you want to input multiple values for use in SQL IN statements put*
at the end of the value and the values will be collapsedand quoted appropriately.
glue_sql("SELECT * FROM {`tbl`} WHERE sepal_length IN ({vals*})", vals = 1, .con = con)#> <SQL> SELECT * FROM `iris` WHERE sepal_length IN (1)glue_sql("SELECT * FROM {`tbl`} WHERE sepal_length IN ({vals*})", vals = 1:5, .con = con)#> <SQL> SELECT * FROM `iris` WHERE sepal_length IN (1, 2, 3, 4, 5)glue_sql("SELECT * FROM {`tbl`} WHERE species IN ({vals*})", vals = "setosa", .con = con)#> <SQL> SELECT * FROM `iris` WHERE species IN ('setosa')glue_sql("SELECT * FROM {`tbl`} WHERE species IN ({vals*})", vals = c("setosa", "versicolor"), .con = con)#> <SQL> SELECT * FROM `iris` WHERE species IN ('setosa', 'versicolor')