This is an alternative to utils::download.file() and a convenience wrapper for GET() + httr::write_disk() to perform file downloads.

download_file(url, path, overwrite = FALSE, ...)

Arguments

url

the url(s) of the file to retrieve. If multiple URLs are provided then the same number of paths must also be provided.

path

Path(s) to save content to. If more than one path is specified then the same number of urls must also be provided. THis parameter will be path.expand()ed.

overwrite

Will only overwrite existing path if TRUE.

...

passed on to GET()

Value

a data frame containing the url(s), path(s), cache status, and HTTP status code(s). If there was an error downloading a file the path, status code, and HTTP status columns will be NA. If the file was now re-downloaded the status code will be 399

Details

Since this function uses GET(), callers can pass in httr configuration options to customize the behaviour of the download process (e.g. specify a User-Agent via user_agent(), set proxy config via use_proxy(), etc.).

The function is also "cache-aware" in the sense that you deliberately have to specify overwrite = TRUE to force a re-download. This has the potential to save bandwidth of both the caller and the site hosting files for download.

Note

While this function supports specifying multiple URLs and download paths it does not perform concurrent downloads.

See also

GET(); write_disk()

Examples

# NOT RUN {
tmp1 <- tempfile()
tmp2 <- tempfile()
tmp3 <- tempfile()

download_file("https://google.com", tmp1) # downloads fine
download_file("https://google.com", tmp1) # doesn't re-download since it's cached
download_file("https://google.com", tmp1, overwrite = TRUE) # re-downloads (overwrites file)
download_file("https://google.com", tmp2) # re-downloads (new file)
download_file("badurl", tmp3) # handles major errors gracefully

# multi-file example with no-caching
download_file(
  c(rep("https://google.com", 2), "badurl"),
  c(tmp1, tmp2, tmp3),
  overwrite = TRUE
)

# multi-file example with caching
download_file(
  c(rep("https://google.com", 2), "badurl"),
  c(tmp1, tmp2, tmp3),
  overwrite = FALSE
)
# }