Download file from the Internet (cache-aware)

This is an alternative to utils::download.file() and a convenience wrapper for GET() + httr::write_disk() to perform file downloads.

download_file(url, path, overwrite = FALSE, ...)

Arguments

url	the url(s) of the file to retrieve. If multiple URLs are provided then the same number of `path`s must also be provided.
path	Path(s) to save content to. If more than one `path` is specified then the same number of `url`s must also be provided. THis parameter will be `path.expand()`ed.
overwrite	Will only overwrite existing path if `TRUE`.
...	passed on to `GET()`

Value

a data frame containing the url(s), path(s), cache status, and HTTP status code(s). If there was an error downloading a file the path, status code, and HTTP status columns will be NA. If the file was now re-downloaded the status code will be 399

Details

Since this function uses GET(), callers can pass in httr configuration options to customize the behaviour of the download process (e.g. specify a User-Agent via user_agent(), set proxy config via use_proxy(), etc.).

The function is also "cache-aware" in the sense that you deliberately have to specify overwrite = TRUE to force a re-download. This has the potential to save bandwidth of both the caller and the site hosting files for download.

Note

While this function supports specifying multiple URLs and download paths it does not perform concurrent downloads.

Examples

# NOT RUN {
tmp1 <- tempfile()
tmp2 <- tempfile()
tmp3 <- tempfile()

download_file("https://google.com", tmp1) # downloads fine
download_file("https://google.com", tmp1) # doesn't re-download since it's cached
download_file("https://google.com", tmp1, overwrite = TRUE) # re-downloads (overwrites file)
download_file("https://google.com", tmp2) # re-downloads (new file)
download_file("badurl", tmp3) # handles major errors gracefully

# multi-file example with no-caching
download_file(
  c(rep("https://google.com", 2), "badurl"),
  c(tmp1, tmp2, tmp3),
  overwrite = TRUE
)

# multi-file example with caching
download_file(
  c(rep("https://google.com", 2), "badurl"),
  c(tmp1, tmp2, tmp3),
  overwrite = FALSE
)
# }