Summarise the contents at a URL to essential bits — summarise_url • crux

Fetches the HTML from x and returns the essential components including:

url
original_url
title
description
site_name
theme_color
amp_url
canonical_url
image_url
video_url
feed_url
favicon_url
reading_time
text (the reducted, plain text) If any compontents cannot be derived from the contents of the URL they will be NA.

summarise_url(x)

Arguments

x	URL

Examples

# NOT RUN {
ex_url <- "https://techcrunch.com/2019/02/28/thailand-passes-controversial-cybersecurity-law/"
str(summarise_url(ex_url), 1)
# }