Philly’s Center City District posted a list of restaurants and bars participating in Philly’s 2022 CCD Sips. CCD Sips is a series of summer Wednesday evenings (4:30-7pm) filled with happy hour specials, between June 1st and August 31st.
I prefer to take in this information as a map instead of a list, so I scraped some information from the website and made one! You can click or tap on the circle map markers to see information about each restaurant/bar along with a direct link to their posted happy hour specials.
Check out the link at the top of this post for a larger version of the interactive map below. And jump down to the tutorial if you’d like to learn how I used R to build the interactive map!
Tutorial start
Aside from the tidyverse
and here
packages, I used a handful of R packages to bring this map project together.
Package | Purpose | Version |
---|---|---|
robotstxt |
Check website for scraping permissions | 0.7.13 |
rvest |
Scrape the information off of the website | 1.0.1 |
ggmap |
Geocode the restaurant addresses | 3.0.0 |
leaflet |
Build the interactive map | 2.0.4.1 |
leaflet.extras |
Add extra functionality to map | 1.0.0 |
Scraping the data
Checking site permissions
Check the site’s terms of service using the robotstxt package, which downloads and parses the site’s robots.txt file.
What I wanted to look for was whether any pages are not allowed to be crawled by bots/scrapers. In my case there weren’t any, indicated by Allow: /
.
get_robotstxt("https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view")
Output
[robots.txt]
--------------------------------------
# robots.txt overwrite by: on_suspect_content
User-agent: *
Allow: /
[events]
--------------------------------------
requested: https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view/robots.txt
downloaded: https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view/robots.txt
$on_not_found
$on_not_found$status_code[1] 404
$on_file_type_mismatch
$on_file_type_mismatch$content_type[1] "text/html; charset=utf-8"
$on_suspect_content
$on_suspect_content$parsable[1] FALSE
$on_suspect_content$content_suspect[1] TRUE
[attributes]
--------------------------------------
problems, cached, request, class
Harvesting data from the first page
Then I used the rvest package to scrape the information from the tables of restaurants/bars participating in CCD Sips.
I’ve learned that ideally you would only scrape each page once, so I checked my approach with the first page before I wrote a function to scrape the remaining pages.
# define the page
<- "https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1"
url
# read the page html
<- read_html(url)
html1
# extract table info
<-
table1 |>
html1 html_node("table") |>
html_table()
|> head(3) |> kableExtra::kable() table1
Name | Address | Phone | CCD SIPS Specials |
---|---|---|---|
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | CCD SIPS Specials |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | CCD SIPS Specials |
Air Grille Garden at Dilworth Park | 1 S 15th St, Philadelphia, PA 19102 | 215.587.2761 | CCD SIPS Specials |
# extract hyperlinks to specific restaurant/bar specials
<-
links |>
html1 html_elements(".o-table__tag.ccd-text-link") |>
html_attr("href") |>
as_tibble()
|> head(3) |> kableExtra::kable() links
value |
---|
#1225-raw-sushi-and-sake-lounge |
#1518-bar-and-grill |
#air-grill-garden-dilworth-park |
# add full hyperlinks to the table info
<-
table1Mod bind_cols(table1, links) |>
mutate(Specials = paste0(url, value)) |>
select(-c(`CCD SIPS Specials`, value))
|> head(3) |> kableExtra::kable() table1Mod
Name | Address | Phone | Specials |
---|---|---|---|
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1225-raw-sushi-and-sake-lounge |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1518-bar-and-grill |
Air Grille Garden at Dilworth Park | 1 S 15th St, Philadelphia, PA 19102 | 215.587.2761 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#air-grill-garden-dilworth-park |
Harvesting data from the remaining pages
Once I could confirm that the above approach harvested the information I needed, I adapted the code into a function that I could apply to pages 2-3 of the site.
<- function(pageNumber) {
getTables Sys.sleep(2)
<- paste0("https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=", pageNumber)
url
<- read_html(url)
html
<-
table |>
html html_node("table") |>
html_table()
<-
links |>
html html_elements(".o-table__tag.ccd-text-link") |>
html_attr("href") |>
as_tibble()
<<-
tableSpecials bind_cols(table, links) |>
mutate(Specials = paste0(url, value)) |>
select(-c(`CCD SIPS Specials`, value))
}
I used my getTable()
function and the purrr::map_df()
function to harvest the table of restaurants/bars from pages 2 and 3. Then I combined all the data frames together and saved the complete data frame as an .Rds
object so that I wouldn’t have to scrape the data again.
# get remaining tables
<- map_df(2:3, getTables)
table2
# combine all tables
<- bind_rows(table1Mod, table2)
table |> head(3) |> kableExtra::kable() table
Name | Address | Phone | Specials |
---|---|---|---|
1028 Yamitsuki Sushi & Ramen | 1028 Arch Street, Philadelphia, PA 19107 | 215.629.3888 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1028-yamitsuki-sushi-ramen |
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1225-raw-sushi-and-sake-lounge |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1518-bar-and-grill |
# save full table to file
write_rds(
table,file = here("content/blog/2022-05-31-ccd-sips/specialsScraped.Rds")
)
Geocoding addresses
The next step was to use geocoding to convert the restaurant/bar addresses to geographical coordinates (longitude and latitude) that I could map. I used the ggmap package and the Google Geocoding API service because this was a small project (59 addresses/requests) which wouldn’t make a dent in the free credit available on the platform.
The last time I geocoded addresses was for an almost identical project in 2019 and I had issues using the same API key from back then, so I made a new one. I restricted my new key to the Geocoding and Geolocation APIs.
# register my API key
# ggmap::register_google(key = "[your key]")
# geocode addresses
<-
specials_ggmap |>
table mutate_geocode(Address)
# rename new variables
<-
specials |>
specials_ggmap rename(Longitude = lon,
Latitude = lat)
|> head(3) |> kableExtra::kable() specials
Name | Address | Phone | Specials | Longitude | Latitude |
---|---|---|---|---|---|
1028 Yamitsuki Sushi & Ramen | 1028 Arch Street, Philadelphia, PA 19107 | 215.629.3888 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1028-yamitsuki-sushi-ramen | -75.15746 | 39.95354 |
1225 Raw Sushi and Sake Lounge | 1225 Sansom St, Philadelphia, PA 19102 | 215.238.1903 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1225-raw-sushi-and-sake-lounge | -75.16149 | 39.95004 |
1518 Bar and Grill | 1518 Sansom St, Philadelphia, PA 19102 | 267.639.6851 | https://centercityphila.org/explore-center-city/ccd-sips/sips-list-view?page=1#1518-bar-and-grill | -75.16665 | 39.95020 |
I made sure to save the new data frame with geographical coordinates as an .Rds
object so I wouldn’t have to geocode the data again! This would be particularly important if I was working on a large project.
# save table with geocoded addresses to file
write_rds(
specials,file = here("content/blog/2022-05-31-ccd-sips/specialsGeocoded.Rds"))
Building the map
To build the map, I used the leaflet package. Some of the resources I found helpful, in addition to the package documentation:
- Scrape website data with the new R package rvest (+ a postscript on interacting with web pages with RSelenium) · Hollie at ZevRoss – how to style pop-ups
- Leaflet Map Markers in R · Jindra Lacko – how to customize marker icons
- A guide to basic Leaflet accessibility · Leaflet – accessibility considerations. Though it’s unclear to me how these features built into the Leaflet library translate over to the leaflet R package. For example, I couldn’t find an option for adding alt-text or a title to each marker, but maybe I wasn’t looking in the right place within the documentation.
Customizing map markers
# style pop-ups for the map with inline css styling
# marker for the restaurants/bars
<- paste("<h2 style='font-family: Red Hat Text, sans-serif; font-size: 1.6em; color:#43464C;'>", "<a style='color: #00857A;' href=", specials$Specials, ">", specials$Name, "</a></h2>","<p style='font-family: Red Hat Text, sans-serif; font-weight: normal; font-size: 1.5em; color:#9197A6;'>", specials$Address, "</p>")
popInfoCircles
# marker for the center of the map
<-paste("<h1 style='padding-top: 0.5em; margin-top: 1em; margin-bottom: 0.5em; font-family: Red Hat Text, sans-serif; font-size: 1.8em; color:#43464C;'>", "<a style='color: #00857A;' href='https://centercityphila.org/explore-center-city/ccdsips'>", "Center City District Sips 2022", "</a></h1><p style='color:#9197A6; font-family: Red Hat Text, sans-serif; font-size: 1.5em; padding-bottom: 1em;'>", "Philadelphia, PA", "</p>")
popInfoMarker
# custom icon for the center of the map
<-
awesome makeAwesomeIcon(
icon = "map-pin",
iconColor = "#FFFFFF",
markerColor = "darkblue",
library = "fa"
)
Plotting the restaurants/bars
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
))
Adding the map background
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
|>
)) # add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron)
Setting the map view
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
|>
)) # add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron) |>
# set the map view
setView(mean(specials$Longitude),
mean(specials$Latitude),
zoom = 16)
Adding a marker at the center
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
|>
)) # add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron) |>
# set the map view
setView(mean(specials$Longitude),
mean(specials$Latitude),
zoom = 16) |>
# add marker at the center ----
addAwesomeMarkers(
icon = awesome,
lng = mean(specials$Longitude),
lat = mean(specials$Latitude),
label = "Center City District Sips 2022",
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
),popup = popInfoMarker,
popupOptions = popupOptions(maxWidth = 250))
Adding fullscreen control
leaflet(data = specials,
width = "100%",
height = "850px",
# https://stackoverflow.com/a/42170340
options = tileOptions(minZoom = 15,
maxZoom = 19)) |>
# add map markers ----
addCircles(
lat = ~ specials$Latitude,
lng = ~ specials$Longitude,
fillColor = "#009E91", #olivedrab goldenrod
fillOpacity = 0.6,
stroke = F,
radius = 12,
popup = popInfoCircles,
label = ~ Name,
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
|>
)) # add map tiles in the background ----
addProviderTiles(providers$CartoDB.Positron) |>
# set the map view
setView(mean(specials$Longitude),
mean(specials$Latitude),
zoom = 16) |>
# add marker at the center ----
addAwesomeMarkers(
icon = awesome,
lng = mean(specials$Longitude),
lat = mean(specials$Latitude),
label = "Center City District Sips 2022",
labelOptions = labelOptions(
style = list(
"font-family" = "Red Hat Text, sans-serif",
"font-size" = "1.2em")
),popup = popInfoMarker,
popupOptions = popupOptions(maxWidth = 250)) |>
# add fullscreen control button ----
::addFullscreenControl() leaflet.extras
Creating the map with Quarto
The first time around, I created a standalone map by first running an R script with the necessary code, and then exporting the HTML output as a webpage. This worked well enough, except that I realized:
- The title of the map webpage (the name that is displayed on a browser tab) was just “map” because the name of the HTML file was
map.html
. I wanted something more descriptive. - The map wasn’t mobile-responsive. In other words, the map markers and text looked too small when viewed on a mobile device.
Changing the webpage title
The webpage title was a quick one to fix thanks to a Stack Overflow response to a question about turning off the title in an R Markdown document. The pagetitle
YAML option lets you set the HTML’s title tag independently of the document title:
pagetitle: "Philly CCD Sips 2022 Map"
Fixing the mobile-responsiveness
The mobile-responsiveness issue could be solved by adding metadata to the map HTML, but I would need to be able to blend HTML with R code. I have been practicing using Quarto and figured I could make a standalone map from a Quarto document (.qmd
) rather than an R Markdown one (.Rmd
or .Rmarkdown
). You can find the map’s Quarto document alongside this blog post.
According to the Leaflet library documentation and this Stack Overflow answer, fixing the map to be mobile-responsive required adding the following metadata to the HTML code:
<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=no" />
I used the metathis R package to add this metadata to an R code chunk in my Quarto document using the meta_viewport()
function:
# make mobile-responsive
meta_viewport(
width = "device-width",
initial_scale = "1.0",
maximum_scale = "1.0",
user_scalable = "no"
)
Update: In the process of updating this post I’m noticing that specifying the viewport metadata tag doesn’t seem to be necessary anymore, and I don’t understand why 🤔 …so I’ll leave the step as is, just in case it’s helpful to anyone 🤷🏽♀️
Making the map fullscreen
A side effect of creating the map from a Quarto (or R Markdown) document is that the output is styled by default to fit within the width of an article (in this case 900 pixels). I wanted the map to take up the whole width of the page, so I made use of the page-layout
Quarto YAML option:
format:
html:
page-layout: custom
Another option that worked pretty well was to use the column: screen
code chunk option built into Quarto. The Quarto documentation even shows an example to display a Leaflet map I but it left a thin margin at the top margin, and I wanted the map to be flush against the top edge of the webpage.
Rendering the standalone map
Lastly, I added one more option to the YAML that would render the Quarto document into a self-contained HTML with all of the content needed to create the map.
format:
html:
page-layout: custom
self-contained: true