04-results.qmd

# Sensitivity Analysis Results {#sec-results}

```{r packages, file = "R/chapter_start.R", include = FALSE, cache=FALSE}
library(sf)
library(ggthemes)
library(ggspatial)
library(lhs)
library(foreign)
library(scales)
```

Each of the 100 LHS parameter draws was applied to the RVTPO model,
generating mode choice utilities, destination choice utilities, and trip
matrices for each draw. The resulting uncertainty can then be quantified
using the outputs from the trip-based model. This section will first
look at the uncertainty of trips by mode, and how the mode split changes
when the parameters vary. Then uncertainty will be quantified using the
highway assigned trips, and how link volume changes across each draw.
The results will then be summarized.

## Mode Choice Trips

Uncertainty can be evaluated by looking at how mode choices change. The
total number of trips by purpose are fixed, but the number of trips by
each mode changes as a result of mode choice, combined with the
availability of modes in the travel time skims. @tbl-mctrips lists the
base trip amount by mode and purpose. It also lists the the average
number of trips across all 100 iterations, with the corresponding
standard deviation and coefficient of variation. For HBW trips there are
103,320 auto trips. Across all 100 iterations there is a mean value of
103,298 trips with a standard deviation of 527.07. This results in a
coefficient of variation of 0.0052 or 0.52% variation in the number of
auto trips. The other modes of transportation are included and similar
patterns can be seen in HBO and NHB. The results listed in the table
show that the variation of the output trips - by mode and purpose - are
less than the input variation (as all $c_v$'s are smaller than 0.10).
This confirms previous research that the outcome variance is less than
or near the parameters variance [@zhao2002; @clay2005univariate]. In all
three purposes that were evaluated, the coefficient of variation in auto
trips are lower than transit or non-motorized trips, meaning that there
is greater confidence in the models accuracy to generate auto trips. The
input parameter variability has a smaller effect on auto trips than on
trips on the other modes.

```{r MCtrips, warning = FALSE, echo = FALSE, message = FALSE, fig.align="center", results='asis'}
#| label: tbl-mctrips
#| tbl-cap: "Coefficient of Variation of Trips by Mode"
tar_load(hbw_trips)
tar_load(hbo_trips)
tar_load(nhb_trips)

hbw <- hbw_trips %>%
  replace(is.na(.), 0) %>%
  group_by(draw) %>%
  summarise(HBW_Auto = sum(DA),
            HBW_NonMotorized = sum(NOM),
            HBW_Transit = sum(WLB))

hbo <- hbo_trips %>%
  replace(is.na(.), 0) %>%
  group_by(draw) %>%
  summarise(HBO_Auto = sum(DA),
            HBO_NonMotorized = sum(NOM),
            HBO_Transit = sum(WLB))

nhb <- nhb_trips %>%
  replace(is.na(.), 0) %>%
  group_by(draw) %>%
  summarise(NHB_Auto = sum(DA),
            NHB_NonMotorized = sum(NOM),
            NHB_Transit = sum(WLB))

trip_table1 <- merge(hbw, hbo, by = "draw")
trip_table2 <- merge(trip_table1, nhb, by = "draw") %>%
  t() 
         

trips <- trip_table2[,2:101]
base <- trip_table2[,1]
mean <- apply(trips, 1, mean) 
sd <- apply(trips, 1, sd) 

table <- data.frame(base, mean, sd) %>%
  mutate(cov = sd/mean) %>%
  filter(base > 0)%>%
  mutate(mode = c("Auto", "Non-Motorized", "Transit", "Auto", "Non-Motorized", "Transit", "Auto", "Non-Motorized", "Transit")) %>%
  select(c("mode", "base", "mean", "sd", "cov")) 

kable(table, caption = , booktabs = TRUE, 
      col.names = c("", "Base", "Mean", "SD", "$c_v$"), digits = c(0, 0, 0, 2, 4), row.names = F, escape = FALSE) %>%
  kable_styling() %>%
  pack_rows(index = c("HBW" = 3, "HBO" = 3, "NHB" = 3))
```

The variation among mode choices can be visualized graphically using a
density of a scaled change in trips by mode. @fig-modechoicetrips shows
density plots for HBW trips by mode for 12 zones -- the zones are
divided into three volume categories: low is less than 200 trips per
zone, mid is 200 to 700 trips per zone, and top is greater than 700
trips per zone -- and four zones are randomly selected from each volume
category. Zones that do not have any transit accessibility have been
excluded. Those zones have very high density in auto trips as with the
ability to choose transit was removed, the choice to choose auto was
more certain. The zones included in @fig-modechoicetrips all have
greater certainty in auto trips, as the change in trips across all 100
iterations is relatively small. This reinforces the previous claim that
the model has more confidence in auto trips than the other modes. It is
also important to note that the modes are correlated to each other. In
zones with a greater confidence in one mode, the other modes are more
confident as well. Since the number of trips by origin zone are held
constant, when there are an increase in trips on one mode there must be
a decrease in trips on one or both of the other modes. Also, the
distribution of non-motorized trips is similar for every zone suggesting
that generally, the most variable mode is non-motorized trips which you
can see in the spread of the graphic. This is also verified using
@tbl-mctrips as the $c_v$ is largest for the non-motorized mode across
all three purposes.

```{r modechoicetrips, warning = FALSE, echo = FALSE, message = FALSE, fig.env= "sidewaysfigure", fig.align="center", fig.width=9}
#| label: fig-modechoicetrips
#| fig-cap: "Trip density for coefficient of variation by mode for HBW trips."
set.seed(1)

tar_load(hbw_trips)

hbw_mc <- hbw_trips

hbw_mc[is.na(hbw_mc)] <- 0
  
hbw_mc2 <- hbw_mc %>%  
group_by(origin, draw) %>%
  summarize(Drive = sum(DA),
            Transit = sum(WLB),
            NonMotor = sum(NOM)) 

hbw_base <- hbw_mc2 %>%
  filter(draw == 0) %>%
  select(-c("draw")) %>%
  rename("baseDrive" = "Drive",
         "baseTransit" = "Transit",
         "baseNonMotor" = "NonMotor")

hbw_transit_origins <- hbw_base %>%
  group_by(origin) %>%
  summarise(transtotal = sum(baseTransit)) %>%
  filter(transtotal > 0) %>%
  pull(origin) 

hbw_mc3 <- hbw_mc2 %>%
  filter(draw > 0) 

merged <- merge(hbw_mc3, hbw_base, by = "origin") 

hbw_mc_v <- merged %>%
  mutate(dDrive = Drive - baseDrive,
         dTransit = Transit - baseTransit,
         dNonMotor = NonMotor - baseNonMotor) %>%
  mutate(Auto = dDrive/baseDrive,
         Transit = dTransit/baseTransit,
         "Non-Motorized" = dNonMotor/baseNonMotor) %>%
  select(c("origin", "draw", "Auto", "Non-Motorized", "Transit")) %>%
  pivot_longer(cols = c("Auto", "Non-Motorized", "Transit"), names_to = "Mode", values_to = "scaleddiff") 

low_ids <- hbw_base  %>%
  filter(origin %in% c(hbw_transit_origins)) %>%
  mutate(total = baseDrive + baseTransit + baseNonMotor) %>%
  filter(total > 0 ) %>%
  filter(total < 200) %>%
  slice(1) %>%
  ungroup() %>%
  sample_n(4) %>%
  pull(origin) 

mid_ids <- hbw_base %>%
    filter(origin %in% c(hbw_transit_origins)) %>%
  mutate(total = baseDrive + baseTransit + baseNonMotor) %>%
  filter(total > 201 ) %>%
  filter(total < 700) %>%
  slice(1) %>%
  ungroup() %>%
  sample_n(4) %>%
  pull(origin) 

top_ids <- hbw_base %>%
    filter(origin %in% c(hbw_transit_origins)) %>%
  mutate(total = baseDrive + baseTransit + baseNonMotor) %>%
  filter(total > 701 ) %>%
  slice(1) %>%
  ungroup() %>%
  sample_n(4) %>%
  pull(origin) 

these_ids <- c(low_ids, mid_ids, top_ids)

hbw_mc_v %>%
  filter(origin %in% these_ids) %>%
  mutate(voltype = case_when(origin %in% low_ids ~ "low",
                             origin %in% mid_ids ~ "mid",
                             origin %in% top_ids ~ "top")) %>%
  mutate(ID = paste(voltype, "vol. - Origin:", origin)) %>%
  ggplot() +
  aes(x = scaleddiff, fill = Mode, color = Mode) +
  geom_density(adjust = 1L, alpha = 0.1) +
  scale_fill_hue(direction = 1) +
  theme_bw() +
  labs(x = "Scaled Difference",
       y = "Density") +
  facet_wrap(vars(ID))
```

## Link Volume

Highway volumes are the most commonly used output of a travel model.
Uncertainty can additionally be evaluated by looking at how assigned
link volume varies across iterations. @fig-networksd displays variation
in forecast link volume spatially. This shows that the links with the
highest standard deviation in forecast volume are high-volume roads
including freeways and principal arterials where the majority of traffic
is internal to the study region. Although these links have the largest
standard deviation, when compared to the total volume of the road, the
variation is in reality very small. A standard deviation of 400 vehicles
on a road with 40,000 total vehicles corresponds to a small variation
(1%).

```{r networksd, warning = FALSE, echo = FALSE, message = FALSE, fig.align="center", fig.env= "sidewaysfigure", results='hide'}
#| label: fig-networksd
#| fig-cap: Standard deviation in daily forecast volume.
tar_load(networks)

invisible(
base <- foreign::read.dbf("data/sensitivity_out/BASE_LOADED.DBF") %>%
  mutate(link = paste(A, B, sep = "_"))
)

links_total <- networks %>%
  group_by(link) %>%
  summarize(sd = sd(TOTAL_VOL),
            mean = mean(TOTAL_VOL))

base_total <- merge(base, links_total, by = "link") %>%
  mutate(cov = sd/mean) %>%
  filter(FACTYPE < 11, mean >0 ) %>%
  mutate(FacilityType = case_when(FACTYPE == 1 ~ "Freeway",
                                  FACTYPE == 2 ~ "Freeway",
                                  FACTYPE == 3 ~ "Arterial",
                                  FACTYPE == 4 ~ "Arterial",
                                  FACTYPE == 5 ~ "Arterial",
                                  FACTYPE == 6 ~ "Collector",
                                  FACTYPE == 7 ~ "Collector",
                                  FACTYPE == 8 ~ "Local",
                                  FACTYPE == 9 ~ "Ramp",
                                  FACTYPE == 10 ~ "Ramp",))

shape <- sf::st_read("data/sensitivity_out/BASE_LOADED.shp", crs = 4326)

modified_shp <- shape %>%
  left_join(base_total, by = c("A" = "A", "B" = "B")) %>%
  filter(mean > 0)

ggplot() +
  ggspatial::annotation_map_tile("cartolight", zoom = 11) +
  geom_sf(data = modified_shp, aes(color = sd, linewidth = mean), alpha = 0.5) +
  ggthemes::theme_map() +
  scale_colour_gradientn(colours=rainbow(4)) +
  labs(color = "Standard Deviation",
       linewidth = "Average Total Volume") +
  theme(legend.position = "left")
```

The highway assignment results can be grouped by facility type to show
how the coefficient of variation compares to link volume.
@fig-totalvolume shows the coefficient of variation for the daily volume
assigned to each network link, across the 100 draws, plotted against the
mean forecast link volume for each link. The values are the volume for
100 randomly sampled links for each facility type. The plots shows that
for the high-volume roads such as major arterials and freeways, the
coefficient of variation converges to approximately 0.01, or about 1% of
the road's total forecast volume. For lower-volume links, the
coefficient of variation is more widely distributed, with some local
roads and small collectors having considerably higher values. Some links
in the model show no variation at all; these are presumably links near
the edges of the model region where the only traffic is to and from
external zones, trips which were held constant in this framework.

```{r totalvolume, echo = FALSE, warning = FALSE}
#| label: fig-totalvolume
#| fig-cap: Coefficient of variation in daily link volume by facility type for a random sample of highway links.
#| out-width: 100%
set.seed(1)


tar_load(networks)

base <- foreign::read.dbf("data/sensitivity_out/BASE_LOADED.DBF") %>%
  mutate(link = paste(A, B, sep = "_"))

## Total Volume
links_total <- networks %>%
  group_by(link) %>%
  summarize(sd = sd(TOTAL_VOL),
            mean = mean(TOTAL_VOL))

base_total <- merge(base, links_total, by = "link") %>%
  mutate(cov = sd/mean) %>%
  filter(FACTYPE < 11, mean >0 ) %>%
  mutate(FacilityType = case_when(FACTYPE == 1 ~ "Freeway",
                                  FACTYPE == 2 ~ "Freeway",
                                  FACTYPE == 3 ~ "Arterial",
                                  FACTYPE == 4 ~ "Arterial",
                                  FACTYPE == 5 ~ "Arterial",
                                  FACTYPE == 6 ~ "Collector",
                                  FACTYPE == 7 ~ "Collector",
                                  FACTYPE == 8 ~ "Local",
                                  FACTYPE == 9 ~ "Ramp",
                                  FACTYPE == 10 ~ "Ramp",))

freeway <- base_total %>%
    filter(FacilityType == "Freeway") %>%
  ungroup() %>%
  sample_n(100) %>%
  pull(link)

collector <- base_total %>%
    filter(FacilityType == "Collector") %>%
  ungroup() %>%
  sample_n(100) %>%
  pull(link)

local <- base_total %>%
    filter(FacilityType == "Local") %>%
  ungroup() %>%
  sample_n(100) %>%
  pull(link)

ramp <- base_total %>%
    filter(FacilityType == "Ramp") %>%
  ungroup() %>%
  sample_n(100) %>%
  pull(link)

arterial <- base_total %>%
    filter(FacilityType == "Arterial") %>%
  ungroup() %>%
  sample_n(100) %>%
  pull(link)

these_links <- c(freeway, collector, local, ramp, arterial)

base_plot <- base_total %>%
  filter(link %in% these_links)
  
  
ggplot(base_plot) +
  aes(x = TOTAL_VOL, y = cov, color = FacilityType) +
  geom_point(shape = "circle", size = 1.5, alpha = 0.5) +
  theme_bw() + scale_y_log10(labels = scales::comma) +
  scale_color_manual(
    values = c(Freeway = "#E77D72", Collector = "#6DB134", 
               Local = "#ED6CB9", Ramp = "#A08BF8", Arterial = "#52B4E3")
  ) +
  labs(x = "Total Link Volume",
       y = "Coefficient of Variation (Logarithmic Scale)",
       color = "Facility Type")
```

Variation among a link can also be visualized with a density plot of the
total volume across all iterations, as shown in @fig-densityplots. In
this plot, the density of forecast volumes in three randomly selected
links in each of the freeway, collector, and arterial functional types
are plotted alongside the baseline forecast and the Average Annual
Weekday Daily Traffic (AAWDT) measured by the Virginia Department of
Transportation, and to which the model estimates were calibrated. In all
cases, the error or uncertainty in the forecast is considerably narrower
than the error inherent in the model construction, as evidenced by the
fact that the AAWDT target value is well outside the bell curve created
by the statistically varied simulation forecasts.

As expected from using the base parameter values as the mean of the LHS
parameter sampling, the base results are at or near the median of the
statistical density for each link's volume. But it is notable that the
estimated volumes are not perfectly, normally distributed as might be
naively expected. In this case, the combined effects of the mode and
destination choice parameter sampling appear to be constrained by the
geographic specificity of the RVTPO model network: even when the demand
for trips changes between zone pairs, the realities of the highway
capacity, volume-delay, and static user equilibrium procedures may be
limiting the possibilities for forecast highway volumes.

```{r densityplots, warning = FALSE, echo = FALSE, message = FALSE}
#| fig-cap: Density plot of forecast volume on selected links, with default parameter results marked in red, and AAWDT values in green.
#| label: fig-densityplots
#| out.width: "100%"
set.seed(109)

tar_load(networks)

freeway_links <- networks %>%
  filter(AAWDT > 0) %>%
  filter(FACTYPE < 3) %>%
  group_by(link) %>%
  slice(1) %>%
  ungroup() %>%
  sample_n(3) %>%
  pull(link) 

arterial_links <- networks %>%
  filter(AAWDT > 0) %>%
  filter(FACTYPE > 2) %>%
  filter(FACTYPE < 6) %>%
  group_by(link) %>%
  slice(1) %>%
  ungroup() %>%
  sample_n(3) %>%
  pull(link) 

collector_links <- networks %>%
  filter(AAWDT > 0, TOTAL_VOL > 0) %>%
  filter(FACTYPE > 5) %>%
  group_by(link) %>%
  slice(1) %>%
  ungroup() %>%
  sample_n(3) %>%
  pull(link) 

these_links <- c(freeway_links, arterial_links, collector_links )

density <- networks %>%
  filter(link %in% these_links)

value <- density %>%
  filter(fileName == "BASE")

fac_density <- density %>%
   mutate(FacilityType = case_when(FACTYPE == 1 ~ "Freeway",
                                  FACTYPE == 2 ~ "Freeway",
                                  FACTYPE == 3 ~ "Arterial",
                                  FACTYPE == 4 ~ "Arterial",
                                  FACTYPE == 5 ~ "Arterial",
                                  FACTYPE == 6 ~ "Collector",
                                  FACTYPE == 7 ~ "Collector"))

 ggplot() +
  geom_density(data = fac_density, mapping = aes(x = TOTAL_VOL, fill = FacilityType), alpha = 0.5) +
  geom_vline(data = value, mapping = aes(xintercept = TOTAL_VOL), linetype = "dashed", color = "red") +
  geom_vline(data = value, mapping = aes(xintercept = AAWDT), linetype = "dashed", color = "green") +
  theme_bw() +
  facet_wrap(vars(link), scales = "free") +
   scale_x_continuous(labels = function(x) {x/1000}) +
  labs(x = "Total Link Volume (in 1,000s of Vehicles)",
       y = "Density")
```