---
title: "Using `{densityarea}`"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Using `{densityarea}`}
  %\VignetteEncoding{UTF-8}
  %\VignetteEngine{knitr::rmarkdown}
editor_options: 
  markdown: 
    wrap: 72
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  message = FALSE,
  comment = "#>",
  dev = "ragg_png",
  dpi = 300
)
```

To get started with using `{densityarea}`, we'll need to load some
packages, and some data to work with. `{densityarea}` is meant to play
nicely with tidyverse-style data processing, in addition to loading the
package itself, we'll also load `{dplyr}`. We have the option of working
with the density polygons in the form of simple features from `{sf}`, so
we'll load that as well. Finally, we'll load `{ggplot2}` and
`{ggdensity}` for the sake of data visualization.

```{r setup}
# package depends
library(densityarea)
library(dplyr)
library(sf)
library(ggdensity)
```

```{r setup2, eval = F}
#| package suggests
library(ggplot2)
```

```{r include=F}
ggplot2_inst <- require(ggplot2)
```

The dataset `s01` is a data frame of vowel formant measurements.

```{r dataload}
data(s01, package = "densityarea")
head(s01)
```

## Initial look at the data

Let's plot the original, raw data from `s01`, with the Highest Density
Regions overlaid (thanks to the `{ggdensity}` package).

```{r eval=ggplot2_inst, fig.width=5, fig.height=3, fig.align='center', out.width="100%"}
ggplot(data = s01,
       aes(x = F2,
           y = F1)
       )+
  geom_point(alpha = 0.1)+
  stat_hdr(probs = c(0.8, 0.5),
           aes(fill = after_stat(probs)),
           color = "black",
           alpha = 0.8)+
  scale_y_reverse()+
  scale_x_reverse()+
  scale_fill_brewer(type = "seq")+
  coord_fixed()
```

The function `ggdensity::get_hdr()` is perfect for quickly adding
interpretable densities to your plots. To work with these densities as
polygons, we can use `densityarea::density_polygons()`.

## Getting density areas

Per the name of the package, we can get the area within each of these
density polygons with `density_area()`.

As a first data processing step, let's log transform and flip our `F1`
and `F2` values.

```{r}
s01 |> 
  mutate(
    lF1 = -log(F1),
    lF2 = -log(F2)
  ) -> 
  s01
```

To get the area within the 80% density polygon for the entire data set,
we'll pass `s01` through a `dplyr::reframe()` function.

```{r}
s01 |> 
  group_by(name) |> 
  reframe(
    density_area(lF2, lF1, probs = 0.8)
  ) 
```

Or, if we wanted the areas associated with subsets of the data (say, for
each `vowel`) we'd just change our `dplyr::group_by()` call.

```{r}
s01 |> 
  group_by(name, vowel) |> 
  reframe(
    density_area(lF2, lF1, probs = 0.8)
  ) ->
  vowel_areas
```

Let's rearrange the order of rows to see the largest areas first.

```{r}
vowel_areas |> 
  arrange(desc(area))
```

## Density Polygons

### Polygon Data Frames

#### A single probability level

In the simplest approach, we can use `density_polygons()` to return a
data frame for just one probability level, 60%.

```{r}
s01 |> 
  group_by(name) |> 
  reframe(
    density_polygons(lF2, lF1, probs = 0.6)
  )->
  sixty_poly_df

head(sixty_poly_df)
```

Now, it's *possible* for the HDR polygon to actually come in multiple
pieces, but in this case, there's just one polygon, so we can plot it.

```{r eval=ggplot2_inst, fig.width=4, fig.height=4, fig.align='center', out.width="80%"}
ggplot(sixty_poly_df,
       aes(lF2, lF1))+
  geom_polygon(
    aes(color = prob,
        group = prob),
    fill = NA,
    linewidth = 1
  )+
  coord_fixed()
```

#### Multiple probability levels

To get polygons associated with multiple probability levels, we simply
pass a vector of values to `probs`.

```{r}
s01 |> 
  group_by(name) |> 
  reframe(
    density_polygons(lF2, 
                     lF1, 
                     probs = c(0.6, 0.8))
  )->
  multi_poly_df

head(multi_poly_df)
```

```{r eval=ggplot2_inst, fig.width=4, fig.height=4, fig.align='center', out.width="80%"}
ggplot(multi_poly_df,
       aes(lF2, lF1))+
  geom_polygon(
    aes(color = prob,
        group = prob),
    fill = NA,
    linewidth = 1
  )+
  coord_fixed()
```

### Polygon Simple Features

We can also get `density_polygons()` to return the polygons as simple
features, as defined in the `{sf}` package, by passing it the argument
`as_sf = TRUE`.

```{r}
s01 |> 
  group_by(name) |> 
  reframe(
    density_polygons(lF2,
                     lF1,
                     probs = c(0.8, 0.6),
                     as_sf = TRUE)
  ) |> 
  st_sf()->
  multi_poly_sf
```

The final function there, `sf::st_sf()`, wasn't strictly necessary, but
makes life a little easier for plotting. Here's what the result looks
like:

```{r}
multi_poly_sf
```

And here's a plot.

```{r eval=ggplot2_inst, fig.width=4, fig.height=4, fig.align='center', out.width="80%"}
ggplot(multi_poly_sf)+
  geom_sf(aes(color = prob),
          fill = NA)
```