﻿WEBVTT

NOTE This file was exported by MacCaption version 7.0.06 to comply with the WebVTT specification dated March 27, 2017.

00:00:09.543 --> 00:00:11.778 align:center line:-1 position:50% size:34%
USGS Landslide Seminar,
which is hosted by

00:00:11.778 --> 00:00:13.814 align:center line:-1 position:50% size:42%
our Landslide Hazards Program,

00:00:13.814 --> 00:00:15.482 align:center line:-1 position:50% size:43%
and coordinated by Matt Thomas,

00:00:15.482 --> 00:00:17.851 align:center line:-1 position:50% size:30%
Jaime Kostelnik,
and Stephen Slaughter.

00:00:17.851 --> 00:00:20.487 align:center line:-1 position:50% size:37%
None of them were available
to host today,

00:00:20.487 --> 00:00:23.190 align:center line:-1 position:50% size:31%
so you're stuck with me,
Ben Mirus.

00:00:23.190 --> 00:00:25.792 align:center line:-1 position:50% size:38%
I think they have
this well-oiled machine going,

00:00:25.792 --> 00:00:27.327 align:center line:-1 position:50% size:30%
so please bear with me

00:00:27.327 --> 00:00:29.363 align:center line:-1 position:50% size:28%
if there are any errors
or mistakes.

00:00:29.363 --> 00:00:31.131 align:center line:-1 position:50% size:31%
One note
is we're under a tornado

00:00:31.131 --> 00:00:33.367 align:center line:-1 position:50% size:40%
and severe thunderstorm watch
here in Golden at the moment,

00:00:33.367 --> 00:00:37.471 align:center line:-1 position:50% size:31%
so if everything crashes,
that's what's happening.

00:00:38.906 --> 00:00:40.240 align:center line:-1 position:50% size:37%
For those of you
who are new to this meeting,

00:00:40.240 --> 00:00:42.009 align:center line:-1 position:50% size:30%
you will have the ability
to submit questions

00:00:42.009 --> 00:00:44.177 align:center line:-1 position:50% size:38%
in the chat window or using
the "Raise your hand" feature

00:00:44.177 --> 00:00:46.413 align:center line:-1 position:50% size:37%
when turning on
your microphone afterwards.

00:00:46.413 --> 00:00:49.316 align:center line:-1 position:50% size:40%
But we'll wait to keep questions
until the end of the talk

00:00:49.316 --> 00:00:50.918 align:center line:-1 position:50% size:36%
and have
a managed discussion then.

00:00:50.918 --> 00:00:52.219 align:center line:-1 position:50% size:26%
So in the meantime,
please keep

00:00:52.219 --> 00:00:54.354 align:center line:-1 position:50% size:36%
your microphones muted
and your cameras turned off

00:00:54.354 --> 00:00:56.523 align:center line:-1 position:50% size:29%
to preserve bandwidth.

00:00:56.523 --> 00:00:58.358 align:center line:-1 position:50% size:36%
Just a quick announcement:

00:00:58.358 --> 00:01:00.460 align:center line:-1 position:50% size:39%
Next week
we'll be hearing from Dan Coe

00:01:00.460 --> 00:01:01.929 align:center line:-1 position:50% size:24%
of the Washington
Geological Survey

00:01:01.929 --> 00:01:04.698 align:center line:-1 position:50% size:43%
with a talk entitled,
"Communicating Landslide Risk--

00:01:04.698 --> 00:01:06.533 align:center line:-1 position:50% size:28%
Landslide Information
and Hazards

00:01:06.533 --> 00:01:07.935 align:center line:-1 position:50% size:31%
with Maps and Graphics

00:01:07.935 --> 00:01:11.371 align:center line:-1 position:50% size:25%
at the Washington
Geological Survey."

00:01:11.371 --> 00:01:13.140 align:center line:-1 position:50% size:27%
Today I am taking on
the dual role

00:01:13.140 --> 00:01:15.309 align:center line:-1 position:50% size:33%
of hosting and introducing
today's speaker,

00:01:15.309 --> 00:01:18.211 align:center line:-1 position:50% size:35%
Jacob Woodard,
who is a Mendenhall fellow

00:01:18.211 --> 00:01:21.214 align:center line:-1 position:50% size:30%
with us here
at the USGS in Golden.

00:01:21.214 --> 00:01:22.783 align:center line:-1 position:50% size:27%
Before coming to us,
Jacob pursued

00:01:22.783 --> 00:01:25.452 align:center line:-1 position:50% size:28%
his bachelor's degree
in Geology at BYU

00:01:25.452 --> 00:01:27.421 align:center line:-1 position:50% size:25%
and earned both
a master's and PhD

00:01:27.421 --> 00:01:29.923 align:center line:-1 position:50% size:28%
from the University
of Wisconsin-Madison

00:01:29.923 --> 00:01:34.094 align:center line:-1 position:50% size:41%
with an emphasis in Geophysics
and Surface Processes.

00:01:34.094 --> 00:01:36.129 align:center line:-1 position:50% size:39%
His research there
was focused on understanding

00:01:36.129 --> 00:01:38.665 align:center line:-1 position:50% size:38%
subglacial erosion processes.

00:01:38.665 --> 00:01:40.867 align:center line:-1 position:50% size:38%
And I took him
to some of my favorite places

00:01:40.867 --> 00:01:44.471 align:center line:-1 position:50% size:31%
in the world,
like Alberta and Iceland.

00:01:44.471 --> 00:01:46.606 align:center line:-1 position:50% size:34%
But promptly after
defending his dissertation,

00:01:46.606 --> 00:01:50.544 align:center line:-1 position:50% size:43%
Jacob moved with his family here
in the summer of 2020,

00:01:50.544 --> 00:01:52.746 align:center line:-1 position:50% size:38%
right in the middle
of those pandemic lockdowns

00:01:52.746 --> 00:01:55.916 align:center line:-1 position:50% size:36%
and the birth of this seminar.

00:01:55.916 --> 00:01:57.217 align:center line:-1 position:50% size:24%
And so it was
an interesting time

00:01:57.217 --> 00:01:59.219 align:center line:-1 position:50% size:28%
and challenging for us
to get research going,

00:01:59.219 --> 00:02:01.722 align:center line:-1 position:50% size:43%
but Jacob's really excelled
and has done some exciting work

00:02:01.722 --> 00:02:04.424 align:center line:-1 position:50% size:33%
on landslide susceptibility
with new data

00:02:04.424 --> 00:02:06.760 align:center line:-1 position:50% size:39%
and new modeling techniques,
and so that's

00:02:06.760 --> 00:02:08.662 align:center line:-1 position:50% size:25%
what he's gonna
present to us today.

00:02:08.662 --> 00:02:11.965 align:center line:-1 position:50% size:40%
It's my pleasure
to hand the screen off to Jacob.

00:02:11.965 --> 00:02:14.401 align:center line:-1 position:50% size:28%
-Jacob, thank you.
-All right, thanks, Ben.

00:02:14.401 --> 00:02:15.869 align:center line:-1 position:50% size:17%
Appreciate it.

00:02:15.869 --> 00:02:17.237 align:center line:-1 position:50% size:26%
Yeah, it's a pleasure
to be here

00:02:17.237 --> 00:02:20.007 align:center line:-1 position:50% size:31%
and I'm excited
to share with everybody

00:02:20.007 --> 00:02:23.744 align:center line:-1 position:50% size:40%
some work that I've been doing
here at the USGS.

00:02:23.744 --> 00:02:25.412 align:center line:-1 position:50% size:34%
Basically figuring out ways

00:02:25.412 --> 00:02:27.381 align:center line:-1 position:50% size:25%
to make meaningful
susceptibility maps

00:02:27.381 --> 00:02:30.984 align:center line:-1 position:50% size:25%
over large regions
with imperfect data.

00:02:30.984 --> 00:02:32.586 align:center line:-1 position:50% size:35%
Here's
the government disclaimers

00:02:32.586 --> 00:02:34.388 align:center line:-1 position:50% size:25%
that are mandatory.

00:02:34.388 --> 00:02:36.623 align:center line:-1 position:50% size:28%
For this talk,
first gonna go through

00:02:36.623 --> 00:02:38.458 align:center line:-1 position:50% size:28%
kinda the problems
that we're confronting,

00:02:38.458 --> 00:02:42.462 align:center line:-1 position:50% size:40%
and then we're gonna try to
answer these two big questions

00:02:42.462 --> 00:02:45.699 align:center line:-1 position:50% size:33%
that I keep on running into
as I'm trying to develop

00:02:45.699 --> 00:02:47.701 align:center line:-1 position:50% size:32%
these susceptibility maps
over large regions.

00:02:47.701 --> 00:02:49.770 align:center line:-1 position:50% size:39%
First is how to manage regions

00:02:49.770 --> 00:02:52.773 align:center line:-1 position:50% size:26%
with limited
or no landslide data,

00:02:52.773 --> 00:02:55.709 align:center line:-1 position:50% size:30%
and to evaluate this
we're gonna investigate

00:02:55.709 --> 00:02:57.310 align:center line:-1 position:50% size:28%
the influence
of the training domain

00:02:57.310 --> 00:02:59.413 align:center line:-1 position:50% size:30%
or the data that you use
to develop your models

00:02:59.413 --> 00:03:01.915 align:center line:-1 position:50% size:33%
and the limits of that data.

00:03:01.915 --> 00:03:04.017 align:center line:-1 position:50% size:36%
And the next question
that we keep on confronting

00:03:04.017 --> 00:03:06.553 align:center line:-1 position:50% size:41%
is how to manage
the imprecise data that we have

00:03:06.553 --> 00:03:09.089 align:center line:-1 position:50% size:34%
available to us for creating
these susceptibility maps.

00:03:09.089 --> 00:03:11.191 align:center line:-1 position:50% size:35%
This isn't a unique problem.

00:03:11.191 --> 00:03:14.127 align:center line:-1 position:50% size:36%
Basically any data set
that we have available to us

00:03:14.127 --> 00:03:17.631 align:center line:-1 position:50% size:34%
is imperfect and imprecise
in its very nature.

00:03:17.631 --> 00:03:19.699 align:center line:-1 position:50% size:36%
And so how can we mitigate

00:03:19.699 --> 00:03:21.334 align:center line:-1 position:50% size:29%
the imprecision
with the model output?

00:03:21.334 --> 00:03:24.004 align:center line:-1 position:50% size:43%
How to best represent the model.

00:03:24.004 --> 00:03:26.706 align:center line:-1 position:50% size:36%
And we're gonna--to do that
we're gonna dive into

00:03:26.706 --> 00:03:29.142 align:center line:-1 position:50% size:30%
different mapping units,
and the pros and cons

00:03:29.142 --> 00:03:33.013 align:center line:-1 position:50% size:32%
of the mapping units
and how they can adjust,

00:03:33.013 --> 00:03:36.883 align:center line:-1 position:50% size:33%
or play a role in displaying
the imprecise nature

00:03:36.883 --> 00:03:38.819 align:center line:-1 position:50% size:22%
of the input data.

00:03:38.819 --> 00:03:40.587 align:center line:-1 position:50% size:27%
So first, introduction.

00:03:40.587 --> 00:03:42.656 align:center line:-1 position:50% size:37%
I don't think this needs
to be said to this group here,

00:03:42.656 --> 00:03:46.193 align:center line:-1 position:50% size:39%
but landslides cause damages
and losses of life

00:03:46.193 --> 00:03:49.529 align:center line:-1 position:50% size:38%
ever year in the United States
and across the world.

00:03:49.529 --> 00:03:53.400 align:center line:-1 position:50% size:28%
So here's an example
of the 1985 earth--

00:03:53.400 --> 00:03:57.137 align:center line:-1 position:50% size:36%
sorry, landslide that resulted
in over 100 fatalities.

00:03:57.137 --> 00:04:01.675 align:center line:-1 position:50% size:40%
Then more currently
Hurricane Maria in Puerto Rico.

00:04:01.675 --> 00:04:04.644 align:center line:-1 position:50% size:24%
And so this causes
damages to life,

00:04:04.644 --> 00:04:08.515 align:center line:-1 position:50% size:21%
to infrastructure,
to residences,

00:04:08.515 --> 00:04:11.952 align:center line:-1 position:50% size:33%
and we can expect that
with the changing climate

00:04:11.952 --> 00:04:14.621 align:center line:-1 position:50% size:35%
the severity of these storms
that cause

00:04:14.621 --> 00:04:18.158 align:center line:-1 position:50% size:32%
these types of landslides
is gonna increase.

00:04:18.158 --> 00:04:21.761 align:center line:-1 position:50% size:25%
Here's an example
of a damaged road.

00:04:21.761 --> 00:04:24.264 align:center line:-1 position:50% size:40%
And this causes both damages
to loss and life,

00:04:24.264 --> 00:04:26.933 align:center line:-1 position:50% size:39%
which is incalculable, but also,

00:04:26.933 --> 00:04:28.268 align:center line:-1 position:50% size:24%
when you damage
the infrastructure,

00:04:28.268 --> 00:04:30.770 align:center line:-1 position:50% size:40%
there's a lot of economic losses
associated with that.

00:04:30.770 --> 00:04:32.239 align:center line:-1 position:50% size:24%
If you have a road,
a pivotal road,

00:04:32.239 --> 00:04:34.374 align:center line:-1 position:50% size:24%
that's closed down
for a long time,

00:04:34.374 --> 00:04:37.444 align:center line:-1 position:50% size:37%
it's important to try to reduce
these losses as best we can,

00:04:37.444 --> 00:04:39.646 align:center line:-1 position:50% size:24%
and that's kind of
been the objective

00:04:39.646 --> 00:04:41.548 align:center line:-1 position:50% size:24%
of my project here.

00:04:41.548 --> 00:04:44.951 align:center line:-1 position:50% size:35%
And kinda the first measure
that you can do

00:04:44.951 --> 00:04:46.586 align:center line:-1 position:50% size:37%
to try to mitigate these losses

00:04:46.586 --> 00:04:49.356 align:center line:-1 position:50% size:32%
with the creation
of the susceptibility map.

00:04:49.356 --> 00:04:51.992 align:center line:-1 position:50% size:29%
So here is an example
of a susceptibility map

00:04:51.992 --> 00:04:54.027 align:center line:-1 position:50% size:23%
from Puerto Rico.

00:04:54.027 --> 00:04:55.996 align:center line:-1 position:50% size:27%
And as I said before,

00:04:55.996 --> 00:04:57.764 align:center line:-1 position:50% size:30%
this is kinda
the first line of defense.

00:04:57.764 --> 00:05:01.568 align:center line:-1 position:50% size:31%
And so what precisely is
a susceptibility map?

00:05:01.568 --> 00:05:04.104 align:center line:-1 position:50% size:36%
Well, what it does
is it measures the likelihood

00:05:04.104 --> 00:05:06.239 align:center line:-1 position:50% size:30%
of landslide occurrence
at a given location

00:05:06.239 --> 00:05:08.074 align:center line:-1 position:50% size:23%
given the local
terrain conditions,

00:05:08.074 --> 00:05:09.543 align:center line:-1 position:50% size:24%
and that's it, right?

00:05:09.543 --> 00:05:12.312 align:center line:-1 position:50% size:31%
It's just
a spatial component of--

00:05:12.312 --> 00:05:14.381 align:center line:-1 position:50% size:31%
And typically the outputs
are the probability

00:05:14.381 --> 00:05:18.685 align:center line:-1 position:50% size:42%
of observing a mapped landslide
within a given mapping unit.

00:05:18.685 --> 00:05:22.722 align:center line:-1 position:50% size:35%
And so a susceptibility map
and a susceptibility model

00:05:22.722 --> 00:05:24.591 align:center line:-1 position:50% size:32%
are inherently imprecise,

00:05:24.591 --> 00:05:26.560 align:center line:-1 position:50% size:39%
and this is because of the data
that we have available

00:05:26.560 --> 00:05:28.195 align:center line:-1 position:50% size:23%
for creating these.

00:05:28.195 --> 00:05:29.930 align:center line:-1 position:50% size:41%
If we had timing and magnitude,

00:05:29.930 --> 00:05:32.165 align:center line:-1 position:50% size:38%
we wouldn't try
to create a susceptibility map,

00:05:32.165 --> 00:05:34.467 align:center line:-1 position:50% size:41%
we'd try to create a hazard map,

00:05:34.467 --> 00:05:36.570 align:center line:-1 position:50% size:36%
which, as the name implies,

00:05:36.570 --> 00:05:39.973 align:center line:-1 position:50% size:31%
incorporates magnitude,
time, and location.

00:05:39.973 --> 00:05:42.309 align:center line:-1 position:50% size:38%
But over much of the United--

00:05:42.309 --> 00:05:44.211 align:center line:-1 position:50% size:33%
most of the United States
and the world,

00:05:44.211 --> 00:05:47.047 align:center line:-1 position:50% size:31%
we don't have the timing
and magnitude

00:05:47.047 --> 00:05:48.381 align:center line:-1 position:50% size:31%
of most of the landslides

00:05:48.381 --> 00:05:51.051 align:center line:-1 position:50% size:33%
that we have
any sort of information on.

00:05:51.051 --> 00:05:54.387 align:center line:-1 position:50% size:36%
And so what good
is really a susceptibility map

00:05:54.387 --> 00:05:56.523 align:center line:-1 position:50% size:36%
if the output's so imprecise?

00:05:56.523 --> 00:05:58.658 align:center line:-1 position:50% size:26%
Well, what it does
is it highlights areas

00:05:58.658 --> 00:06:00.327 align:center line:-1 position:50% size:26%
that are more prone
to landsliding

00:06:00.327 --> 00:06:03.830 align:center line:-1 position:50% size:25%
by using evidences
of past landslides.

00:06:05.232 --> 00:06:08.268 align:center line:-1 position:50% size:30%
And again, just kinda
emphasizing this point:

00:06:08.268 --> 00:06:12.973 align:center line:-1 position:50% size:34%
It's--we're limited
to this susceptibility output

00:06:12.973 --> 00:06:14.574 align:center line:-1 position:50% size:30%
because we don't have
the data available

00:06:14.574 --> 00:06:16.443 align:center line:-1 position:50% size:27%
for something better.

00:06:16.443 --> 00:06:19.279 align:center line:-1 position:50% size:38%
But people have been making
susceptibility maps

00:06:19.279 --> 00:06:20.981 align:center line:-1 position:50% size:19%
for a long time,

00:06:20.981 --> 00:06:22.449 align:center line:-1 position:50% size:35%
and there's
a few common approaches

00:06:22.449 --> 00:06:25.785 align:center line:-1 position:50% size:35%
that people use for creating
these susceptibility maps.

00:06:25.785 --> 00:06:28.221 align:center line:-1 position:50% size:36%
First one
is a physically-based model,

00:06:28.221 --> 00:06:30.223 align:center line:-1 position:50% size:27%
and these are,
as the name implies,

00:06:30.223 --> 00:06:32.659 align:center line:-1 position:50% size:41%
they're able to model
the actual physical mechanisms

00:06:32.659 --> 00:06:35.328 align:center line:-1 position:50% size:32%
that lead to slope failure.

00:06:35.328 --> 00:06:37.831 align:center line:-1 position:50% size:37%
The next one is a data-driven
or statistical models,

00:06:37.831 --> 00:06:40.100 align:center line:-1 position:50% size:35%
and these include methods
like machine learning,

00:06:40.100 --> 00:06:43.503 align:center line:-1 position:50% size:38%
fuzzy logic, deep learning, AI,
those types of methods,

00:06:43.503 --> 00:06:47.007 align:center line:-1 position:50% size:36%
where really the output
is dictated by the input data.

00:06:47.007 --> 00:06:49.276 align:center line:-1 position:50% size:30%
And we'll dig more into
exactly how these work

00:06:49.276 --> 00:06:51.911 align:center line:-1 position:50% size:34%
because this is gonna be
kinda the focus of this talk,

00:06:51.911 --> 00:06:55.115 align:center line:-1 position:50% size:27%
are these data-driven
statistical models.

00:06:55.115 --> 00:06:57.884 align:center line:-1 position:50% size:33%
Next are heuristic models,
and what these are,

00:06:57.884 --> 00:07:00.320 align:center line:-1 position:50% size:22%
are a user would
basically define

00:07:00.320 --> 00:07:02.022 align:center line:-1 position:50% size:35%
what dictates susceptibility.

00:07:02.022 --> 00:07:04.858 align:center line:-1 position:50% size:27%
So you can imagine
somebody could say,

00:07:04.858 --> 00:07:07.260 align:center line:-1 position:50% size:30%
"Okay, we're gonna say
that susceptibility

00:07:07.260 --> 00:07:09.029 align:center line:-1 position:50% size:38%
is gonna increase with slope."

00:07:09.029 --> 00:07:10.864 align:center line:-1 position:50% size:40%
And so you could just measure
the slope of the terrain,

00:07:10.864 --> 00:07:14.334 align:center line:-1 position:50% size:27%
and voila, you'd have
a susceptibility map.

00:07:14.334 --> 00:07:16.603 align:center line:-1 position:50% size:34%
That would be an example
of heuristic model.

00:07:16.603 --> 00:07:18.838 align:center line:-1 position:50% size:33%
And then you can also do
geomorphic mapping

00:07:18.838 --> 00:07:23.310 align:center line:-1 position:50% size:40%
using GIS and field observation
to create a susceptibility map.

00:07:23.310 --> 00:07:25.545 align:center line:-1 position:50% size:38%
But as I said before,
we're gonna be focusing here

00:07:25.545 --> 00:07:27.013 align:center line:-1 position:50% size:23%
on data-driven
statistical models,

00:07:27.013 --> 00:07:28.348 align:center line:-1 position:50% size:29%
and the reason for that
is because

00:07:28.348 --> 00:07:31.351 align:center line:-1 position:50% size:33%
generally over large areas
we don't--once again,

00:07:31.351 --> 00:07:34.120 align:center line:-1 position:50% size:41%
we don't have the data available
for a physically-based model.

00:07:34.120 --> 00:07:37.090 align:center line:-1 position:50% size:27%
But thanks to GI--
or to remote sensing,

00:07:37.090 --> 00:07:38.758 align:center line:-1 position:50% size:27%
that's really
kinda gained traction

00:07:38.758 --> 00:07:40.327 align:center line:-1 position:50% size:30%
over the last few years.

00:07:40.327 --> 00:07:43.163 align:center line:-1 position:50% size:31%
We do have a lot of data
that can be used

00:07:43.163 --> 00:07:45.432 align:center line:-1 position:50% size:23%
with data-driven
statistical models.

00:07:47.634 --> 00:07:49.602 align:center line:-1 position:50% size:40%
So how do these models work?

00:07:49.602 --> 00:07:51.671 align:center line:-1 position:50% size:34%
The basic idea is,
you imagine that you have

00:07:51.671 --> 00:07:54.774 align:center line:-1 position:50% size:36%
some domain of interest
shown here in these figures,

00:07:54.774 --> 00:07:57.210 align:center line:-1 position:50% size:40%
and you have some
known distribution of landslides

00:07:57.210 --> 00:07:59.979 align:center line:-1 position:50% size:38%
shown here by the white dots.

00:07:59.979 --> 00:08:01.348 align:center line:-1 position:50% size:35%
And then you can measure
the attributes

00:08:01.348 --> 00:08:02.615 align:center line:-1 position:50% size:25%
of those landslides.

00:08:02.615 --> 00:08:04.317 align:center line:-1 position:50% size:28%
Here on the left
we have slope plotted

00:08:04.317 --> 00:08:06.453 align:center line:-1 position:50% size:24%
and on the right
we have elevation.

00:08:06.453 --> 00:08:08.355 align:center line:-1 position:50% size:30%
And so you can gather
the slope and elevation

00:08:08.355 --> 00:08:10.490 align:center line:-1 position:50% size:32%
of all these
different landslide points,

00:08:10.490 --> 00:08:13.026 align:center line:-1 position:50% size:38%
and then you can collect data
across the domain

00:08:13.026 --> 00:08:15.895 align:center line:-1 position:50% size:30%
that doesn't have
evidence of landsliding,

00:08:15.895 --> 00:08:17.597 align:center line:-1 position:50% size:30%
and collect
the slope and elevation

00:08:17.597 --> 00:08:20.467 align:center line:-1 position:50% size:33%
of all those
non-landsliding locations.

00:08:20.467 --> 00:08:22.569 align:center line:-1 position:50% size:32%
So you have
these two arrays of data.

00:08:22.569 --> 00:08:25.038 align:center line:-1 position:50% size:25%
Landslide data,
non-landslide data.

00:08:25.038 --> 00:08:27.707 align:center line:-1 position:50% size:35%
You feed that into a model
and that model is designed

00:08:27.707 --> 00:08:30.076 align:center line:-1 position:50% size:36%
to differentiate as best it can

00:08:30.076 --> 00:08:33.613 align:center line:-1 position:50% size:36%
the landslide data
from the non-landslide data.

00:08:33.613 --> 00:08:35.815 align:center line:-1 position:50% size:22%
And so hopefully
it's obvious here

00:08:35.815 --> 00:08:37.884 align:center line:-1 position:50% size:38%
the importance of data, right?

00:08:37.884 --> 00:08:40.019 align:center line:-1 position:50% size:32%
This is gonna be
kind of a common theme

00:08:40.019 --> 00:08:43.256 align:center line:-1 position:50% size:35%
or point I'm gonna
keep on driving home here.

00:08:43.256 --> 00:08:46.326 align:center line:-1 position:50% size:38%
Data reigns supreme
with these statistical methods.

00:08:46.326 --> 00:08:48.528 align:center line:-1 position:50% size:22%
And so with that,

00:08:48.528 --> 00:08:50.296 align:center line:-1 position:50% size:34%
we're running into
these two major obstacles.

00:08:50.296 --> 00:08:52.665 align:center line:-1 position:50% size:37%
How do we deal if we wanna
create a susceptibility map

00:08:52.665 --> 00:08:54.334 align:center line:-1 position:50% size:32%
where we might not have

00:08:54.334 --> 00:08:56.302 align:center line:-1 position:50% size:27%
very much data
or no landslide data?

00:08:56.302 --> 00:08:59.072 align:center line:-1 position:50% size:34%
How can we create
a susceptibility map there?

00:08:59.072 --> 00:09:02.242 align:center line:-1 position:50% size:32%
And then how to manage
the imprecise nature

00:09:02.242 --> 00:09:04.177 align:center line:-1 position:50% size:26%
of the landslide data
that's available to us

00:09:04.177 --> 00:09:06.913 align:center line:-1 position:50% size:26%
to create these
susceptibility maps?

00:09:06.913 --> 00:09:09.182 align:center line:-1 position:50% size:27%
To illustrate this point
we're gonna zoom in

00:09:09.182 --> 00:09:12.085 align:center line:-1 position:50% size:34%
to the Four Corners region
of the United States.

00:09:12.085 --> 00:09:14.487 align:center line:-1 position:50% size:28%
Here I have an image

00:09:14.487 --> 00:09:18.258 align:center line:-1 position:50% size:40%
from the National Inventory
that USGS has been compiling,

00:09:18.258 --> 00:09:21.828 align:center line:-1 position:50% size:37%
and the colors here correlate
with the level of confidence

00:09:21.828 --> 00:09:23.596 align:center line:-1 position:50% size:30%
in the extent and nature

00:09:23.596 --> 00:09:26.366 align:center line:-1 position:50% size:26%
of the landslide data
that we have.

00:09:26.366 --> 00:09:28.835 align:center line:-1 position:50% size:28%
Lighter colors
being low confidence,

00:09:28.835 --> 00:09:32.005 align:center line:-1 position:50% size:32%
higher--or darker colors
being higher confidence.

00:09:32.005 --> 00:09:34.841 align:center line:-1 position:50% size:38%
So we're gonna zoom in here
to Colorado, New Mexico,

00:09:34.841 --> 00:09:36.576 align:center line:-1 position:50% size:24%
Arizona, and Utah.

00:09:36.576 --> 00:09:39.712 align:center line:-1 position:50% size:37%
And you can see that there's
a lot of heterogeneity here

00:09:39.712 --> 00:09:42.782 align:center line:-1 position:50% size:32%
both in the extent
of the mapped landslides

00:09:42.782 --> 00:09:46.019 align:center line:-1 position:50% size:22%
and the quality
of the landslides.

00:09:46.019 --> 00:09:47.620 align:center line:-1 position:50% size:37%
And so you see here in Utah

00:09:47.620 --> 00:09:50.056 align:center line:-1 position:50% size:26%
you have these
Colorado landslides

00:09:50.056 --> 00:09:53.693 align:center line:-1 position:50% size:34%
that starkly stop
at the municipal boundary,

00:09:53.693 --> 00:09:56.496 align:center line:-1 position:50% size:17%
which is odd.

00:09:56.496 --> 00:09:58.097 align:center line:-1 position:50% size:24%
And then you have
varying degrees

00:09:58.097 --> 00:10:00.033 align:center line:-1 position:50% size:27%
of confidence in this.

00:10:00.033 --> 00:10:02.135 align:center line:-1 position:50% size:42%
And so how do we manage this?

00:10:02.135 --> 00:10:03.570 align:center line:-1 position:50% size:28%
The way we're gonna

00:10:03.570 --> 00:10:05.271 align:center line:-1 position:50% size:29%
manage these regions
of no landslide data

00:10:05.271 --> 00:10:08.074 align:center line:-1 position:50% size:37%
is that we're gonna assess
the limits of the training data.

00:10:08.074 --> 00:10:09.709 align:center line:-1 position:50% size:24%
And I'll highlight
these experiments

00:10:09.709 --> 00:10:11.077 align:center line:-1 position:50% size:26%
in the coming slides,
but that's gonna be

00:10:11.077 --> 00:10:13.046 align:center line:-1 position:50% size:30%
the first half of this talk.

00:10:13.046 --> 00:10:16.149 align:center line:-1 position:50% size:43%
The next one is, "How to manage
imprecise landslide data?"

00:10:16.149 --> 00:10:18.751 align:center line:-1 position:50% size:30%
By "imprecise"--
let me define this here--

00:10:18.751 --> 00:10:20.887 align:center line:-1 position:50% size:43%
we mean heterogeneous formats.

00:10:20.887 --> 00:10:23.189 align:center line:-1 position:50% size:32%
So for instance here,
the Four Corners region.

00:10:23.189 --> 00:10:26.125 align:center line:-1 position:50% size:34%
New Mexico
predominantly uses points.

00:10:26.125 --> 00:10:28.194 align:center line:-1 position:50% size:31%
The other areas
often use a combination

00:10:28.194 --> 00:10:30.396 align:center line:-1 position:50% size:30%
of points and polygons.

00:10:30.396 --> 00:10:32.432 align:center line:-1 position:50% size:44%
And so how do we manage those?

00:10:32.432 --> 00:10:35.201 align:center line:-1 position:50% size:39%
Inaccurate landslide locations,
as we already talked about.

00:10:35.201 --> 00:10:37.270 align:center line:-1 position:50% size:35%
The colors here correlate
with the level of confidence

00:10:37.270 --> 00:10:39.372 align:center line:-1 position:50% size:35%
and the extent and location
of these landslides.

00:10:39.372 --> 00:10:41.941 align:center line:-1 position:50% size:30%
So how do we deal with
that heterogeneity?

00:10:41.941 --> 00:10:43.243 align:center line:-1 position:50% size:40%
There's no time unit component

00:10:43.243 --> 00:10:44.777 align:center line:-1 position:50% size:30%
on basically
any of these landslides

00:10:44.777 --> 00:10:46.045 align:center line:-1 position:50% size:30%
that are available to us.

00:10:46.045 --> 00:10:48.314 align:center line:-1 position:50% size:28%
These are all mapped
using predominantly

00:10:48.314 --> 00:10:51.251 align:center line:-1 position:50% size:34%
geomorphic mapping tools
and GIS.

00:10:51.251 --> 00:10:52.819 align:center line:-1 position:50% size:39%
And the level of completeness.

00:10:52.819 --> 00:10:56.589 align:center line:-1 position:50% size:34%
We can see that
there's relative differences

00:10:56.589 --> 00:10:58.258 align:center line:-1 position:50% size:36%
in the level of completeness.

00:10:58.258 --> 00:11:00.393 align:center line:-1 position:50% size:35%
And what really does--
what does a complete map

00:11:00.393 --> 00:11:02.195 align:center line:-1 position:50% size:20%
really look like?

00:11:02.195 --> 00:11:04.864 align:center line:-1 position:50% size:27%
These are all kinda
inherent imprecisions

00:11:04.864 --> 00:11:06.766 align:center line:-1 position:50% size:32%
that we need to deal with
and take account for

00:11:06.766 --> 00:11:10.069 align:center line:-1 position:50% size:26%
when we're creating
a susceptibility map.

00:11:10.069 --> 00:11:14.240 align:center line:-1 position:50% size:42%
And one way that we can kind of
accommodate this imprecision

00:11:14.240 --> 00:11:15.742 align:center line:-1 position:50% size:21%
is by the choice
of mapping unit,

00:11:15.742 --> 00:11:19.345 align:center line:-1 position:50% size:27%
and I'll dive into
how that plays a role

00:11:19.345 --> 00:11:21.848 align:center line:-1 position:50% size:24%
for the second half
of this talk.

00:11:21.848 --> 00:11:23.716 align:center line:-1 position:50% size:30%
So once again,
we're first gonna tackle

00:11:23.716 --> 00:11:25.685 align:center line:-1 position:50% size:30%
how to manage regions
with limited

00:11:25.685 --> 00:11:28.221 align:center line:-1 position:50% size:31%
or no landslide data,
and then we'll talk about

00:11:28.221 --> 00:11:30.924 align:center line:-1 position:50% size:30%
how to manage
imprecision in the data.

00:11:30.924 --> 00:11:33.560 align:center line:-1 position:50% size:27%
So to evaluate
how to apply a model

00:11:33.560 --> 00:11:37.764 align:center line:-1 position:50% size:30%
to areas with poor data,
or limited or no data,

00:11:37.764 --> 00:11:39.933 align:center line:-1 position:50% size:28%
we're gonna evaluate
previous approaches

00:11:39.933 --> 00:11:43.403 align:center line:-1 position:50% size:33%
that have been done
for mapping susceptibility

00:11:43.403 --> 00:11:45.438 align:center line:-1 position:50% size:33%
over areas with poor data.

00:11:45.438 --> 00:11:48.441 align:center line:-1 position:50% size:30%
And having reviewed
the literature quite a bit,

00:11:48.441 --> 00:11:52.745 align:center line:-1 position:50% size:38%
I kinda broke 'em down into
these three main approaches.

00:11:52.745 --> 00:11:54.380 align:center line:-1 position:50% size:25%
The first one
is applying a model

00:11:54.380 --> 00:11:57.483 align:center line:-1 position:50% size:35%
trained on data-rich regions
to data-poor regions.

00:11:57.483 --> 00:11:58.952 align:center line:-1 position:50% size:32%
And the way we're gonna
evaluate this approach

00:11:58.952 --> 00:12:00.820 align:center line:-1 position:50% size:29%
is that we have
four different data sets

00:12:00.820 --> 00:12:02.789 align:center line:-1 position:50% size:39%
from across the United States,

00:12:02.789 --> 00:12:05.258 align:center line:-1 position:50% size:39%
and we're gonna train a model
on three of those four

00:12:05.258 --> 00:12:07.393 align:center line:-1 position:50% size:31%
and apply that model
and see how it performs

00:12:07.393 --> 00:12:09.128 align:center line:-1 position:50% size:24%
on that one region
that's left out,

00:12:09.128 --> 00:12:10.964 align:center line:-1 position:50% size:36%
and this is basically to mimic

00:12:10.964 --> 00:12:14.233 align:center line:-1 position:50% size:27%
what most people do

00:12:14.233 --> 00:12:16.035 align:center line:-1 position:50% size:24%
with this approach.

00:12:16.035 --> 00:12:17.537 align:center line:-1 position:50% size:32%
We're gonna do this
for each of the four areas

00:12:17.537 --> 00:12:20.573 align:center line:-1 position:50% size:38%
so we can get kind of a range
of the relative performance

00:12:20.573 --> 00:12:22.942 align:center line:-1 position:50% size:19%
of this method.

00:12:22.942 --> 00:12:25.178 align:center line:-1 position:50% size:38%
The second approach
that I've seen commonly used

00:12:25.178 --> 00:12:27.847 align:center line:-1 position:50% size:38%
is that they restrict
model training and application

00:12:27.847 --> 00:12:31.017 align:center line:-1 position:50% size:41%
to regions with
shared environmental attributes.

00:12:31.017 --> 00:12:34.921 align:center line:-1 position:50% size:43%
And so we selected these regions
based off of one, the quality

00:12:34.921 --> 00:12:36.556 align:center line:-1 position:50% size:26%
of the landslide data
that's available,

00:12:36.556 --> 00:12:39.926 align:center line:-1 position:50% size:26%
but also on
the relative similarity

00:12:39.926 --> 00:12:44.063 align:center line:-1 position:50% size:34%
in environmental attributes
on the continental scale.

00:12:44.063 --> 00:12:46.566 align:center line:-1 position:50% size:30%
So two of them share
physiographic province,

00:12:46.566 --> 00:12:48.968 align:center line:-1 position:50% size:39%
and one of them is outside
of that physiographic province,

00:12:48.968 --> 00:12:51.771 align:center line:-1 position:50% size:39%
but shares a similar ecoregion,

00:12:51.771 --> 00:12:54.340 align:center line:-1 position:50% size:27%
and then one region
is completely outside

00:12:54.340 --> 00:12:57.977 align:center line:-1 position:50% size:33%
of those
environmental conditions.

00:12:57.977 --> 00:13:01.180 align:center line:-1 position:50% size:33%
And so we'll train a model
on one location

00:13:01.180 --> 00:13:03.549 align:center line:-1 position:50% size:36%
that should have
high environmental similarity

00:13:03.549 --> 00:13:05.051 align:center line:-1 position:50% size:24%
and tested on
the other locations

00:13:05.051 --> 00:13:08.721 align:center line:-1 position:50% size:39%
to see if constraining by
these environmental attributes

00:13:08.721 --> 00:13:11.424 align:center line:-1 position:50% size:44%
helps improve model performance.

00:13:11.424 --> 00:13:13.192 align:center line:-1 position:50% size:34%
So we're gonna do that
with these varying degrees

00:13:13.192 --> 00:13:15.294 align:center line:-1 position:50% size:34%
of environmental similarity.

00:13:15.294 --> 00:13:17.296 align:center line:-1 position:50% size:23%
The last approach

00:13:17.296 --> 00:13:18.798 align:center line:-1 position:50% size:41%
is training and applying a model

00:13:18.798 --> 00:13:21.668 align:center line:-1 position:50% size:30%
on very sparse,
but a uniform inventory.

00:13:21.668 --> 00:13:24.370 align:center line:-1 position:50% size:35%
And so what I mean by this
is that you have

00:13:24.370 --> 00:13:27.340 align:center line:-1 position:50% size:28%
some landslide data
across the entire area

00:13:27.340 --> 00:13:29.008 align:center line:-1 position:50% size:26%
where you wanna
model susceptibility,

00:13:29.008 --> 00:13:31.444 align:center line:-1 position:50% size:35%
but it could be very sparse.

00:13:31.444 --> 00:13:36.382 align:center line:-1 position:50% size:42%
And often the way people do this
is that they'll kinda scour GIS

00:13:36.382 --> 00:13:38.184 align:center line:-1 position:50% size:36%
and just find some locations

00:13:38.184 --> 00:13:40.853 align:center line:-1 position:50% size:39%
that they know
don't have a mapped landslide

00:13:40.853 --> 00:13:43.523 align:center line:-1 position:50% size:26%
or do have evidence
of landsliding.

00:13:43.523 --> 00:13:48.428 align:center line:-1 position:50% size:38%
But they're not worried about
getting a complete repository.

00:13:48.428 --> 00:13:50.196 align:center line:-1 position:50% size:28%
And to mimic this,
what we're gonna do,

00:13:50.196 --> 00:13:52.298 align:center line:-1 position:50% size:32%
we're gonna use
these same four regions,

00:13:52.298 --> 00:13:55.234 align:center line:-1 position:50% size:36%
but only sample five percent
of the available data,

00:13:55.234 --> 00:13:57.637 align:center line:-1 position:50% size:27%
develop a model
with that five percent,

00:13:57.637 --> 00:14:00.006 align:center line:-1 position:50% size:31%
and then test that model
on the 95 percent

00:14:00.006 --> 00:14:02.008 align:center line:-1 position:50% size:28%
we didn't use
to develop the model,

00:14:02.008 --> 00:14:03.309 align:center line:-1 position:50% size:26%
and we'll repeat this
several times

00:14:03.309 --> 00:14:05.344 align:center line:-1 position:50% size:35%
so we can get distributions.

00:14:05.344 --> 00:14:07.914 align:center line:-1 position:50% size:29%
And so now we kind of
understand the theory

00:14:07.914 --> 00:14:09.348 align:center line:-1 position:50% size:30%
of how we're gonna
approach this problem.

00:14:09.348 --> 00:14:12.185 align:center line:-1 position:50% size:37%
Here are the actual data sets
that we're looking at here.

00:14:12.185 --> 00:14:17.023 align:center line:-1 position:50% size:35%
So we have three locations
in Appalachia

00:14:17.023 --> 00:14:19.392 align:center line:-1 position:50% size:35%
we have
Magoffin County, Kentucky,

00:14:19.392 --> 00:14:20.827 align:center line:-1 position:50% size:42%
Doddridge County, West Virginia,

00:14:20.827 --> 00:14:23.730 align:center line:-1 position:50% size:42%
which both share
the same physiographic province

00:14:23.730 --> 00:14:25.465 align:center line:-1 position:50% size:31%
and level two ecoregion.

00:14:25.465 --> 00:14:27.066 align:center line:-1 position:50% size:29%
Ecoregions are shown
by the colors.

00:14:27.066 --> 00:14:30.269 align:center line:-1 position:50% size:37%
The physiographic provinces
are shown by outlines.

00:14:30.269 --> 00:14:31.704 align:center line:-1 position:50% size:29%
And then we also have
Macon County,

00:14:31.704 --> 00:14:34.307 align:center line:-1 position:50% size:30%
which is in a different
physiographic province,

00:14:34.307 --> 00:14:36.943 align:center line:-1 position:50% size:27%
but within the same
level two eco region.

00:14:36.943 --> 00:14:39.245 align:center line:-1 position:50% size:38%
And then we also have
the Elkhorn Ridge Wilderness

00:14:39.245 --> 00:14:40.613 align:center line:-1 position:50% size:16%
in California.

00:14:40.613 --> 00:14:42.648 align:center line:-1 position:50% size:40%
And so these are the data sets,
and once again,

00:14:42.648 --> 00:14:44.751 align:center line:-1 position:50% size:37%
we chose these areas due to

00:14:44.751 --> 00:14:46.953 align:center line:-1 position:50% size:33%
the variation
in environmental similarity

00:14:46.953 --> 00:14:49.088 align:center line:-1 position:50% size:25%
and the high quality
landslide data

00:14:49.088 --> 00:14:50.389 align:center line:-1 position:50% size:27%
that's available here.

00:14:50.389 --> 00:14:51.724 align:center line:-1 position:50% size:37%
And we know this because of

00:14:51.724 --> 00:14:55.762 align:center line:-1 position:50% size:39%
the high resolution
LiDAR imagery that they used,

00:14:55.762 --> 00:14:58.664 align:center line:-1 position:50% size:39%
including field reconnaissance
for developing these catalogs

00:14:58.664 --> 00:15:00.867 align:center line:-1 position:50% size:28%
that we're gonna use.

00:15:00.867 --> 00:15:04.070 align:center line:-1 position:50% size:29%
All right, so the models
that we're gonna use,

00:15:04.070 --> 00:15:05.938 align:center line:-1 position:50% size:27%
the statistical models
that we're gonna use

00:15:05.938 --> 00:15:07.573 align:center line:-1 position:50% size:27%
is logistic regression.

00:15:07.573 --> 00:15:09.675 align:center line:-1 position:50% size:29%
And we're gonna
evaluate these models

00:15:09.675 --> 00:15:11.177 align:center line:-1 position:50% size:38%
using two primary techniques.

00:15:11.177 --> 00:15:13.179 align:center line:-1 position:50% size:33%
The first, commonly used,

00:15:13.179 --> 00:15:14.781 align:center line:-1 position:50% size:22%
receiver operator
characteristics,

00:15:14.781 --> 00:15:16.415 align:center line:-1 position:50% size:28%
area under the curve.

00:15:16.415 --> 00:15:18.151 align:center line:-1 position:50% size:30%
Kind of the go-to metric

00:15:18.151 --> 00:15:21.988 align:center line:-1 position:50% size:29%
for assessing
landslide susceptibility,

00:15:21.988 --> 00:15:24.090 align:center line:-1 position:50% size:31%
or how well the models--

00:15:24.090 --> 00:15:26.058 align:center line:-1 position:50% size:24%
But something
that's often lacking

00:15:26.058 --> 00:15:29.695 align:center line:-1 position:50% size:27%
when we evaluate
susceptibility models

00:15:29.695 --> 00:15:31.731 align:center line:-1 position:50% size:32%
is actually understanding
what leads to

00:15:31.731 --> 00:15:33.733 align:center line:-1 position:50% size:28%
that level of accuracy,

00:15:33.733 --> 00:15:37.436 align:center line:-1 position:50% size:34%
as reflected in the receiver
operator characteristics.

00:15:37.436 --> 00:15:39.705 align:center line:-1 position:50% size:30%
And we're gonna
dive into understanding

00:15:39.705 --> 00:15:41.941 align:center line:-1 position:50% size:27%
this relative accuracy
by comparing

00:15:41.941 --> 00:15:46.412 align:center line:-1 position:50% size:29%
the logistic coefficients
within the models.

00:15:46.412 --> 00:15:48.247 align:center line:-1 position:50% size:29%
And we'll get more into
what that means

00:15:48.247 --> 00:15:50.516 align:center line:-1 position:50% size:26%
in the coming slides.

00:15:50.516 --> 00:15:52.785 align:center line:-1 position:50% size:30%
It should be noted here
that we're not using

00:15:52.785 --> 00:15:55.988 align:center line:-1 position:50% size:33%
just a basic frequentist
logistic regression model,

00:15:55.988 --> 00:15:57.557 align:center line:-1 position:50% size:43%
we're using a Bayesian approach.

00:15:57.557 --> 00:15:59.392 align:center line:-1 position:50% size:27%
And the reason we're
(audio drops out)

00:15:59.392 --> 00:16:00.793 align:center line:-1 position:50% size:36%
is that
a Bayesian approach is able

00:16:00.793 --> 00:16:03.629 align:center line:-1 position:50% size:35%
to incorporate uncertainty
in the estimated distribution

00:16:03.629 --> 00:16:05.598 align:center line:-1 position:50% size:32%
of the model parameters.

00:16:05.598 --> 00:16:08.601 align:center line:-1 position:50% size:37%
In this case, we're interested
in distribution of beta.

00:16:08.601 --> 00:16:11.370 align:center line:-1 position:50% size:40%
And a logistic regression model
is shown here,

00:16:11.370 --> 00:16:13.172 align:center line:-1 position:50% size:42%
where you have a linear function

00:16:13.172 --> 00:16:16.576 align:center line:-1 position:50% size:23%
that's within
a logistic function.

00:16:16.576 --> 00:16:18.878 align:center line:-1 position:50% size:40%
And that's basically--
and then it outputs the log odds

00:16:18.878 --> 00:16:22.481 align:center line:-1 position:50% size:32%
of there being a landslide
at a given location.

00:16:22.481 --> 00:16:25.318 align:center line:-1 position:50% size:33%
And the basis
of the Bayesian approach

00:16:25.318 --> 00:16:26.853 align:center line:-1 position:50% size:27%
for logistic regression

00:16:26.853 --> 00:16:29.155 align:center line:-1 position:50% size:25%
is what's referred to
as Bayes' Law,

00:16:29.155 --> 00:16:30.890 align:center line:-1 position:50% size:31%
and what Bayes' Law is,

00:16:30.890 --> 00:16:35.127 align:center line:-1 position:50% size:39%
is as a posterior distribution
of some parameter of interest,

00:16:35.127 --> 00:16:39.298 align:center line:-1 position:50% size:31%
in this case it's the beta,
given your input data,

00:16:39.298 --> 00:16:42.435 align:center line:-1 position:50% size:23%
is proportional
to prior probability

00:16:42.435 --> 00:16:43.736 align:center line:-1 position:50% size:25%
times the likelihood
of function.

00:16:43.736 --> 00:16:47.173 align:center line:-1 position:50% size:40%
The prior probability is
an assumed distribution of beta

00:16:47.173 --> 00:16:49.575 align:center line:-1 position:50% size:27%
before you even look
at the data.

00:16:49.575 --> 00:16:51.644 align:center line:-1 position:50% size:34%
And the likelihood function
is assumed

00:16:51.644 --> 00:16:52.945 align:center line:-1 position:50% size:37%
to be a Bernoulli distribution,

00:16:52.945 --> 00:16:56.382 align:center line:-1 position:50% size:30%
another logistic function
that's shown here.

00:16:56.382 --> 00:16:58.284 align:center line:-1 position:50% size:27%
So for our instances,

00:16:58.284 --> 00:17:00.319 align:center line:-1 position:50% size:23%
we're gonna use
very vague priors,

00:17:00.319 --> 00:17:02.421 align:center line:-1 position:50% size:31%
so that way the data
is really what dominates

00:17:02.421 --> 00:17:04.624 align:center line:-1 position:50% size:36%
these posterior distributions.

00:17:04.624 --> 00:17:06.158 align:center line:-1 position:50% size:26%
And so this gives us
a distribution

00:17:06.158 --> 00:17:07.727 align:center line:-1 position:50% size:27%
of these coefficients,
and that allows us

00:17:07.727 --> 00:17:10.229 align:center line:-1 position:50% size:25%
to do more
statistical tinkering,

00:17:10.229 --> 00:17:11.731 align:center line:-1 position:50% size:33%
as we'll show you,
than you can generally do

00:17:11.731 --> 00:17:15.668 align:center line:-1 position:50% size:36%
with a frequentist approach.

00:17:15.668 --> 00:17:17.937 align:center line:-1 position:50% size:38%
So for preprocessing the data

00:17:17.937 --> 00:17:20.406 align:center line:-1 position:50% size:33%
there's a lot
that I'm not gonna go into,

00:17:20.406 --> 00:17:22.775 align:center line:-1 position:50% size:39%
but we follow
kinda the standard procedures

00:17:22.775 --> 00:17:26.345 align:center line:-1 position:50% size:31%
that people have had
success with in the past.

00:17:26.345 --> 00:17:29.282 align:center line:-1 position:50% size:37%
And we ended up evaluating
64 different predictors,

00:17:29.282 --> 00:17:31.651 align:center line:-1 position:50% size:37%
and by "predictors"
I mean aspects of the terrain,

00:17:31.651 --> 00:17:34.553 align:center line:-1 position:50% size:29%
like slope or elevation.

00:17:34.553 --> 00:17:36.689 align:center line:-1 position:50% size:23%
And after we did
our preprocessing

00:17:36.689 --> 00:17:38.858 align:center line:-1 position:50% size:34%
and a correlation analysis,
we reduced that number

00:17:38.858 --> 00:17:40.626 align:center line:-1 position:50% size:15%
down to 11.

00:17:42.161 --> 00:17:44.363 align:center line:-1 position:50% size:34%
And so
that's the model approach.

00:17:44.363 --> 00:17:45.765 align:center line:-1 position:50% size:39%
How are we evaluating these?

00:17:45.765 --> 00:17:47.900 align:center line:-1 position:50% size:35%
So if you're not familiar with

00:17:47.900 --> 00:17:49.468 align:center line:-1 position:50% size:22%
receiver operator
characteristics,

00:17:49.468 --> 00:17:51.871 align:center line:-1 position:50% size:28%
area under the curve,
the basic idea is that

00:17:51.871 --> 00:17:53.806 align:center line:-1 position:50% size:30%
it's a measure
of how well your model

00:17:53.806 --> 00:17:57.543 align:center line:-1 position:50% size:38%
is able to distinguish
no landslides from landslides.

00:17:57.543 --> 00:17:58.945 align:center line:-1 position:50% size:30%
So if you have a model
that's able

00:17:58.945 --> 00:18:00.846 align:center line:-1 position:50% size:29%
to perfectly distinguish
no landslides,

00:18:00.846 --> 00:18:02.949 align:center line:-1 position:50% size:25%
shown here in blue,
from landslides,

00:18:02.949 --> 00:18:04.283 align:center line:-1 position:50% size:29%
shown here in orange,
you would get

00:18:04.283 --> 00:18:05.651 align:center line:-1 position:50% size:39%
an area-under-the-curve value
of one,

00:18:05.651 --> 00:18:08.821 align:center line:-1 position:50% size:28%
one being
perfect differentiation.

00:18:08.821 --> 00:18:10.957 align:center line:-1 position:50% size:33%
We contrast if your model
does a very poor job

00:18:10.957 --> 00:18:12.525 align:center line:-1 position:50% size:22%
of differentiating,

00:18:12.525 --> 00:18:14.060 align:center line:-1 position:50% size:42%
such as the example down here,

00:18:14.060 --> 00:18:17.296 align:center line:-1 position:50% size:24%
where it basically
randomly chooses.

00:18:17.296 --> 00:18:18.998 align:center line:-1 position:50% size:28%
It's equivalent
to randomly choosing

00:18:18.998 --> 00:18:20.866 align:center line:-1 position:50% size:25%
between landslides
and no landslides.

00:18:20.866 --> 00:18:22.868 align:center line:-1 position:50% size:39%
You would get
an area-under-the-curve value

00:18:22.868 --> 00:18:24.203 align:center line:-1 position:50% size:8%
of 0.5,

00:18:24.203 --> 00:18:26.706 align:center line:-1 position:50% size:29%
so basically equivalent
to randomly guessing.

00:18:26.706 --> 00:18:27.940 align:center line:-1 position:50% size:36%
So you want
area-under-the-curve values

00:18:27.940 --> 00:18:29.976 align:center line:-1 position:50% size:22%
of near 1, ideally,

00:18:29.976 --> 00:18:33.245 align:center line:-1 position:50% size:36%
closer to 1, further from 0.5.

00:18:33.245 --> 00:18:34.880 align:center line:-1 position:50% size:42%
And once again, the second way

00:18:34.880 --> 00:18:36.749 align:center line:-1 position:50% size:29%
we're gonna
evaluate these models

00:18:36.749 --> 00:18:39.952 align:center line:-1 position:50% size:36%
to understand these
area-under-the-curve values

00:18:39.952 --> 00:18:41.887 align:center line:-1 position:50% size:22%
are by assessing
the coefficients

00:18:41.887 --> 00:18:44.023 align:center line:-1 position:50% size:21%
of these models.

00:18:44.023 --> 00:18:46.325 align:center line:-1 position:50% size:32%
But because of
some inherent properties

00:18:46.325 --> 00:18:48.227 align:center line:-1 position:50% size:32%
within logistic regression,

00:18:48.227 --> 00:18:50.730 align:center line:-1 position:50% size:36%
we actually can't
directly compare coefficients

00:18:50.730 --> 00:18:53.999 align:center line:-1 position:50% size:37%
between the different models
that we've trained,

00:18:53.999 --> 00:18:56.268 align:center line:-1 position:50% size:42%
and so we have to convert these
into something called

00:18:56.268 --> 00:18:58.571 align:center line:-1 position:50% size:32%
average marginal effects,
but all these are,

00:18:58.571 --> 00:19:02.008 align:center line:-1 position:50% size:31%
it's a way of assessing
the effect of the variable

00:19:02.008 --> 00:19:04.710 align:center line:-1 position:50% size:30%
or predictor of interest
within the given model.

00:19:04.710 --> 00:19:07.346 align:center line:-1 position:50% size:28%
So an example of this
is shown here.

00:19:07.346 --> 00:19:10.383 align:center line:-1 position:50% size:39%
These different colors
correlate with a different model

00:19:10.383 --> 00:19:11.951 align:center line:-1 position:50% size:31%
or different training data
that was used

00:19:11.951 --> 00:19:13.819 align:center line:-1 position:50% size:29%
to develop that model.

00:19:13.819 --> 00:19:15.921 align:center line:-1 position:50% size:42%
And we have our predictor slope,

00:19:15.921 --> 00:19:17.390 align:center line:-1 position:50% size:29%
and we have
these predictor effects,

00:19:17.390 --> 00:19:21.160 align:center line:-1 position:50% size:35%
or average marginal effects
on the X-axis.

00:19:21.160 --> 00:19:23.863 align:center line:-1 position:50% size:29%
And what you can see
is that there's variation

00:19:23.863 --> 00:19:28.167 align:center line:-1 position:50% size:38%
in the sine and the magnitude
of these predictor effects.

00:19:28.167 --> 00:19:30.403 align:center line:-1 position:50% size:36%
And so if we take,
for instance, Macon County,

00:19:30.403 --> 00:19:34.206 align:center line:-1 position:50% size:29%
you can see it has
a negative AME value,

00:19:34.206 --> 00:19:37.109 align:center line:-1 position:50% size:36%
and what that tells you
is that as you increase slope

00:19:37.109 --> 00:19:40.379 align:center line:-1 position:50% size:34%
at that area, the model's
gonna cause susceptibility

00:19:40.379 --> 00:19:42.148 align:center line:-1 position:50% size:26%
to actually go down.

00:19:42.148 --> 00:19:44.417 align:center line:-1 position:50% size:32%
So that's what
the negative sine means.

00:19:44.417 --> 00:19:46.218 align:center line:-1 position:50% size:27%
In contrast to
all the other locations

00:19:46.218 --> 00:19:48.788 align:center line:-1 position:50% size:30%
or all the other models,
as you increase slope,

00:19:48.788 --> 00:19:50.723 align:center line:-1 position:50% size:25%
you get an increase
in susceptibility

00:19:50.723 --> 00:19:53.659 align:center line:-1 position:50% size:32%
or in probability,
which the model outputs.

00:19:53.659 --> 00:19:56.862 align:center line:-1 position:50% size:34%
But the magnitude
of that probability increase

00:19:56.862 --> 00:19:59.331 align:center line:-1 position:50% size:37%
is variable
depending on the magnitude

00:19:59.331 --> 00:20:01.734 align:center line:-1 position:50% size:39%
of this average marginal effect.

00:20:01.734 --> 00:20:05.204 align:center line:-1 position:50% size:33%
So how is this useful to us
for understanding

00:20:05.204 --> 00:20:06.539 align:center line:-1 position:50% size:24%
a model accuracy?

00:20:06.539 --> 00:20:10.776 align:center line:-1 position:50% size:42%
Imagine that we develop a model
only using slope,

00:20:10.776 --> 00:20:13.179 align:center line:-1 position:50% size:26%
and we only have
Macon County data,

00:20:13.179 --> 00:20:15.181 align:center line:-1 position:50% size:41%
and we're gonna train our model
on Macon County,

00:20:15.181 --> 00:20:18.484 align:center line:-1 position:50% size:37%
and we end up with
this negative predictor effect,

00:20:18.484 --> 00:20:21.187 align:center line:-1 position:50% size:28%
negative average-
marginal-effect value.

00:20:21.187 --> 00:20:22.555 align:center line:-1 position:50% size:25%
And then we try
to apply that model

00:20:22.555 --> 00:20:24.457 align:center line:-1 position:50% size:40%
to any of these other data sets.

00:20:24.457 --> 00:20:28.861 align:center line:-1 position:50% size:38%
You're gonna get a horrible fit

00:20:28.861 --> 00:20:30.596 align:center line:-1 position:50% size:36%
for these areas
where you apply that model,

00:20:30.596 --> 00:20:32.565 align:center line:-1 position:50% size:35%
because it's gonna actually
do the opposite

00:20:32.565 --> 00:20:34.900 align:center line:-1 position:50% size:34%
of what it should be
for assessing susceptibility

00:20:34.900 --> 00:20:36.836 align:center line:-1 position:50% size:23%
in those locations.

00:20:36.836 --> 00:20:39.738 align:center line:-1 position:50% size:39%
So in this way,
these average marginal effects

00:20:39.738 --> 00:20:41.974 align:center line:-1 position:50% size:30%
can give us information
as to whether or not

00:20:41.974 --> 00:20:44.577 align:center line:-1 position:50% size:40%
these different models
are compatible with each other,

00:20:44.577 --> 00:20:46.278 align:center line:-1 position:50% size:18%
or better said,

00:20:46.278 --> 00:20:48.647 align:center line:-1 position:50% size:43%
the location
where these models were trained

00:20:48.647 --> 00:20:50.516 align:center line:-1 position:50% size:42%
are compatible with one another.

00:20:51.884 --> 00:20:53.919 align:center line:-1 position:50% size:38%
And so using these two tools,

00:20:53.919 --> 00:20:57.890 align:center line:-1 position:50% size:33%
average marginal effects
and area under the curve,

00:20:57.890 --> 00:21:00.993 align:center line:-1 position:50% size:38%
we're gonna assess
these three different methods.

00:21:00.993 --> 00:21:04.430 align:center line:-1 position:50% size:34%
So first is applying models

00:21:04.430 --> 00:21:06.732 align:center line:-1 position:50% size:28%
from data-rich regions
to data-poor,

00:21:06.732 --> 00:21:08.334 align:center line:-1 position:50% size:25%
and then also
we're gonna look at

00:21:08.334 --> 00:21:10.936 align:center line:-1 position:50% size:32%
restricting by
environmental attributes.

00:21:10.936 --> 00:21:12.471 align:center line:-1 position:50% size:27%
And so the next plot,
I'm gonna show you

00:21:12.471 --> 00:21:15.407 align:center line:-1 position:50% size:32%
the area-under-the-curve
distributions

00:21:15.407 --> 00:21:16.909 align:center line:-1 position:50% size:40%
for these different experiments.

00:21:16.909 --> 00:21:18.277 align:center line:-1 position:50% size:38%
So there's a lot going on here.

00:21:18.277 --> 00:21:20.713 align:center line:-1 position:50% size:29%
We're gonna
break this figure down.

00:21:20.713 --> 00:21:22.615 align:center line:-1 position:50% size:23%
All right, so first
up here at the top

00:21:22.615 --> 00:21:24.817 align:center line:-1 position:50% size:40%
we have what we're referring to
as resubstitution,

00:21:24.817 --> 00:21:27.019 align:center line:-1 position:50% size:29%
that is the training data

00:21:27.019 --> 00:21:30.623 align:center line:-1 position:50% size:33%
is then evaluated on itself.

00:21:30.623 --> 00:21:32.091 align:center line:-1 position:50% size:30%
And so this
you would expect to get

00:21:32.091 --> 00:21:34.360 align:center line:-1 position:50% size:37%
relatively high
area-under-the-curve values,

00:21:34.360 --> 00:21:37.596 align:center line:-1 position:50% size:33%
and we do indeed
get area under the curves

00:21:37.596 --> 00:21:40.199 align:center line:-1 position:50% size:15%
around 0.7.

00:21:40.199 --> 00:21:42.067 align:center line:-1 position:50% size:35%
Lowest value is about 0.65.

00:21:42.067 --> 00:21:45.137 align:center line:-1 position:50% size:41%
But relatively high values
compared to the other examples

00:21:45.137 --> 00:21:47.072 align:center line:-1 position:50% size:27%
I'm gonna show you.

00:21:47.072 --> 00:21:49.875 align:center line:-1 position:50% size:42%
Down here is where we evaluate
the first quest--

00:21:49.875 --> 00:21:51.777 align:center line:-1 position:50% size:30%
the first method
that we're interested in.

00:21:51.777 --> 00:21:53.812 align:center line:-1 position:50% size:24%
Training model
on data-rich areas,

00:21:53.812 --> 00:21:56.482 align:center line:-1 position:50% size:26%
and then applying
to a data-poor area.

00:21:56.482 --> 00:21:59.084 align:center line:-1 position:50% size:28%
And these are models
that were trained

00:21:59.084 --> 00:22:02.021 align:center line:-1 position:50% size:34%
on all the data,
except for a given location,

00:22:02.021 --> 00:22:04.924 align:center line:-1 position:50% size:32%
so all the data except for
the Magoffin location.

00:22:04.924 --> 00:22:06.992 align:center line:-1 position:50% size:36%
And then we test that model
on that one area

00:22:06.992 --> 00:22:09.562 align:center line:-1 position:50% size:33%
that was left out,
in this case the Magoffins.

00:22:09.562 --> 00:22:10.996 align:center line:-1 position:50% size:37%
You can see here that we get

00:22:10.996 --> 00:22:14.033 align:center line:-1 position:50% size:37%
quite a bit lower
area-under-the-curve values.

00:22:14.033 --> 00:22:16.969 align:center line:-1 position:50% size:36%
So this suggests that maybe
this isn't the best approach.

00:22:16.969 --> 00:22:19.338 align:center line:-1 position:50% size:39%
Okay, how about constricting--
or restricting

00:22:19.338 --> 00:22:21.840 align:center line:-1 position:50% size:33%
based off of
environmental attributes?

00:22:21.840 --> 00:22:23.776 align:center line:-1 position:50% size:28%
That's shown up here.

00:22:23.776 --> 00:22:26.679 align:center line:-1 position:50% size:35%
So here's the two data sets
that have shared ecoregion

00:22:26.679 --> 00:22:28.714 align:center line:-1 position:50% size:37%
and physiographic provinces.

00:22:28.714 --> 00:22:30.149 align:center line:-1 position:50% size:42%
And once again,
the highest area under the curve

00:22:30.149 --> 00:22:32.551 align:center line:-1 position:50% size:33%
we're getting is about 0.6,

00:22:32.551 --> 00:22:34.887 align:center line:-1 position:50% size:30%
but we're also getting
another example of this

00:22:34.887 --> 00:22:38.824 align:center line:-1 position:50% size:42%
below 0.5,
so very poor model performance.

00:22:38.824 --> 00:22:40.993 align:center line:-1 position:50% size:40%
Shared ecoregion and different
physiographic province,

00:22:40.993 --> 00:22:42.494 align:center line:-1 position:50% size:28%
once again,
we're getting very low

00:22:42.494 --> 00:22:43.996 align:center line:-1 position:50% size:37%
area-under-the-curve values.

00:22:43.996 --> 00:22:46.198 align:center line:-1 position:50% size:35%
And really these--restricting
by these environments

00:22:46.198 --> 00:22:48.100 align:center line:-1 position:50% size:43%
are no better than when you have
different ecoregion

00:22:48.100 --> 00:22:49.468 align:center line:-1 position:50% size:37%
and physiographic provinces,

00:22:49.468 --> 00:22:52.204 align:center line:-1 position:50% size:29%
and those distributions
are shown down here.

00:22:52.204 --> 00:22:55.007 align:center line:-1 position:50% size:37%
And so it appears that
neither of these approaches,

00:22:55.007 --> 00:22:56.842 align:center line:-1 position:50% size:28%
training on data-rich,
applying to data-poor,

00:22:56.842 --> 00:22:59.245 align:center line:-1 position:50% size:33%
or restricting by
environmental similarities,

00:22:59.245 --> 00:23:01.113 align:center line:-1 position:50% size:32%
are effective in this case.

00:23:01.113 --> 00:23:02.281 align:center line:-1 position:50% size:18%
And so, why?

00:23:02.281 --> 00:23:04.250 align:center line:-1 position:50% size:27%
What's going on here
with this data

00:23:04.250 --> 00:23:05.517 align:center line:-1 position:50% size:29%
and with these models
that cause them

00:23:05.517 --> 00:23:09.888 align:center line:-1 position:50% size:33%
not to be transferable
between these locations?

00:23:09.888 --> 00:23:12.558 align:center line:-1 position:50% size:39%
Average marginal effects
are gonna give us information.

00:23:12.558 --> 00:23:15.995 align:center line:-1 position:50% size:24%
So here hopefully
the slope segment

00:23:15.995 --> 00:23:17.830 align:center line:-1 position:50% size:22%
is familiar to you,
and once again,

00:23:17.830 --> 00:23:19.898 align:center line:-1 position:50% size:24%
if we want a model
to perform well

00:23:19.898 --> 00:23:21.767 align:center line:-1 position:50% size:30%
in a different location,
you would expect these

00:23:21.767 --> 00:23:25.771 align:center line:-1 position:50% size:36%
to have similar average-
marginal-effect distributions.

00:23:25.771 --> 00:23:28.507 align:center line:-1 position:50% size:35%
And across the parameter--
or the predictors

00:23:28.507 --> 00:23:30.109 align:center line:-1 position:50% size:22%
that we include
in these models--

00:23:30.109 --> 00:23:32.011 align:center line:-1 position:50% size:33%
here's a subset of those--

00:23:32.011 --> 00:23:34.013 align:center line:-1 position:50% size:29%
you can see that
there's no consistency

00:23:34.013 --> 00:23:35.581 align:center line:-1 position:50% size:22%
in sine generally,

00:23:35.581 --> 00:23:37.883 align:center line:-1 position:50% size:41%
and definitely
no consistency in the magnitude

00:23:37.883 --> 00:23:39.852 align:center line:-1 position:50% size:32%
of these
average marginal effects.

00:23:39.852 --> 00:23:41.754 align:center line:-1 position:50% size:31%
And if we look at
the rest of the predictors

00:23:41.754 --> 00:23:44.189 align:center line:-1 position:50% size:28%
included in the model,
it's the same story.

00:23:44.189 --> 00:23:46.725 align:center line:-1 position:50% size:40%
And so because
you don't have any consistency

00:23:46.725 --> 00:23:48.627 align:center line:-1 position:50% size:32%
in these
average marginal effects,

00:23:48.627 --> 00:23:51.297 align:center line:-1 position:50% size:39%
a model trained in one location
isn't gonna perform well

00:23:51.297 --> 00:23:52.931 align:center line:-1 position:50% size:28%
in the other locations.

00:23:55.401 --> 00:23:57.803 align:center line:-1 position:50% size:36%
And so the poor
area-under-the-curve values

00:23:57.803 --> 00:23:59.505 align:center line:-1 position:50% size:36%
that we're getting are due to

00:23:59.505 --> 00:24:01.740 align:center line:-1 position:50% size:22%
these divergent
predictor effects,

00:24:01.740 --> 00:24:03.075 align:center line:-1 position:50% size:19%
as related by--

00:24:03.075 --> 00:24:07.980 align:center line:-1 position:50% size:37%
as manifest by
the average marginal effects.

00:24:07.980 --> 00:24:09.948 align:center line:-1 position:50% size:36%
How about this last method,

00:24:09.948 --> 00:24:11.216 align:center line:-1 position:50% size:37%
training and applying models

00:24:11.216 --> 00:24:13.886 align:center line:-1 position:50% size:28%
on very sparse
but uniform inventory.

00:24:13.886 --> 00:24:16.755 align:center line:-1 position:50% size:32%
Again, here, we're only
sampling 5% of the data,

00:24:16.755 --> 00:24:19.024 align:center line:-1 position:50% size:25%
developing a model
with that 5%,

00:24:19.024 --> 00:24:22.161 align:center line:-1 position:50% size:28%
and applying it
to the rest of the data.

00:24:22.161 --> 00:24:24.129 align:center line:-1 position:50% size:28%
Here's the distribution
of those areas

00:24:24.129 --> 00:24:26.031 align:center line:-1 position:50% size:27%
and the curve values
we're getting.

00:24:26.031 --> 00:24:28.701 align:center line:-1 position:50% size:36%
For reference, when we use
all the available data,

00:24:28.701 --> 00:24:30.936 align:center line:-1 position:50% size:35%
100% of the available data,

00:24:30.936 --> 00:24:33.872 align:center line:-1 position:50% size:40%
we're getting an area
under the curve values of 0.65.

00:24:33.872 --> 00:24:35.574 align:center line:-1 position:50% size:34%
But when we only use 5%,

00:24:35.574 --> 00:24:38.510 align:center line:-1 position:50% size:38%
on average, we get an area
under the curve value of 0.63.

00:24:38.510 --> 00:24:42.681 align:center line:-1 position:50% size:26%
The drop is not
that significant here.

00:24:42.681 --> 00:24:46.118 align:center line:-1 position:50% size:25%
Why is this method
of only using 5%

00:24:46.118 --> 00:24:49.555 align:center line:-1 position:50% size:26%
of the available data
effective?

00:24:49.555 --> 00:24:51.690 align:center line:-1 position:50% size:39%
Average marginal effects are
gonna give us this information.

00:24:51.690 --> 00:24:54.226 align:center line:-1 position:50% size:22%
Here, this might
look intimidating,

00:24:54.226 --> 00:24:55.594 align:center line:-1 position:50% size:34%
but we'll walk through this.

00:24:55.594 --> 00:24:57.596 align:center line:-1 position:50% size:37%
These different colored lines,

00:24:57.596 --> 00:24:58.530 align:center line:-1 position:50% size:29%
as I mentioned before,

00:24:58.530 --> 00:25:02.835 align:center line:-1 position:50% size:22%
we sampled 5%
about 100 times.

00:25:02.835 --> 00:25:04.837 align:center line:-1 position:50% size:28%
These are those
posterior distributions

00:25:04.837 --> 00:25:06.972 align:center line:-1 position:50% size:35%
for each of those iterations.

00:25:06.972 --> 00:25:08.474 align:center line:-1 position:50% size:30%
The black line here
is showing the average

00:25:08.474 --> 00:25:09.675 align:center line:-1 position:50% size:27%
of all those iterations,

00:25:09.675 --> 00:25:11.877 align:center line:-1 position:50% size:29%
and the red is showing

00:25:11.877 --> 00:25:14.146 align:center line:-1 position:50% size:35%
the average marginal effect
distribution

00:25:14.146 --> 00:25:18.083 align:center line:-1 position:50% size:34%
for a model when we used
100% of the data.

00:25:18.083 --> 00:25:19.918 align:center line:-1 position:50% size:33%
If we expect
the model to perform well,

00:25:19.918 --> 00:25:25.023 align:center line:-1 position:50% size:40%
we would expect these average
marginal effect distributions

00:25:25.023 --> 00:25:27.493 align:center line:-1 position:50% size:24%
to be well aligned
with this red curve.

00:25:27.493 --> 00:25:28.794 align:center line:-1 position:50% size:19%
And generally,

00:25:28.794 --> 00:25:33.165 align:center line:-1 position:50% size:40%
across all the predictors
that we included in our models,

00:25:33.165 --> 00:25:34.400 align:center line:-1 position:50% size:25%
that's what we see.

00:25:34.400 --> 00:25:36.502 align:center line:-1 position:50% size:33%
We see consistency here
between these red curves

00:25:36.502 --> 00:25:39.104 align:center line:-1 position:50% size:34%
and the multicolored lines.

00:25:39.104 --> 00:25:41.173 align:center line:-1 position:50% size:29%
There's consistency
in the predictor effects.

00:25:41.173 --> 00:25:42.775 align:center line:-1 position:50% size:38%
Even though we're only using

00:25:42.775 --> 00:25:47.379 align:center line:-1 position:50% size:25%
a very small subset
of the data, 5%,

00:25:47.379 --> 00:25:50.883 align:center line:-1 position:50% size:30%
it is still able
to capture the influence

00:25:50.883 --> 00:25:52.384 align:center line:-1 position:50% size:26%
of the landslide data
appropriately

00:25:52.384 --> 00:26:01.193 align:center line:-1 position:50% size:21%
to better assess
the susceptibility

00:26:01.193 --> 00:26:03.962 align:center line:-1 position:50% size:38%
across the domain of interest.

00:26:03.962 --> 00:26:06.031 align:center line:-1 position:50% size:28%
So to just kind of
drive home this point,

00:26:06.031 --> 00:26:07.699 align:center line:-1 position:50% size:27%
on the right,
we can see similarity

00:26:07.699 --> 00:26:10.903 align:center line:-1 position:50% size:33%
and the AME distributions
per slope,

00:26:10.903 --> 00:26:12.137 align:center line:-1 position:50% size:36%
and when you train a model

00:26:12.137 --> 00:26:15.107 align:center line:-1 position:50% size:29%
on each given location
individually,

00:26:15.107 --> 00:26:18.644 align:center line:-1 position:50% size:24%
you get quite a bit
of dispersion here.

00:26:18.644 --> 00:26:20.679 align:center line:-1 position:50% size:36%
To wrap up this first section,

00:26:20.679 --> 00:26:22.948 align:center line:-1 position:50% size:29%
the unique geospatial
attributes of landslides

00:26:22.948 --> 00:26:24.550 align:center line:-1 position:50% size:15%
(inaudible).

00:26:24.550 --> 00:26:26.285 align:center line:-1 position:50% size:23%
You can't really
leave an area out

00:26:26.285 --> 00:26:28.120 align:center line:-1 position:50% size:33%
and then expect
that model to perform well

00:26:28.120 --> 00:26:30.122 align:center line:-1 position:50% size:15%
in that area
you left out.

00:26:30.122 --> 00:26:31.423 align:center line:-1 position:50% size:35%
Limiting by the environment

00:26:31.423 --> 00:26:33.025 align:center line:-1 position:50% size:26%
doesn't seem
to improve anything.

00:26:33.025 --> 00:26:36.628 align:center line:-1 position:50% size:36%
You can't expect to use
Level II ecoregions and say,

00:26:36.628 --> 00:26:38.130 align:center line:-1 position:50% size:38%
we're gonna develop a model

00:26:38.130 --> 00:26:40.432 align:center line:-1 position:50% size:41%
on this area
where we have really good data

00:26:40.432 --> 00:26:41.700 align:center line:-1 position:50% size:22%
and apply it
everywhere else,

00:26:41.700 --> 00:26:46.004 align:center line:-1 position:50% size:27%
because we assume
landslide similarities.

00:26:46.004 --> 00:26:48.307 align:center line:-1 position:50% size:24%
We've shown that
that's not effective,

00:26:48.307 --> 00:26:51.376 align:center line:-1 position:50% size:32%
at least over the domains
that we analyzed.

00:26:51.376 --> 00:26:54.146 align:center line:-1 position:50% size:34%
But using a sparse
uniform landslide inventory

00:26:54.146 --> 00:26:55.747 align:center line:-1 position:50% size:27%
is relatively effective.

00:26:55.747 --> 00:26:57.516 align:center line:-1 position:50% size:32%
So even if you only have
a little bit of data

00:26:57.516 --> 00:27:00.719 align:center line:-1 position:50% size:34%
that you can collect
across your entire domain,

00:27:00.719 --> 00:27:02.588 align:center line:-1 position:50% size:37%
that's better than trying to get

00:27:02.588 --> 00:27:05.257 align:center line:-1 position:50% size:24%
a very detailed
landslide inventory

00:27:05.257 --> 00:27:11.129 align:center line:-1 position:50% size:33%
on a very small subset
of your domain of interest.

00:27:11.129 --> 00:27:13.899 align:center line:-1 position:50% size:28%
We've looked at
how to manage areas

00:27:13.899 --> 00:27:16.702 align:center line:-1 position:50% size:22%
with limited or no
landslide data.

00:27:16.702 --> 00:27:19.738 align:center line:-1 position:50% size:26%
Now we're gonna go
into how to manage

00:27:19.738 --> 00:27:21.573 align:center line:-1 position:50% size:21%
the imprecision
of landslide data

00:27:21.573 --> 00:27:26.445 align:center line:-1 position:50% size:33%
that's inherent with
these susceptibility maps.

00:27:26.445 --> 00:27:29.481 align:center line:-1 position:50% size:40%
Once again, we're gonna dive
into the choice of mapping unit.

00:27:29.481 --> 00:27:30.716 align:center line:-1 position:50% size:22%
By mapping unit,

00:27:30.716 --> 00:27:34.520 align:center line:-1 position:50% size:31%
generally, there's
two major mapping units

00:27:34.520 --> 00:27:35.888 align:center line:-1 position:50% size:21%
that people use.

00:27:35.888 --> 00:27:38.790 align:center line:-1 position:50% size:45%
The first and most common method
is the raster-based approach,

00:27:38.790 --> 00:27:41.560 align:center line:-1 position:50% size:33%
where you can
characterize susceptibility

00:27:41.560 --> 00:27:46.665 align:center line:-1 position:50% size:32%
for each pixel or grid unit
of your raster

00:27:46.665 --> 00:27:49.635 align:center line:-1 position:50% size:31%
with an output that looks
something like up here,

00:27:49.635 --> 00:27:51.603 align:center line:-1 position:50% size:21%
in the upper left.

00:27:51.603 --> 00:27:52.838 align:center line:-1 position:50% size:35%
In contrast, another method

00:27:52.838 --> 00:27:55.307 align:center line:-1 position:50% size:24%
that's been gaining
more traction

00:27:55.307 --> 00:27:57.009 align:center line:-1 position:50% size:29%
is the slope unit-based
approach.

00:27:57.009 --> 00:28:00.879 align:center line:-1 position:50% size:29%
What a slope unit is
is a way of delineating

00:28:00.879 --> 00:28:02.848 align:center line:-1 position:50% size:30%
your domain of interest

00:28:02.848 --> 00:28:05.017 align:center line:-1 position:50% size:33%
according to
drainage and divide lines.

00:28:05.017 --> 00:28:07.619 align:center line:-1 position:50% size:34%
Here we have
a watershed from Oregon.

00:28:07.619 --> 00:28:08.921 align:center line:-1 position:50% size:30%
You can see
that each of these lines

00:28:08.921 --> 00:28:15.394 align:center line:-1 position:50% size:33%
correlates with a drainage
and/or divide line.

00:28:15.394 --> 00:28:17.362 align:center line:-1 position:50% size:27%
You can then
analyze susceptibility

00:28:17.362 --> 00:28:19.698 align:center line:-1 position:50% size:20%
over the whole
area of interest.

00:28:19.698 --> 00:28:22.868 align:center line:-1 position:50% size:30%
This gives us
quite a bit of advantage

00:28:22.868 --> 00:28:25.837 align:center line:-1 position:50% size:27%
when we're facing
imprecise input data.

00:28:25.837 --> 00:28:27.039 align:center line:-1 position:50% size:37%
What are these advantages?

00:28:27.039 --> 00:28:30.409 align:center line:-1 position:50% size:32%
First, because
we're dealing with areas,

00:28:30.409 --> 00:28:32.511 align:center line:-1 position:50% size:24%
we're able to apply
more statistics

00:28:32.511 --> 00:28:34.146 align:center line:-1 position:50% size:28%
to that area of interest

00:28:34.146 --> 00:28:37.482 align:center line:-1 position:50% size:39%
instead of just taking the mean
slope value of each pixel,

00:28:37.482 --> 00:28:39.351 align:center line:-1 position:50% size:27%
which is basically
what you're limited to

00:28:39.351 --> 00:28:40.986 align:center line:-1 position:50% size:36%
with a grid-based approach.

00:28:40.986 --> 00:28:42.187 align:center line:-1 position:50% size:26%
Because you're
dealing with an area

00:28:42.187 --> 00:28:44.456 align:center line:-1 position:50% size:20%
with an array
of slope values,

00:28:44.456 --> 00:28:45.857 align:center line:-1 position:50% size:29%
you can then measure
the standard deviation

00:28:45.857 --> 00:28:47.059 align:center line:-1 position:50% size:16%
of the slope,

00:28:47.059 --> 00:28:48.293 align:center line:-1 position:50% size:26%
or the max, the min,

00:28:48.293 --> 00:28:51.263 align:center line:-1 position:50% size:37%
whatever metric
you think is most appropriate

00:28:51.263 --> 00:28:56.635 align:center line:-1 position:50% size:34%
for assessing susceptibility
over your area of interest.

00:28:56.635 --> 00:28:58.437 align:center line:-1 position:50% size:17%
Alternatively,

00:28:58.437 --> 00:29:00.505 align:center line:-1 position:50% size:30%
it's a better
way of accommodating

00:29:00.505 --> 00:29:02.841 align:center line:-1 position:50% size:26%
the imprecise nature
of your input data.

00:29:02.841 --> 00:29:05.043 align:center line:-1 position:50% size:26%
And once again,
by imprecise I mean

00:29:05.043 --> 00:29:08.180 align:center line:-1 position:50% size:31%
heterogeneous formats,
inaccurate locations,

00:29:08.180 --> 00:29:10.582 align:center line:-1 position:50% size:30%
or no timing component

00:29:10.582 --> 00:29:15.887 align:center line:-1 position:50% size:20%
and incomplete
landslide data.

00:29:15.887 --> 00:29:17.489 align:center line:-1 position:50% size:27%
This is gonna be
a little bit theoretical,

00:29:17.489 --> 00:29:20.192 align:center line:-1 position:50% size:29%
but I hope it kind of
drives home the point.

00:29:20.192 --> 00:29:22.027 align:center line:-1 position:50% size:25%
We have imprecise
input data.

00:29:22.027 --> 00:29:24.563 align:center line:-1 position:50% size:31%
The very nature
of the data that we have

00:29:24.563 --> 00:29:26.365 align:center line:-1 position:50% size:35%
across the U.S. for creating
susceptibility maps

00:29:26.365 --> 00:29:28.233 align:center line:-1 position:50% size:16%
is imprecise.

00:29:28.233 --> 00:29:29.701 align:center line:-1 position:50% size:33%
You feed it into the model,

00:29:29.701 --> 00:29:34.039 align:center line:-1 position:50% size:33%
you want the outputs
to reflect that imprecision.

00:29:34.039 --> 00:29:37.676 align:center line:-1 position:50% size:36%
But by using
very high resolution rasters,

00:29:37.676 --> 00:29:42.114 align:center line:-1 position:50% size:30%
you're actually showing
a precise output

00:29:42.114 --> 00:29:45.450 align:center line:-1 position:50% size:31%
that doesn't really match
the input data.

00:29:45.450 --> 00:29:48.353 align:center line:-1 position:50% size:36%
A metaphor that I thought of
is that it's kind of like

00:29:48.353 --> 00:29:51.323 align:center line:-1 position:50% size:30%
putting a go-kart engine
in a pickup truck.

00:29:51.323 --> 00:29:53.692 align:center line:-1 position:50% size:33%
There's an incompatibility
there.

00:29:53.692 --> 00:29:55.427 align:center line:-1 position:50% size:21%
If you try to use
this pickup truck

00:29:55.427 --> 00:29:57.996 align:center line:-1 position:50% size:26%
to go and tow a boat
or whatever,

00:29:57.996 --> 00:29:59.431 align:center line:-1 position:50% size:37%
it's not gonna work very well,

00:29:59.431 --> 00:30:01.900 align:center line:-1 position:50% size:20%
it's just gonna
crash and burn.

00:30:01.900 --> 00:30:03.735 align:center line:-1 position:50% size:31%
That's kind of equivalent
to trying to get

00:30:03.735 --> 00:30:07.606 align:center line:-1 position:50% size:26%
super-detailed,
very precise outputs

00:30:07.606 --> 00:30:08.874 align:center line:-1 position:50% size:38%
of these susceptibility models

00:30:08.874 --> 00:30:11.410 align:center line:-1 position:50% size:28%
when the input data
really is kind of limited

00:30:11.410 --> 00:30:15.247 align:center line:-1 position:50% size:28%
because it's imprecise
by nature.

00:30:15.247 --> 00:30:18.784 align:center line:-1 position:50% size:32%
To further dive into
explicitly how slope units

00:30:18.784 --> 00:30:21.119 align:center line:-1 position:50% size:27%
account for
these first two points,

00:30:21.119 --> 00:30:22.754 align:center line:-1 position:50% size:28%
I'm going to show
a couple of examples.

00:30:22.754 --> 00:30:26.024 align:center line:-1 position:50% size:36%
The first one is how to deal
with heterogeneous formats

00:30:26.024 --> 00:30:29.127 align:center line:-1 position:50% size:26%
and then inaccurate
landslide locations.

00:30:29.127 --> 00:30:30.962 align:center line:-1 position:50% size:35%
The grid-based approach is
the most common way

00:30:30.962 --> 00:30:33.765 align:center line:-1 position:50% size:31%
of dealing with
this variation in formats.

00:30:33.765 --> 00:30:38.170 align:center line:-1 position:50% size:31%
It's to convert
the polygons into points,

00:30:38.170 --> 00:30:40.305 align:center line:-1 position:50% size:40%
so therefore, they're compatible
with the other points.

00:30:40.305 --> 00:30:42.274 align:center line:-1 position:50% size:38%
Again, this is really applicable
for large regions

00:30:42.274 --> 00:30:46.645 align:center line:-1 position:50% size:30%
that we're talking about
through this whole talk.

00:30:46.645 --> 00:30:48.647 align:center line:-1 position:50% size:23%
You can imagine
that by converting

00:30:48.647 --> 00:30:51.717 align:center line:-1 position:50% size:34%
these whole polygons
into a representative point,

00:30:51.717 --> 00:30:55.454 align:center line:-1 position:50% size:24%
you're losing
a lot of information

00:30:55.454 --> 00:30:56.855 align:center line:-1 position:50% size:37%
that would be available to us

00:30:56.855 --> 00:30:58.690 align:center line:-1 position:50% size:28%
if we were able to use
the whole polygon,

00:30:58.690 --> 00:31:00.559 align:center line:-1 position:50% size:36%
but that's really not tractable

00:31:00.559 --> 00:31:05.063 align:center line:-1 position:50% size:26%
when you're dealing
with large domains.

00:31:05.063 --> 00:31:07.099 align:center line:-1 position:50% size:30%
By contrast,
the slope unit approach

00:31:07.099 --> 00:31:09.534 align:center line:-1 position:50% size:36%
converts everything to areas
rather than points.

00:31:09.534 --> 00:31:12.704 align:center line:-1 position:50% size:34%
That is, any landslide point
or landslide polygon

00:31:12.704 --> 00:31:15.807 align:center line:-1 position:50% size:22%
that intersects
a slope unit area

00:31:15.807 --> 00:31:20.378 align:center line:-1 position:50% size:33%
would then be mapped
as containing a landslide,

00:31:20.378 --> 00:31:23.815 align:center line:-1 position:50% size:33%
and then you can develop
your model based off that.

00:31:23.815 --> 00:31:25.884 align:center line:-1 position:50% size:37%
As far as landslide accuracy,

00:31:25.884 --> 00:31:27.719 align:center line:-1 position:50% size:27%
at least as far
as the position goes,

00:31:27.719 --> 00:31:31.189 align:center line:-1 position:50% size:33%
Jacobs et al. back in 2020
tackled this problem,

00:31:31.189 --> 00:31:33.391 align:center line:-1 position:50% size:30%
where they compared
grid-based approaches

00:31:33.391 --> 00:31:35.360 align:center line:-1 position:50% size:32%
to slope unit approaches,

00:31:35.360 --> 00:31:39.030 align:center line:-1 position:50% size:42%
and they measured how resilient
these methods were

00:31:39.030 --> 00:31:41.867 align:center line:-1 position:50% size:29%
for imprecise locations
of the landslides.

00:31:41.867 --> 00:31:43.268 align:center line:-1 position:50% size:22%
They did this by--

00:31:43.268 --> 00:31:45.003 align:center line:-1 position:50% size:28%
they had polygons
for all their landslides,

00:31:45.003 --> 00:31:48.406 align:center line:-1 position:50% size:39%
and they then buffered
those polygons by cell amount

00:31:48.406 --> 00:31:50.475 align:center line:-1 position:50% size:21%
and randomly
sampled a point

00:31:50.475 --> 00:31:51.977 align:center line:-1 position:50% size:36%
within that buffered polygon.

00:31:51.977 --> 00:31:54.312 align:center line:-1 position:50% size:34%
That's basically replicating
what it'd be like

00:31:54.312 --> 00:31:57.415 align:center line:-1 position:50% size:30%
if you had slightly
imprecise or inaccurate

00:31:57.415 --> 00:31:59.284 align:center line:-1 position:50% size:19%
landslide data.

00:31:59.284 --> 00:32:01.253 align:center line:-1 position:50% size:35%
That's in contrast
to these two other methods

00:32:01.253 --> 00:32:06.291 align:center line:-1 position:50% size:36%
where you sampled
within the landslide polygon.

00:32:06.291 --> 00:32:07.559 align:center line:-1 position:50% size:27%
What they found was
that they got

00:32:07.559 --> 00:32:10.195 align:center line:-1 position:50% size:29%
a larger drop
in model performance.

00:32:10.195 --> 00:32:12.430 align:center line:-1 position:50% size:29%
Here we have the area
under the curve metric

00:32:12.430 --> 00:32:15.534 align:center line:-1 position:50% size:27%
using the pixel-based
approaches,

00:32:15.534 --> 00:32:17.269 align:center line:-1 position:50% size:34%
compared to the slope unit
approaches,

00:32:17.269 --> 00:32:19.704 align:center line:-1 position:50% size:33%
where, here on the right,
you can see that the drop

00:32:19.704 --> 00:32:24.509 align:center line:-1 position:50% size:28%
in model performance
is less drastic.

00:32:24.509 --> 00:32:26.645 align:center line:-1 position:50% size:27%
If slope units have
all these advantages,

00:32:26.645 --> 00:32:29.514 align:center line:-1 position:50% size:31%
hopefully, I've illustrated
the advantages to you

00:32:29.514 --> 00:32:31.550 align:center line:-1 position:50% size:38%
and you're a little bit
convinced that these might be

00:32:31.550 --> 00:32:33.618 align:center line:-1 position:50% size:38%
something worth entertaining,

00:32:33.618 --> 00:32:35.854 align:center line:-1 position:50% size:29%
why don't more people
use these?

00:32:35.854 --> 00:32:37.322 align:center line:-1 position:50% size:34%
Well, after I was convinced

00:32:37.322 --> 00:32:39.157 align:center line:-1 position:50% size:25%
that slope units are
the way forward,

00:32:39.157 --> 00:32:41.827 align:center line:-1 position:50% size:39%
at least for the size of domains
we're interested in

00:32:41.827 --> 00:32:44.629 align:center line:-1 position:50% size:28%
and the data we have
available to us,

00:32:44.629 --> 00:32:46.531 align:center line:-1 position:50% size:29%
I ran into the problem
of the current methods

00:32:46.531 --> 00:32:49.167 align:center line:-1 position:50% size:32%
for delineating these
are either very laborious,

00:32:49.167 --> 00:32:52.070 align:center line:-1 position:50% size:29%
you can manually
delineate these things,

00:32:52.070 --> 00:32:55.340 align:center line:-1 position:50% size:41%
very slow, and/or
if you use an automatic method,

00:32:55.340 --> 00:32:57.642 align:center line:-1 position:50% size:18%
most of them,

00:32:57.642 --> 00:33:01.146 align:center line:-1 position:50% size:28%
the size of the sloping
is arbitrarily set.

00:33:01.146 --> 00:33:04.616 align:center line:-1 position:50% size:37%
There are examples of those
that are not arbitrary.

00:33:04.616 --> 00:33:07.652 align:center line:-1 position:50% size:35%
They do parameter sweeps
that are very, very slow

00:33:07.652 --> 00:33:09.688 align:center line:-1 position:50% size:30%
for optimizing the scale
of these slope units.

00:33:09.688 --> 00:33:12.123 align:center line:-1 position:50% size:25%
So, there isn't really
anything out there

00:33:12.123 --> 00:33:15.827 align:center line:-1 position:50% size:33%
that is fast and effective
at delineating slope units.

00:33:15.827 --> 00:33:19.631 align:center line:-1 position:50% size:33%
So we wrote an algorithm
that I'm calling SUMak,

00:33:19.631 --> 00:33:21.132 align:center line:-1 position:50% size:26%
or Slope Unit Maker,

00:33:21.132 --> 00:33:22.968 align:center line:-1 position:50% size:22%
and it overcomes
these challenges

00:33:22.968 --> 00:33:25.337 align:center line:-1 position:50% size:30%
of the past approaches.

00:33:25.337 --> 00:33:26.504 align:center line:-1 position:50% size:23%
First, it's very fast.

00:33:26.504 --> 00:33:28.673 align:center line:-1 position:50% size:26%
It automatically runs
in parallel.

00:33:28.673 --> 00:33:30.508 align:center line:-1 position:50% size:21%
You can throw it
on a cluster,

00:33:30.508 --> 00:33:31.843 align:center line:-1 position:50% size:30%
and it's parameter free.

00:33:31.843 --> 00:33:34.012 align:center line:-1 position:50% size:35%
There isn't anything
that a user needs to dictate

00:33:34.012 --> 00:33:35.747 align:center line:-1 position:50% size:26%
within this algorithm.

00:33:35.747 --> 00:33:38.817 align:center line:-1 position:50% size:34%
It automatically adjusts
the scale of the slope units

00:33:38.817 --> 00:33:41.486 align:center line:-1 position:50% size:27%
according to the local
terrain morphology.

00:33:41.486 --> 00:33:43.321 align:center line:-1 position:50% size:32%
I'll get into more explicitly
how that works

00:33:43.321 --> 00:33:46.791 align:center line:-1 position:50% size:26%
in the coming slides.

00:33:46.791 --> 00:33:48.326 align:center line:-1 position:50% size:40%
There's minimal preprocessing.

00:33:48.326 --> 00:33:51.096 align:center line:-1 position:50% size:25%
You give it a DEM
and area of interest

00:33:51.096 --> 00:33:53.265 align:center line:-1 position:50% size:20%
and it'll spit out
the slope units.

00:33:53.265 --> 00:33:54.799 align:center line:-1 position:50% size:24%
It's all open source
programming,

00:33:54.799 --> 00:33:58.036 align:center line:-1 position:50% size:37%
GRASS and R and TauDEM,

00:33:58.036 --> 00:34:02.173 align:center line:-1 position:50% size:30%
so a user could adapt it
to their needs.

00:34:02.173 --> 00:34:05.076 align:center line:-1 position:50% size:39%
How does this algorithm work?

00:34:05.076 --> 00:34:09.247 align:center line:-1 position:50% size:37%
It uses something referred to
as a constant drop law.

00:34:09.247 --> 00:34:12.017 align:center line:-1 position:50% size:32%
A constant drop law says

00:34:12.017 --> 00:34:14.452 align:center line:-1 position:50% size:30%
as you go up
Strahler stream orders,

00:34:14.452 --> 00:34:18.390 align:center line:-1 position:50% size:38%
shown here in this figure, left,

00:34:18.390 --> 00:34:20.825 align:center line:-1 position:50% size:35%
if you have
a fluvial dominated system,

00:34:20.825 --> 00:34:22.994 align:center line:-1 position:50% size:29%
where the morphology
is controlled

00:34:22.994 --> 00:34:24.963 align:center line:-1 position:50% size:26%
by fluvial processes,

00:34:24.963 --> 00:34:27.532 align:center line:-1 position:50% size:42%
you expect that as you go across
these Strahler stream orders,

00:34:27.532 --> 00:34:30.936 align:center line:-1 position:50% size:34%
you would get a consistent
drop in elevation

00:34:30.936 --> 00:34:33.204 align:center line:-1 position:50% size:38%
between these stream orders.

00:34:33.204 --> 00:34:35.106 align:center line:-1 position:50% size:25%
Once you get
to the stream order

00:34:35.106 --> 00:34:38.243 align:center line:-1 position:50% size:28%
that you get
a statistical difference

00:34:38.243 --> 00:34:40.111 align:center line:-1 position:50% size:30%
in the drop in elevation,

00:34:40.111 --> 00:34:42.380 align:center line:-1 position:50% size:30%
it's been shown
by Tarbotton and others

00:34:42.380 --> 00:34:45.684 align:center line:-1 position:50% size:34%
that you're no longer in
a fluvial dominated system

00:34:45.684 --> 00:34:49.621 align:center line:-1 position:50% size:38%
but rather
a hillslope dominated system.

00:34:49.621 --> 00:34:52.557 align:center line:-1 position:50% size:36%
So what we do is that we go
through different thresholds

00:34:52.557 --> 00:34:55.026 align:center line:-1 position:50% size:28%
for flow accumulation,

00:34:55.026 --> 00:34:59.898 align:center line:-1 position:50% size:38%
then we find where
this constant drop law breaks.

00:34:59.898 --> 00:35:00.999 align:center line:-1 position:50% size:28%
And right at that point,

00:35:00.999 --> 00:35:02.400 align:center line:-1 position:50% size:40%
we delineate these watersheds.

00:35:02.400 --> 00:35:04.402 align:center line:-1 position:50% size:28%
That's basically
the largest watershed

00:35:04.402 --> 00:35:07.505 align:center line:-1 position:50% size:25%
that captures
hillslope processes,

00:35:07.505 --> 00:35:10.675 align:center line:-1 position:50% size:40%
then we split these watersheds
by their longest flow paths.

00:35:10.675 --> 00:35:13.678 align:center line:-1 position:50% size:40%
So this output is what would be
recognized in the field

00:35:13.678 --> 00:35:16.114 align:center line:-1 position:50% size:14%
as a slope.

00:35:16.114 --> 00:35:19.451 align:center line:-1 position:50% size:36%
Just to compare our method
to the other method

00:35:19.451 --> 00:35:20.785 align:center line:-1 position:50% size:29%
that's commonly used,

00:35:20.785 --> 00:35:23.455 align:center line:-1 position:50% size:26%
it's a great algorithm
developed by Alvioli,

00:35:23.455 --> 00:35:25.457 align:center line:-1 position:50% size:34%
and it's called r.slopeunits.

00:35:25.457 --> 00:35:27.225 align:center line:-1 position:50% size:25%
It's kind of really
laid the foundation,

00:35:27.225 --> 00:35:29.861 align:center line:-1 position:50% size:32%
really allowed slope units
to gain traction.

00:35:29.861 --> 00:35:32.430 align:center line:-1 position:50% size:22%
But it's very slow.

00:35:32.430 --> 00:35:34.632 align:center line:-1 position:50% size:24%
So we delineated
the island of Sicily,

00:35:34.632 --> 00:35:36.968 align:center line:-1 position:50% size:35%
which is an area
that was already delineated

00:35:36.968 --> 00:35:38.837 align:center line:-1 position:50% size:23%
using r.slopeunits,

00:35:38.837 --> 00:35:41.473 align:center line:-1 position:50% size:35%
and we wanted to compare
how our algorithm

00:35:41.473 --> 00:35:42.574 align:center line:-1 position:50% size:25%
compared to theirs.

00:35:42.574 --> 00:35:44.542 align:center line:-1 position:50% size:25%
And you can see
the relative outputs.

00:35:44.542 --> 00:35:47.412 align:center line:-1 position:50% size:24%
TauDEM here
should be SUMak.

00:35:47.412 --> 00:35:49.514 align:center line:-1 position:50% size:33%
So ours are shown in red,
theirs is shown in blue.

00:35:49.514 --> 00:35:53.485 align:center line:-1 position:50% size:34%
There isn't a perfect match
between the two,

00:35:53.485 --> 00:35:56.321 align:center line:-1 position:50% size:24%
but if you look
at the whole island

00:35:56.321 --> 00:35:59.457 align:center line:-1 position:50% size:32%
and quantify a metric
that they use to optimize,

00:35:59.457 --> 00:36:02.794 align:center line:-1 position:50% size:30%
r.slopeunits uses
to optimize their model,

00:36:02.794 --> 00:36:04.129 align:center line:-1 position:50% size:19%
which is this V.

00:36:04.129 --> 00:36:06.464 align:center line:-1 position:50% size:33%
It's an area of
normalized local variance.

00:36:06.464 --> 00:36:07.732 align:center line:-1 position:50% size:28%
What that basically is,

00:36:07.732 --> 00:36:11.236 align:center line:-1 position:50% size:37%
it's a measure
of the homogeneity of aspect

00:36:11.236 --> 00:36:12.971 align:center line:-1 position:50% size:29%
within each slope unit,

00:36:12.971 --> 00:36:15.306 align:center line:-1 position:50% size:27%
which is argued to be
a good slope unit

00:36:15.306 --> 00:36:18.009 align:center line:-1 position:50% size:28%
should have
pretty uniform aspect.

00:36:18.009 --> 00:36:19.577 align:center line:-1 position:50% size:34%
These are the distributions
that we get.

00:36:19.577 --> 00:36:22.380 align:center line:-1 position:50% size:23%
So using SUMak,
using r.slopeunits,

00:36:22.380 --> 00:36:25.316 align:center line:-1 position:50% size:41%
there's quite a bit of overlap
between these two distributions,

00:36:25.316 --> 00:36:26.851 align:center line:-1 position:50% size:21%
suggesting that,

00:36:26.851 --> 00:36:29.454 align:center line:-1 position:50% size:20%
while theirs is
a little bit better

00:36:29.454 --> 00:36:32.424 align:center line:-1 position:50% size:25%
as far as this metric
is concerned,

00:36:32.424 --> 00:36:34.192 align:center line:-1 position:50% size:32%
ours is remarkably faster.

00:36:34.192 --> 00:36:36.928 align:center line:-1 position:50% size:34%
In fact,
across the country of Italy,

00:36:36.928 --> 00:36:39.264 align:center line:-1 position:50% size:34%
it would take our algorithm
about three days

00:36:39.264 --> 00:36:41.232 align:center line:-1 position:50% size:33%
to process the same area

00:36:41.232 --> 00:36:44.969 align:center line:-1 position:50% size:33%
that took r.slopeunits
three months to delineate,

00:36:44.969 --> 00:36:46.838 align:center line:-1 position:50% size:30%
and they had four times
the number of cores

00:36:46.838 --> 00:36:48.373 align:center line:-1 position:50% size:34%
and five times the memory
that we had.

00:36:48.373 --> 00:36:51.776 align:center line:-1 position:50% size:30%
I delineated this
just on my local laptop.

00:36:51.776 --> 00:36:56.948 align:center line:-1 position:50% size:26%
So, really, SUMak is
an effective means

00:36:56.948 --> 00:37:00.819 align:center line:-1 position:50% size:33%
of delineating large areas
for slope units,

00:37:00.819 --> 00:37:04.389 align:center line:-1 position:50% size:39%
and you get very good results.

00:37:04.389 --> 00:37:07.125 align:center line:-1 position:50% size:29%
Now I want to illustrate
the uses of slope units

00:37:07.125 --> 00:37:09.794 align:center line:-1 position:50% size:29%
and bring it back
to how it's able to deal

00:37:09.794 --> 00:37:12.363 align:center line:-1 position:50% size:34%
with imprecise data, again,

00:37:12.363 --> 00:37:14.766 align:center line:-1 position:50% size:32%
putting the go-kart motor
in a pickup truck.

00:37:14.766 --> 00:37:17.769 align:center line:-1 position:50% size:34%
We're going to try
to illustrate that point here.

00:37:17.769 --> 00:37:20.371 align:center line:-1 position:50% size:38%
We're going to do this
with three different locations--

00:37:20.371 --> 00:37:23.441 align:center line:-1 position:50% size:29%
two locations
from Western Oregon,

00:37:23.441 --> 00:37:26.878 align:center line:-1 position:50% size:42%
we have the Umpqua Watershed

00:37:26.878 --> 00:37:30.315 align:center line:-1 position:50% size:38%
and the Calapooia Watershed
in Western Oregon,

00:37:30.315 --> 00:37:34.719 align:center line:-1 position:50% size:36%
and also, we're going to use
the island of Puerto Rico.

00:37:34.719 --> 00:37:35.954 align:center line:-1 position:50% size:37%
For the Puerto Rico data set,

00:37:35.954 --> 00:37:38.156 align:center line:-1 position:50% size:32%
we used an event-based
landslide catalogue

00:37:38.156 --> 00:37:41.793 align:center line:-1 position:50% size:40%
from the 2017 Hurricane Maria.

00:37:41.793 --> 00:37:44.129 align:center line:-1 position:50% size:36%
We'll dig into
how slope units are effective

00:37:44.129 --> 00:37:49.334 align:center line:-1 position:50% size:31%
for event-based
susceptibility maps also.

00:37:49.334 --> 00:37:51.336 align:center line:-1 position:50% size:38%
As I said, we're going
to compare slope unit outputs

00:37:51.336 --> 00:37:52.670 align:center line:-1 position:50% size:28%
to grid-based outputs,

00:37:52.670 --> 00:37:56.007 align:center line:-1 position:50% size:34%
and because we're dealing
with variable formats

00:37:56.007 --> 00:37:58.376 align:center line:-1 position:50% size:33%
in these data sets,
both points and polygons,

00:37:58.376 --> 00:38:01.346 align:center line:-1 position:50% size:37%
we need to use some sort
of standardization procedure.

00:38:01.346 --> 00:38:05.717 align:center line:-1 position:50% size:26%
We use an array
of different methods.

00:38:05.717 --> 00:38:07.685 align:center line:-1 position:50% size:22%
First, for using
a 10 meter DEM,

00:38:07.685 --> 00:38:10.288 align:center line:-1 position:50% size:25%
we take the highest
elevation point

00:38:10.288 --> 00:38:11.923 align:center line:-1 position:50% size:26%
within each polygon

00:38:11.923 --> 00:38:14.526 align:center line:-1 position:50% size:21%
and convert that
into a point.

00:38:14.526 --> 00:38:17.095 align:center line:-1 position:50% size:39%
We also take the middle value
of each polygon

00:38:17.095 --> 00:38:19.264 align:center line:-1 position:50% size:34%
and convert that
into a representative point.

00:38:19.264 --> 00:38:20.632 align:center line:-1 position:50% size:28%
Then we take
the highest point also,

00:38:20.632 --> 00:38:22.400 align:center line:-1 position:50% size:30%
using a 30 meter DEM,

00:38:22.400 --> 00:38:25.436 align:center line:-1 position:50% size:30%
to make a slightly
more conservative map

00:38:25.436 --> 00:38:29.340 align:center line:-1 position:50% size:35%
using larger mapping units.

00:38:29.340 --> 00:38:31.176 align:center line:-1 position:50% size:28%
Then also we're going
to take multiple points

00:38:31.176 --> 00:38:33.444 align:center line:-1 position:50% size:35%
from each polygon
using a 200 meter spacing,

00:38:33.444 --> 00:38:35.747 align:center line:-1 position:50% size:36%
again, with a 10 meter DEM

00:38:35.747 --> 00:38:38.116 align:center line:-1 position:50% size:32%
just to see if any of these
can approach

00:38:38.116 --> 00:38:41.519 align:center line:-1 position:50% size:43%
the model performance of
the slope unit-based approaches.

00:38:41.519 --> 00:38:43.421 align:center line:-1 position:50% size:36%
For our models
that we're going to compare,

00:38:43.421 --> 00:38:44.489 align:center line:-1 position:50% size:24%
we're going to use
the frequentist

00:38:44.489 --> 00:38:46.157 align:center line:-1 position:50% size:25%
Logistic Regression
and XGBoost.

00:38:46.157 --> 00:38:48.259 align:center line:-1 position:50% size:36%
We use two different models
just so we can see

00:38:48.259 --> 00:38:49.527 align:center line:-1 position:50% size:23%
that we're getting
consistent results

00:38:49.527 --> 00:38:50.828 align:center line:-1 position:50% size:22%
between the two.

00:38:50.828 --> 00:38:53.031 align:center line:-1 position:50% size:31%
As far as predictor data,

00:38:53.031 --> 00:38:54.766 align:center line:-1 position:50% size:27%
we're going to keep it
very simple.

00:38:54.766 --> 00:38:57.569 align:center line:-1 position:50% size:31%
We're only using
elevation, slope, aspect,

00:38:57.569 --> 00:39:00.338 align:center line:-1 position:50% size:34%
roughness, and curvature,
as far as static variables.

00:39:00.338 --> 00:39:02.407 align:center line:-1 position:50% size:37%
For the Puerto Rico data set,

00:39:02.407 --> 00:39:04.709 align:center line:-1 position:50% size:34%
because it is
an event-based catalogue,

00:39:04.709 --> 00:39:08.313 align:center line:-1 position:50% size:34%
we're also going to include
a trigger parameter

00:39:08.313 --> 00:39:10.415 align:center line:-1 position:50% size:28%
that's soil moisture
using the SMAP data,

00:39:10.415 --> 00:39:11.916 align:center line:-1 position:50% size:33%
which is shown by people

00:39:11.916 --> 00:39:13.651 align:center line:-1 position:50% size:39%
who have looked more closely
at Puerto Rico

00:39:13.651 --> 00:39:17.989 align:center line:-1 position:50% size:39%
as being an effective measure
for susceptibility

00:39:17.989 --> 00:39:20.225 align:center line:-1 position:50% size:26%
for Hurricane Maria.

00:39:20.225 --> 00:39:22.560 align:center line:-1 position:50% size:36%
For the two slope unit maps,

00:39:22.560 --> 00:39:25.897 align:center line:-1 position:50% size:39%
we're going to develop models
using only the median values,

00:39:25.897 --> 00:39:28.566 align:center line:-1 position:50% size:29%
predictor values,
within each slope unit,

00:39:28.566 --> 00:39:29.867 align:center line:-1 position:50% size:37%
and then also another model

00:39:29.867 --> 00:39:33.071 align:center line:-1 position:50% size:34%
using the median
standard deviation values.

00:39:33.071 --> 00:39:35.640 align:center line:-1 position:50% size:38%
Model evaluation, we're going
to use area under the curve.

00:39:35.640 --> 00:39:37.709 align:center line:-1 position:50% size:36%
We're also going to throw in
an additional metric,

00:39:37.709 --> 00:39:39.644 align:center line:-1 position:50% size:20%
which is called
the Brier score,

00:39:39.644 --> 00:39:41.212 align:center line:-1 position:50% size:30%
which is just
the mean squared error

00:39:41.212 --> 00:39:44.315 align:center line:-1 position:50% size:32%
between the observation
and our model outputs.

00:39:44.315 --> 00:39:46.517 align:center line:-1 position:50% size:36%
We want lower values there.

00:39:46.517 --> 00:39:47.785 align:center line:-1 position:50% size:31%
Let's look at the outputs,

00:39:47.785 --> 00:39:50.788 align:center line:-1 position:50% size:36%
let's look at the results here.

00:39:50.788 --> 00:39:54.359 align:center line:-1 position:50% size:43%
Up here at the top we have
the two raster-based approaches.

00:39:54.359 --> 00:39:57.061 align:center line:-1 position:50% size:41%
I'm showing the best-performing
results here

00:39:57.061 --> 00:39:58.396 align:center line:-1 position:50% size:28%
from the raster-based
approaches.

00:39:58.396 --> 00:40:00.665 align:center line:-1 position:50% size:40%
On the bottom,
we have the slope unit outputs.

00:40:00.665 --> 00:40:04.702 align:center line:-1 position:50% size:40%
These are outputs using
median and standard deviation

00:40:04.702 --> 00:40:07.071 align:center line:-1 position:50% size:24%
of each slope unit.

00:40:07.071 --> 00:40:08.539 align:center line:-1 position:50% size:32%
A few observations here.

00:40:08.539 --> 00:40:12.277 align:center line:-1 position:50% size:37%
Generally, they both highlight
similar locations

00:40:12.277 --> 00:40:17.282 align:center line:-1 position:50% size:35%
of having both low and high
susceptibility values.

00:40:17.282 --> 00:40:21.019 align:center line:-1 position:50% size:37%
But the raster-based
approaches are able to show

00:40:21.019 --> 00:40:24.222 align:center line:-1 position:50% size:27%
a lot more variability
in these probabilities

00:40:24.222 --> 00:40:26.057 align:center line:-1 position:50% size:29%
across these domains.

00:40:26.057 --> 00:40:28.359 align:center line:-1 position:50% size:30%
While this level of detail
can be advantageous,

00:40:28.359 --> 00:40:32.664 align:center line:-1 position:50% size:42%
the high resolution rasters
may be requiring too much detail

00:40:32.664 --> 00:40:35.967 align:center line:-1 position:50% size:33%
from the models trained
with imprecise input data.

00:40:35.967 --> 00:40:38.569 align:center line:-1 position:50% size:42%
This can lead to poor confidence
in the model outputs.

00:40:38.569 --> 00:40:41.839 align:center line:-1 position:50% size:31%
Again, these maps were
created with data

00:40:41.839 --> 00:40:45.610 align:center line:-1 position:50% size:36%
with heterogeneous formats,
points and polygons,

00:40:45.610 --> 00:40:48.479 align:center line:-1 position:50% size:41%
somewhat inaccurate, probably,
locations.

00:40:48.479 --> 00:40:50.948 align:center line:-1 position:50% size:36%
There's no timing
with any of these landslides.

00:40:50.948 --> 00:40:54.352 align:center line:-1 position:50% size:37%
There's no real measure
of the level of completeness.

00:40:54.352 --> 00:41:00.591 align:center line:-1 position:50% size:37%
Once again, I don't even
really know what that means.

00:41:00.591 --> 00:41:02.093 align:center line:-1 position:50% size:25%
There's no way
of really measuring

00:41:02.093 --> 00:41:05.163 align:center line:-1 position:50% size:38%
how complete an inventory is.

00:41:05.163 --> 00:41:06.964 align:center line:-1 position:50% size:40%
In contrast,
the slope unit-based approach,

00:41:06.964 --> 00:41:09.434 align:center line:-1 position:50% size:28%
because it uses
a larger mapping unit,

00:41:09.434 --> 00:41:11.269 align:center line:-1 position:50% size:29%
it's more conservative,

00:41:11.269 --> 00:41:12.337 align:center line:-1 position:50% size:24%
and in my opinion,

00:41:12.337 --> 00:41:13.938 align:center line:-1 position:50% size:30%
it's able
to better accommodate

00:41:13.938 --> 00:41:16.240 align:center line:-1 position:50% size:26%
the imprecise nature
of this input data--

00:41:16.240 --> 00:41:23.114 align:center line:-1 position:50% size:33%
again, the go-kart engine
inside of the pickup truck.

00:41:23.114 --> 00:41:26.651 align:center line:-1 position:50% size:30%
That's what we're trying
to not have here.

00:41:26.651 --> 00:41:29.587 align:center line:-1 position:50% size:31%
Just to further compare
the Umpqua Watershed,

00:41:29.587 --> 00:41:32.156 align:center line:-1 position:50% size:28%
here's the distribution
of known landslides

00:41:32.156 --> 00:41:34.225 align:center line:-1 position:50% size:22%
over this domain.

00:41:34.225 --> 00:41:36.661 align:center line:-1 position:50% size:34%
You can see that
the landslide deposits here

00:41:36.661 --> 00:41:39.230 align:center line:-1 position:50% size:37%
cover almost the whole area.

00:41:39.230 --> 00:41:42.200 align:center line:-1 position:50% size:18%
In my opinion,

00:41:42.200 --> 00:41:45.703 align:center line:-1 position:50% size:40%
what you would expect a useful
susceptibility map to do,

00:41:45.703 --> 00:41:47.438 align:center line:-1 position:50% size:30%
at least over the scales
we're interested in here

00:41:47.438 --> 00:41:48.806 align:center line:-1 position:50% size:25%
with our input data,

00:41:48.806 --> 00:41:51.242 align:center line:-1 position:50% size:22%
is to model
most of this area

00:41:51.242 --> 00:41:53.244 align:center line:-1 position:50% size:35%
as being highly susceptible.

00:41:53.244 --> 00:41:54.879 align:center line:-1 position:50% size:31%
Looking at
the Calapooia data sets,

00:41:54.879 --> 00:41:56.280 align:center line:-1 position:50% size:29%
we see a similar thing.

00:41:56.280 --> 00:41:58.015 align:center line:-1 position:50% size:37%
Again, here's
the distribution of landslides,

00:41:58.015 --> 00:42:00.051 align:center line:-1 position:50% size:41%
here are the two model outputs.

00:42:00.051 --> 00:42:03.654 align:center line:-1 position:50% size:37%
It we want
a more quantitative measure,

00:42:03.654 --> 00:42:05.757 align:center line:-1 position:50% size:24%
this effect
we've here plotted,

00:42:05.757 --> 00:42:09.227 align:center line:-1 position:50% size:29%
the percent area
against the probability.

00:42:09.227 --> 00:42:10.895 align:center line:-1 position:50% size:26%
If we look
at the Umpqua data,

00:42:10.895 --> 00:42:13.464 align:center line:-1 position:50% size:36%
the two slope unit-based
approaches are shown here

00:42:13.464 --> 00:42:14.832 align:center line:-1 position:50% size:26%
in green and yellow.

00:42:14.832 --> 00:42:19.404 align:center line:-1 position:50% size:40%
You can see that it gets a much
higher proportion of the area

00:42:19.404 --> 00:42:23.474 align:center line:-1 position:50% size:28%
at higher probabilities,

00:42:23.474 --> 00:42:25.309 align:center line:-1 position:50% size:35%
in contrast to
the grid-based approaches,

00:42:25.309 --> 00:42:27.412 align:center line:-1 position:50% size:28%
where, really,
the highest proportion

00:42:27.412 --> 00:42:29.080 align:center line:-1 position:50% size:16%
is around .5.

00:42:29.080 --> 00:42:32.150 align:center line:-1 position:50% size:24%
It's more nebulous
in that sense.

00:42:32.150 --> 00:42:34.852 align:center line:-1 position:50% size:22%
At the Calapooia,
it's less stark,

00:42:34.852 --> 00:42:37.088 align:center line:-1 position:50% size:28%
but you can also see
kind of a similar thing,

00:42:37.088 --> 00:42:40.091 align:center line:-1 position:50% size:39%
where both the highs and lows
are more extreme

00:42:40.091 --> 00:42:41.692 align:center line:-1 position:50% size:32%
with the slope unit-based
approaches

00:42:41.692 --> 00:42:44.061 align:center line:-1 position:50% size:32%
as opposed to the raster.

00:42:44.061 --> 00:42:45.630 align:center line:-1 position:50% size:34%
Let's look at
the event-based catalogue

00:42:45.630 --> 00:42:47.899 align:center line:-1 position:50% size:23%
from Puerto Rico.

00:42:47.899 --> 00:42:50.334 align:center line:-1 position:50% size:28%
Here's the distribution
of landslides

00:42:50.334 --> 00:42:51.936 align:center line:-1 position:50% size:28%
from Hurricane Maria.

00:42:51.936 --> 00:42:53.971 align:center line:-1 position:50% size:34%
At the top, we have
the raster-based approach

00:42:53.971 --> 00:42:55.873 align:center line:-1 position:50% size:30%
using a 30 meter DEM,

00:42:55.873 --> 00:42:57.041 align:center line:-1 position:50% size:37%
and then down at the bottom,

00:42:57.041 --> 00:42:59.110 align:center line:-1 position:50% size:40%
we have
the slope unit-based approach.

00:42:59.110 --> 00:43:02.713 align:center line:-1 position:50% size:28%
I think we're seeing
the same things here.

00:43:02.713 --> 00:43:06.617 align:center line:-1 position:50% size:36%
Looking at the percent area,
once again,

00:43:06.617 --> 00:43:09.287 align:center line:-1 position:50% size:38%
it's less stark here than it was
for the Umpqua Watershed,

00:43:09.287 --> 00:43:11.622 align:center line:-1 position:50% size:23%
but you do still get
higher proportion

00:43:11.622 --> 00:43:12.990 align:center line:-1 position:50% size:33%
with the slope unit-based,

00:43:12.990 --> 00:43:15.092 align:center line:-1 position:50% size:41%
as opposed to the raster-based.

00:43:15.092 --> 00:43:18.763 align:center line:-1 position:50% size:38%
Looking at the model metrics,
area under the curve,

00:43:18.763 --> 00:43:21.466 align:center line:-1 position:50% size:37%
here we have all the different
grid-based approaches

00:43:21.466 --> 00:43:22.967 align:center line:-1 position:50% size:17%
that we used.

00:43:22.967 --> 00:43:24.402 align:center line:-1 position:50% size:35%
That is the different
standardization procedures

00:43:24.402 --> 00:43:27.905 align:center line:-1 position:50% size:34%
we did to get the polygons
and points consistent.

00:43:27.905 --> 00:43:31.275 align:center line:-1 position:50% size:35%
In bold here, we have
the slope unit-based maps.

00:43:31.275 --> 00:43:33.444 align:center line:-1 position:50% size:33%
We're looking at
Logistic Regression in red

00:43:33.444 --> 00:43:36.981 align:center line:-1 position:50% size:27%
and XGBoost in blue.

00:43:36.981 --> 00:43:39.917 align:center line:-1 position:50% size:29%
Hopefully, you can see
that generally,

00:43:39.917 --> 00:43:42.453 align:center line:-1 position:50% size:35%
across each of the domains
that we studied,

00:43:42.453 --> 00:43:43.955 align:center line:-1 position:50% size:28%
you get a higher
area under the curve,

00:43:43.955 --> 00:43:46.491 align:center line:-1 position:50% size:20%
which suggests
better model fit

00:43:46.491 --> 00:43:50.094 align:center line:-1 position:50% size:33%
when you use slope units,

00:43:50.094 --> 00:43:54.499 align:center line:-1 position:50% size:26%
in contrast
to grid-based maps.

00:43:54.499 --> 00:43:57.101 align:center line:-1 position:50% size:38%
Brier score is another metric--
again, mean squared error,

00:43:57.101 --> 00:43:59.737 align:center line:-1 position:50% size:36%
so we want low error values.

00:43:59.737 --> 00:44:04.876 align:center line:-1 position:50% size:42%
You can see that in bold we have
lower Brier scores generally

00:44:04.876 --> 00:44:06.177 align:center line:-1 position:50% size:32%
with the slope unit-based
approaches,

00:44:06.177 --> 00:44:08.746 align:center line:-1 position:50% size:37%
in contrast to
the raster-based approaches

00:44:08.746 --> 00:44:12.517 align:center line:-1 position:50% size:32%
for both XGBoost
and Logistic Regression.

00:44:12.517 --> 00:44:15.052 align:center line:-1 position:50% size:32%
Wrapping up this section,

00:44:15.052 --> 00:44:17.088 align:center line:-1 position:50% size:37%
hopefully, I've convinced you
that slope units are

00:44:17.088 --> 00:44:18.422 align:center line:-1 position:50% size:28%
a powerful alternative

00:44:18.422 --> 00:44:19.557 align:center line:-1 position:50% size:26%
when you're dealing
with the scales

00:44:19.557 --> 00:44:22.260 align:center line:-1 position:50% size:29%
that we're dealing with,
very large regions.

00:44:22.260 --> 00:44:23.895 align:center line:-1 position:50% size:30%
Again, this might not be
as applicable

00:44:23.895 --> 00:44:27.465 align:center line:-1 position:50% size:35%
when you're trying
to analyze the susceptibility

00:44:27.465 --> 00:44:28.666 align:center line:-1 position:50% size:25%
over a smaller area

00:44:28.666 --> 00:44:30.034 align:center line:-1 position:50% size:25%
and you might have
more detailed data.

00:44:30.034 --> 00:44:31.502 align:center line:-1 position:50% size:28%
But the scales
we're looking at here,

00:44:31.502 --> 00:44:33.070 align:center line:-1 position:50% size:16%
large areas,

00:44:33.070 --> 00:44:35.339 align:center line:-1 position:50% size:25%
it is a very powerful
alternative.

00:44:35.339 --> 00:44:37.808 align:center line:-1 position:50% size:26%
It's able to provide
an effective solution

00:44:37.808 --> 00:44:42.513 align:center line:-1 position:50% size:38%
for inconsistent and imprecise
landslide data.

00:44:42.513 --> 00:44:44.382 align:center line:-1 position:50% size:41%
It increases the model accuracy,

00:44:44.382 --> 00:44:47.585 align:center line:-1 position:50% size:29%
as shown by
the two metrics before.

00:44:47.585 --> 00:44:51.355 align:center line:-1 position:50% size:38%
The conservative output
better matches the input data,

00:44:51.355 --> 00:44:53.858 align:center line:-1 position:50% size:37%
so you're keeping the go-kart
a go-kart,

00:44:53.858 --> 00:44:58.195 align:center line:-1 position:50% size:35%
you're not trying to put your
go-kart engine in the truck.

00:44:58.195 --> 00:45:01.499 align:center line:-1 position:50% size:27%
Let's wrap things up.

00:45:01.499 --> 00:45:03.234 align:center line:-1 position:50% size:18%
In conclusion,

00:45:03.234 --> 00:45:05.303 align:center line:-1 position:50% size:35%
developing a meaningful
landslide susceptibility map

00:45:05.303 --> 00:45:06.737 align:center line:-1 position:50% size:23%
over large regions

00:45:06.737 --> 00:45:08.306 align:center line:-1 position:50% size:33%
is a very difficult problem.

00:45:08.306 --> 00:45:10.174 align:center line:-1 position:50% size:31%
We've investigated here

00:45:10.174 --> 00:45:12.643 align:center line:-1 position:50% size:35%
where susceptibility models
can be applied.

00:45:12.643 --> 00:45:14.045 align:center line:-1 position:50% size:23%
We've found that,

00:45:14.045 --> 00:45:16.747 align:center line:-1 position:50% size:35%
at least over the study sites
that we looked at,

00:45:16.747 --> 00:45:19.684 align:center line:-1 position:50% size:35%
every region needs to have
some sort of representation

00:45:19.684 --> 00:45:22.453 align:center line:-1 position:50% size:28%
in your training model

00:45:22.453 --> 00:45:27.258 align:center line:-1 position:50% size:37%
for it to be applicable over
the entire domain of interest.

00:45:27.258 --> 00:45:29.827 align:center line:-1 position:50% size:31%
Using one concentrated
landslide data set

00:45:29.827 --> 00:45:31.429 align:center line:-1 position:50% size:25%
and then applying it
to other areas

00:45:31.429 --> 00:45:35.967 align:center line:-1 position:50% size:41%
with assumed similarities
to where you trained your model

00:45:35.967 --> 00:45:37.368 align:center line:-1 position:50% size:27%
probably isn't
a very safe approach

00:45:37.368 --> 00:45:40.471 align:center line:-1 position:50% size:26%
when you're dealing
with large scales.

00:45:40.471 --> 00:45:42.106 align:center line:-1 position:50% size:32%
Slope units are
an effective mapping unit

00:45:42.106 --> 00:45:44.442 align:center line:-1 position:50% size:23%
for imprecise data

00:45:44.442 --> 00:45:45.843 align:center line:-1 position:50% size:33%
that's commonly available
for creating

00:45:45.843 --> 00:45:47.178 align:center line:-1 position:50% size:33%
these susceptibility maps,

00:45:47.178 --> 00:45:49.246 align:center line:-1 position:50% size:38%
and I've shown how SUMak is

00:45:49.246 --> 00:45:53.284 align:center line:-1 position:50% size:35%
a new, fast method
of implementing slope units

00:45:53.284 --> 00:45:56.087 align:center line:-1 position:50% size:37%
to hopefully allow slope units
to gain more traction

00:45:56.087 --> 00:46:00.257 align:center line:-1 position:50% size:36%
and for more people to take
advantage of these benefits

00:46:00.257 --> 00:46:04.428 align:center line:-1 position:50% size:31%
that slope units provide.

00:46:04.428 --> 00:46:07.231 align:center line:-1 position:50% size:29%
With that, I'd be happy
to take any questions.

00:46:07.231 --> 00:46:09.567 align:center line:-1 position:50% size:19%
You can read--

00:46:09.567 --> 00:46:11.836 align:center line:-1 position:50% size:36%
the paper just got published
on the first half of this talk

00:46:11.836 --> 00:46:14.905 align:center line:-1 position:50% size:28%
in JGR Earth Surface,
and hopefully,

00:46:14.905 --> 00:46:16.607 align:center line:-1 position:50% size:42%
the SUMak algorithm manuscript

00:46:16.607 --> 00:46:20.411 align:center line:-1 position:50% size:32%
will be out in not too long
as well.

00:46:20.411 --> 00:46:22.446 align:center line:-1 position:50% size:28%
I'd be happy
to take any questions.

