Saturday, February 1, 2014

Scatter for Terrain Database Generation

Part 1 - Overview

In the field of terrain database generation, feature scatter is a standard practice.  The typical use case consists of converting a line or polygon geometry into one or many different types of features.  For example, suppose we have a collection of land cover polygons:
The green polygons may represent a forrest or park that is filled with a variety of vegetation.  From the single polygon, one could choose to scatter multiple points, as illustrated below.

In addition to scattering within a feature, you could choose to scatter points along or around the border of a target polygon, as illustrated below:

Whether it is performed within a terrain tool or manually using a GIS tool, the operation is fundamentally the same.  It all boils down to one thing: placing features in a reasonable place.  However, this raises the questions:
  • How many features should be scattered?
  • What type of features should be scattered?
  • Where should the scattered features be placed?
  • Is the scatter method deterministic or non-deterministic?
Regardless of what is scattered (or how) the points need to be stored somewhere.  Regardless of which container is used(shapefile, geodatabase, sql-database), scatter points can be stored in one of two ways:
  1. All scatter points could be stored in a single container and individually attributed with a type
  2. Scatter points can be stored within individual datasets according to their types
Either way, a data model is required for the attribution that is associated with the individual scatter points.  With respects to vegetation a good starting point for a data model is the FGDCs Vegetation Classification Standard.  Now, although vegetation is the focus of my example, scatter features are not  limited to vegetation.  There are several other kinds of features that get scattered in a terrain database pipeline, such as buildings.  Before we consider the affects of different scatter types, let's add some  geometry to the mix and see what happens.  Consider the case where multiple features overlap, as illustrated below:

In this situation we have land cover and hydrology polygons that overlap.  In my geographic viewer, I placed the hydrology on top of the ground cover.  There is a visually obvious, well-defined layering order that can be observed based on how I decided to order the layers.  Should the scatter system be aware of these layering 'rules'?  Are these layering rules everything that scatter needs to do its job?  Let's consider how these layering rules may operate:
  • IF a ground cover polygon and a different ground cover polygon overlap, THEN you may scatter different types of features within the overlapping region.
  • IF a hydrology polygon and a ground cover polygon overlap THEN you may not scatter within the hydrology feature.
After all, you do not want to scatter trees in the water... do you?  Well, actually... you may!  We conclude that the answer is not always that simple.  So what should a good scatter system do?  Since we are still not certain, let's sum up what we have learned so-far; a good scatter system:
  • may or may not be deterministic
  • must take feature relationships into account
    • layering appears to be one aspect of this
    • feature relationships is another
  • must know how many features should be placed
    • this may be driven by density
    • this may be driven by location
    • or may not even matter, just as long as there are 'enough'
In the modeling and simulation industry, the most basic approach to scatter is based upon three variables, area, density and avoidance:
  • area - where the features should be placed
  • density - the number of features to place
  • avoidance - features must be placed so that they avoid existing features
Though the variables are simple, some immediate problems come to mind.  Does scatter fail if it can not fit a certain number of features into a given area?  Does scatter fail if it can not avoid features in a given area?  In addition to these questions, there are other non-obvious complexities involved with avoidance.  Suppose you are interested in scattering two tree models whose footprints are not rectangular, like those seen below:
Remember, the illustration above is the footprint of the tree models.  If you choose to scatter these features using only their bounding volumes, you will end up with un-natural looking scatter results, because the features will be spaced too far apart.  Unless feature model overlap is allowed, the geometry of the trees must be considered during scatter.

Despite these problems, it appears that the variables involved with the 'simple approach' are a subset of what we consider to be a good scatter system.  The critical pieces that are missing are realistic feature relationships.  Avoidance is one type of feature relationship but this is insufficient.  A lack of good feature relationships is what prevents most modern scatter systems from generating realistic results.

In a future post, we will explore what feature relationships mean and how they can be implemented.  In addition, we will examine how feature relationships can be used to enhance terrain database content in other fascinating ways.

No comments: