Thursday, January 19, 2012

Tiling spatial data into KML Placemarks

Overview
A project that I recently worked on was tasked to improve the speed of dynamic KML rendering in a client application that was using the Google Earth plugin.  The purpose of the application was to provide a 3D viewer of 2D data.  When we took over the application, which started as a prototype, it was using SQL Server Compact without any spatial extensions.  Because the app needed to be able to function in a disconnected state (in addition to some other DoD restrictions), we decided to use Sqlite with the Spatialite extensions.
One thing we had to figure out was how to cut down the amount of data being returned to the client application.  For example, one piece of data the application provided was 3D rendering of all airspace on the globe.  The amount of data is significant, but the only filter capability the application provided was to allow the user to pre-select types of airspace they wanted to view.  Once they said go, the application proceeded to build a KML document to be streamed back that contained all airspace on the planet of their selected type.  The user may have only been interested in a very specific area, but the prototype didn't take that into account.  While this method works for all different kinds of geometries we will just focus on polygons (airspace) since this was the most difficult for us to get working.

The "Tiling" process
 I don't know if "tiling" would be the correct term, but it's just what we came up with in order to relay what we were building to project manager-types.  For our purpose the definition of a tile is simply a polygon that defines a section of the planet. As you can see in the following image the red polygon would represent Tile 1, the yellow polygon would be Tile 2 and the blue one is Tile 3.  Now imagine the entire map is covered in these tiles. 

Having the luxury of being able to pre-ingest the geometries, we were able to come up with this method of tearing them apart along the tile boundaries.  Once we had the pieces, we could then create the KML placemarks for those pieces and then put the KML into the database to be retrieved and served back up.
Now lets look at the code that will do this.  First we had to find a good tile size.  As you can probably guess, the smaller the tile, the more data you have to store, but you get more definition on your geometries.  Conversely, the larger the tile, the smaller the data but then you lose some definition.
In the code below you can see where we chose a value of 4 as a seed for our tile size.  This was just what worked best for us.  From there we, in affect, iterated through the entire planet tile by tile, creating polygons in memory to use to query the database.  

 int level = 4;  
 double tileSize = 360 / Math.Pow(2, level + 1);  
 for (int x = 0; x < Math.Pow(2, (level + 1)); x++)  
      {  
         double minX = x * tileSize + -180;  
         double maxX = minX + tileSize;  
         for (int y = 0; y < Math.Pow(2, level); y++)  
              {  
                   double minY = y * tileSize + -90;  
                   double maxY = minY + tileSize;  
                   int tileId = x * 1000 + y;  
                   //query for polygons that match the min/max x/y values  
                   //use the result set to create your KML placemarks
                   //add placemarks to the database with tileId as the index 
              }  
    }  
 


All you need from here is the MinX, MaxX, MinY, MaxY values to use in your spatial query


Query for spatial objects
Most of the magic happens in the query.  
One major problem we experienced, and had to solve, was avoiding the same data being returned multiple times.  This will become clear later, but the method we used to solve this was to only return the piece of the geometry that is contained in a given tile.  To accomplish this we structured a query that would return the border of the polygon and the fill of the polygon separately.  
Here is what that query looks like for these values minX: -168.75, minY: 45, maxX: -157.5
maxY: 56.25:

SELECT AsText(Intersection(ExteriorRing(polygon.Geom), PolygonFromText('POLYGON((-168.75 45, -157.5 45, -157.5 56.25, -168.75 56.25, -168.75 45))'))) AS border, 
AsText(Intersection(polygon.Geom, PolygonFromText('POLYGON((-168.75 45, -157.5 45, -157.5 56.25, -168.75 56.25, -168.75 45))'))) AS fill, Name 
FROM AirSpace AS polygon 
WHERE  polygon.ROWID IN (SELECT pkid FROM idx_AirSpace_Geom WHERE xmin <= -157.5 AND xmax >= -168.75 AND ymin <= 56.25 AND ymax >= 45) AND 
Intersects(polygon.Geom, PolygonFromText('POLYGON((-168.75 45, -157.5 45, -157.5 56.25, -168.75 56.25, -168.75 45))'));

And here part of the result set that came back as an example.  You can see that the query found pieces of four different airspaces that intersect the tile
POLYGON((-168.75 45, -157.5 45, -157.5 56.25, -168.75 56.25, -168.75 45)).


borderfillName

LINESTRING(-168.75 51.26489, -167.816666 51.400002, -160 53.5, -157.5 54.392857)
POLYGON((-168.75 51.26489, -167.816666 51.400002, -160 53.5, -157.5 54.392857, -157.5 45, -168.75 45, -168.75 51.26489))OAKLAND OCEANIC FIR

LINESTRING(-168.75 51.26489, -167.816666 51.400002, -160 53.5, -157.5 54.392857)
POLYGON((-168.75 51.26489, -167.816666 51.400002, -160 53.5, -157.5 54.392857, -157.5 45, -168.75 45, -168.75 51.26489))OAKLAND OCEANIC CTA

LINESTRING(-157.5 54.392857, -160 53.5, -167.816666 51.400002, -168.75 51.26489)
POLYGON((-157.5 54.392857, -160 53.5, -167.816666 51.400002, -168.75 51.26489, -168.75 56.25, -157.5 56.25, -157.5 54.392857))ANCHORAGE HIGH ARTCC

MULTILINESTRING((-163 56.25, -163 54, -157.5 55.527778), (-157.5 53.153846, -158 53, -168.75 50.068182))
POLYGON((-163 56.25, -163 54, -157.5 55.527778, -157.5 53.153846, -158 53, -168.75 50.068182, -168.75 56.25, -163 56.25))ALASKA ADIZ

So, at this point you may be wondering why we are returning the border and the fill of a polygon in this manner.  This is strictly for rendering and stylizing.  The border represents just the piece of the external ring that resides inside the tile.  The fill is the polygon made up of the border and any tile boundaries that intersected it.
By separating them we can style only the outside border (the true border of the airspace) then we can set the border style of the fill to transparent.  Now when the different pieces of the polygon that fall into contiguous tiles are pieced back together, you won't have border lines in the middle of a polygon along the tile boundaries.  If you have no need for more pronounced borders around your polygons, you can simply modify your query to return just the fill and set your border style to transparent.

Creating the KML placemarks and using "TileId"
Sparing you the mundane details of how we created the placemarks using Google's API, I will just tell you what we did from here.  We created a separate placemark for each piece of each airspace and placed them in a separate table in the database with our TileId.  Why?  Well, two very important reasons:
  1. B*Tree vs R*Tree
    • It was simply faster to figure out what TileIds we needed to query for based off the information passed to us from the network link.  We then used those TileIds to query for placemarks (B*Tree).  The other option was to use the bounding box given to us by the network link then do a spatial query against the database returning all geometries that matched the query (R*Tree).
  2. Repeating Data
    • If we went the R*Tree route (which we did, then optimized to the B*Tree method) we would actually get repeating data.  When we used a bounding box to query for polygons that intersected it, those polygons would be returned and rendered.  When the user would move the map and we would receive a contiguous bounding box, we would get some of the same results because the new bounding box also intersected some of the polygons as the first bounding box.  This may not be a huge issue if you are only showing the borders of the polygons for example, however because we were displaying an opaque fill on our polygons, we had to fix this as the opacity of the polygons eventually turned solid.  Actually, it was this that clued us into the fact that we were getting the same polygons returned in subsequent queries.
As you can see the TileId is just a poor man's geohash.  There are certainly better, more standard ways of doing this, but it was easy, it worked, and we were under the gun as I'm sure every developer can understand.

Summary
 By using this technique to break apart large geometries, or even large sets of only point data, into "tiles" we've now provided a way for a viewer such as the Google Earth plugin to request only the data in your database that is visible in a bounding box that the client can provide through a network link.  This process can be somewhat likened to KML Regionation which can be done with static KML files.
In my next blog post, I will outline how we employed the Google Earth plugin and network links to: discern what "tiles" the GE plugin is viewing, query for the data in those tiles then build and return a KML document back to the plugin.