Geotrellis Versions Save

GeoTrellis is a geographic data processing engine for high performance applications.

v2.0.0

5 years ago

Additions

  • geotrellis.spark

    • All focal operations now except an optional partitioner parameter.
    • BufferTiles.apply methods and the bufferTiles methods now except an optional partitioner parameter.
    • CollectionLayerReader now has an SPI interface.
    • ZoomResample can now be used on MultibandTileLayerRDD.
    • Partitioner can be specified in the reproject methods of TileLayerRDD.
    • Compression level of GeoTiffs can be specified in the DeflateCompression constructor.
    • resampleMethod parameter has been added to COGLayerWriter.Options.
    • A new type called LayerType has been created to help identify the nature of a layer (either Avro or COG).
    • LayerHeader now have an additional parameter: layerType.
    • AttributeStore now has four new methods: layerType, isCOGLayer, readCOGLayerAttributes, and writeCOGLayerAttributes.
    • Kryo serialization of geometry now uses a binary format to reduce shuffle block size.
    • Alter geotrellis.spark.stitch.StitchRDDMethods to allow RDD[(K, V)] to be stitched when not all tiles are of the same dimension.
    • Introduce Pyramid class to provide a convenience wrapper for building raster pyramids.
    • COGValueReader and OverzoomingCOGValueReader have the readSubsetBands method which allows reading bands subsets.
    • COGLayerReader now has the readSubsetBands and querySubsetBands methods which allow users to read in layers with the desired bands in the order they choose.
    • KeyBounds now has the rekey method that will rekey the bounds from a source layout to a target layout.
  • geotrellis.raster

    • Kryo serialization of geometry now uses a binary format to reduce shuffle block size.
    • GeoTiffMultibandTile now has another crop method that takes a GridBounds and an Array[Int] that represents the band indices.
    • GeoTiff[MultibandTile] can be written with BandInterleave, only PixelInterleave previously supported. (#2767)
    • MultibandTile now has a new method, cropBands that takes an Array of band indices and returns a cropped MultibandTile with the chosen bands.
  • geotrellis.spark-etl

    • Input.partitionBytes and it is set to 134217728 by default to correspond HadoopGeoTiffRDD default behaviour.
    • Output.bufferSize option to set up a custom buffer size for the buffered reprojection.

Changes

  • geotrellis.spark

    • The length of the key (the space-filling curve index or address) used for layer reading and writing has been extended from a fixed length of 8 bytes to an arbitrary length. This change affects not only the geotrellis.spark package, but all backends (excluding geotrellis.geowave and geotrellis.geomesa).
    • Reprojection has improved performance due to one less shuffle stage and lower memory usage. TileRDDReproject loses dependency on TileReprojectMethods in favor of RasterRegionReproject
    • The Ascii draw methods are now method extensions of Tile.
    • Replace geotrellis.util.Functor with cats.Functor.
    • Specifying the maxTileSize for a COGLayer that's to be written is now done via COGLayerWriter.Options which can be passed directly to the write methods.
    • Specifying the compression for a COGLayer that's to be written is now done via COGLayerWriter.Options which can be passed directly to the write methods.
    • The attribute name for COGLayerStorageMetadata is now metadata instead of cog_metadata.
    • Scalaz streams were replaced by fs2 streams.
    • Refactored HBaseInstance, now accepts a plain Hadoop Configuration object.
    • Refactored CassandraInstance, now accepts a getCluster function.
    • Use pureconfig to handle all work with configuration files.
    • Removed LayerUpdater with its functionality covered by LayerWriter (#2663). of the same dimension.
    • Change TilerMethods.tileToLayout functions that accept TileLayerMetadata as an argument to return RDD[(K, V)] with Metadata[M] instead of RDD[(K, V)].
    • Expose attributeStore parameter to LayerReader interface.
    • Added exponential backoffs in S3RDDReader.
    • Changed SinglebandGeoTiff and MultibandGeoTiff crop function behaviour to work properly with cases when extent to crop by doesn't intersect tiff extent.
    • All classes and objects in the geowave package now use the spelling: GeoWave in their names.
    • TileLayerMetadata.fromRdd method has been renamed to TileLayerMetadata.fromRDD.
    • KeyBounds.fromRdd method has been renamed to KeyBounds.fromRDD.
  • geotrellis.raster

    • Removed implicit conversion from Raster[T] to T (#2771).
    • Removed decompress option from GeoTiffReader functions.
    • Scalaz streams were replaced by fs2 streams.
  • geotrellis.spark-etl

    • Package is deprecated since GeoTrellis 2.0.
    • Input.maxTileSize is 256 by default to correspond HadoopGeoTiffRDD default behaviour.
  • geotrellis.slick

    • geotrellis.slick.Projected has been moved to geotrellis.vector.Projected

Fixes

  • StreamingHistogram.binCount now returns non-zero counts (#2590)
  • HilbertSpatialKeyIndex index offset. Existing spatial layers using Hilbert index will need to be updated (#2586)
  • Fixed CastException that sometimes occured when reading cached attributes.
  • Uncompressed GeoTiffMultibandTiles will now convert to the correct CellType.
  • Calculating the Slope of a Tile when targetCell is Data will now produce the correct result.
  • Introduce new hooks into AttributeStore trait to allow for better performance in certain queries against catalogs with many layers.
  • GeoTiffReader can now read tiffs that are missing the NewSubfileType tag.
  • Pyramiding code will once again respect resampling method and will now actually reduce shuffle volume by resampling tiles on map side of pyramid operation.
  • COGLayer attributes can be accessed via the various read attribute methods in AttributeStore (ie readMetadata, readHeader, etc)
  • The regex used to match files for the HadoopLayerAttributeStore and FileLayerAttributeStore has been expanded to include more characters.
  • HadoopAttributeStore.availableAttributes has been fixed so that it'll now list all attribute files.
  • Allow for simple features to be generated with a specified or random id with geometry stored in the standard field, "the_geom"
  • Update version of Amazon SDK API to remove deprecation warnings.
  • Fixed a bug in incorrect metadata fetch by COGLayerReader that could lead to an incorrect data querying.
  • Cropping RDDs with clamp=false now produces correct result.
  • Fixed tiff reads in case RowsPerStrip tiff tag is not defined.
  • Change aspect result to azimuth, i.e. start from due north and be clockwise.
  • COG overviews generated in the COGLayer.fromLayerRDD method will now use the passed in ResampleMethod.
  • Reading a GeoTiff with streaming will now work with files that are larger than java.lang.Integer.MAX_VALUE.
  • GeoTiffMultibandTile.crop will now work with GeoTiffs that have tiled segments and band interleave.
  • GeoTiffMultibandTile.crop will now return ArrayMultibandTile(s) with the correct number of bands.
  • Improved performance of COGValueReader.readSubsetBands when reading from S3.

v1.2.0

6 years ago

API Changes

  • geotrellis.raster
    • Deprecation: GridBounds.size in favor of GridBounds.sizeLong.
    • Deprecation: GridBounds.coords in favor of GridBounds.coordsIter.
    • New: GridBounds.offset and GridBounds.buffer for creating a modified GridBounds from an existing one.
    • New: ColorRamps.greyscale: Int => ColorRamp, which will generate a ramp when given some number of stops.
    • New: ConstantTile.fromBytes to create any type of ConstantTile from an Array[Byte].
    • New: Tile.rotate90: Int => Tile, Tile.flipVertical: Tile and Tile.flipHorizontal: Tile.
  • geotrellis.vector
    • New: Geometry.isEmpty: Boolean. This incurs much less overhead than previous ways of determining emptiness.
    • New: Line.head and Line.last for efficiently grabbing the first or last Point in the Line.
  • geotrellis.spark
    • Deprecation: The LayerUpdater trait hierarchy. Use LayerWriter.update or LayerWriter.overwrite instead.
    • Deprecation: Every cache provided by geotrellis.spark.util.cache. These will be removed in favor of a pluggable cache in 2.0.
    • New: SpatialKey.extent: LayoutDefinition => Extent
    • New: ValueReader.attributeStore: AttributeStore
    • New: TileLayerRDD.toSpatialReduce: ((V, V) => V) => TileLayerRDD[SpatialKey] for smarter folding of 3D tile layers into 2D tile layers.
    • The often-used apply method overloads in MapKeyTransform have been given more descriptive aliases.
  • geotrellis.vectortile (experimental)
    • New: VectorTile.toGeoJson and VectorTile.toIterable.
    • Library simplified by assuming the codec backend will always be Protobuf.

New Features

Rasterizing Geometry Layers

Finally, the full marriage of the vector, raster, and spark packages! You can now transform an RDD[Geometry] into a writable GeoTrellis layer of (SpatialKey, Tile)!

val geoms: RDD[Geometry] = ...
val celltype: CellType = ...
val layout: LayoutDefinition = ...
val value: Double = ...  /* Value to fill the intersecting pixels with */

val layer: RDD[(SpatialKey, Tile)] with Metadata[LayoutDefinition] =
  geoms.rasterize(value, celltype, layout)
Clipping Geometry Layers to a Grid

In a similar vein to the above, you can now transform an arbitrarily large collection of Geometries into a proper GeoTrellis layer, where the sections of each Geometry are clipped to fit inside their enclosing Extents.

Here we can see a large Line being clipped into nine sublines. It's one method call:

import geotrellis.spark._

val layout: LayoutDefinition = ...  /* The definition of your grid */
val geoms: RDD[Geometry] = ...      /* Result of some previous work */

/* There are likely many clipped Geometries per SpatialKey... */
val layer: RDD[(SpatialKey, Geometry)] = geoms.clipToGrid(layout)

/* ... so we can group them! */
val grouped: RDD[(SpatialKey, Iterable[Geometry])] = layer.groupByKey

If clipping on the Extent boundaries is not what you want, there are ways to customize this. See the ClipToGrid entry in our Scaladocs.

Sparkified Viewshed

A Viewshed shows "visibility" from some set vantage point, given an Elevation raster. Prior to GeoTrellis 1.2 this was possible at the individual Tile level but not the Layer (RDD) level. Now it is.

First, we need to think about the Viewpoint type:

import geotrellis.spark.viewshed._

val point: Viewpoint(
  x = ...,                    // some coordinate.
  y = ...,                    // some coordinate.
  viewHeight = 4000,          // 4 kilometres above the surface.
  angle = Math.PI / 2,        // direction that the "camera" faces (in radians). 0 == east.
  fieldOfView = Math.PI / 2,  // angular width of the "view port".
  altitude = 0                // the height of points you're interested in seeing.
)

In other words:

  • x, y, viewHeight: where are we?
  • angle: what direction are we looking?
  • fieldOfView: how wide are we looking?
  • altitude: how high/low is the "target" of our viewing?

Given a Seq[Viewpoint] (the algorithm supports multiple simultaneous view points), we can do:

// Recall this common alias:
//   type TileLayerRDD[K] = RDD[(K, Tile)] with Metadata[TileLayerMetadata[K]]

val layer: TileLayerRDD[SpatialKey] = ...  /* Result of previous work */

val viewshed: TileLayerRDD[SpatialKey] = layer.viewshed(Seq(point))
Sparkified Euclidean Distance

We use Euclidean Distance to render a collection of points into a heatmap of proximities of some area. Say, of two roads crossing:

Prior to GeoTrellis 1.2, this was possible at the individual Tile level but not the Layer (RDD) level. Now it is.

/* Result of previous work. Potentially millions of points per SpatialKey. */
val points: RDD[(SpatialKey, Array[Coordinate])] = ...
val layout: LayoutDefinition = ...  /* The definition of your grid */

val layer: RDD[(SpatialKey, Tile)] = points.euclideanDistance(layout)
Polygonal Summaries over Time

The following was possible prior to GeoTrellis 1.2:

val layer: TileLayerRDD[SpatialKey] = ...
val polygon: Polgyon = ...

/* The maximum value within some Polygon overlaid on a Tile layer */
val summary: Double = layer.polygonalMaxDouble(polygon)

The above is also now possible for layers keyed by SpaceTimeKey to form a "time series":

val layer: TileLayerRDD[SpaceTimeKey] = ...
val polygon: MultiPolygon = ...

/* The maximum value within some Polygonal area at each time slice */
val summary: Map[ZonedDateTime, Double] = layer.maxSeries(polygon)
Overzooming ValueReader

A GeoTrellis ValueReader connects to some layer catalog and lets you read individual values (usually Tiles):

import geotrellis.spark.io.s3._

val store: AttributeStore = ...
val reader: Reader[SpatialKey, Tile] = S3ValueReader(store).reader(LayerId("my-catalog", 10))

val tile: Tile = reader.read(SpatialKey(10, 10))

However .reader is limited to zoom levels that actually exist for the given layer. Now you can use .overzoomingReader to go as deep as you like:

import geotrellis.raster.resample._

val reader: Reader[SpatialKey, Tile] =
  S3ValueReader(store).overzoomingReader(LayerId("my-catalog", 20), Average)

val tile: Tile = reader.read(SpatialKey(1000, 1000))
Regridding a Tile Layer

Have you ever wanted to "redraw" a grid over an established GeoTrellis layer? Say, this 16-tile Layer into a 4-tile one, both of 1024x1024 total pixels:

Prior to GeoTrellis 1.2, there was no official way to do this. Now you can use .regrid:

/* The result of some previous work. Say each Tile is 256x256. */
val layer: TileLayerRDD[SpatialKey] = ...

/* "Recut" the tiles so that each one is now 512x512.
 * No pixels are gained or lost, save some NODATA on the bottom
 * and right edges that may appear for padding purposes.
 */
val regridded: TileLayerRDD[SpatialKey] = layer.regrid(512)

You can also regrid to non-rectangular sizes:

val regridded: TileLayerRDD[SpatialKey] = layer.regrid(tileCols = 100, tileRows = 300)
Robust Layer Querying

It's common to find a subset of Tiles in a layer that are touched by some given Polygon:

val poly: Polygon = ???

val rdd: TileLayerRDD[SpatialKey] =
 layerReader
    .query[SpatialKey, Tile, TileLayerMetadata[SpatialKey]](Layer("name", zoom))
    .where(Intersects(poly))
    .result

Now you can perform this same operation with Line, MultiLine, and even (Polygon, CRS) to ensure that your Layer and Geometry always exist in the same projection.

Improved Tile ASCII Art

Sometimes you just want to visualize a Tile without going through the song-and-dance of rendering it to a .png. The existing Tile.asciiDraw method kind of does that, except its output is all in numbers.

The new Tile.renderAscii: Palette => String method fulfills your heart's desire:

import geotrellis.raster._
import geotrellis.raster.io.geotiff._
import geotrellis.raster.render.ascii._

val tile: Tile = SinglebandGeoTiff("path/to/tiff.tiff").tile

// println(tile.renderAscii())  // the default
println(tile.renderAscii(AsciiArtEncoder.Palette.STIPLED))
            ▚▖
            ▚▚▜▚▚
            ▚▖▚▜▚▖▚▚
           ▜▚▚▚▜▚█▚▜▚█▚
           █▚▜▖▜▖▚▚█▚▚▜▚█▖
           ▚▚█▚▜▚▚▚▚▚▚▚▜▚▚▚▚▚
          ▚▚▖▚▚▚▚▚█▜▚▚▜▚▚▖▚▖▚▖▚
          ▚▚▚▚█▚▚▚▚▚██▚▚▚▜▖▖██▚▚▜▚
          ▚▚█▚▚▚▚▚▚▚▜▚▚▚▚▚▚▜▚█▚▚▚▚▚▚▚
         █▚▚▖▚█▚▜▚▚▚▚▖▚▚▚▚▚▚▚▚▚▚▜▚▚▚▚▚▚▖
         █▚▚▚▜▚▖▚▚▚▚▚▚▚▚▚▚▚▚▚▚▚▚▚██▖▜▚█▚▚▚
         █▚▚██▚▚▚▚▚▚▚▚▖▚▚▚▚▚▚▚▚█▚▚▚▚▚▚▖▖▖▚▚▚▚
        █▜▚▚██▜▚▚▚▜▖▚▚▜▚█▜▚▚▚▜▚▖▚▜▚█▚▚▖▚▚▖▚▚▖▖▚▚
        ▚▚█▚▚▚█▚██▚▚▚▚▚▚▚▚▜▚▚█▜▚▖█▚▚▚▜▚▚▚▚▚▚▜▚█▚█
        █▚▜▚▜▚█▚▜▚▚▜▚█▚▚▚▚▚▚▚▚▚▚▚▖▚▖▚▚▖▚█▚█▚▚▚▖█▚
        ████▚███▚▚▚▚██▚▚▚█▜▚▚▖▚▚▚▖▖▚▚▚▚▚▚▚▚█▚▜▖█
       ▖█▜▚█▚██▜▖▜▜█▜▜█▜▚▚▚▚▚█▖▚▚▚▚█▚▚▚▚▚▚▜▚▚█▖▜
       ▚▖██▚▜▚█▚▚▜▜█▜▜▜██▚▚▚▚█▚▚▚▜▖▚▚█▚▖▚▜▚▚▚▖▚█
       █▚▚▚▚▜▚██▖██▜▚▚█▚▚▖▚▚▜▚▖▚▖▚▚▚▚▚▖▚▚▖▖▖▚▖▚
      ▚▚▚█▚▚▚▚▚█▜▚▚████▚█▚▚▚█▚▖▚▚▚▖▚▚█▚▚▖▚▚▚▖▖▖
      ▚▚▚█▚▚▚▖▖▚▜█▜██▜██▚▚▖██▜▚▜▚█▚▚▚▚▚▚▚▚▖▖▜██
      ▚▚▚▚▜█▚▚▚▚▚█████▚▜██▚██▚▚▚▚▜▚▖▚█▚▚▖▚▖▚▚█
     ▚▚▜▚▚▚▚▜▚▜▚▚▚▚▜▚█▚▜█▚██▚██▚▚▚▚▖▚▚▚▚▖▖▚▚▖█
     ▚▜▚▜▚▚▚▚▚▚█▚▚▚▚▚██▜▜▜███▖▚▚▜█▚▚▖▚█▚▚█▚▖▚
     ▚▜▚▚▚▚▚▚▚▚▚▚▜▜▜▚▚▖▚▖▚▚▜▜██▜▚██▚▚▚▚▚▚▖▜█▚
    ▚▚▖▚▚█▚█▚▚▚█▚▖▚▚▚█▚▚▚▚▚▜██▚█▜▚█▚▜▚▚███▜█▜
    ▚▚▚▜▚▚▚▚▚▚▚▚▚▚▚▖█▚█▚▚▜█▜█▜█▜▚▖▚▚▚██▜▜█▚▜
    ▚▚▚▚▜▚▚▚▚▚▚▜▚▚▚▚▚▚▖▚█▜▖▖█▚▖▜▖▚▖█▚▖█▚▚▜▚█
    ▚▚█▚▚█▚▚▜▚▚▚▚▜▚▚▚▚▚▜▚▖▚█▜█▚▜▜▚█▖▜███▜▚▚
   ▚▚▚▚▚▚▖▜▚█▚▚▚▖▚▚▚▚▚▚▚▚▚▚▚▜█▖▜▜▜█▚▚▚▖▚█▚█
   ▜▚▚▚█▚▖▚█▚█▚▚█▚▚▚▚▚▚▚▖▚▚▚▜▚▚▚▜▚▖▚▖▚▚▚▚▜▚
   ▚▚▚▚▖▚█▖█▜▚▚▚▚▚▚▚▚▖▚▚▖▖█▚▜▚▖▚▚▚▚▖▖▚█▚▚▚
  ▚▚▚▚▚▚▚▚▚█▚▚▚▖▚▚▚█▚▜▚█▚▚▖▜██▚▖▚▚▚▚▚▚▚▚▚▖
  ▚▚▚▚▚▚▚▖▚▚██▚▚▚▚▚▚▚▚▜▚▚█▚██▚▚▚▚▖▚▚▖▚▚█▜▖
  ▚▚▚▚▚▚▚▚▚▚▚▚▚█▚▜▚▚▚▜▚▚▖▚▚▚▚▚▜▚▚▚▚▖▚▚▚▚▚
 ▚██▖▚▚▚▚▚▚▚▚▜▚▚█▚▚▚▚▜▚▚▚▚█▜▖▚▚█▜▜█▜█▚▖▚▖
 ▚▚▚▖▚▚█▚▚▜███▚▚▚▜▚▚▚▚▚█▚▖▖█▖▚████▜███▚██
 ▚█▚▚▚▚██▜▚▜▚▜▜▜█▜▚█▚▜▖▜▚▚▚█▚▜█▚▜▚▚▚▚▚▖▖
    █▜█▚▚▜▚▜▚▜▜▜▚▚▚▚██▖▖▖▚██▖█▚▜▜▚▚▚▚▚▚▖
       ▚█▜▜▜▜▜██▚▜▚▚▚▚▚▚▖▜▚▜▚▚▚▜▚█▚▚▖▖▖
          ██▚▚▚▚▚▚▚▜▚▜▖▚██▜▜▚▖▚▚█▚▚▚▖▜▜
             ▜▚▚▖▚▚▚▖▚▜▜██▜▜▚█▚▚▜▚▚▜██▚
                ▚▚█▚▜▚▚█▖▜▚▚▚▖█▚▚█▚▚█▚
                   █▜▜▚▚▜▜▚▚▚▜█▚▚▚▜█▜█
                      ▚▚▖▚█▖▚▖▜▚▖▚▖▜▚
                         ███▖██▚▖▚▚▚▚
                            ▜▚▚█▚▚▖▖█
                              ▚▖▜█▜▚
                                 ▖█▚

Gorgious.

Storage on Azure via HDFS

By adding some additional configuration, you can now use our HDFS Layer Backend to read and write GeoTrellis layers to Microsoft Azure's blob storage.

S3 Configurability

It's now possible to customize how our S3 backend communicates with S3.

Configuring JTS Precision

GeoTrellis uses the Java Topology Suite for its vector processing. By default, JTS uses a "floating" PrecisionModel. When writing code that needs to be numerically robust, this default can lead to Topology Exceptions.

You can now use Typesafe Config to configure this to your application's needs. See here for the specifics.

Other New Features

Fixes