Spatial Operations and rio
#
Today we are going to use the spatial background we learned yesterday and apply it to dataset exploration of an AVIRIS image. We are going to use the rio
module of xarray
.
# Libraries for image visualization
from IPython.display import Image
from IPython.core.display import HTML
import xarray as xr
First let’s open the datafile. We are going to use the argument engine='rasterio'
because we are using an ENVI file type. This is also the recommended method for opening TIF files.
aviris_ds = xr.open_dataset('./data/subset_f131205t01p00r10rdn_e_sc01_ort_img', engine='rasterio')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[3], line 1
----> 1 aviris_ds = xr.open_dataset('./data/subset_f131205t01p00r10rdn_e_sc01_ort_img', engine='rasterio')
File ~/miniconda3/envs/sarp/lib/python3.10/site-packages/xarray/backends/api.py:556, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
553 if from_array_kwargs is None:
554 from_array_kwargs = {}
--> 556 backend = plugins.get_backend(engine)
558 decoders = _resolve_decoders_kwargs(
559 decode_cf,
560 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
566 decode_coords=decode_coords,
567 )
569 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
File ~/miniconda3/envs/sarp/lib/python3.10/site-packages/xarray/backends/plugins.py:205, in get_backend(engine)
203 engines = list_engines()
204 if engine not in engines:
--> 205 raise ValueError(
206 f"unrecognized engine {engine} must be one of: {list(engines)}"
207 )
208 backend = engines[engine]
209 elif isinstance(engine, type) and issubclass(engine, BackendEntrypoint):
ValueError: unrecognized engine rasterio must be one of: ['netcdf4', 'scipy', 'store']
# Extract just the band DataArray from the dataset
aviris = aviris_ds['band_data']
aviris
<xarray.DataArray 'band_data' (band: 224, y: 300, x: 100)> [6720000 values with dtype=float32] Coordinates: * band (band) int64 1 2 3 4 5 6 7 8 ... 218 219 220 221 222 223 224 xc (y, x) float64 5.777e+05 5.777e+05 ... 5.814e+05 5.814e+05 yc (y, x) float64 4.151e+06 4.151e+06 ... 4.148e+06 4.148e+06 spatial_ref int64 0 Dimensions without coordinates: y, x Attributes: long_name: ('Band 1', 'Band 2', 'Band 3', 'Band 4', 'Band... bands: 224 band_names: Band 1,Band 2,Band 3,Band 4,Band 5,Band 6,Band... byte_order: 0 coordinate_system_string: PROJCS["unnamed",GEOGCS["GCS_WGS_1984",DATUM["... data_type: 2 description: ./output_data/subset_f131205t01p00r10rdn_e_sc0... file_type: ENVI Standard header_offset: 0 interleave: bsq lines: 300 samples: 100
When we are opening TIF or ENVI files using the rasterio
engine is that we get an additional set of attributes and methods that give us information about the image. We access this using .rio
print(aviris.rio.bounds()) # bounding box
print(aviris.rio.crs) # coordinate reference system
print(aviris.rio.height) # number of pixels tall
print(aviris.rio.width) # number of pixels wide
print(aviris.rio.nodata) # a nodata fill value, if one has been set
print(aviris.rio.transform()) # A CRS property called an affine transformation
(577666.150342955, 4155829.70969225, 579276.150342955, 4150999.70969225)
EPSG:32610
300
100
None
| 14.08, 7.81, 577666.15|
| 7.81,-14.08, 4150999.71|
| 0.00, 0.00, 1.00|
These metadata can be really helpful for orienting you to your data.
One field I want to make a note about is the nodata field. The nodata value is sometimes not set, but that doesn’t mean that there isn’t a nodata value in the dataset. There still may be a nodata value present in the dataset even if one is not set.
📝 Check your understanding
Look at the EPSG code of this dataset. Reference the figure below from the spatial data lecture to figure out approximately where in the world the dataset is located. Is the image in the northern or southern hemisphere?
Image(url= "http://www.dmap.co.uk/utmworld.gif")
Although the EPSG code is useful for general orientation and an overall sanity check, we usually want to know more specifically where our data is located. We can do this by extracting the bounds and converting the coordinates to EPSG 4326.
The general syntax for that is:
t = Transformer.from_crs(INPUT_PROJ, OUTPUT_PROJ, always_xy=True).transform
transform(t, SHAPELY_POINT)
Where everything in ALL_CAPS is a variable you will be inserting yourself.
Here’s an example
from pyproj import Transformer
from shapely.ops import transform
# Get the bounding box
bbox = aviris.rio.bounds()
from shapely.geometry import box
# Create a shapely object with the bounding box
bbox_geom = box(bbox[0], bbox[3], bbox[2], bbox[1])
# Consider this as a copy-and-paste able code snippet that you can use if you ever need to
# transform coordinates
t = Transformer.from_crs('epsg:32610', 'epsg:4326', always_xy=True).transform
bbox_4326 = transform(t, bbox_geom)
# View the output as a wkt
bbox_4326.wkt
'POLYGON ((-122.10308958299767 37.50252918542879, -122.10256826050976 37.5460596647619, -122.12079199682005 37.54619677057729, -122.12130273640575 37.50266607712935, -122.10308958299767 37.50252918542879))'