Geospatial data examples#
Python and Github#
UW Geospatial Data Analysis
CEE467/CEWA567
David Shean, Eric Gagliano, Quinn Brencher
This notebook is meant to show off some of the types of datasets we’ll learn about this quarter. At this point, there is no need to understand the code, we’ll get there one step at a time!
As you’re looking through the notebook, think about…
What are the potential use cases for different datasets?
What is the structure of these datasets? What do rows and columns represent in each?
Why might a dataset be more appropriate as a certain data type (think vector vs raster, etc)?
How might you use these or similar datasets in your own work or interests?
No need to run this notebook unless you want to interact with the code. If you run the notebook, make sure to shut down the kernel afterwards because this notebook takes up a lot of RAM!
Imports#
import numpy as np
import pandas as pd
import geopandas as gpd
import xarray
import rasterio as rio
import matplotlib.pyplot as plt
import matplotlib
import requests
import contextily as ctx
import pystac_client
import planetary_computer
import odc.stac
import easysnowdata
Vector example 1: Natural Earth Countries#
Read data#
world_gdf = gpd.read_file(f"zip+https://naciscdn.org/naturalearth/110m/cultural/ne_110m_admin_0_countries.zip")
Preview data structure#
world_gdf
| featurecla | scalerank | LABELRANK | SOVEREIGNT | SOV_A3 | ADM0_DIF | LEVEL | TYPE | TLC | ADMIN | ... | FCLASS_TR | FCLASS_ID | FCLASS_PL | FCLASS_GR | FCLASS_IT | FCLASS_NL | FCLASS_SE | FCLASS_BD | FCLASS_UA | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Admin-0 country | 1 | 6 | Fiji | FJI | 0 | 2 | Sovereign country | 1 | Fiji | ... | None | None | None | None | None | None | None | None | None | MULTIPOLYGON (((180 -16.06713, 180 -16.55522, ... |
| 1 | Admin-0 country | 1 | 3 | United Republic of Tanzania | TZA | 0 | 2 | Sovereign country | 1 | United Republic of Tanzania | ... | None | None | None | None | None | None | None | None | None | POLYGON ((33.90371 -0.95, 34.07262 -1.05982, 3... |
| 2 | Admin-0 country | 1 | 7 | Western Sahara | SAH | 0 | 2 | Indeterminate | 1 | Western Sahara | ... | Unrecognized | Unrecognized | Unrecognized | None | None | Unrecognized | None | None | None | POLYGON ((-8.66559 27.65643, -8.66512 27.58948... |
| 3 | Admin-0 country | 1 | 2 | Canada | CAN | 0 | 2 | Sovereign country | 1 | Canada | ... | None | None | None | None | None | None | None | None | None | MULTIPOLYGON (((-122.84 49, -122.97421 49.0025... |
| 4 | Admin-0 country | 1 | 2 | United States of America | US1 | 1 | 2 | Country | 1 | United States of America | ... | None | None | None | None | None | None | None | None | None | MULTIPOLYGON (((-122.84 49, -120 49, -117.0312... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 172 | Admin-0 country | 1 | 5 | Republic of Serbia | SRB | 0 | 2 | Sovereign country | 1 | Republic of Serbia | ... | None | None | None | None | None | None | None | None | None | POLYGON ((18.82982 45.90887, 18.82984 45.90888... |
| 173 | Admin-0 country | 1 | 6 | Montenegro | MNE | 0 | 2 | Sovereign country | 1 | Montenegro | ... | None | None | None | None | None | None | None | None | None | POLYGON ((20.0707 42.58863, 19.80161 42.50009,... |
| 174 | Admin-0 country | 1 | 6 | Kosovo | KOS | 0 | 2 | Disputed | 1 | Kosovo | ... | Admin-0 country | Unrecognized | Admin-0 country | Unrecognized | Admin-0 country | Admin-0 country | Admin-0 country | Admin-0 country | Unrecognized | POLYGON ((20.59025 41.85541, 20.52295 42.21787... |
| 175 | Admin-0 country | 1 | 5 | Trinidad and Tobago | TTO | 0 | 2 | Sovereign country | 1 | Trinidad and Tobago | ... | None | None | None | None | None | None | None | None | None | POLYGON ((-61.68 10.76, -61.105 10.89, -60.895... |
| 176 | Admin-0 country | 1 | 3 | South Sudan | SDS | 0 | 2 | Sovereign country | 1 | South Sudan | ... | None | None | None | None | None | None | None | None | None | POLYGON ((30.83385 3.50917, 29.9535 4.1737, 29... |
177 rows × 169 columns
world_gdf.shape
(177, 169)
Visualize data#
f,ax=plt.subplots(layout='compressed')
world_gdf.plot(ax=ax,
column="POP_EST",
legend=True,
norm=matplotlib.colors.LogNorm(
vmin=world_gdf['POP_EST'].min(),
vmax=world_gdf['POP_EST'].max()),
legend_kwds={
'label': 'Population (log scale)',
'orientation': 'horizontal',
},
)
ax.set_title('Population by Country')
Text(0.5, 1.0, 'Population by Country')
Vector example 2: GMBA Mountain Inventory v2#
Read data#
url = (f"https://data.earthenv.org/mountains/standard/GMBA_Inventory_v2.0_standard_300.zip")
mountains_gdf = gpd.read_file("zip+" + url)
Preview data structure#
mountains_gdf
| GMBA_V2_ID | GMBA_V1_ID | MapName | WikiDataUR | MapUnit | Hier_Lvl | Feature | Area | Perimeter | Elev_Low | ... | Name_ES | Name_FR | Name_PT | Name_RU | Name_ZH | LocalNames | ColorAll | ColorBasic | Color300 | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 19102 | None | Arctic Ocean | https://www.wikidata.org/wiki/Q788 | Aggregated | 1 | Ocean | 57205.24576 | 22872.03994 | 0.0 | ... | Océano Ártico | Océan Arctique | Oceano Ártico | Северный Ледовитый океан | 北冰洋 | Kadagatang Artiko (Norwegian) | 1 | 0 | 1 | MULTIPOLYGON (((-8.63071 70.96548, -8.62822 70... |
| 1 | 11134 | 3.04.01 | Appalachian Mountains | https://www.wikidata.org/wiki/Q93332 | Aggregated | 2 | Mountain range with well-recognized name | 129875.02600 | 37431.11230 | -159.0 | ... | Apalaches | Appalaches | Apalaches | Аппалачи | 阿巴拉契亚山脉 | None | 4 | 0 | 1 | MULTIPOLYGON (((-85.78819 33.4368, -85.80703 3... |
| 2 | 12390 | None | Cordillera Centroamericana | https://www.wikidata.org/wiki/Q5788379 | Aggregated | 2 | Mountain range with well-recognized name | 284121.62390 | 31599.87888 | -1.0 | ... | Cordillera Centroamericana | None | None | None | None | None | 1 | 0 | 2 | MULTIPOLYGON (((-77.41295 5.56954, -77.41741 5... |
| 3 | 12497 | None | Caribbean Islands | https://www.wikidata.org/wiki/Q664609 | Aggregated | 2 | Archipelago | 61738.67557 | 10329.79926 | -40.0 | ... | Caribe | Caraïbes | Caraíbas | Карибы | 加勒比地区 | None | 4 | 0 | 4 | MULTIPOLYGON (((-61.05987 10.83557, -61.05311 ... |
| 4 | 14271 | None | South Atlantic Islands | https://www.wikidata.org/wiki/Q1482804 | Aggregated | 2 | Geographically-defined sub-range | 5579.35771 | 1487.00483 | -133.0 | ... | Océano Atlántico Sur | Océan Atlantique Sud | Atlântico Sul | Южная Атлантика | 南大西洋 | None | 3 | 0 | 2 | MULTIPOLYGON (((3.39049 -54.395, 3.41145 -54.4... |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 286 | 11208 | 5.08.19 | Central Range | None | Aggregated | 6 | Mountain range with well-recognized name | 263615.62150 | 12354.30388 | 0.0 | ... | Cordillera Central | Chaîne Centrale | None | Центральный хребет Новой Гвинеи | 巴布亞新幾內亞高地 | None | 6 | 0 | 4 | MULTIPOLYGON (((150.33845 -10.40741, 150.34297... |
| 287 | 12748 | None | Borneo | https://www.wikidata.org/wiki/Q36117 | Aggregated | 6 | Island | 248480.11360 | 28667.97285 | -48.0 | ... | Isla de Borneo | Bornéo | Bornéu | Калимантан | 婆罗洲 | Kalimantan (Acehnese); Nusa Kalimantan (Baline... | 6 | 0 | 5 | MULTIPOLYGON (((114.97609 -3.51973, 114.9728 -... |
| 288 | 12756 | None | Java | https://www.wikidata.org/wiki/Q3757 | Aggregated | 6 | Island | 42186.46494 | 6852.31987 | 0.0 | ... | Java | Java | Java | Ява | 爪哇岛 | Jawa (Acehnese); Jawa (Balinese); Jawa (Banjar... | 1 | 0 | 4 | MULTIPOLYGON (((114.06353 -8.62899, 114.05197 ... |
| 289 | 18917 | None | Sulawesi Archipelago | None | Aggregated | 6 | Archipelago | 128619.86330 | 11614.45005 | -16.0 | ... | None | None | None | None | None | None | 1 | 0 | 1 | MULTIPOLYGON (((120.73094 -7.09701, 120.71339 ... |
| 290 | 11154 | 5.07.09 | Barisan Mountains | https://www.wikidata.org/wiki/Q649051 | Aggregated | 7 | Mountain range with well-recognized name | 120726.28300 | 11781.38549 | 0.0 | ... | Cordillera de Barisan | None | Montanhas Barisan | Барисан | 巴里桑山脈 | Pegunungan Barisan (Indonesian); Pegunungan Bu... | 6 | 0 | 2 | MULTIPOLYGON (((104.86102 -5.84223, 104.85614 ... |
291 rows × 41 columns
mountains_gdf.shape
(291, 41)
Visualize data#
f,ax=plt.subplots(layout='compressed')
mountains_gdf.plot(ax=ax,
column="Area",
legend=True,
legend_kwds={
'label': 'Area [km2]',
'orientation': 'horizontal',
},
)
ax.set_title('Mountain Range Area')
Text(0.5, 1.0, 'Mountain Range Area')
Vector example 3: NOAA’s Significant Earthquake Database#
Read data#
def load_and_prepare_earthquake_data():
"""
Load and prepare global historical earthquake data from NOAA's API.
Returns a GeoDataFrame of significant earthquakes.
"""
# Use NOAA's official API endpoint for significant earthquakes
base_url = "https://www.ngdc.noaa.gov/hazel/hazard-service/api/v1/earthquakes"
# Get total number of pages from first request
first_page = requests.get(base_url).json()
total_pages = first_page['totalPages']
# Initialize list to store all items
all_items = []
# Fetch all pages
print(f"Fetching {total_pages} pages of earthquake data...")
for page in range(total_pages):
response = requests.get(f"{base_url}?page={page+1}").json()
all_items.extend(response['items'])
print(f"Total earthquakes fetched: {len(all_items)}")
# Create DataFrame from all items
df = pd.DataFrame(all_items)
# Create GeoDataFrame from latitude and longitude
gdf = gpd.GeoDataFrame(
df,
geometry=gpd.points_from_xy(df.longitude, df.latitude),
crs="EPSG:4326" # WGS 84
)
# Filter for earthquakes with magnitude data
gdf = gdf[gdf['eqMagnitude'].notna()]
# Convert to Web Mercator for basemap compatibility
#gdf = gdf.to_crs(epsg=3857)
return gdf
earthquake_gdf = load_and_prepare_earthquake_data()
Fetching 33 pages of earthquake data...
Total earthquakes fetched: 6447
Preview data structure#
earthquake_gdf
| id | year | locationName | latitude | longitude | eqMagnitude | damageAmountOrder | eqMagUnk | publish | country | ... | area | eqMagMfa | damageMillionsDollars | missing | missingAmountOrder | missingTotal | missingAmountOrderTotal | damageMillionsDollarsTotal | eqMagMb | geometry | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | -2150 | JORDAN: BAB-A-DARAA,AL-KARAK | 31.100 | 35.500 | 7.3 | 3.0 | 7.3 | True | JORDAN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (35.5 31.1) |
| 2 | 3 | -2000 | TURKMENISTAN: W | 38.000 | 58.200 | 7.1 | 1.0 | NaN | True | TURKMENISTAN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (58.2 38) |
| 5 | 12 | -1250 | ISRAEL: ARIHA (JERICHO) | 32.000 | 35.500 | 6.5 | 3.0 | 6.5 | True | ISRAEL | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (35.5 32) |
| 6 | 13 | -1050 | JORDAN: SW: TIMNA COPPER MINES | 29.600 | 35.000 | 6.2 | 3.0 | 6.2 | True | JORDAN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (35 29.6) |
| 9 | 17 | -479 | GREECE: MACEDONIA | 39.700 | 23.300 | 7.0 | NaN | NaN | True | GREECE | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (23.3 39.7) |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 6441 | 10759 | 1570 | OMAN: QALHAT | 22.600 | 59.400 | 7.0 | 4.0 | NaN | True | OMAN | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (59.4 22.6) |
| 6442 | 10760 | 2024 | TURKEY: ELAZIG AND MALATYA PROVINCES | 38.309 | 38.826 | 6.0 | 2.0 | NaN | True | TURKEY | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (38.826 38.309) |
| 6443 | 10761 | 2024 | CUBA: GRANMA, SANTIAGO DE CUBA | 19.812 | -77.039 | 6.8 | 3.0 | NaN | True | CUBA | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (-77.039 19.812) |
| 6444 | 10762 | 2024 | CALIFORNIA: OFFSHORE CAPE MENDOCINO | 40.374 | -125.022 | 7.0 | NaN | NaN | True | USA | ... | CA | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (-125.022 40.374) |
| 6445 | 10763 | 2024 | VANUATU ISLANDS: EFATE, PORT VILA | -17.686 | 168.034 | 7.3 | 3.0 | NaN | True | VANUATU | ... | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | POINT (168.034 -17.686) |
4672 rows × 50 columns
earthquake_gdf.shape
(4672, 50)
Visualize data#
f, ax = plt.subplots(layout='compressed')
world_gdf.boundary.plot(ax=ax,color='grey',linewidth=0.5)
# Plot earthquakes
scatter = earthquake_gdf.plot(
ax=ax,
column='eqMagnitude',
cmap='inferno',
legend=True,
legend_kwds={
'label': 'Magnitude',
'orientation': 'horizontal',
},
markersize=1,#earthquake_gdf['eqMagnitude'] * 1, # Reduced size multiplier due to more points
alpha=0.8
)
# # Add basemap
# ctx.add_basemap(
# ax,
# source=ctx.providers.Esri.WorldImagery
# )
ax.set_title('Global Historical Significant Earthquakes')
Text(0.5, 1.0, 'Global Historical Significant Earthquakes')
Raster example 1: Copernicus GLO-90 Digital Elevation Model#
Read data#
bbox = (-122.0, 46.7, -121.5, 47.0)
cop90_dem_da = easysnowdata.topography.get_copernicus_dem(bbox, resolution=90)
Preview data structure#
cop90_dem_da
<xarray.DataArray 'data' (latitude: 361, longitude: 601)> Size: 868kB
dask.array<getitem, shape=(361, 601), dtype=float32, chunksize=(361, 601), chunktype=numpy.ndarray>
Coordinates:
* latitude (latitude) float64 3kB 47.0 47.0 47.0 47.0 ... 46.7 46.7 46.7
* longitude (longitude) float64 5kB -122.0 -122.0 -122.0 ... -121.5 -121.5
spatial_ref int32 4B 4326
time datetime64[ns] 8B 2021-04-22cop90_dem_da.values
array([[ 499.2021 , 496.54 , 495.75162, ..., 1218.6013 , 1234.3867 ,
1233.8529 ],
[ 508.11026, 508.29654, 507.0695 , ..., 1155.2106 , 1165.3165 ,
1167.0106 ],
[ 519.83374, 517.6379 , 515.73 , ..., 1095.627 , 1106.0608 ,
1106.6742 ],
...,
[ 984.7717 , 955.91187, 929.96 , ..., 1208.51 , 1210.0714 ,
1209.9208 ],
[ 950.2026 , 925.1188 , 901.5398 , ..., 1207.794 , 1216.0377 ,
1216.38 ],
[ 915.3658 , 893.26587, 872.855 , ..., 1206.2902 , 1217.8804 ,
1224.1011 ]], dtype=float32)
np.shape(cop90_dem_da.values)
(361, 601)
Visualize data#
f,ax=plt.subplots(layout='compressed')
cop90_dem_da.plot(ax=ax,cmap='terrain')
ax.set_title('Copernicus GLO-90 DEM')
Text(0.5, 1.0, 'Copernicus GLO-90 DEM')
Raster example 2: ESA WorldCover 2021#
Read data#
worldcover_da = easysnowdata.remote_sensing.get_esa_worldcover(bbox)
Preview data structure#
worldcover_da
<xarray.DataArray 'map' (latitude: 3600, longitude: 6000)> Size: 22MB
dask.array<getitem, shape=(3600, 6000), dtype=uint8, chunksize=(3600, 6000), chunktype=numpy.ndarray>
Coordinates:
* latitude (latitude) float64 29kB 47.0 47.0 47.0 47.0 ... 46.7 46.7 46.7
* longitude (longitude) float64 48kB -122.0 -122.0 -122.0 ... -121.5 -121.5
spatial_ref int32 4B 4326
time datetime64[ns] 8B 2021-01-01
Attributes:
nodata: 0
class_info: {10: {'name': 'Tree cover', 'color': '#006400'}, 20: {'na...
cmap: <matplotlib.colors.ListedColormap object at 0x7ac749e91490>
data_citation: Zanaga, D., Van De Kerchove, R., De Keersmaecker, W., Sou...
example_plot: <function get_esa_worldcover.<locals>.plot_classes at 0x7...worldcover_da.values
array([[10, 10, 10, ..., 10, 10, 10],
[10, 10, 10, ..., 10, 10, 10],
[10, 10, 10, ..., 10, 10, 10],
...,
[10, 10, 10, ..., 10, 10, 10],
[10, 10, 10, ..., 10, 10, 10],
[10, 10, 10, ..., 10, 10, 10]], dtype=uint8)
np.shape(worldcover_da.values)
(3600, 6000)
Visualize data#
f, ax = worldcover_da.attrs['example_plot'](worldcover_da)
Raster example 3: Sentinel-2 RGB image#
Read data#
def norm(a, perc_lim=(2, 98), clip=True, verbose=False):
#Simple approach using actual min and max values
#amin, amax = (a.min(), a.max())
#Check if we're using masked array or np.nan
if np.ma.isMaskedArray(a):
amin, amax = np.percentile(a.compressed(), perc_lim)
else:
amin, amax = np.nanpercentile(a, perc_lim)
out = ((a - amin)/(amax - amin))
#This will "clip" normalized values <0 to 0 and >1 to 1
if clip:
out = out.clip(0,1)
if verbose:
print("Input range: ({}, {})".format(a.min(), a.max()))
print(f"Percentile range {perc_lim}: ({amin}, {amax})")
print("Output range: ({}, {})".format(out.min(), out.max()))
return out
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
search = catalog.search(
collections=["sentinel-2-l2a"],
bbox=bbox,
datetime=("2022-07-31", "2022-07-31"),
)
red_fn = search.item_collection()[3].assets['B04'].href
green_fn = search.item_collection()[3].assets['B03'].href
blue_fn = search.item_collection()[3].assets['B02'].href
red_src = rio.open(red_fn)
window = rio.windows.Window(8000, 0, 3*1024, 2*1024)
window_bounds = rio.windows.bounds(window, red_src.transform)
window_extent = [window_bounds[0], window_bounds[2], window_bounds[1], window_bounds[3]]
r = rio.open(red_fn).read(1,window=window)
g = rio.open(green_fn).read(1,window=window)
b = rio.open(blue_fn).read(1,window=window)
perc = (0, 90)
r_norm = norm(r, perc)
g_norm = norm(g, perc)
b_norm = norm(b, perc)
rgb = np.dstack([r_norm,g_norm,b_norm])
Preview data structure#
rgb
array([[[0.25661157, 0.26189128, 0.2309719 ],
[0.32203857, 0.28765572, 0.27078454],
[0.29201102, 0.28015289, 0.26478337],
...,
[0.16143251, 0.19252548, 0.17300937],
[0.16212121, 0.18530578, 0.17740047],
[0.16639118, 0.18587203, 0.17227752]],
[[0.23168044, 0.22989807, 0.21574941],
[0.29173554, 0.26642129, 0.25263466],
[0.28732782, 0.26882786, 0.26009953],
...,
[0.1584022 , 0.18162514, 0.17242389],
[0.15275482, 0.16208947, 0.16569087],
[0.1523416 , 0.16605323, 0.16583724]],
[[0.184573 , 0.20385051, 0.18881733],
[0.21184573, 0.21191959, 0.19672131],
[0.21983471, 0.22197055, 0.20769906],
...,
[0.15055096, 0.15812571, 0.16129977],
[0.15013774, 0.16718573, 0.16144614],
[0.15275482, 0.17468856, 0.16671546]],
...,
[[0.16363636, 0.1867214 , 0.17637588],
[0.1607438 , 0.1817667 , 0.17652225],
[0.15922865, 0.18035108, 0.17476581],
...,
[0.15247934, 0.17780294, 0.16378806],
[0.15922865, 0.17709513, 0.1684719 ],
[0.15716253, 0.17412231, 0.16598361]],
[[0.16143251, 0.1882786 , 0.1743267 ],
[0.15936639, 0.18105889, 0.16964286],
[0.15426997, 0.17539638, 0.17110656],
...,
[0.1577135 , 0.18884485, 0.16686183],
[0.16143251, 0.18360702, 0.16978923],
[0.15716253, 0.17313137, 0.16598361]],
[[0.15964187, 0.18148358, 0.17505855],
[0.15730028, 0.17667044, 0.17052108],
[0.15495868, 0.17921857, 0.17096019],
...,
[0.15523416, 0.18374858, 0.16759368],
[0.16033058, 0.18403171, 0.16276347],
[0.16859504, 0.19266704, 0.17271663]]])
np.shape(rgb)
(2048, 2980, 3)
Visualize data#
f, axs = plt.subplots(2,2, layout='compressed',sharex=True,sharey=True)
axs[0,0].imshow(rgb[:,:,0], cmap='Reds',extent=window_extent)
axs[0,1].imshow(rgb[:,:,1], cmap='Greens',extent=window_extent)
axs[1,0].imshow(rgb[:,:,2], cmap='Blues',extent=window_extent)
axs[1,1].imshow(rgb,extent=window_extent)
axs[0,0].set_title('Red band')
axs[0,1].set_title('Green band')
axs[1,0].set_title('Blue band')
axs[1,1].set_title('RGB image')
for ax in axs.flat:
ax.set_aspect('equal')
nDarray example 1: Sentinel-2 image stack#
Read data#
bbox = (-122.0, 46.7, -121.5, 47.0)
s2 = easysnowdata.remote_sensing.Sentinel2(
bbox_input=bbox,
start_date="2022-07-21",
end_date="2022-07-31",
resolution=80,
)
Data searched. Access the returned seach with the .search attribute.
Data retrieved. Access with the .data attribute. Data CRS: WGS 84 / UTM zone 10N.
Nodata values removed from the data. In doing so, all bands converted to float32. To turn this behavior off, set remove_nodata=False.
Data acquired after January 25th, 2022 harmonized to old baseline. To override this behavior, set harmonize_to_old=False.
Data scaled to float reflectance. To turn this behavior off, set scale_data=False.
Metadata retrieved. Access with the .metadata attribute.
Preview data structure#
s2.data # click through data variables to explore the structure!
<xarray.Dataset> Size: 66MB
Dimensions: (time: 5, y: 426, x: 484)
Coordinates:
* y (y) float64 3kB 5.206e+06 5.206e+06 ... 5.172e+06 5.172e+06
* x (x) float64 4kB 5.76e+05 5.761e+05 ... 6.146e+05 6.147e+05
spatial_ref int32 4B 32610
* time (time) datetime64[ns] 40B 2022-07-21T18:59:19.024000 ... 202...
Data variables: (12/16)
coastal (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
blue (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
green (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
red (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
rededge (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
rededge2 (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
... ...
swir16 (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
swir22 (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
aot (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
scl (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
wvp (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
visual (time, y, x) float32 4MB dask.array<chunksize=(1, 426, 484), meta=np.ndarray>
Attributes:
band_info: {'B01': {'name': 'coastal', 'description': 'Coastal aero...
scl_class_info: {0: {'name': 'No Data (Missing data)', 'color': '#000000...Visualize data#
s2.get_rgb()
/srv/conda/envs/notebook/lib/python3.12/site-packages/rasterio/warp.py:387: NotGeoreferencedWarning: Dataset has no geotransform, gcps, or rpcs. The identity matrix will be returned.
dest = _reproject(
RGB data retrieved.
Access with the following attributes:
.rgb for raw RGB,
.rgba for RGBA,
.rgb_percentile for percentile RGB,
.rgb_clahe for CLAHE RGB.
You can pass in percentile_kwargs and clahe_kwargs to adjust RGB calculations, check documentation for options.
f = s2.rgb_clahe.plot.imshow(col='time',col_wrap=5, robust=False)
for ax, time, in zip(f.axs.flat, s2.rgb.time.values):
local_time = pd.to_datetime(time).tz_localize('UTC').tz_convert('America/Los_Angeles')
ax.set_title(f'{local_time.strftime("%B %d, %Y")}\n{local_time.strftime("%I:%M%p")}')
ax.axis('off')
ax.set_aspect('equal')
f.fig.tight_layout()
f.fig.dpi = 300
f.fig.suptitle('Sentinel-2 RGB w/ clahe equalization', fontsize=16, y=1.02)
Text(0.5, 1.02, 'Sentinel-2 RGB w/ clahe equalization')
nDarray example 2: MODIS image stack#
Read data#
bbox = (-125.0, 47.2, -122, 48.5)
catalog = pystac_client.Client.open(
"https://planetarycomputer.microsoft.com/api/stac/v1",
modifier=planetary_computer.sign_inplace,
)
RGB = ["sur_refl_b01", "sur_refl_b04", "sur_refl_b03"]
search = catalog.search(
collections=["modis-09A1-061"],
bbox=bbox,
datetime=("2022-06-18", "2022-08-31"),
)
load_params = {
"chunks": {},
"crs":"EPSG:32610",
"items": search.items(),
"bbox": bbox,
}
modis_ds = odc.stac.load(**load_params)
Preview data structure#
modis_ds
<xarray.Dataset> Size: 50MB
Dimensions: (y: 316, x: 491, time: 10)
Coordinates:
* y (y) float64 3kB 5.374e+06 5.373e+06 ... 5.228e+06
* x (x) float64 4kB 3.486e+05 3.491e+05 ... 5.757e+05
spatial_ref int32 4B 32610
* time (time) datetime64[ns] 80B 2022-06-18 ... 2022-08-29
Data variables: (12/14)
hdf (time, y, x) float32 6MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_b01 (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_b02 (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_b03 (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_b04 (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_b05 (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
... ...
sur_refl_raz (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_szen (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_vzen (time, y, x) int16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_qc_500m (time, y, x) uint32 6MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_state_500m (time, y, x) uint16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>
sur_refl_day_of_year (time, y, x) uint16 3MB dask.array<chunksize=(1, 316, 491), meta=np.ndarray>Visualize data#
vis = modis_ds.odc.to_rgba(RGB, vmin=1, vmax=3000).compute()
vis.plot.imshow(col="time", rgb="band", col_wrap=5, size=4)
<xarray.plot.facetgrid.FacetGrid at 0x7ac748ce7bc0>
nDarray example 3: ERA5 atmospheric reanalysis#
Read data#
era5_ds = xarray.open_zarr(
"gs://gcp-public-data-arco-era5/ar/1959-2022-full_37-1h-0p25deg-chunk-1.zarr-v2",
chunks={'time': 48},
consolidated=True,
storage_options=dict(token='anon'),
)
def ds_swaplon(ds):
return ds.assign_coords(longitude=(((ds.longitude + 180) % 360) - 180)).sortby('longitude')
era5_ds = ds_swaplon(era5_ds).rio.write_crs('EPSG:4326')
Preview data structure#
era5_ds
<xarray.Dataset> Size: 534TB
Dimensions: (time: 552264,
latitude: 721,
longitude: 1440,
level: 37)
Coordinates:
* latitude (latitude) float32 3kB ...
* level (level) int64 296B 1 .....
* time (time) datetime64[ns] 4MB ...
* longitude (longitude) float32 6kB ...
spatial_ref int64 8B 0
Data variables: (12/31)
10m_u_component_of_wind (time, latitude, longitude) float32 2TB dask.array<chunksize=(48, 721, 1440), meta=np.ndarray>
10m_v_component_of_wind (time, latitude, longitude) float32 2TB dask.array<chunksize=(48, 721, 1440), meta=np.ndarray>
2m_temperature (time, latitude, longitude) float32 2TB dask.array<chunksize=(48, 721, 1440), meta=np.ndarray>
angle_of_sub_gridscale_orography (latitude, longitude) float32 4MB dask.array<chunksize=(721, 1440), meta=np.ndarray>
anisotropy_of_sub_gridscale_orography (latitude, longitude) float32 4MB dask.array<chunksize=(721, 1440), meta=np.ndarray>
geopotential (time, level, latitude, longitude) float32 85TB dask.array<chunksize=(48, 37, 721, 1440), meta=np.ndarray>
... ...
total_precipitation (time, latitude, longitude) float32 2TB dask.array<chunksize=(48, 721, 1440), meta=np.ndarray>
type_of_high_vegetation (latitude, longitude) float32 4MB dask.array<chunksize=(721, 1440), meta=np.ndarray>
type_of_low_vegetation (latitude, longitude) float32 4MB dask.array<chunksize=(721, 1440), meta=np.ndarray>
u_component_of_wind (time, level, latitude, longitude) float32 85TB dask.array<chunksize=(48, 37, 721, 1440), meta=np.ndarray>
v_component_of_wind (time, level, latitude, longitude) float32 85TB dask.array<chunksize=(48, 37, 721, 1440), meta=np.ndarray>
vertical_velocity (time, level, latitude, longitude) float32 85TB dask.array<chunksize=(48, 37, 721, 1440), meta=np.ndarray>Visualize data#
f,axs=plt.subplots(1,2,figsize=(10,7),layout='compressed')
era5_ds['sea_ice_cover'].sel(time='2020-03-20T00').plot(ax=axs[0],cmap='Blues',add_colorbar=False)
era5_ds['sea_ice_cover'].sel(time='2020-09-22T00').plot(ax=axs[1],cmap='Blues')
for ax in axs:
world_gdf.boundary.plot(ax=ax,linewidth=1,color='black')
march20_seaice_extent_da = era5_ds['sea_ice_cover'].sel(time=((era5_ds.time.dt.month == 3) &
(era5_ds.time.dt.day == 20) &
(era5_ds.time.dt.hour == 1))
).rio.reproject('EPSG:6933')
sept22_seaice_extent_da = era5_ds['sea_ice_cover'].sel(time=((era5_ds.time.dt.month == 9) &
(era5_ds.time.dt.day == 22) &
(era5_ds.time.dt.hour == 1))
).rio.reproject('EPSG:6933')
march20_max_seaice_extent_NH_da = march20_seaice_extent_da.where(march20_seaice_extent_da.y>0).sum(dim=['x','y']).compute()
sept22_max_seaice_extent_SH_da = sept22_seaice_extent_da.where(sept22_seaice_extent_da.y<0).sum(dim=['x','y']).compute()
f,ax=plt.subplots()
march20_max_seaice_extent_NH_da.plot(ax=ax,label='March 20th Northern Hemisphere')
sept22_max_seaice_extent_SH_da.plot(ax=ax, label='September 22nd Southern Hemisphere')
ax.legend()
ax.set_title('Very rough estimate of max sea ice extent over time')
Text(0.5, 1.0, 'Very rough estimate of max sea ice extent over time')
That’s all, thanks for checking out this notebook!