Module 4: Raster Data & Analysis¶
Learning Goals¶
- Understand raster data structure and properties
- Load and inspect raster files with rasterio
- Perform raster calculations and statistics
- Clip rasters using vector boundaries
- Combine raster and vector data analysis
- Handle NoData values and data types
Introduction to Raster Data¶
Raster data represents continuous phenomena as a grid of cells (pixels), where each cell contains a value representing some measurement or classification. Think of it as a digital photograph where each pixel has a specific value instead of a color.
Key Raster Concepts¶
Grid Structure: Rasters are organized in rows and columns, creating a matrix of cells. Each cell represents a specific geographic area on Earth's surface.
Resolution: The size of each pixel, typically measured in meters. A 30m resolution means each pixel represents a 30m × 30m area on the ground.
Extent: The geographic boundary of the raster, defined by minimum and maximum X and Y coordinates.
Coordinate Reference System (CRS): Defines how the raster's coordinates relate to locations on Earth.
NoData Values: Special values (like -9999) that represent areas with no data or invalid measurements.
graph TD
A[Raster Data] --> B[Grid Structure]
A --> C[Cell Values]
A --> D[Spatial Reference]
B --> E[Rows & Columns]
B --> F[Resolution]
C --> G[Continuous Values]
C --> H[Categorical Values]
D --> I[Coordinate System]
D --> J[Extent/Bounds]
Common Raster Data Types¶
Digital Elevation Models (DEMs): Store terrain height values, essential for topographic analysis, watershed modeling, and visibility studies.
Satellite Imagery: Multispectral data capturing different wavelengths of light. Each band represents different portions of the electromagnetic spectrum:
- Visible bands (Red, Green, Blue): What human eyes can see
- Near-Infrared (NIR): Useful for vegetation analysis
- Thermal bands: Measure surface temperature
- Radar bands: Can penetrate clouds and work day/night
Climate Data: Temperature, precipitation, humidity measurements over time and space.
Population Density: Number of people per unit area, often derived from census data.
Land Cover/Land Use: Classification of Earth's surface (forest, urban, water, agriculture).
Raster vs Vector Data¶
| Aspect | Raster | Vector |
|---|---|---|
| Structure | Grid of cells | Points, lines, polygons |
| Best for | Continuous phenomena | Discrete objects |
| Storage | Fixed grid size | Variable based on complexity |
| Analysis | Mathematical operations | Geometric operations |
| Examples | Temperature, elevation | Roads, boundaries, buildings |
Understanding Raster Resolution¶
Spatial Resolution: The ground distance represented by each pixel
- High resolution (1-5m): Detailed analysis, small areas
- Medium resolution (10-30m): Regional studies, land cover mapping
- Low resolution (100m-1km): Global studies, climate modeling
Temporal Resolution: How often data is collected
- Daily: Weather data, some satellites
- Weekly/Monthly: Vegetation monitoring
- Annual: Land cover change detection
Spectral Resolution: Number and width of spectral bands
- Panchromatic: Single band (black and white)
- Multispectral: 3-10 bands (RGB + infrared)
- Hyperspectral: 100+ narrow bands
Raster File Formats¶
GeoTIFF (.tif/.tiff): Most common format, supports georeferencing and multiple bands
NetCDF (.nc): Excellent for scientific data with multiple dimensions (time, depth)
HDF (.hdf): Hierarchical format for complex scientific datasets
JPEG2000 (.jp2): Compressed format with good quality retention
ESRI Grid: Proprietary format used by ArcGIS
Raster Coordinate Systems¶
Geographic Coordinates: Latitude/longitude in degrees - Good for: Global datasets, web mapping - Units: Decimal degrees - Example: WGS84 (EPSG:4326)
Projected Coordinates: Flat map projections in linear units - Good for: Area calculations, distance measurements - Units: Meters, feet - Example: UTM zones, State Plane
Raster Data Quality Considerations¶
Accuracy: How close values are to true measurements
Precision: Consistency of repeated measurements
Completeness: Percentage of area covered (vs NoData)
Currency: How recent the data is
Lineage: Documentation of data sources and processing steps
Setting Up the Environment¶
import rasterio
import rasterio.plot
import rasterio.mask
import numpy as np
import matplotlib.pyplot as plt
import geopandas as gpd
from rasterio.warp import calculate_default_transform, reproject, Resampling
from rasterio.windows import from_bounds
import warnings
warnings.filterwarnings('ignore')
# Set up plotting
plt.style.use('default')
# Data directory path - change this if your TIFF files are in a different folder
DATA_DIR = 'data/'
# Load available raster data
dem_path = DATA_DIR + 'dem.tif'
b04_path = DATA_DIR + 'B04.tif' # Red band
b08_path = DATA_DIR + 'B08.tif' # NIR band
true_color_path = DATA_DIR + 'true_color.tiff'
print("=== LOADING RASTER DATA ===")
print(f"DEM: {dem_path}")
print(f"Red Band (B04): {b04_path}")
print(f"NIR Band (B08): {b08_path}")
print(f"True Color: {true_color_path}")
1. Understanding Raster Structure¶
Raster Components Deep Dive¶
graph LR
A[Raster Dataset] --> B[Metadata]
A --> C[Data Array]
B --> D[CRS]
B --> E[Transform]
B --> F[Dimensions]
B --> G[NoData Value]
C --> H[Pixel Values]
C --> I[Data Type]
Metadata contains crucial information about the raster:
- CRS (Coordinate Reference System): Defines the spatial reference
- Transform: Mathematical relationship between pixel coordinates and geographic coordinates
- Dimensions: Number of rows, columns, and bands
- NoData Value: Represents missing or invalid data
Data Array holds the actual pixel values:
- Data Type: Integer (int16, int32) or floating-point (float32, float64)
- Bit Depth: Determines value range (8-bit: 0-255, 16-bit: 0-65535)
- Signed vs Unsigned: Whether negative values are allowed
Understanding the Affine Transform¶
The affine transform is a 6-parameter mathematical transformation that converts pixel coordinates to geographic coordinates:
Where:
- a: Pixel width (x-direction)
- b: Row rotation (usually 0)
- c: X-coordinate of upper-left corner
- d: Column rotation (usually 0)
- e: Pixel height (y-direction, usually negative)
- f: Y-coordinate of upper-left corner
Pixel Indexing and Coordinates¶
Pixel Coordinates: Array indices (row, column) starting from (0,0) at top-left
Geographic Coordinates: Real-world coordinates (X, Y) in the raster's CRS
Center vs Corner: Pixels can be referenced by their center point or corner
Data Types and Memory Considerations¶
| Data Type | Range | Memory per Pixel | Best Use |
|---|---|---|---|
| uint8 | 0-255 | 1 byte | Classified data, RGB images |
| int16 | -32,768 to 32,767 | 2 bytes | Elevation, temperature |
| uint16 | 0-65,535 | 2 bytes | Satellite imagery |
| float32 | ±3.4 × 10³⁸ | 4 bytes | Calculated values, ratios |
| float64 | ±1.8 × 10³⁰⁸ | 8 bytes | High-precision calculations |
Loading Real Raster Data¶
# Load the DEM (Digital Elevation Model)
with rasterio.open(dem_path) as dem_src:
elevation_data = dem_src.read(1) # Read first band
dem_transform = dem_src.transform
dem_crs = dem_src.crs
dem_bounds = dem_src.bounds
dem_nodata = dem_src.nodata
print("=== DEM DATA PROPERTIES ===")
print(f"Shape: {elevation_data.shape}")
print(f"Data type: {elevation_data.dtype}")
print(f"CRS: {dem_crs}")
print(f"Bounds: {dem_bounds}")
print(f"NoData value: {dem_nodata}")
# Handle NoData values
if dem_nodata is not None:
elevation_masked = np.ma.masked_equal(elevation_data, dem_nodata)
print(f"Valid pixels: {elevation_masked.count():,}")
print(f"NoData pixels: {elevation_masked.mask.sum():,}")
# Use masked array for statistics
print(f"Min elevation: {elevation_masked.min():.1f}m")
print(f"Max elevation: {elevation_masked.max():.1f}m")
print(f"Mean elevation: {elevation_masked.mean():.1f}m")
else:
print(f"Min elevation: {elevation_data.min():.1f}m")
print(f"Max elevation: {elevation_data.max():.1f}m")
print(f"Mean elevation: {elevation_data.mean():.1f}m")
Raster Metadata and Transform¶
# Use the actual transform from the DEM
transform = dem_transform
bounds = dem_bounds
height, width = elevation_data.shape
print("=== RASTER SPATIAL PROPERTIES ===")
print(f"Bounds: {bounds}")
print(f"Transform: {transform}")
print(f"Pixel size X: {transform[0]:.4f}")
print(f"Pixel size Y: {abs(transform[4]):.4f}")
print(f"Width: {width} pixels")
print(f"Height: {height} pixels")
# Calculate pixel coordinates
def pixel_to_coord(row, col, transform):
"""Convert pixel coordinates to geographic coordinates"""
x = transform[0] * col + transform[1] * row + transform[2]
y = transform[3] * col + transform[4] * row + transform[5]
return x, y
# Example: center pixel coordinates
center_row, center_col = height // 2, width // 2
center_x, center_y = pixel_to_coord(center_row, center_col, transform)
print(f"Center pixel ({center_row}, {center_col}) = ({center_x:.2f}, {center_y:.2f})")
# Calculate extent for plotting
extent = [bounds.left, bounds.right, bounds.bottom, bounds.top]
print(f"Plot extent: {extent}")
2. Visualizing Raster Data¶
Basic Raster Visualization¶
# Create comprehensive visualization
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
# 1. Basic elevation map
im1 = axes[0,0].imshow(elevation_data, cmap='terrain', extent=bounds)
axes[0,0].set_title('Elevation Map')
axes[0,0].set_xlabel('Longitude')
axes[0,0].set_ylabel('Latitude')
plt.colorbar(im1, ax=axes[0,0], label='Elevation (m)')
# 2. Hillshade effect
from matplotlib.colors import LightSource
ls = LightSource(azdeg=315, altdeg=45)
hillshade = ls.hillshade(elevation_data, vert_exag=2)
axes[0,1].imshow(hillshade, cmap='gray', extent=bounds)
axes[0,1].set_title('Hillshade')
axes[0,1].set_xlabel('Longitude')
axes[0,1].set_ylabel('Latitude')
# 3. Elevation histogram
axes[1,0].hist(elevation_data.flatten(), bins=50, alpha=0.7, color='brown')
axes[1,0].set_title('Elevation Distribution')
axes[1,0].set_xlabel('Elevation (m)')
axes[1,0].set_ylabel('Frequency')
axes[1,0].grid(True, alpha=0.3)
# 4. Contour lines
contour = axes[1,1].contour(elevation_data, levels=10, extent=bounds, colors='black', alpha=0.6)
axes[1,1].clabel(contour, inline=True, fontsize=8)
im4 = axes[1,1].imshow(elevation_data, cmap='terrain', extent=bounds, alpha=0.7)
axes[1,1].set_title('Elevation with Contours')
axes[1,1].set_xlabel('Longitude')
axes[1,1].set_ylabel('Latitude')
plt.tight_layout()
plt.show()
Advanced Visualization Techniques¶
# Create elevation classes
def classify_elevation(elevation, breaks):
"""Classify elevation into categories"""
classified = np.zeros_like(elevation, dtype=np.int32)
for i, break_val in enumerate(breaks[1:], 1):
classified[elevation <= break_val] = i
return classified
# Define elevation breaks
elevation_breaks = [0, 200, 500, 800, 1200, 2000]
elevation_classes = classify_elevation(elevation_data, elevation_breaks)
# Create custom colormap
colors = ['#2E8B57', '#9ACD32', '#DAA520', '#CD853F', '#A0522D', '#FFFFFF']
from matplotlib.colors import ListedColormap
custom_cmap = ListedColormap(colors[:len(elevation_breaks)-1])
# Plot classified elevation
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
# Original continuous elevation
im1 = ax1.imshow(elevation_data, cmap='terrain', extent=bounds)
ax1.set_title('Continuous Elevation')
plt.colorbar(im1, ax=ax1, label='Elevation (m)')
# Classified elevation
im2 = ax2.imshow(elevation_classes, cmap=custom_cmap, extent=bounds)
ax2.set_title('Classified Elevation')
cbar = plt.colorbar(im2, ax=ax2, ticks=range(len(elevation_breaks)-1))
cbar.set_ticklabels([f'{elevation_breaks[i]}-{elevation_breaks[i+1]}m'
for i in range(len(elevation_breaks)-1)])
plt.tight_layout()
plt.show()
3. Raster Calculations and Statistics¶
Basic Statistics¶
# Calculate comprehensive statistics
def raster_statistics(data, name="Raster"):
"""Calculate comprehensive raster statistics"""
# Remove any potential NoData values (assuming -9999)
valid_data = data[data != -9999]
stats = {
'count': valid_data.size,
'min': valid_data.min(),
'max': valid_data.max(),
'mean': valid_data.mean(),
'median': np.median(valid_data),
'std': valid_data.std(),
'range': valid_data.max() - valid_data.min()
}
print(f"=== {name.upper()} STATISTICS ===")
for key, value in stats.items():
if key == 'count':
print(f"{key.capitalize()}: {value:,}")
else:
print(f"{key.capitalize()}: {value:.2f}")
return stats
# Calculate statistics for our elevation data
elev_stats = raster_statistics(elevation_data, "Elevation")
# Percentiles
percentiles = [10, 25, 50, 75, 90, 95, 99]
elev_percentiles = np.percentile(elevation_data, percentiles)
print(f"\n=== ELEVATION PERCENTILES ===")
for p, val in zip(percentiles, elev_percentiles):
print(f"{p}th percentile: {val:.1f}m")
Raster Math Operations¶
# Demonstrate various raster calculations
print("=== RASTER CALCULATIONS ===")
# 1. Convert elevation to feet
elevation_feet = elevation_data * 3.28084
print(f"Elevation in feet - Min: {elevation_feet.min():.1f}, Max: {elevation_feet.max():.1f}")
# 2. Calculate slope (simplified)
def calculate_slope(elevation, pixel_size):
"""Calculate slope using gradient"""
grad_y, grad_x = np.gradient(elevation, pixel_size)
slope_radians = np.arctan(np.sqrt(grad_x**2 + grad_y**2))
slope_degrees = np.degrees(slope_radians)
return slope_degrees
pixel_size = abs(transform[0]) # Assuming square pixels
slope_data = calculate_slope(elevation_data, pixel_size)
print(f"Slope - Min: {slope_data.min():.1f}°, Max: {slope_data.max():.1f}°, Mean: {slope_data.mean():.1f}°")
# 3. Create binary mask (high elevation areas)
high_elevation_mask = elevation_data > np.percentile(elevation_data, 75)
high_elevation_area = np.sum(high_elevation_mask) * (pixel_size ** 2)
print(f"High elevation areas (>75th percentile): {high_elevation_area:.2f} square units")
# 4. Normalize elevation (0-1 scale)
elevation_normalized = (elevation_data - elevation_data.min()) / (elevation_data.max() - elevation_data.min())
print(f"Normalized elevation - Min: {elevation_normalized.min():.3f}, Max: {elevation_normalized.max():.3f}")
Visualizing Calculated Rasters¶
# Visualize calculated rasters
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
# Original elevation
im1 = axes[0,0].imshow(elevation_data, cmap='terrain', extent=bounds)
axes[0,0].set_title('Original Elevation')
plt.colorbar(im1, ax=axes[0,0], label='Elevation (m)')
# Slope
im2 = axes[0,1].imshow(slope_data, cmap='Reds', extent=bounds)
axes[0,1].set_title('Slope')
plt.colorbar(im2, ax=axes[0,1], label='Slope (degrees)')
# High elevation mask
im3 = axes[1,0].imshow(high_elevation_mask, cmap='RdYlBu_r', extent=bounds)
axes[1,0].set_title('High Elevation Areas (>75th percentile)')
plt.colorbar(im3, ax=axes[1,0], label='High Elevation')
# Normalized elevation
im4 = axes[1,1].imshow(elevation_normalized, cmap='viridis', extent=bounds)
axes[1,1].set_title('Normalized Elevation (0-1)')
plt.colorbar(im4, ax=axes[1,1], label='Normalized Value')
plt.tight_layout()
plt.show()
4. Working with Vector Boundaries¶
Creating Sample Vector Data¶
# Create sample vector boundaries for clipping
from shapely.geometry import Polygon, Point
import geopandas as gpd
# Create sample study areas
study_areas = [
{'name': 'Area A', 'geometry': Polygon([(-5, -5), (0, -5), (0, 0), (-5, 0), (-5, -5)])},
{'name': 'Area B', 'geometry': Polygon([(2, 2), (8, 2), (8, 8), (2, 8), (2, 2)])},
{'name': 'Circular Area', 'geometry': Point(3, -3).buffer(3)}
]
study_gdf = gpd.GeoDataFrame(study_areas, crs='EPSG:4326')
print("=== STUDY AREAS ===")
print(study_gdf)
# Visualize study areas with elevation
fig, ax = plt.subplots(1, 1, figsize=(12, 10))
# Plot elevation as background
im = ax.imshow(elevation_data, cmap='terrain', extent=bounds, alpha=0.7)
plt.colorbar(im, ax=ax, label='Elevation (m)')
# Plot study areas
study_gdf.plot(ax=ax, facecolor='none', edgecolor='red', linewidth=3, alpha=0.8)
# Add labels
for idx, row in study_gdf.iterrows():
centroid = row.geometry.centroid
ax.annotate(row['name'], (centroid.x, centroid.y),
ha='center', va='center', fontsize=12, fontweight='bold',
bbox=dict(boxstyle='round,pad=0.3', facecolor='white', alpha=0.8))
ax.set_title('Study Areas on Elevation Map')
ax.set_xlabel('Longitude')
ax.set_ylabel('Latitude')
plt.tight_layout()
plt.show()
Clipping Rasters with Vector Boundaries¶
def clip_raster_with_polygon(raster_data, transform, polygon, bounds):
"""Clip raster data using a polygon boundary"""
from rasterio.features import geometry_mask
# Create mask from polygon
mask = geometry_mask([polygon], transform=transform,
invert=True, out_shape=raster_data.shape)
# Apply mask
clipped_data = raster_data.copy()
clipped_data[~mask] = np.nan # Set areas outside polygon to NaN
return clipped_data, mask
# Clip elevation data for each study area
clipped_results = {}
for idx, area in study_gdf.iterrows():
clipped_elev, mask = clip_raster_with_polygon(
elevation_data, transform, area.geometry, bounds
)
clipped_results[area['name']] = {
'data': clipped_elev,
'mask': mask,
'geometry': area.geometry
}
# Visualize clipped results
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
axes = axes.flatten()
# Original elevation
im0 = axes[0].imshow(elevation_data, cmap='terrain', extent=bounds)
study_gdf.plot(ax=axes[0], facecolor='none', edgecolor='red', linewidth=2)
axes[0].set_title('Original Elevation with Study Areas')
plt.colorbar(im0, ax=axes[0], label='Elevation (m)')
# Clipped areas
for i, (name, result) in enumerate(clipped_results.items(), 1):
if i < 4: # Only plot first 3 clipped areas
im = axes[i].imshow(result['data'], cmap='terrain', extent=bounds)
axes[i].set_title(f'Clipped Elevation - {name}')
plt.colorbar(im, ax=axes[i], label='Elevation (m)')
plt.tight_layout()
plt.show()
Statistics for Clipped Areas¶
# Calculate statistics for each clipped area
print("=== CLIPPED AREA STATISTICS ===")
for name, result in clipped_results.items():
clipped_data = result['data']
valid_data = clipped_data[~np.isnan(clipped_data)]
if len(valid_data) > 0:
print(f"\n{name}:")
print(f" Valid pixels: {len(valid_data):,}")
print(f" Min elevation: {valid_data.min():.1f}m")
print(f" Max elevation: {valid_data.max():.1f}m")
print(f" Mean elevation: {valid_data.mean():.1f}m")
print(f" Std deviation: {valid_data.std():.1f}m")
# Calculate area (assuming each pixel represents area)
pixel_area = (pixel_size ** 2)
total_area = len(valid_data) * pixel_area
print(f" Total area: {total_area:.2f} square units")
else:
print(f"\n{name}: No valid data")
5. Raster-Vector Integration¶
Extracting Values at Points¶
# Create sample point locations
sample_points = [
{'name': 'Peak A', 'geometry': Point(-2, 3)},
{'name': 'Valley B', 'geometry': Point(4, -2)},
{'name': 'Ridge C', 'geometry': Point(-6, -1)},
{'name': 'Plain D', 'geometry': Point(7, 5)}
]
points_gdf = gpd.GeoDataFrame(sample_points, crs='EPSG:4326')
def extract_raster_values_at_points(raster_data, transform, points_gdf):
"""Extract raster values at point locations"""
from rasterio.transform import rowcol
extracted_values = []
for idx, point in points_gdf.iterrows():
# Convert geographic coordinates to pixel coordinates
row, col = rowcol(transform, point.geometry.x, point.geometry.y)
# Check if point is within raster bounds
if 0 <= row < raster_data.shape[0] and 0 <= col < raster_data.shape[1]:
value = raster_data[row, col]
extracted_values.append(value)
else:
extracted_values.append(np.nan)
return extracted_values
# Extract elevation values at points
elevation_at_points = extract_raster_values_at_points(elevation_data, transform, points_gdf)
slope_at_points = extract_raster_values_at_points(slope_data, transform, points_gdf)
# Add values to GeoDataFrame
points_gdf['elevation'] = elevation_at_points
points_gdf['slope'] = slope_at_points
print("=== ELEVATION AND SLOPE AT SAMPLE POINTS ===")
print(points_gdf[['name', 'elevation', 'slope']])
# Visualize points on raster
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
# Elevation with points
im1 = ax1.imshow(elevation_data, cmap='terrain', extent=bounds)
points_gdf.plot(ax=ax1, color='red', markersize=100, marker='*', edgecolor='black', linewidth=1)
for idx, point in points_gdf.iterrows():
ax1.annotate(f"{point['name']}\n{point['elevation']:.0f}m",
(point.geometry.x, point.geometry.y),
xytext=(10, 10), textcoords='offset points',
bbox=dict(boxstyle='round,pad=0.3', facecolor='white', alpha=0.8),
fontsize=8)
ax1.set_title('Elevation Values at Sample Points')
plt.colorbar(im1, ax=ax1, label='Elevation (m)')
# Slope with points
im2 = ax2.imshow(slope_data, cmap='Reds', extent=bounds)
points_gdf.plot(ax=ax2, color='blue', markersize=100, marker='*', edgecolor='black', linewidth=1)
for idx, point in points_gdf.iterrows():
ax2.annotate(f"{point['name']}\n{point['slope']:.1f}°",
(point.geometry.x, point.geometry.y),
xytext=(10, 10), textcoords='offset points',
bbox=dict(boxstyle='round,pad=0.3', facecolor='white', alpha=0.8),
fontsize=8)
ax2.set_title('Slope Values at Sample Points')
plt.colorbar(im2, ax=ax2, label='Slope (degrees)')
plt.tight_layout()
plt.show()
Zonal Statistics¶
def calculate_zonal_statistics(raster_data, zones_gdf, transform):
"""Calculate statistics for each zone (polygon)"""
from rasterio.features import geometry_mask
results = []
for idx, zone in zones_gdf.iterrows():
# Create mask for this zone
mask = geometry_mask([zone.geometry], transform=transform,
invert=True, out_shape=raster_data.shape)
# Extract values within zone
zone_values = raster_data[mask]
zone_values = zone_values[~np.isnan(zone_values)] # Remove NaN values
if len(zone_values) > 0:
stats = {
'zone_name': zone.get('name', f'Zone_{idx}'),
'count': len(zone_values),
'min': zone_values.min(),
'max': zone_values.max(),
'mean': zone_values.mean(),
'median': np.median(zone_values),
'std': zone_values.std(),
'sum': zone_values.sum()
}
else:
stats = {
'zone_name': zone.get('name', f'Zone_{idx}'),
'count': 0,
'min': np.nan,
'max': np.nan,
'mean': np.nan,
'median': np.nan,
'std': np.nan,
'sum': np.nan
}
results.append(stats)
return pd.DataFrame(results)
# Calculate zonal statistics for elevation
zonal_stats = calculate_zonal_statistics(elevation_data, study_gdf, transform)
print("=== ZONAL STATISTICS FOR ELEVATION ===")
print(zonal_stats.round(2))
# Calculate zonal statistics for slope
slope_zonal_stats = calculate_zonal_statistics(slope_data, study_gdf, transform)
print("\n=== ZONAL STATISTICS FOR SLOPE ===")
print(slope_zonal_stats.round(2))
7. Multi-band Raster Analysis¶
NDVI Calculation from Two Bands¶
NDVI (Normalized Difference Vegetation Index) is calculated using Near-Infrared (NIR) and Red bands to assess vegetation health.
# Load the satellite bands
with rasterio.open(b08_path) as nir_src:
nir_data = nir_src.read(1).astype(float) # NIR band (B08)
nir_transform = nir_src.transform
nir_bounds = nir_src.bounds
nir_nodata = nir_src.nodata
with rasterio.open(b04_path) as red_src:
red_data = red_src.read(1).astype(float) # Red band (B04)
red_transform = red_src.transform
red_bounds = red_src.bounds
red_nodata = red_src.nodata
print("=== SATELLITE BAND DATA ===")
print(f"NIR Band shape: {nir_data.shape}")
print(f"Red Band shape: {red_data.shape}")
print(f"NIR range: {nir_data.min():.0f} - {nir_data.max():.0f}")
print(f"Red range: {red_data.min():.0f} - {red_data.max():.0f}")
print(f"NIR NoData: {nir_nodata}")
print(f"Red NoData: {red_nodata}")
# Handle NoData values and convert to reflectance (0-1 scale)
if nir_nodata is not None:
nir_data = np.where(nir_data == nir_nodata, np.nan, nir_data)
if red_nodata is not None:
red_data = np.where(red_data == red_nodata, np.nan, red_data)
# Convert to reflectance if needed (assuming values are in 0-10000 range)
if nir_data.max() > 1:
nir_data = nir_data / 10000.0
if red_data.max() > 1:
red_data = red_data / 10000.0
print(f"NIR reflectance range: {np.nanmin(nir_data):.3f} - {np.nanmax(nir_data):.3f}")
print(f"Red reflectance range: {np.nanmin(red_data):.3f} - {np.nanmax(red_data):.3f}")
# Calculate NDVI
def calculate_ndvi(nir, red):
"""Calculate NDVI from NIR and Red bands"""
# Avoid division by zero
denominator = nir + red
ndvi = np.where(denominator != 0, (nir - red) / denominator, np.nan)
return ndvi
ndvi = calculate_ndvi(nir_data, red_data)
print(f"\nNDVI range: {np.nanmin(ndvi):.3f} - {np.nanmax(ndvi):.3f}")
print(f"Mean NDVI: {np.nanmean(ndvi):.3f}")
# Calculate extent for plotting
band_extent = [nir_bounds.left, nir_bounds.right, nir_bounds.bottom, nir_bounds.top]
# Visualize bands and NDVI
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
# NIR Band
im1 = axes[0,0].imshow(nir_data, cmap='Reds', extent=band_extent)
axes[0,0].set_title('NIR Band (B08)')
plt.colorbar(im1, ax=axes[0,0], label='Reflectance')
# Red Band
im2 = axes[0,1].imshow(red_data, cmap='Reds', extent=band_extent)
axes[0,1].set_title('Red Band (B04)')
plt.colorbar(im2, ax=axes[0,1], label='Reflectance')
# NDVI
im3 = axes[1,0].imshow(ndvi, cmap='RdYlGn', vmin=-1, vmax=1, extent=band_extent)
axes[1,0].set_title('NDVI (Vegetation Index)')
plt.colorbar(im3, ax=axes[1,0], label='NDVI Value')
# NDVI Classification
ndvi_classes = np.where(ndvi < 0, 0, # Water/bare soil
np.where(ndvi < 0.2, 1, # Low vegetation
np.where(ndvi < 0.5, 2, # Moderate vegetation
3))) # High vegetation
class_colors = ['blue', 'brown', 'yellow', 'green']
class_labels = ['Water/Bare', 'Low Veg', 'Moderate Veg', 'High Veg']
from matplotlib.colors import ListedColormap
class_cmap = ListedColormap(class_colors)
im4 = axes[1,1].imshow(ndvi_classes, cmap=class_cmap, extent=band_extent)
axes[1,1].set_title('NDVI Classification')
cbar = plt.colorbar(im4, ax=axes[1,1], ticks=[0, 1, 2, 3])
cbar.set_ticklabels(class_labels)
plt.tight_layout()
plt.show()
# NDVI Statistics
print("\n=== NDVI ANALYSIS ===")
for i, label in enumerate(class_labels):
pixel_count = np.sum(ndvi_classes == i)
percentage = (pixel_count / ndvi_classes.size) * 100
print(f"{label}: {pixel_count:,} pixels ({percentage:.1f}%)")
RGB Visualization from Three Bands¶
True Color Composite uses Red, Green, and Blue bands to create natural-looking images.
# Load the true color composite
with rasterio.open(true_color_path) as rgb_src:
rgb_data = rgb_src.read() # Read all bands
rgb_transform = rgb_src.transform
rgb_bounds = rgb_src.bounds
rgb_nodata = rgb_src.nodata
print("=== TRUE COLOR COMPOSITE DATA ===")
print(f"RGB data shape: {rgb_data.shape}")
print(f"Number of bands: {rgb_data.shape[0]}")
print(f"Image dimensions: {rgb_data.shape[1]} x {rgb_data.shape[2]}")
print(f"Data type: {rgb_data.dtype}")
print(f"NoData value: {rgb_nodata}")
# Handle the data for visualization
if rgb_data.shape[0] >= 3:
# Extract RGB bands (assuming order is R, G, B)
red_band = rgb_data[0].astype(float)
green_band = rgb_data[1].astype(float)
blue_band = rgb_data[2].astype(float)
# Handle NoData values
if rgb_nodata is not None:
red_band = np.where(red_band == rgb_nodata, np.nan, red_band)
green_band = np.where(green_band == rgb_nodata, np.nan, green_band)
blue_band = np.where(blue_band == rgb_nodata, np.nan, blue_band)
print(f"Red band range: {np.nanmin(red_band):.0f} - {np.nanmax(red_band):.0f}")
print(f"Green band range: {np.nanmin(green_band):.0f} - {np.nanmax(green_band):.0f}")
print(f"Blue band range: {np.nanmin(blue_band):.0f} - {np.nanmax(blue_band):.0f}")
# Create RGB composite
def create_rgb_composite(red, green, blue, enhance=True):
"""Create RGB composite from three bands"""
# Normalize to 0-1 range if needed
if red.max() > 1:
red = red / np.nanmax(red)
if green.max() > 1:
green = green / np.nanmax(green)
if blue.max() > 1:
blue = blue / np.nanmax(blue)
# Stack bands into 3D array
rgb = np.dstack((red, green, blue))
if enhance:
# Enhance contrast (stretch to full 0-1 range)
for i in range(3):
band = rgb[:, :, i]
band_min, band_max = np.nanmin(band), np.nanmax(band)
if band_max > band_min:
rgb[:, :, i] = (band - band_min) / (band_max - band_min)
return rgb
# Create composites
rgb_natural = create_rgb_composite(red_band, green_band, blue_band, enhance=False)
rgb_enhanced = create_rgb_composite(red_band, green_band, blue_band, enhance=True)
# Calculate extent for plotting
rgb_extent = [rgb_bounds.left, rgb_bounds.right, rgb_bounds.bottom, rgb_bounds.top]
# Visualize individual bands and composites
fig, axes = plt.subplots(2, 3, figsize=(18, 12))
# Individual bands
im1 = axes[0,0].imshow(red_band, cmap='Reds', extent=rgb_extent)
axes[0,0].set_title('Red Band')
plt.colorbar(im1, ax=axes[0,0], label='Digital Number')
im2 = axes[0,1].imshow(green_band, cmap='Greens', extent=rgb_extent)
axes[0,1].set_title('Green Band')
plt.colorbar(im2, ax=axes[0,1], label='Digital Number')
im3 = axes[0,2].imshow(blue_band, cmap='Blues', extent=rgb_extent)
axes[0,2].set_title('Blue Band')
plt.colorbar(im3, ax=axes[0,2], label='Digital Number')
# RGB composites
axes[1,0].imshow(rgb_natural, extent=rgb_extent)
axes[1,0].set_title('Natural Color Composite')
axes[1,1].imshow(rgb_enhanced, extent=rgb_extent)
axes[1,1].set_title('Enhanced Color Composite')
# False color composite (NIR-Red-Green) if NIR data is available
if 'nir_data' in locals() and nir_data.shape == red_band.shape:
false_color = create_rgb_composite(nir_data, red_band, green_band)
axes[1,2].imshow(false_color, extent=rgb_extent)
axes[1,2].set_title('False Color Composite\n(NIR-Red-Green)')
else:
axes[1,2].text(0.5, 0.5, 'NIR data not available\nfor false color composite',
ha='center', va='center', transform=axes[1,2].transAxes)
axes[1,2].set_title('False Color Composite\n(Not Available)')
plt.tight_layout()
plt.show()
# Band statistics
print("\n=== BAND STATISTICS ===")
bands = {'Red': red_band, 'Green': green_band, 'Blue': blue_band}
if 'nir_data' in locals():
bands['NIR'] = nir_data
for band_name, band_data in bands.items():
print(f"{band_name} Band:")
print(f" Mean: {np.nanmean(band_data):.1f}")
print(f" Std: {np.nanstd(band_data):.1f}")
print(f" Min: {np.nanmin(band_data):.1f}")
print(f" Max: {np.nanmax(band_data):.1f}")
else:
print("Error: True color composite does not have enough bands for RGB visualization")
8. Handling NoData Values¶
# Create sample data with NoData values
elevation_with_nodata = elevation_data.copy()
# Introduce some NoData areas (simulate missing data)
# Set random patches to NoData value (-9999)
nodata_value = -9999
np.random.seed(42) # For reproducible results
# Create random NoData patches
for _ in range(5):
center_row = np.random.randint(20, elevation_data.shape[0] - 20)
center_col = np.random.randint(20, elevation_data.shape[1] - 20)
size = np.random.randint(5, 15)
elevation_with_nodata[
center_row-size:center_row+size,
center_col-size:center_col+size
] = nodata_value
print("=== NODATA HANDLING ===")
print(f"Original data shape: {elevation_data.shape}")
print(f"NoData value: {nodata_value}")
print(f"Number of NoData pixels: {np.sum(elevation_with_nodata == nodata_value)}")
print(f"Percentage NoData: {np.sum(elevation_with_nodata == nodata_value) / elevation_with_nodata.size * 100:.1f}%")
# Create masked array for proper handling
elevation_masked = np.ma.masked_equal(elevation_with_nodata, nodata_value)
print(f"Valid data points: {elevation_masked.count()}")
print(f"Masked data points: {elevation_masked.mask.sum()}")
NoData Visualization and Statistics¶
# Visualize data with NoData
fig, axes = plt.subplots(1, 3, figsize=(18, 6))
# Original data
im1 = axes[0].imshow(elevation_data, cmap='terrain', extent=bounds)
axes[0].set_title('Original Elevation Data')
plt.colorbar(im1, ax=axes[0], label='Elevation (m)')
# Data with NoData (showing NoData as white)
elevation_display = elevation_with_nodata.copy()
elevation_display[elevation_display == nodata_value] = np.nan
im2 = axes[1].imshow(elevation_display, cmap='terrain', extent=bounds)
axes[1].set_title('Elevation Data with NoData (white areas)')
plt.colorbar(im2, ax=axes[1], label='Elevation (m)')
# NoData mask only
nodata_mask = elevation_with_nodata == nodata_value
im3 = axes[2].imshow(nodata_mask, cmap='RdYlBu_r', extent=bounds)
axes[2].set_title('NoData Mask (red = NoData)')
plt.colorbar(im3, ax=axes[2], label='NoData')
plt.tight_layout()
plt.show()
# Statistics comparison
print("\n=== STATISTICS COMPARISON ===")
print("Original data:")
raster_statistics(elevation_data, "Original")
print("\nData with NoData (excluding NoData values):")
valid_data = elevation_with_nodata[elevation_with_nodata != nodata_value]
raster_statistics(valid_data, "Valid Data Only")
Filling NoData Values¶
# Demonstrate different NoData filling methods
from scipy import ndimage
def fill_nodata_methods(data, nodata_value):
"""Demonstrate different methods to fill NoData values"""
# Create mask
mask = data == nodata_value
filled_data = data.copy().astype(float)
filled_data[mask] = np.nan
methods = {}
# Method 1: Fill with mean
mean_filled = filled_data.copy()
mean_value = np.nanmean(filled_data)
mean_filled[np.isnan(mean_filled)] = mean_value
methods['Mean Fill'] = mean_filled
# Method 2: Fill with median
median_filled = filled_data.copy()
median_value = np.nanmedian(filled_data)
median_filled[np.isnan(median_filled)] = median_value
methods['Median Fill'] = median_filled
# Method 3: Interpolation (simple nearest neighbor)
from scipy.interpolate import griddata
# Get coordinates of valid and invalid points
rows, cols = np.mgrid[0:data.shape[0], 0:data.shape[1]]
valid_mask = ~np.isnan(filled_data)
if np.sum(valid_mask) > 0:
# Interpolate
interpolated = griddata(
(rows[valid_mask], cols[valid_mask]),
filled_data[valid_mask],
(rows, cols),
method='nearest'
)
methods['Nearest Neighbor'] = interpolated
return methods
# Apply different filling methods
filled_methods = fill_nodata_methods(elevation_with_nodata, nodata_value)
# Visualize different filling methods
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
axes = axes.flatten()
# Original with NoData
elevation_display = elevation_with_nodata.copy()
elevation_display[elevation_display == nodata_value] = np.nan
im0 = axes[0].imshow(elevation_display, cmap='terrain', extent=bounds)
axes[0].set_title('Original with NoData')
plt.colorbar(im0, ax=axes[0], label='Elevation (m)')
# Different filling methods
for i, (method_name, filled_data) in enumerate(filled_methods.items(), 1):
if i < 4:
im = axes[i].imshow(filled_data, cmap='terrain', extent=bounds)
axes[i].set_title(f'{method_name}')
plt.colorbar(im, ax=axes[i], label='Elevation (m)')
plt.tight_layout()
plt.show()
Practice Problems¶
Problem 1: Elevation Analysis¶
Perform comprehensive elevation analysis:
# TODO:
# 1. Create elevation zones (low, medium, high)
# 2. Calculate the area of each elevation zone
# 3. Find the steepest areas (top 10% slope)
# 4. Create a suitability map combining elevation and slope
# 5. Generate summary statistics
# Your code here
Solution
# 1. Create elevation zones
elevation_percentiles = np.percentile(elevation_data, [33, 67])
elevation_zones = np.zeros_like(elevation_data, dtype=int)
elevation_zones[elevation_data <= elevation_percentiles[0]] = 1 # Low
elevation_zones[(elevation_data > elevation_percentiles[0]) &
(elevation_data <= elevation_percentiles[1])] = 2 # Medium
elevation_zones[elevation_data > elevation_percentiles[1]] = 3 # High
# 2. Calculate area of each zone
pixel_area = (pixel_size ** 2)
zone_areas = {}
zone_names = {1: 'Low', 2: 'Medium', 3: 'High'}
for zone_id, zone_name in zone_names.items():
zone_pixels = np.sum(elevation_zones == zone_id)
zone_area = zone_pixels * pixel_area
zone_areas[zone_name] = zone_area
print(f"{zone_name} elevation zone: {zone_pixels:,} pixels, {zone_area:.2f} sq units")
# 3. Find steepest areas (top 10% slope)
slope_threshold = np.percentile(slope_data, 90)
steep_areas = slope_data > slope_threshold
steep_area_total = np.sum(steep_areas) * pixel_area
print(f"\nSteepest areas (>{slope_threshold:.1f}°): {steep_area_total:.2f} sq units")
# 4. Create suitability map (example: suitable = medium elevation + gentle slope)
gentle_slope = slope_data < np.percentile(slope_data, 50) # Bottom 50% slope
medium_elevation = elevation_zones == 2
suitability = (medium_elevation & gentle_slope).astype(int)
suitable_area = np.sum(suitability) * pixel_area
print(f"Suitable areas (medium elevation + gentle slope): {suitable_area:.2f} sq units")
# 5. Visualize results
fig, axes = plt.subplots(2, 2, figsize=(15, 12))
# Elevation zones
zone_colors = ['white', 'green', 'yellow', 'red']
zone_cmap = ListedColormap(zone_colors)
im1 = axes[0,0].imshow(elevation_zones, cmap=zone_cmap, extent=bounds)
axes[0,0].set_title('Elevation Zones')
# Steep areas
im2 = axes[0,1].imshow(steep_areas, cmap='Reds', extent=bounds)
axes[0,1].set_title('Steepest Areas (Top 10%)')
# Suitability map
im3 = axes[1,0].imshow(suitability, cmap='RdYlGn', extent=bounds)
axes[1,0].set_title('Suitability Map')
# Combined visualization
axes[1,1].imshow(elevation_data, cmap='terrain', extent=bounds, alpha=0.7)
axes[1,1].contour(slope_data, levels=[slope_threshold], colors='red', linewidths=2)
axes[1,1].set_title('Elevation with Steep Area Contours')
plt.tight_layout()
plt.show()
Problem 2: Multi-Criteria Analysis¶
Combine multiple raster layers for analysis:
# TODO:
# 1. Create a temperature raster (simulate climate data)
# 2. Create a distance-to-water raster (simulate accessibility)
# 3. Combine elevation, slope, temperature, and distance for habitat suitability
# 4. Apply different weights to each factor
# 5. Identify the top 20% most suitable areas
# Your code here
Solution
# 1. Create temperature raster (simulate)
x = np.linspace(-10, 10, elevation_data.shape[1])
y = np.linspace(-10, 10, elevation_data.shape[0])
X, Y = np.meshgrid(x, y)
# Temperature decreases with elevation and varies with latitude
temperature_data = (
25 - (elevation_data / 100) + # Temperature lapse rate
5 * np.sin(Y / 5) + # Latitudinal variation
3 * np.random.random(elevation_data.shape) # Random variation
)
# 2. Create distance-to-water raster (simulate rivers/lakes)
# Create some "water bodies" as low elevation areas
water_bodies = elevation_data < np.percentile(elevation_data, 20)
# Calculate distance to nearest water
from scipy.ndimage import distance_transform_edt
distance_to_water = distance_transform_edt(~water_bodies) * pixel_size
print("=== SIMULATED DATA CREATED ===")
print(f"Temperature range: {temperature_data.min():.1f}°C to {temperature_data.max():.1f}°C")
print(f"Distance to water range: {distance_to_water.min():.1f} to {distance_to_water.max():.1f} units")
# 3. Normalize all factors to 0-1 scale for combination
def normalize_raster(data):
return (data - data.min()) / (data.max() - data.min())
# Normalize factors (some need to be inverted for suitability)
elev_norm = 1 - normalize_raster(np.abs(elevation_data - np.median(elevation_data))) # Prefer medium elevation
slope_norm = 1 - normalize_raster(slope_data) # Prefer gentle slopes
temp_norm = 1 - normalize_raster(np.abs(temperature_data - 20)) # Prefer ~20°C
water_norm = 1 - normalize_raster(distance_to_water) # Prefer close to water
# 4. Apply weights and combine
weights = {
'elevation': 0.3,
'slope': 0.3,
'temperature': 0.2,
'water_distance': 0.2
}
habitat_suitability = (
weights['elevation'] * elev_norm +
weights['slope'] * slope_norm +
weights['temperature'] * temp_norm +
weights['water_distance'] * water_norm
)
# 5. Identify top 20% most suitable areas
suitability_threshold = np.percentile(habitat_suitability, 80)
top_suitable = habitat_suitability > suitability_threshold
suitable_area = np.sum(top_suitable) * pixel_area
total_area = habitat_suitability.size * pixel_area
print(f"\n=== HABITAT SUITABILITY RESULTS ===")
print(f"Suitability threshold (80th percentile): {suitability_threshold:.3f}")
print(f"Top suitable area: {suitable_area:.2f} sq units ({suitable_area/total_area*100:.1f}% of total)")
# Visualize all factors and result
fig, axes = plt.subplots(3, 2, figsize=(15, 18))
# Individual factors
im1 = axes[0,0].imshow(elevation_data, cmap='terrain', extent=bounds)
axes[0,0].set_title('Elevation')
plt.colorbar(im1, ax=axes[0,0])
im2 = axes[0,1].imshow(slope_data, cmap='Reds', extent=bounds)
axes[0,1].set_title('Slope')
plt.colorbar(im2, ax=axes[0,1])
im3 = axes[1,0].imshow(temperature_data, cmap='coolwarm', extent=bounds)
axes[1,0].set_title('Temperature')
plt.colorbar(im3, ax=axes[1,0])
im4 = axes[1,1].imshow(distance_to_water, cmap='Blues_r', extent=bounds)
axes[1,1].set_title('Distance to Water')
plt.colorbar(im4, ax=axes[1,1])
# Combined suitability
im5 = axes[2,0].imshow(habitat_suitability, cmap='RdYlGn', extent=bounds)
axes[2,0].set_title('Habitat Suitability (Combined)')
plt.colorbar(im5, ax=axes[2,0])
# Top suitable areas
im6 = axes[2,1].imshow(top_suitable, cmap='RdYlGn', extent=bounds)
axes[2,1].set_title('Top 20% Most Suitable Areas')
plt.colorbar(im6, ax=axes[2,1])
plt.tight_layout()
plt.show()
Problem 3: Raster Time Series Analysis¶
Simulate and analyze change over time:
# TODO:
# 1. Create 3 time periods of elevation data (simulate erosion/deposition)
# 2. Calculate elevation change between periods
# 3. Identify areas of significant change
# 4. Calculate change statistics
# 5. Visualize the temporal changes
# Your code here
Solution
# 1. Create time series data (simulate erosion/deposition)
np.random.seed(42)
# Base elevation (time 1)
elevation_t1 = elevation_data.copy()
# Time 2: Add some erosion and deposition
erosion_areas = slope_data > np.percentile(slope_data, 70) # Steep areas erode
deposition_areas = slope_data < np.percentile(slope_data, 30) # Flat areas accumulate
elevation_t2 = elevation_t1.copy()
elevation_t2[erosion_areas] -= np.random.uniform(5, 20, np.sum(erosion_areas))
elevation_t2[deposition_areas] += np.random.uniform(2, 10, np.sum(deposition_areas))
# Time 3: Continue the process
elevation_t3 = elevation_t2.copy()
elevation_t3[erosion_areas] -= np.random.uniform(3, 15, np.sum(erosion_areas))
elevation_t3[deposition_areas] += np.random.uniform(1, 8, np.sum(deposition_areas))
# 2. Calculate elevation changes
change_t1_t2 = elevation_t2 - elevation_t1
change_t2_t3 = elevation_t3 - elevation_t2
change_total = elevation_t3 - elevation_t1
# 3. Identify significant changes (>10m change)
significant_change = np.abs(change_total) > 10
erosion_areas_final = change_total < -10
deposition_areas_final = change_total > 10
# 4. Calculate statistics
print("=== ELEVATION CHANGE ANALYSIS ===")
print(f"Period 1-2 change: {change_t1_t2.min():.1f} to {change_t1_t2.max():.1f}m")
print(f"Period 2-3 change: {change_t2_t3.min():.1f} to {change_t2_t3.max():.1f}m")
print(f"Total change: {change_total.min():.1f} to {change_total.max():.1f}m")
print(f"\nAreas with significant change (>10m): {np.sum(significant_change):,} pixels")
print(f"Erosion areas (>10m loss): {np.sum(erosion_areas_final):,} pixels")
print(f"Deposition areas (>10m gain): {np.sum(deposition_areas_final):,} pixels")
# Calculate volumes
pixel_area = pixel_size ** 2
erosion_volume = np.sum(change_total[erosion_areas_final]) * pixel_area
deposition_volume = np.sum(change_total[deposition_areas_final]) * pixel_area
print(f"\nErosion volume: {abs(erosion_volume):,.0f} cubic units")
print(f"Deposition volume: {deposition_volume:,.0f} cubic units")
print(f"Net change: {erosion_volume + deposition_volume:,.0f} cubic units")
# 5. Visualize temporal changes
fig, axes = plt.subplots(3, 3, figsize=(18, 15))
# Time series elevations
for i, (elev_data, title) in enumerate([(elevation_t1, 'Time 1'),
(elevation_t2, 'Time 2'),
(elevation_t3, 'Time 3')]):
im = axes[0,i].imshow(elev_data, cmap='terrain', extent=bounds)
axes[0,i].set_title(title)
plt.colorbar(im, ax=axes[0,i], label='Elevation (m)')
# Change maps
change_data = [change_t1_t2, change_t2_t3, change_total]
change_titles = ['Change T1-T2', 'Change T2-T3', 'Total Change']
for i, (change, title) in enumerate(zip(change_data, change_titles)):
im = axes[1,i].imshow(change, cmap='RdBu_r', extent=bounds,
vmin=-30, vmax=30)
axes[1,i].set_title(title)
plt.colorbar(im, ax=axes[1,i], label='Elevation Change (m)')
# Analysis results
im7 = axes[2,0].imshow(significant_change, cmap='Reds', extent=bounds)
axes[2,0].set_title('Significant Change Areas (>10m)')
plt.colorbar(im7, ax=axes[2,0])
im8 = axes[2,1].imshow(erosion_areas_final, cmap='Reds', extent=bounds)
axes[2,1].set_title('Major Erosion Areas')
plt.colorbar(im8, ax=axes[2,1])
im9 = axes[2,2].imshow(deposition_areas_final, cmap='Blues', extent=bounds)
axes[2,2].set_title('Major Deposition Areas')
plt.colorbar(im9, ax=axes[2,2])
plt.tight_layout()
plt.show()
# Create change summary
change_summary = pd.DataFrame({
'Metric': ['Mean Change', 'Max Erosion', 'Max Deposition',
'Std Deviation', 'Erosion Pixels', 'Deposition Pixels'],
'Value': [change_total.mean(), change_total.min(), change_total.max(),
change_total.std(), np.sum(erosion_areas_final), np.sum(deposition_areas_final)],
'Unit': ['m', 'm', 'm', 'm', 'pixels', 'pixels']
})
print("\n=== CHANGE SUMMARY TABLE ===")
print(change_summary.round(2))
Key Takeaways¶
mindmap
root((Raster Analysis))
Data Structure
Grid Cells
Resolution
Extent
NoData Values
Operations
Math Operations
Statistics
Classification
Filtering
Vector Integration
Clipping
Zonal Statistics
Point Extraction
Overlay Analysis
Quality Control
NoData Handling
Data Validation
Error Checking
What You've Learned
- Raster Structure: Understanding grids, resolution, and spatial reference
- Data Analysis: Statistics, calculations, and transformations
- Vector Integration: Combining raster and vector data effectively
- Quality Control: Handling missing data and validation
- Visualization: Creating meaningful maps and charts
- Practical Applications: Real-world analysis workflows
Best Practices
- Always check raster metadata before analysis
- Handle NoData values appropriately
- Use appropriate data types to save memory
- Validate results with visualizations
- Document your analysis parameters
- Consider computational efficiency for large datasets
Next Steps¶
In the next module, we'll create interactive visualizations and web applications: - Interactive maps with Folium - Dashboard creation with Streamlit - Combining raster and vector visualizations - User interface design for geospatial applications
graph LR
A[Raster Analysis] --> B[Visualization]
B --> C[Interactive Maps]
B --> D[Web Applications]
B --> E[Dashboards]