Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Select an option

  • Save akhuff157/1e88e7573ad2e46cba621fc5d7e969e7 to your computer and use it in GitHub Desktop.

Select an option

Save akhuff157/1e88e7573ad2e46cba621fc5d7e969e7 to your computer and use it in GitHub Desktop.
tutorial_snap_viirs_adp_geotiff_20250401.ipynb
Display the source blob
Display the rendered blob
Raw
{
"nbformat": 4,
"nbformat_minor": 0,
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyM7TO7KIXBPa4GEvDZbZ1+R",
"include_colab_link": true
},
"kernelspec": {
"name": "python3",
"display_name": "Python 3"
},
"language_info": {
"name": "python"
}
},
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/akhuff157/1e88e7573ad2e46cba621fc5d7e969e7/tutorial_snap_viirs_adp_geotiff_20250401.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"#**Tutorial: Creating a GeoTiff File from a Map Image File of VIIRS Aerosol Detection (Smoke/Dust Mask) using Python**\n",
"\n",
"This Python tutorial was written in April 2025 by Dr. Amy Huff, STC at NOAA/NESDIS/STAR (amy.huff@noaa.gov). <font color='red'>**If you use any of the Python code in your research, please credit the NOAA/NESDIS/STAR Aerosols & Atmospheric Composition Science Team.**</font>"
],
"metadata": {
"id": "Lo-fWLDTTN-U"
}
},
{
"cell_type": "markdown",
"metadata": {
"id": "d3404f3a"
},
"source": [
"## **Import Python modules and packages**\n",
"\n",
"- [pathlib](https://docs.python.org/3/library/pathlib.html): module to set file system paths\n",
"- [datetime](https://docs.python.org/3/library/datetime.html): module to manipulate dates and times\n",
"- [S3Fs](https://s3fs.readthedocs.io/en/latest/): library to set up a file system interface with AWS Simple Storage Service (S3) buckets\n",
"- [requests](https://requests.readthedocs.io/en/latest/): library to send HTTP requests\n",
"- [pandas](https://pandas.pydata.org/docs/user_guide/index.html): library for data analysis\n",
"- [xarray](https://docs.xarray.dev/en/stable/index.html): library to work with labeled multi-dimensional arrays\n",
"- [NumPy](https://numpy.org/doc/stable/user/index.html): library to perform array operations\n",
"- [SciPy](https://scipy.org/): library to manipulate and visualize data\n",
"- [matplotlib](https://matplotlib.org/stable/index.html): library to make plots\n",
"- [cartopy](https://scitools.org.uk/cartopy/docs/latest/index.html): library to make maps\n",
"- [gdal](https://pypi.org/project/GDAL/): library interface to the [GDAL Geospatial Data Abstraction Library](https://gdal.org/en/stable/)\n",
"\n",
"Used by `xarray` but not imported:\n",
"- [h5netcdf](https://h5netcdf.org/): interface for the netCDF4 file-format based on h5py"
]
},
{
"cell_type": "markdown",
"source": [
"---\n",
"---\n",
"\n",
"The Colab configuration does not include the `s3fs` or `cartopy` packages, so they need to be installed.\n",
"\n",
"**Ignore any error messages about package dependency conflicts.**"
],
"metadata": {
"id": "uKyeZsUEBdGs"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "6Fsya_6ip_1L"
},
"outputs": [],
"source": [
"# Install missing packages in Colab quietly (no progress notifications)\n",
"\n",
"!pip install --quiet s3fs cartopy"
]
},
{
"cell_type": "code",
"source": [
"# Import modules and packages\n",
"\n",
"from pathlib import Path\n",
"\n",
"import datetime\n",
"from datetime import date\n",
"\n",
"import s3fs\n",
"\n",
"import requests\n",
"\n",
"import pandas as pd\n",
"\n",
"import xarray as xr\n",
"\n",
"import numpy as np\n",
"\n",
"from scipy.interpolate import griddata\n",
"\n",
"import matplotlib as mpl\n",
"from matplotlib import pyplot as plt\n",
"\n",
"from cartopy import crs as ccrs\n",
"\n",
"from osgeo import gdal"
],
"metadata": {
"id": "cfhllG-zyU0v"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"## **Create an Image File of VIIRS ADP *Without* Downloading Data Files from the NODD on AWS & Convert it to a GeoTIFF File**"
],
"metadata": {
"id": "nf6-3lhb6jTY"
}
},
{
"cell_type": "markdown",
"source": [
"**Define functions needed to:**\n",
"\n",
"1. Select list of VIIRS ADP files from JPSS NODD on AWS\n",
"2. Open VIIRS files remotely & process smoke SAAI (presence & intensity of smoke)\n",
"3. Plot smoke SAAI on a simple, transparent map & save locally as a .png file\n",
"4. Convert .png file to .tif file & save locally\n",
"\n"
],
"metadata": {
"id": "9BrJ-Rz4DT5c"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "cee73dce"
},
"outputs": [],
"source": [
"# Find start/end times for JPSS satellite overpass(es) of geographic domain\n",
"# Uses UWisc OrbNav API\n",
"def find_jpss_observation_times(observation_date, lower_left_lat_lon,\n",
" upper_right_lat_lon, sat_name):\n",
"\n",
" # Convert entered observation_date to format needed by UWisc OrbNav API\n",
" api_date = date.isoformat(datetime.datetime.strptime(observation_date,\n",
" '%Y%m%d'))\n",
"\n",
" # Set JPSS satellite URL for UWisc OrbNav API\n",
" sat_number_dict = {'SNPP':'37849', 'NOAA20':'43013', 'NOAA21':'54234'}\n",
" sat_number = sat_number_dict.get(sat_name)\n",
" # Break long url string using f-string formatting\n",
" url = (f'http://sips.ssec.wisc.edu/orbnav/api/v1/boxtimes.json?start='\n",
" f'{api_date}T00:00:00Z&sat={sat_number}&end={api_date}T23:59:59Z&ur='\n",
" f'{upper_right_lat_lon}&ll={lower_left_lat_lon}')\n",
"\n",
" # Use requests library to get json response from UWisc OrbNav API\n",
" response = requests.get(url)\n",
" data = response.json()\n",
"\n",
" # Convert json response values from \"data\" key into a dataframe\n",
" # \"enter\" & \"leave\": times when satellite enters/leaves domain bounding box\n",
" df = pd.DataFrame(data['data'], columns=['enter', 'leave'])\n",
"\n",
" # Make two new dataframes, for \"enter\" and \"leave\" column lists\n",
" # Read in all the values in the lists as separate columns in new dataframes\n",
" df_enter = pd.DataFrame(df['enter'].to_list(), columns = ['enter_datetime',\n",
" 'enter_lat',\n",
" 'enter_lon',\n",
" 'enter_sea',\n",
" 'enter_orbit'])\n",
" df_leave = pd.DataFrame(df['leave'].to_list(), columns = ['leave_datetime',\n",
" 'leave_lat',\n",
" 'leave_lon',\n",
" 'leave_sea',\n",
" 'leave_orbit'])\n",
"\n",
" # Combine \"enter\" & \"leave\" dataframes into new dataframe; drop extra columns\n",
" combined = (pd.concat([df_enter, df_leave], axis=1, join='outer')\n",
" .drop(columns=['enter_lat', 'enter_lon', 'enter_sea', 'leave_lat',\n",
" 'leave_lon', 'leave_sea'], axis=1))\n",
"\n",
" # Drop rows with descending orbits\n",
" combined.drop(combined[(combined['leave_orbit'] == 'D') |\n",
" (combined['enter_orbit'] == 'D')].index, inplace=True)\n",
"\n",
" # Export the \"enter_datetime\" & \"leave_datetime\" columns to lists\n",
" enter_list = combined['enter_datetime'].tolist()\n",
" leave_list = combined['leave_datetime'].tolist()\n",
"\n",
" # Remove the colon from the list of enter/leave times (strings)\n",
" # Need 'HHMM' format for use with satellite data file names\n",
" start_times = [time[11:16].replace(':','') for time in enter_list]\n",
" end_times = [time[11:16].replace(':','') for time in leave_list]\n",
"\n",
" return start_times, end_times"
]
},
{
"cell_type": "code",
"source": [
"# Query AWS NODD for available VIIRS operational ADP EDR data files\n",
"def query_nodd_viirs_adp(fs, observation_date, sat_name, start_times, end_times):\n",
"\n",
" # Define terms in directory paths on AWS NODD\n",
" year = observation_date[:4]\n",
" month = observation_date[4:6]\n",
" day = observation_date[6:]\n",
" # Make dictionary for JPSS satellite/ADP directory paths on AWS NODD\n",
" # Keys for JPSS satellites\n",
" keys = ['SNPP', 'NOAA20', 'NOAA21']\n",
" # Values for operational VIIRS ADP EDR directory paths\n",
" values = ['noaa-nesdis-snpp-pds/VIIRS-JRR-ADP/',\n",
" 'noaa-nesdis-n20-pds/VIIRS-JRR-ADP/',\n",
" 'noaa-nesdis-n21-pds/VIIRS-JRR-ADP/']\n",
" # Combine \"values\" and \"keys\" lists\n",
" abbreviation_dictionary = {keys[i]: values[i] for i in range(len(keys))}\n",
" product_path = abbreviation_dictionary.get(sat_name)\n",
"\n",
" # Query AWS NODD for available files for entire day\n",
" try:\n",
" day_files = fs.ls(product_path + year + '/' + month + '/' + day + '/',\n",
" refresh=True)\n",
" except:\n",
" day_files = []\n",
"\n",
" if day_files:\n",
" # Generate list of available files for observation time period(s)\n",
" nodd_file_list = [file for start_time, end_time in zip(start_times, end_times)\n",
" for file in day_files if (file.split('/')[-1].split('_')[3][9:13] >= start_time\n",
" and file.split('/')[-1].split('_')[3][9:13] <= end_time)]\n",
" else:\n",
" nodd_file_list = []\n",
"\n",
" return nodd_file_list"
],
"metadata": {
"id": "R1278v6ffd09"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Process VIIRS ADP SAAI (Smoke only)\n",
"def process_viirs_adp_saai_smoke(ds):\n",
"\n",
" # Convert xarray DataArrays to NumPy masked arrays with correct dtype\n",
" # Select \"smoke present\" (Smoke = 1) pixels\n",
" # Casting xarray float32 to int8 gives NumPy error for fill value (NaN)\n",
" # Silence \"invalid value encountered in cast\" warning with np.errstate\n",
" with np.errstate(invalid=\"ignore\"):\n",
" pqi4 = ds.PQI4.to_masked_array().astype('int8')\n",
" saai_smoke = ds.SAAI.where(ds.Smoke == 1).to_masked_array().astype('float32')\n",
"\n",
" # Select deep-blue based algorithm smoke pixels using PQI4 bits 4-5\n",
" # Mask missing and IR-visible path pixels (see Table 8 in User's Guide)\n",
" # missing (10): 16 + 0 = 16, IR-visible (01): 0 + 32 = 32\n",
" smoke_algorithm_mask = (((pqi4 & 16 == 16) & (pqi4 & 32 != 32))\n",
" | ((pqi4 & 16 != 16) & (pqi4 & 32 == 32)))\n",
" saai_smoke = np.ma.masked_where(smoke_algorithm_mask, saai_smoke)\n",
"\n",
" return saai_smoke"
],
"metadata": {
"id": "ynX2b7iSZTbY"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Interpolate missing pixels in DataArray using SciPy\n",
"def interpolate_missing_pixels(da):\n",
"\n",
" # Height (rows) and width (columns) of DataArray\n",
" height, width = da.values.shape\n",
"\n",
" # Create 2D arrays with dimensions of x (width) and y (height)\n",
" xx, yy = np.meshgrid(np.arange(width), np.arange(height))\n",
"\n",
" # Boolean array of DataArray with masked (NaN) pixels set to \"True\"\n",
" mask = np.isnan(da)\n",
"\n",
" # 1D arrays of known (non-masked) x, y indices (locations)\n",
" # np.logical_not() reverses the boolean array\n",
" known_x = xx[np.logical_not(mask)]\n",
" known_y = yy[np.logical_not(mask)]\n",
"\n",
" # 1D arrays of missing (masked) x, y indices (locations)\n",
" missing_x = xx[mask]\n",
" missing_y = yy[mask]\n",
"\n",
" # 1D array of known (non-masked) DataArray values\n",
" known_v = da.to_masked_array().compressed()\n",
"\n",
" # Interpolate missing DataArray values using SciPy (returns 1D array)\n",
" interpolated_values = griddata((known_x, known_y), known_v,\n",
" (missing_x, missing_y), method='nearest')\n",
"\n",
" # Assign interpolated values to indexed DataArray (replace NaNs)\n",
" da_interpolated = da.values.copy() # Copy of DataArray as np array\n",
" da_interpolated[missing_y, missing_x] = interpolated_values\n",
"\n",
" return da_interpolated"
],
"metadata": {
"id": "Yge-aVMeJ38c"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Plot processed SAAI Smoke from multiple VIIRS granule files on simple map\n",
"# No borderlines/coastlines, transparent background (for GeoTIFF)\n",
"# Files are opened remotely on the AWS NODD (not downloaded)\n",
"\n",
"def plot_simple_viirs_adp_smoke(fs, file_list, png_domain, save_path):\n",
"\n",
" # Set up figure in matplotlib\n",
" fig = plt.figure(figsize=(8, 10))\n",
"\n",
" # Set map projection using cartopy\n",
" # Set central_longitude=180 for Alaska; avoids errors crossing antimeridian\n",
" ax = plt.axes(projection=ccrs.Mercator(central_longitude=180))\n",
"\n",
" # Remove border around figure (for GeoTIFF)\n",
" plt.axis('off')\n",
"\n",
" # Set geographic domain of map for image file (.png)\n",
" # [W_lon, E_lon, S_lat, N_lat]\n",
" ax.set_extent(png_domain, crs=ccrs.PlateCarree())\n",
"\n",
" # Set colormaps & normalization for plotting data\n",
" norm = mpl.colors.Normalize(vmin=0, vmax=2)\n",
" cmap = plt.get_cmap('PuRd')\n",
"\n",
" # Loop through VIIRS ADP files\n",
" for file in file_list:\n",
" print('Now processing', file.split('/')[-1]) # Print the file name\n",
"\n",
" # Open remote file using S3fs & xarray (automatically closes file when done)\n",
" with fs.open(file, mode='rb') as remote_file:\n",
" with xr.open_dataset(remote_file, engine='h5netcdf') as ds:\n",
"\n",
" # Process VIIRS ADP SAAI (smoke only)\n",
" saai_smoke = process_viirs_adp_saai_smoke(ds)\n",
"\n",
" # Interpolate missing latitude & longitude values\n",
" latitude_interpolated = interpolate_missing_pixels(ds.Latitude)\n",
" longitude_interpolated = interpolate_missing_pixels(ds.Longitude)\n",
"\n",
" # Plot data\n",
" ax.pcolormesh(longitude_interpolated, latitude_interpolated,\n",
" saai_smoke, cmap=cmap, norm=norm,\n",
" transform=ccrs.PlateCarree())\n",
"\n",
" # Pull observation info from first data file name (string)\n",
" fname = file_list[0].split('/')[-1]\n",
" save_sat = fname.split('_')[2]\n",
" save_date = fname.split('_')[3][1:9]\n",
" save_name = f'{save_sat}_viirs_adp_saai_smoke_{save_date}'\n",
"\n",
" # Save image file to designated directory\n",
" # Set background as transparent (for GeoTIFF)\n",
" fig.savefig(save_path / save_name, transparent=True, dpi=300,\n",
" bbox_inches='tight')\n",
"\n",
" # Close plot\n",
" plt.close()\n",
"\n",
" return save_name"
],
"metadata": {
"id": "rDQYAgmtIc87"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Create GeoTIFF from map image file\n",
"def create_geotiff(output_file, input_file, tif_domain):\n",
"\n",
" translate_file = gdal.Translate(output_file, input_file,\n",
" outputBounds=tif_domain,\n",
" outputSRS='EPSG:4326', format='GTiff')\n",
" translate_file = None # Release memory used to generate .tif file"
],
"metadata": {
"id": "xvF8mWvCV6gR"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"### **User Variable Settings and Content for \"Main Function\":**"
],
"metadata": {
"id": "7mxgU0wz28W5"
}
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"id": "daeef554"
},
"outputs": [],
"source": [
"# Select list of VIIRS ADP data files from the AWS NODD for Alaska domain\n",
"\n",
"# Enter search variables for VIIRS ADP EDR data files on AWS NODD\n",
"sat_name = 'NOAA21' # Satellite name: 'SNPP', 'NOAA20', 'NOAA21'\n",
"observation_date = '20240627' # 'YYYYMMDD' (string)\n",
"\n",
"###############################################################################\n",
"\n",
"# Connect to AWS S3 anonymously\n",
"fs = s3fs.S3FileSystem(anon=True)\n",
"\n",
"# Find the overpass start/end times for Alaska domain - east of Antimeridian\n",
"east_start_times, east_end_times = find_jpss_observation_times(observation_date,\n",
" '45,-180',\n",
" '75,-130',\n",
" sat_name)\n",
"\n",
"# Query AWS NODD for available files for Alaska domain - east of Antimeridian\n",
"east_file_list = query_nodd_viirs_adp(fs, observation_date, sat_name,\n",
" east_start_times, east_end_times)\n",
"\n",
"# Find \"tomorrow's\" 8-digit date as a string\n",
"# Date for VIIRS granules falling west of Antimeridian is \"tomorrow\" in UTC\n",
"obs_date = datetime.datetime.strptime(observation_date, '%Y%m%d').date()\n",
"tomorrow = obs_date + datetime.timedelta(days=1)\n",
"tomorrow = tomorrow.strftime('%Y%m%d')\n",
"\n",
"# Find the overpass start/end times for Alaska domain - west of Antimeridian\n",
"west_start_times, west_end_times = find_jpss_observation_times(tomorrow,\n",
" '45,170',\n",
" '75,180',\n",
" sat_name)\n",
"\n",
"# Query AWS NODD for available files for Alaska domain - west of Antimeridian\n",
"west_file_list = query_nodd_viirs_adp(fs, tomorrow, sat_name,\n",
" west_start_times, west_end_times)\n",
"\n",
"# Combine files for Alaska domains east & west of Antimeridian\n",
"file_list = east_file_list + west_file_list\n",
"\n",
"# Print the available file names (optional sanity check)\n",
"for file in file_list:\n",
" print(file.split('/')[-1])"
]
},
{
"cell_type": "code",
"source": [
"# Plot smoke SAAI on a simple, transparent map & save locally as a .png file\n",
"\n",
"# ENTER USER SETTINGS\n",
"\n",
"# Enter domain for map image file\n",
"# For Alaska, enter domain longitude in 360 degrees (i.e., 100°W = 260)\n",
"png_domain = [180, 225, 50, 71] # [W_lon, E_lon, S_lat, N_lat]\n",
"\n",
"# Enter directory name for saved .png file (using pathlib module)\n",
"image_path = Path.cwd()\n",
"\n",
"################################################################################\n",
"\n",
"# Create VIIRS Smoke ADP map image file\n",
"if file_list:\n",
" save_name = plot_simple_viirs_adp_smoke(fs, file_list, png_domain, image_path)"
],
"metadata": {
"id": "kcvZ4EJIP3to"
},
"execution_count": null,
"outputs": []
},
{
"cell_type": "code",
"source": [
"# Convert .png file to .tif file & save locally\n",
"\n",
"# ENTER USER SETTINGS\n",
"\n",
"# Enter domain for .tif file\n",
"# Must use the same boundaries as domain used to make .png file!!!!\n",
"# Note the order of entered lat/lon boundaries is different!\n",
"# Enter longitude in 100 degrees (i.e., 100°W = -100)\n",
"tif_domain = [-180, 72, -135, 50] # [W_lon, N_lat, E_lon, S_lat]\n",
"\n",
"# Enter directory name for saved .tif file (using pathlib module)\n",
"tif_path = Path.cwd()\n",
"\n",
"################################################################################\n",
"\n",
"# Set full paths for .png and .tif files (as strings)\n",
"# gdal takes file paths as strings\n",
"image_file_path = (image_path / (f'{save_name}.png')).as_posix()\n",
"tif_file_path = (tif_path / (f'{save_name}.tif')).as_posix()\n",
"\n",
"# Create geotiff\n",
"create_geotiff(tif_file_path, image_file_path, tif_domain)"
],
"metadata": {
"id": "F6vPM-YFYPZT"
},
"execution_count": null,
"outputs": []
}
]
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment