top of page
Writer's pictureKatie McQuillan

Properly converting USGS gage heights to match SWOT water surface elevation data

Updated: Jul 29

TLDR: (1) Don't use the generic WGS84 - instead use WGS84 (G1762); (2) Set the epoch; (3) Sample gdal command to convert USGS coordinates to the datum used by SWOT:

gdaltransform -s_coord_epoch "2010.0" -t_coord_epoch "2010.0" 
-s_srs "EPSG:6349" -t_srs "EPSG:9057+EPSG:3855" 
< "gdal_in.txt" > "gdal_out_6349_9057_3855_epoch_2010.txt"

Tutorial

The Surface Water Ocean Topography (SWOT) Ka-band radar interferometer observes water surface area and elevation of inland water bodies. We validated SWOT LakeSP water surface elevation (WSE) using in situ observations from the USGS Lake/Reservoir Water Surface Elevation dataset (parameter code = 62615).


To directly compare SWOT and USGS datasets, we needed to transform USGS coordinates to the SWOT horizontal and vertical datums. Finding the best tool and the correct parameters to perform this transformation was tricky! After trial and error, GDAL stood out as one of the simplest and most powerful tools for horizontal and vertical coordinate transformations. Next, we tested horizontal and vertical datum parameters to accurately transform the data. Datums for each dataset are noted in Table 1.


Table 1. SWOT and USGS datum information

Data source

Horizontal Datum

Reference Ellipsoid

Vertical Datum

SWOT

ITRF2014

WGS84

EGM2008

USGS

NAD83 (2011)

GRS80

NAVD88

USGS Data

USGS data was represented using EPSG:6349. EPSG:6349 is a compound CRS that represents NAD83 (2011) + NAVD88 height.

In [10]:

# Details of the the compound NAD83(2011) + NAVD88 (EPSG:6349)
from osgeo import osr
src = osr.SpatialReference()
src.ImportFromEPSG(6349)
print(src)
COMPD_CS["NAD83(2011) + NAVD88 height",
    GEOGCS["NAD83(2011)",
        DATUM["NAD83_National_Spatial_Reference_System_2011",
            SPHEROID["GRS 1980",6378137,298.257222101,
                AUTHORITY["EPSG","7019"]],
            AUTHORITY["EPSG","1116"]],
        PRIMEM["Greenwich",0,
            AUTHORITY["EPSG","8901"]],
        UNIT["degree",0.0174532925199433,
            AUTHORITY["EPSG","9122"]],
        AXIS["Latitude",NORTH],
        AXIS["Longitude",EAST],
        AUTHORITY["EPSG","6318"]],
    VERT_CS["NAVD88 height",
        VERT_DATUM["North American Vertical Datum 1988",2005,
            AUTHORITY["EPSG","5103"]],
        UNIT["metre",1,
            AUTHORITY["EPSG","9001"]],
        AXIS["Gravity-related height",UP],
        AUTHORITY["EPSG","5703"]],
    AUTHORITY["EPSG","6349"]]

SWOT Data

We were less certain about how to represent the SWOT datums and tested three different representations.

First, we tried transforming USGS data to match SWOT using the generic WGS84 represented by EPSG:4326 for the horizontal datum and EGM2008 represented by EPSG:3855 for the vertical datum. We calculated validation statistics including mean bias error (MBE, meters), mean absolute error (MAE, meters), root mean square error (RMSE, meters), and one-sigma (meters).

In [9]:

# Details of EPSG:4326 generic WGS84
dst = osr.SpatialReference()
dst.ImportFromEPSG(4326)
print(dst)
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9122"]],
    AXIS["Latitude",NORTH],
    AXIS["Longitude",EAST],
    AUTHORITY["EPSG","4326"]]

In [4]:

# Details of EPSG:3855 EGM2008
v_dst = osr.SpatialReference()
v_dst.ImportFromEPSG(3855)
print(v_dst)
VERT_CS["EGM2008 height",
    VERT_DATUM["EGM2008 geoid",2005,
        AUTHORITY["EPSG","1027"]],
    UNIT["metre",1,
        AUTHORITY["EPSG","9001"]],
    AXIS["Gravity-related height",UP],
    AUTHORITY["EPSG","3855"]]

In [11]:

# Load modules
import os
import pandas as pd
import subprocess
import numpy as np
import math

home = "C:/Users/kmcquil/Documents/SWOT_Validation"

# Open the dataset of cotemporal SWOT and USGS observations
swot_usgs_df = pd.read_csv(os.path.join(home, "Data/usgs_swot_merged_exaple.csv"), index_col=0)

# How many cotemporal observations? 
print('The number of cotemporal USGS and SWOT observations = ' + str(swot_usgs_df.shape[0]))

# Get data into correct format to pass to gdal including longitude, latitude, and gage height in feet
in_gdal = swot_usgs_df[["usgs_long", "usgs_lat", "usgs_X_62615_00000"]].copy()

# Since the USGS heights are in feet but the projection we have assigned are in meters, convert heights to meters
in_gdal.loc[:,'usgs_X_62615_00000'] = in_gdal.loc[:,'usgs_X_62615_00000'] * 0.3048

# Create a column with long, lat, height combined 
in_gdal.loc[:,"out"] = [
    str(i) + " " + str(j) + " " + str(k)
    for i, j, k in zip(
        in_gdal["usgs_long"],
        in_gdal["usgs_lat"],
        in_gdal["usgs_X_62615_00000"],
    )
]

# Save the combined column to a text file
in_gdal["out"].to_csv(
    os.path.join(home, "Data/USGS/gdal_in.txt"), header=False, index=False
)

# Function to transform the USGS heights and calculate validation stats 
out_df = swot_usgs_df.copy()
def transform_height(gdal_command, fp, label):
    subprocess.run(gdal_command, shell=True)
    out_gdal = pd.read_csv(os.path.join(home, fp), header=None)
    out_gdal = out_gdal.rename(columns={0: "result"})
    out_gdal[["usgs_long", "usgs_lat", "usgs_X_62615_00000_egm0_meters"]] = out_gdal["result"].str.split(" ", expand=True)
    out_df["usgs_X_62615_00000_egm0_meters"] = out_gdal["usgs_X_62615_00000_egm0_meters"].astype(float)

    mae = np.mean(np.abs(np.subtract(out_df["usgs_X_62615_00000_egm0_meters"], out_df["swot_wse"])))
    bias = np.mean(np.subtract(out_df["usgs_X_62615_00000_egm0_meters"], out_df["swot_wse"]))
    rmse = math.sqrt(np.square(np.subtract(out_df["usgs_X_62615_00000_egm0_meters"], out_df["swot_wse"])).mean())
    one_sigma = np.quantile(np.abs(np.subtract(out_df["usgs_X_62615_00000_egm0_meters"], out_df["swot_wse"])),0.68)
    return pd.DataFrame([[label, bias, mae, rmse, one_sigma]], columns=["Command", "MBE (m)", "MAE (m)", "RMSE (m)", "One-Sigma (m)"])
The number of cotemporal USGS and SWOT observations = 425

In [12]:

# Convert from NAD83(2011) + NAVD88 to match SWOT using generic WGS84 EPSG:4326 and EGM2008
gdal_command = 'gdaltransform -s_srs "EPSG:6349" -t_srs "EPSG:4326+EPSG:3855" < "C:/Users/kmcquil/Documents/SWOT_Validation/Data/USGS/gdal_in.txt" > "C:/Users/kmcquil/Documents/SWOT_Validation/Data/USGS/gdal_out_6349_4326_3855.txt"'
label = "EPSG:4326, No epoch"
fp = "Data/USGS/gdal_out_6349_4326_3855.txt"
epsg4326_noepoch = transform_height(gdal_command, fp, label)
print(epsg4326_noepoch)
               Command   MBE (m)   MAE (m)  RMSE (m)  One-Sigma (m)
0  EPSG:4326, No epoch  0.938846  1.174174  1.371652       1.235044

The generic WGS84 (EPSG:4326) did not result in close comparison between USGS and SWOT elevations, with a MBE and MAE of approximately 1 meter.

Further digging revealed that using the generic WGS84 (EPSG:4326) is not recommended because it is based on a datum ensemble whose positional accuracy is approximately two meters. Instead, a realization such as WGS84 (G1762) is recommended. WGS84 (G1762) and ITRF 2014 are equivalent for all practical purposes when their epochs are the same.

Epochs are used to define a reference date for positions esablished using the datum ellipsoid and reference frame. Due to tectonic plate movement, landmasses are constantly moving in relationship to each other and in relation to the reference frame. Therefore, accounting for the epoch is necessary for accurate coordinate transformations.

Next, we tested two configurations using WGS84 (G1762) which is represented by EPSG:9057. One configuration set the epoch for each dataset and the other did not.

In [13]:

# Details for EPSG:9057 WGS84 (G1762)
dst = osr.SpatialReference()
dst.ImportFromEPSG(9057)
print(dst)
GEOGCS["WGS 84 (G1762)",
    DATUM["World_Geodetic_System_1984_G1762",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","1156"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9122"]],
    AXIS["Latitude",NORTH],
    AXIS["Longitude",EAST],
    AUTHORITY["EPSG","9057"]]

In [14]:

# Convert from NAD83(2011) + NAVD88 to match SWOT using WGS84 (G1762) EPSG:9057 and EGM2008
gdal_command = 'gdaltransform -s_srs "EPSG:6349" -t_srs "EPSG:9057+EPSG:3855" < "C:/Users/kmcquil/Documents/SWOT_Validation/Data/USGS/gdal_in.txt" > "C:/Users/kmcquil/Documents/SWOT_Validation/Data/USGS/gdal_out_6349_9057_3855.txt"'
label = "EPSG:9057, No epoch"
fp = "Data/USGS/gdal_out_6349_9057_3855.txt"
epsg9057_noepoch = transform_height(gdal_command, fp, label)
print(epsg9057_noepoch)
               Command   MBE (m)   MAE (m)  RMSE (m)  One-Sigma (m)
0  EPSG:9057, No epoch -0.158443  0.299551   0.97709       0.134066

The NAD83 (2011) epoch is 2010.0. The standard epoch of WGS84 (G1762) is 2005.0 and the standard epoch of ITRF2014 is 2010.0. Since SWOT is based on ITRF2014, we set the target epoch to 2010.0.

In [15]:

# Convert from NAD83(2011) + NAVD88 to match SWOT using WGS84 (G1762) EPSG:9057 and EGM2008 and set the epoch = 2010.0
gdal_command = 'gdaltransform -s_coord_epoch "2010.0" -t_coord_epoch "2010.0" -s_srs "EPSG:6349" -t_srs "EPSG:9057+EPSG:3855" < "C:/Users/kmcquil/Documents/SWOT_Validation/Data/USGS/gdal_in.txt" > "C:/Users/kmcquil/Documents/SWOT_Validation/Data/USGS/gdal_out_6349_9057_3855_epoch_2010.txt"'
label = "EPSG:9057, Epoch = 2010"
fp = "Data/USGS/gdal_out_6349_9057_3855_epoch_2010.txt"
epsg9057_epoch2010 = transform_height(gdal_command, fp, label)
print(epsg9057_epoch2010)
                   Command   MBE (m)   MAE (m)  RMSE (m)  One-Sigma (m)
0  EPSG:9057, Epoch = 2010 -0.144761  0.299489  0.974874       0.139806

In [16]:

# Compare the validation statistics using different SWOT datum parameters.
perf = pd.concat([epsg4326_noepoch, epsg9057_noepoch, epsg9057_epoch2010])
print(perf)
                   Command   MBE (m)   MAE (m)  RMSE (m)  One-Sigma (m)
0      EPSG:4326, No epoch  0.938846  1.174174  1.371652       1.235044
0      EPSG:9057, No epoch -0.158443  0.299551  0.977090       0.134066
0  EPSG:9057, Epoch = 2010 -0.144761  0.299489  0.974874       0.139806

Using WGS84 (G1762) resulted in closer agreement between USGS and SWOT elevations. Accounting for the epoch made a small difference but is important to accurately represent horizontal and vertical coordinate transformations.

Key takeaways

  1. Don't use the generic WGS84 - instead use WGS84 (G1762).

  2. Set the epoch.

  3. Sample gdal command to convert USGS coordinates to the datum used by SWOT:

gdaltransform -s_coord_epoch "2010.0" -t_coord_epoch "2010.0" 
-s_srs "EPSG:6349" -t_srs "EPSG:9057+EPSG:3855" 
< "gdal_in.txt" > "gdal_out_6349_9057_3855_epoch_2010.txt"

We are also interested in hearing about other approaches folks have taken to reproject data into SWOT format!





Dr. Katie McQuillan is a Virginia Tech Presidential Postdoctoral Fellow in the Global Rivers Group.

50 views0 comments

Recent Posts

See All

Comments


Commenting has been turned off.
bottom of page