Cloud Cover

Estimate clear-sky probability from MODIS imagery via Google Earth Engine. Requires a GEE account and earthengine-api.

Cloud fraction analysis for airborne remote sensing campaign planning.

Provides historical cloud climatology from two sources:

  • Google Earth Engine (source="gee"): MODIS Terra/Aqua surface reflectance at 1 km resolution over user-supplied polygons. Requires GEE authentication. pip install hyplan[clouds].

  • Open-Meteo (source="openmeteo"): ERA5 reanalysis at 0.25 deg resolution using each polygon’s centroid. No authentication required.

Also provides short-range cloud cover forecasts (up to 16 days) via the Open-Meteo Forecast API, visit simulation for campaign scheduling, and visualization helpers.

get_binary_cloud(image)[source]

Generate a binary cloud mask for a MODIS image.

The MOD09GA/MYD09GA state_1km band encodes cloud state in bits 0-1 (00 = clear, 01 = cloudy, 10 = mixed, 11 = not set). Any non-zero value is treated as cloudy.

Parameters:

image (Image) – An Earth Engine image with a state_1km QA band.

Return type:

Image

Returns:

Binary cloud mask (1 for cloudy/mixed, 0 for clear).

calculate_cloud_fraction(image, polygon_geometry)[source]

Calculate the cloud fraction over a polygon for a MODIS image.

Parameters:
  • image (Image) – An Earth Engine MODIS image.

  • polygon_geometry (Geometry) – A polygon geometry for the region of interest.

Return type:

Feature

Returns:

An ee.Feature with date and cloud fraction properties.

create_date_ranges(day_start, day_stop, year_start, year_stop)[source]

Create date ranges for filtering Earth Engine image collections.

Supports year-boundary crossings (e.g., day_start=335, day_stop=60 for a December-to-February campaign). When day_start > day_stop, each year-pair produces two date ranges.

Parameters:
  • day_start (int) – Start day of the year (1-365).

  • day_stop (int) – End day of the year (1-365).

  • year_start (int) – Start year for the ranges.

  • year_stop (int) – End year for the ranges.

Return type:

list

Returns:

List of (start_date, end_date) tuples suitable for filterDate.

create_cloud_data_array_with_limit(polygon_file, year_start, year_stop, day_start, day_stop, limit=5000, satellite='both', split_satellite=False)[source]

Fetch MODIS cloud data for polygons via Google Earth Engine.

Queries MODIS Terra and/or Aqua surface reflectance imagery and computes daily cloud fractions over each polygon at 1 km resolution.

Parameters:
  • polygon_file (str) – Path to a GeoJSON or shapefile with a Name column.

  • year_start (int) – Start year.

  • year_stop (int) – End year.

  • day_start (int) – Start day-of-year.

  • day_stop (int) – End day-of-year.

  • limit (int) – Max images per date range (default 5000).

  • satellite (str) – "both", "terra", or "aqua".

  • split_satellite (bool) – If True, include a satellite column.

Return type:

DataFrame

Returns:

DataFrame with columns polygon_id, year, day_of_year, cloud_fraction (plus satellite if split_satellite).

simulate_visits(df, day_start, day_stop, year_start, year_stop, cloud_fraction_threshold=0.1, rest_day_threshold=6, exclude_weekends=False, debug=False)[source]

Simulate daily flight scheduling based on cloud fraction thresholds.

On each visitable day, the alphabetically first unvisited polygon that meets the cloud threshold is chosen. Rest days count toward total_days but no polygon is visited.

Parameters:
  • df (DataFrame) – Cloud fraction data with columns polygon_id, year, day_of_year, cloud_fraction.

  • day_start (int) – Start day-of-year for simulation.

  • day_stop (int) – End day-of-year for simulation.

  • year_start (int) – Start year.

  • year_stop (int) – End year.

  • cloud_fraction_threshold (float) – Maximum allowable cloud fraction.

  • rest_day_threshold (int) – Max consecutive visits before a rest day.

  • exclude_weekends (bool) – Skip weekends and reset counter.

  • debug (bool) – Enable detailed logging.

Return type:

Tuple[DataFrame, Dict[int, Dict[str, list]], Dict[int, list]]

Returns:

Tuple of (summary_df, visit_tracker, rest_days) where summary_df has year and days columns, visit_tracker maps year -> polygon_id -> [day_of_year], and rest_days maps year -> [day_of_year].

plot_yearly_cloud_fraction_heatmaps_with_visits(cloud_data_df, visit_tracker, rest_days, cloud_fraction_threshold=0.1, exclude_weekends=False, day_start=1, day_stop=365)[source]

Generate heatmaps of cloud fraction for each year with visit markers.

Parameters:
  • cloud_data_df (DataFrame) – DataFrame with polygon_id, year, day_of_year, cloud_fraction.

  • visit_tracker (Dict[int, Dict[str, list]]) – Visit days per polygon per year.

  • rest_days (Dict[int, list]) – Rest days per year.

  • cloud_fraction_threshold (float) – Threshold for clear/cloudy classification.

  • exclude_weekends (bool) – Highlight and skip weekends.

  • day_start (int) – Start day-of-year.

  • day_stop (int) – End day-of-year.

Return type:

None

class OpenMeteoCloudFraction[source]

Bases: object

Fetch daily cloud fraction from the Open-Meteo ERA5 archive.

Uses the Open-Meteo Historical Weather API to retrieve daily mean total cloud cover from ERA5 reanalysis at 0.25 deg (~25 km) resolution. No authentication required.

For each polygon the centroid is used as the query point. One HTTP request per polygon covers the entire date range (all years at once), so even 20-year x 5-polygon queries complete in seconds.

Parameters:

url (str | None) – Override the Open-Meteo archive endpoint (for testing).

__init__(url=None)[source]
Parameters:

url (str | None)

fetch(polygons, year_start, year_stop, day_start, day_stop)[source]

Fetch daily cloud fraction for each polygon.

Parameters:
  • polygons (GeoDataFrame) – GeoDataFrame with a Name column and polygon geometries (WGS 84).

  • year_start (int) – First year to include.

  • year_stop (int) – Last year to include (inclusive).

  • day_start (int) – Start day-of-year (1-365).

  • day_stop (int) – End day-of-year (1-365). If day_start > day_stop the range crosses a year boundary (e.g. Dec to Feb).

Return type:

DataFrame

Returns:

DataFrame with columns polygon_id, year, day_of_year, cloud_fraction (0.0-1.0).

fetch_cloud_fraction(polygon_file, year_start, year_stop, day_start, day_stop, source='openmeteo', **kwargs)[source]

Fetch historical cloud fraction data for flight planning.

Factory function that selects the appropriate cloud data source and returns a DataFrame compatible with simulate_visits() and plot_yearly_cloud_fraction_heatmaps_with_visits().

Parameters:
  • polygon_file (str) – Path to a GeoJSON or shapefile with a Name column.

  • year_start (int) – First year to include.

  • year_stop (int) – Last year to include (inclusive).

  • day_start (int) – Start day-of-year (1-365).

  • day_stop (int) – End day-of-year (1-365).

  • source (str) – "openmeteo" (ERA5, 0.25 deg, no auth) or "gee" (MODIS, 1 km, requires GEE auth).

  • **kwargs – Passed to the source constructor.

Return type:

DataFrame

Returns:

DataFrame with columns polygon_id, year, day_of_year, cloud_fraction.

summarize_cloud_fraction_by_doy(df, window=None)[source]

Compute a “typical year” cloud fraction summary per polygon.

Averages cloud fraction across all years for each (polygon_id, day_of_year) pair, producing a single seasonal profile per polygon.

Parameters:
  • df (DataFrame) – Cloud fraction DataFrame with columns polygon_id, year, day_of_year, cloud_fraction (as returned by fetch_cloud_fraction()).

  • window (int | None) – Optional rolling-mean window size (centered) applied to the per-polygon DOY mean. None disables smoothing.

Return type:

DataFrame

Returns:

DataFrame with columns polygon_id, day_of_year, cloud_fraction_mean, cloud_fraction_std, cloud_fraction_count.

plot_doy_cloud_fraction(summary_df, ax=None, show_std=True, **kwargs)[source]

Line plot of DOY cloud fraction for each polygon.

Parameters:
  • summary_df (DataFrame) – Output of summarize_cloud_fraction_by_doy().

  • ax (Axes | None) – Matplotlib Axes to plot on. Created if None.

  • show_std (bool) – If True, draw a shaded +/-1 std-dev band.

  • **kwargs – Passed to ax.plot().

Return type:

Axes

Returns:

The matplotlib Axes.

class OpenMeteoCloudForecast[source]

Bases: object

Fetch cloud cover forecasts from the Open-Meteo Forecast API.

Provides up to 16 days of forecast cloud cover at any global location. No authentication required.

For each polygon the centroid is used as the query point.

Parameters:

url (str | None) – Override the Open-Meteo forecast endpoint (for testing).

__init__(url=None)[source]
Parameters:

url (str | None)

fetch(polygons, forecast_days=7, hourly=False, models=None)[source]

Fetch cloud cover forecast for each polygon.

Parameters:
  • polygons (GeoDataFrame) – GeoDataFrame with a Name column and polygon geometries (WGS 84).

  • forecast_days (int) – Number of days to forecast (1-16).

  • hourly (bool) – If True, return hourly data with layer breakdown (low / mid / high cloud cover). Otherwise return daily means.

  • models (list[str] | None) – Optional list of NWP model identifiers to use (e.g. ["ecmwf_ifs025"]). None uses Open-Meteo’s automatic best-match selection.

Returns:

  • daily (default): polygon_id, date, cloud_fraction (0.0-1.0).

  • hourly: polygon_id, date, hour, cloud_fraction, cloud_fraction_low, cloud_fraction_mid, cloud_fraction_high.

Return type:

DataFrame with columns

fetch_cloud_forecast(polygon_file, source='openmeteo', forecast_days=7, hourly=False, **kwargs)[source]

Fetch cloud cover forecasts for flight planning.

Parameters:
  • polygon_file (str) – Path to a GeoJSON or shapefile with a Name column.

  • source (str) – Forecast source. Currently only "openmeteo" is supported.

  • forecast_days (int) – Number of days to forecast (1-16).

  • hourly (bool) – If True, return hourly resolution with cloud-layer breakdown.

  • **kwargs – Passed to the source constructor (e.g. models).

Return type:

DataFrame

Returns:

DataFrame — see OpenMeteoCloudForecast for column details.

fetch_cloud_fraction_spatial(polygon_file, year_start, year_stop, day_start, day_stop, scale=1000, satellite='both')[source]

Compute a per-pixel mean cloud fraction map for each polygon.

Uses Google Earth Engine to produce a time-averaged cloud fraction raster at the native MODIS resolution within each polygon’s bounding box. Requires GEE authentication.

Parameters:
  • polygon_file (str) – Path to a GeoJSON or shapefile with a Name column.

  • year_start (int) – First year to include.

  • year_stop (int) – Last year to include (inclusive).

  • day_start (int) – Start day-of-year (1-365).

  • day_stop (int) – End day-of-year (1-365).

  • scale (int) – Output resolution in metres (default 1000 = MODIS native).

  • satellite (str) – "both", "terra", or "aqua".

Return type:

dict[str, object]

Returns:

Dictionary mapping polygon name to xarray.DataArray with dimensions (latitude, longitude) and values 0.0-1.0.

plot_cloud_forecast(forecast_df, threshold=0.25, ax=None, cmap='RdYlGn_r', annotate=True, figsize=(12, 4), title='Cloud Cover Forecast')[source]

Heatmap of cloud cover forecast with go/no-go threshold.

Parameters:
  • forecast_df (DataFrame) – DataFrame with polygon_id, date, cloud_fraction columns (e.g. from fetch_cloud_forecast()).

  • threshold (float) – Cloud fraction threshold for go/no-go classification.

  • ax (Axes | None) – Matplotlib Axes to plot on. Created if None.

  • cmap (str) – Matplotlib colormap name.

  • annotate (bool) – If True, print percentage values in each cell.

  • figsize (tuple[float, float]) – Figure size when ax is None.

  • title (str) – Plot title.

Return type:

Axes

Returns:

The matplotlib Axes.

plot_cloud_fraction_spatial(spatial_data, polygon_file=None, ncols=2)[source]

Plot per-pixel cloud fraction maps.

Parameters:
Return type:

Figure

Returns:

The matplotlib Figure.