Data Generation

Calibration Model Data Generation

This document provides a comprehensive guide to the Python code designed for running a greenhouse climate simulation and predicting climate parameters using a pre-trained machine learning model.


  1. Initializes a greenhouse climate model with specific parameters.
  2. Runs the model for a defined growth season, incrementally updating the greenhouse state.
  3. Uses a trained neural network model to predict climate parameters within the greenhouse.
  4. Calculates the relative root mean squared error (RRMSE) between the predicted and actual values.
  5. Plots the actual vs. predicted values for temperature and vapor pressure.
  6. Saves the simulation data and predictions to a CSV file.

Custom Loss Function

A custom loss function, quantiles_loss, is defined to perform quantile regression, which predicts the median of the target distribution.

Model Preparation

The prepare_inputs function is crucial for preparing the input data for the prediction model. It normalizes the data using pre-trained scalers and reshapes it to match the model's expected input format.

Simulation and Prediction

The script sets up a GreenLightModel instance with predefined settings and runs a simulation over a specified season length. At each interval, it performs the following:

  1. Updates the state of the greenhouse based on the simulation.
  2. Prepares input data from the current and previous states.
  3. Generates predictions using the neural network model.
  4. Stores the actual and predicted values for further analysis.

Plotting Results

The script visualizes the actual and predicted values of the temperature at the top of the greenhouse (tTop) and the vapor pressure at the top (vpTop) using line plots.

Calculating RRMSE

The Relative Root Mean Squared Error is calculated for both tTop and vpTop to quantify the prediction accuracy of the model.

Saving Data

Finally, the actual and predicted values of tTop and vpTop are saved to a CSV file named calibration_data.csv for further analysis or calibration purposes.

The script exemplifies a practical application of machine learning models in precision agriculture, particularly in optimizing greenhouse climate conditions for enhanced plant growth and productivity.

Complete Code

from greenlight import (
import random
from tensorflow.keras import backend as K
from joblib import load
from tensorflow.keras.models import load_model
import pandas as pd
import matplotlib.pyplot as plt
import copy
import numpy as np
import time as tm
# Define a custom loss function for quantile regression
def quantiles_loss(y_true, y_pred):
    quantiles = 0.5
    error = y_true - y_pred
    loss = tf.where(error >= 0, quantiles * error, (quantiles - 1) * error)
    return tf.reduce_mean(loss)
# Define a function to calculate the relative root mean squared error (RRMSE)
def calculate_rrmse_loss(y_true, y_pred):
    return K.sqrt(K.mean(K.square(y_true - y_pred)) / K.mean(K.square(y_true)))
# Define a function to save the results to a CSV file
def save_to_csv(tTop_array, pred_tTop_array, vpTop_array, pred_vpTop_array):
    data = {
        "tTop": tTop_array,
        "pred_tTop": pred_tTop_array,
        "vpTop": vpTop_array,
        "pred_vpTop": pred_vpTop_array,
    df = pd.DataFrame(data)
    df.to_csv("calibration_data.csv", index=False)
# Define a function to calculate the relative root mean squared error (RRMSE)
def calculate_rrmse(true_values, predicted_values):
    mse = np.mean((true_values - predicted_values) ** 2)
    rmse = np.sqrt(mse)
    mean_true_values = np.mean(true_values)
    rrmse = rmse / mean_true_values
    return rrmse
# Define a function to prepare the inputs for the model
def prepare_inputs(prev_gl, gl, scaler_features, past_time_steps):
    last_step_prev_gl = [
        prev_gl["d"]["iGlob"][-1, 1],
        prev_gl["d"]["tOut"][-1, 1],
        prev_gl["d"]["vpOut"][-1, 1],
        prev_gl["d"]["wind"][-1, 1],
        prev_gl["d"]["tSky"][-1, 1],
        prev_gl["x"]["tTop"][-1, 1],
        prev_gl["x"]["vpTop"][-1, 1],
        prev_gl["u"]["roof"][-1, 1],
    current_data = np.array(
            [gl["d"]["iGlob"][-i, 1] for i in reversed(range(1, past_time_steps))],
            [gl["d"]["tOut"][-i, 1] for i in reversed(range(1, past_time_steps))],
            [gl["d"]["vpOut"][-i, 1] for i in reversed(range(1, past_time_steps))],
            [gl["d"]["wind"][-i, 1] for i in reversed(range(1, past_time_steps))],
            [gl["d"]["tSky"][-i, 1] for i in reversed(range(1, past_time_steps))],
            [gl["x"]["tTop"][-i, 1] for i in reversed(range(1, past_time_steps))],
            [gl["x"]["vpTop"][-i, 1] for i in reversed(range(1, past_time_steps))],
            [gl["u"]["roof"][-i, 1] for i in reversed(range(1, past_time_steps))],
    combined_data = np.vstack([last_step_prev_gl, current_data])
    normalized_combined_data = scaler_features.transform(combined_data)
    input_data = [
        normalized_combined_data[:, i].reshape(1, -1, 1)
        for i in range(normalized_combined_data.shape[1])
    return input_data
# Main function
if __name__ == "__main__":
    # Define the parameters for the simulation
    season_length = 5
    season_interval = 1 / 24
    first_day = 155
    weatherInput = "ams"
    roof_type = "triangle"
    past_time_steps = 12
    # Load the scalers and the model
    scaler_features = load("./data/pth/scaler_features.joblib")
    scaler_labels = load("./data/pth/scaler_labels.joblib")
    model_pth = f"./data/pth/triangle_0.5_12_1.h5"
    predict_model = load_model(
        f"{model_pth}", custom_objects={"quantiles_loss": quantiles_loss}
    # Initialize the GreenLight model
    model = GreenLightModel(
        weatherInput=weatherInput, first_day=first_day, isMature=True
    # Define the parameters for the GreenLight model
    gl = {
        "p": {
            # Greenhouse structure settings
            "tSpDay": 19.5,  # Setpoint temperature during the day
            "tSpNight": 18.5,  # Setpoint temperature during the night
            "co2SpDay": 1000,  # Setpoint CO2 concentration during the day
            "psi": 22,  # Mean greenhouse cover slope [degrees]
            "aFlr": 4e4,  # Floor area [m^{2}]
            # Surface area of the cover including side walls [m^{2}]
            "aCov": 4.84e4,
            # Height of the main compartment [m] (the ridge height is 6.5, screen is 20 cm below it)
            "hAir": 6.5,
            "hGh": 6.905,  # Mean height of the greenhouse [m]
            "aRoof": 0.1169 * 4e4,  # Maximum roof ventilation area
            # Vertical dimension of a single ventilation opening [m]
            "hVent": 1.3,
            "cDgh": 0.75,  # Ventilation discharge coefficient [-]
            "lPipe": 1.25,  # Length of the pipe rail system [m m^{-2}]
            "phiExtCo2": 7.2e4 * 4e4
            # Capacity of CO2 injection for the entire greenhouse [mg s^{-1}]
            / 1.4e4,
            # Capacity of the boiler for the entire greenhouse [W]
            "pBoil": 300 * 4e4,
            # Control settings
            "tSpDay": 19.5,  # temperature set point light period [°C]
            "tSpNight": 18.5,  # temperature set point dark period [°C]
            "co2SpDay": 1000,  # CO2 setpoint during the light period [ppm]
            "rhMax": 87,  # maximum relative humidity [%]
            # P-band for ventilation due to high temperature [°C]
            "ventHeatPband": 4,
            # P-band for ventilation due to high relative humidity [% humidity]
            "ventRhPband": 50,
            # P-band for screen opening due to high relative humidity [% humidity]
            "thScrRhPband": 10,
            "lampsOn": 0,  # time of day (in morning) to switch on lamps [h]
            "lampsOff": 18,  # time of day (in evening) to switch off lamps [h]
            # lamps are switched off if global radiation is above this value [W m^{-2}]
            "lampsOffSun": 400,
            # Predicted daily radiation sum from the sun where lamps are not used that day [MJ m^{-2} day^{-1}]
            "lampRadSumLimit": 10,
    prev_gl = None
    # Initialize lists to store the results
    tTop_list = []
    vpTop_list = []
    pred_tTop_list = []
    pred_vpTop_list = []
    # Run the simulation
    for current_step in range(int(season_length // season_interval)):
        gl = model.run_model(gl, season_length, season_interval, current_step)
        # If this is not the first iteration, make a prediction
        if prev_gl is not None:
            model_inputs = prepare_inputs(prev_gl, gl, scaler_features, past_time_steps)
            predictions = predict_model.predict(model_inputs)
            predictions_reshaped = predictions.reshape(-1, 2)
            predictions_inverse = scaler_labels.inverse_transform(predictions_reshaped)
            tTop = predictions_inverse[0][0]
            vpTop = predictions_inverse[0][1]
        prev_gl = copy.deepcopy(gl)
        gl = extract_last_value_from_nested_dict(gl)
    # Convert the lists to numpy arrays
    tTop_array = np.array(tTop_list)
    vpTop_array = np.array(vpTop_list)
    pred_tTop_array = np.array(pred_tTop_list)
    pred_vpTop_array = np.array(pred_vpTop_list)
    # Plot the results
    plt.figure(figsize=(10, 5))
    plt.plot(tTop_array, label="Actual tTop")
    plt.plot(pred_tTop_array, label="Predicted tTop")
    plt.ylabel("tTop (°C)")
    plt.figure(figsize=(10, 5))
    plt.plot(vpTop_array, label="Actual vpTop")
    plt.plot(pred_vpTop_array, label="Predicted vpTop")
    plt.ylabel("vpTop (Pa)")
    # Calculate the RRMSE for tTop and vpTop
    rrmse_tTop = calculate_rrmse(tTop_array, pred_tTop_array)
    print(f"Predicted RRMSE for tTop: {rrmse_tTop}")
    rrmse_vpTop = calculate_rrmse(vpTop_array, pred_vpTop_array)
    print(f"Predicted RRMSE for vpTop: {rrmse_vpTop}")
    # Save the results to a CSV file