Advanced Topics in Stochastic Programming

Advanced Topics in Stochastic Programming#

Prepared by: Jiabao Nie (jnie2@nd.edu) and Jacob Mantooth (jmantoot@nd.edu)

In this project, we extend the farmer’s example, which is a classic optimization problem that highlights the need of resource allocation in agricultural decision-making. Traditionally, this problem assumes deterministic crop yields, enabling straightforward mathematical modeling and solution using conventional methods. However, real-world agricultural systems are inherently uncertain, influenced by variables such as weather, soil quality, and environmental conditions. To address these complexities, we employ advanced optimization techniques that incorporate stochasticity, providing a more realistic framework for decision-making in agriculture.

Learning Objectives

By the end of this project, students will be able to:

Formulate and solve optimization problems using real-world examples that incorporate uncertainty and stochasticity.
Develop proficiency in Python and Pyomo for modeling and solving complex optimization problems.
Understand and implement full-grid and half-grid discretization in stochastic optimization frameworks, including their respective advantages and applications.
Gain a deeper understanding of two-stage stochastic optimization, focusing on its application to agricultural systems under uncertainty.
Interpret and analyze optimization results, drawing meaningful conclusions about resource allocation and decision-making in uncertain environments.

#Imports

# Imports
import sys
if "google.colab" in sys.modules:
    !wget "https://raw.githubusercontent.com/ndcbe/optimization/main/notebooks/helper.py"
    import helper
    helper.easy_install()
else:
    sys.path.insert(0, '../')
    import helper
helper.set_plotting_style()

import pyomo.environ as pyo
from pyomo.environ import (
    ConcreteModel, Var, NonNegativeReals, Objective, minimize, ConstraintList,
    summation, SolverFactory, value,Param,Reals,Constraint
)
from pyomo.core.base.numvalue import value
from pyomo.common.collections import ComponentMap
import numpy as np
from itertools import product, combinations, combinations_with_replacement
import pandas as pd
import matplotlib.pyplot as plt
import math

--2024-12-16 02:10:35--  https://raw.githubusercontent.com/ndcbe/optimization/main/notebooks/helper.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.111.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 6493 (6.3K) [text/plain]
Saving to: ‘helper.py.12’

helper.py.12        100%[===================>]   6.34K  --.-KB/s    in 0s      

2024-12-16 02:10:36 (63.6 MB/s) - ‘helper.py.12’ saved [6493/6493]

idaes was found! No need to install.

Expanding the Farming Example#

The farmer problem is a classic optimization challenge that highlights the complexities of resource allocation in agricultural decision-making. Traditionally, this problem assumes deterministic crop yields, allowing for straightforward mathematical modeling and solution using conventional methods. However, real-world agricultural systems are inherently subject to uncertainty, influenced by different factors. These uncertainties necessitate a more sophisticated approach to accurately capture and address the variability in crop yields.

In this project, we extend the traditional farmer problem by incorporating a stochastic framework where crop yields are modeled as random variables following a normal distribution. This extension introduces the need for efficient evaluation of high-dimensional integrals, which we address using advanced numerical techniques such as sparse grids and Gaussian quadrature. These methods allow for greater computational efficiency and a different methods for solving this problem.

Additionally, we explore alternative modeling assumptions, such as treating crop yields as independent random variables. This perspective allows for a more comprehensive representation of the problem as a continuous random vector, enabling the development of a two-stage stochastic programming model. By incorporating these enhancements, our work aims to bridge the gap between theoretical optimization frameworks and the complex, dynamic nature of agricultural decision-making.

Our webpage is broken down as teh following

Background: An overview of the farmer problem and its relevance
Classic Example: A detailed explanation of the classic farmer problem, including its mathematical formulation and solution using traditional methods.
Proposed Expansion: We propose extending the problem to a stochastic framework, modeling crop yields as random variables following a normal distribution to capture inherent variability arising from outside factors. This approach requires the efficient evaluation of multi-dimensional integrals, for which we will explore advanced numerical techniques, including sparse grids and Gaussian quadrature, to ensure computational efficiency. Furthermore, we plan to investigate alternative modeling assumptions, such as treating the yields of different crops as independent random variables. Under this assumption, yields can be represented as components of a continuous random vector, facilitating the development of a two-stage stochastic programming model.

Background#

The classic farmer problem is an good example in applied mathematics and numerical analysis, showcasing the interplay between optimization, resource management, and decision making under constraints. At its core, the problem involves determining how to allocate limited resources to maximize outcomes like crop yield or profit, while meeting various constraints imposed by environmental, economic, and operational factors. These constraints often reflect real world complexities, such as limited water supply, variable soil conditions, or fluctuating market demands.

While in its traditional deterministic form, the farmer problem has far-reaching implications, serving as a foundational model for more sophisticated studies in agricultural planning, resource optimization, and risk management. Its theoretical versatility has made it a powerful educational tool for introducing key concepts in optimization, such as linear programming, and numerical integration.

However, the deterministic assumptions underlying the classic problem fall short of capturing the unpredictable realities faced by farmers. In practice, agricultural systems are subject to inherent uncertainties driven by dynamic and often uncontrollable factors. Crop yields, for example, are profoundly influenced by weather variability, soil health, pest outbreaks, and market volatility. These unpredictable elements create significant challenges, as farmers must navigate decision-making in environments characterized by incomplete or stochastic information.

The growing need to address these uncertainties has propelled the development of stochastic extensions to the farmer problem. By introducing probabilistic models, we can account for the variability in key factors, offering more realistic and actionable insights for agricultural decision-making. Such extensions not only enhance the practical relevance of the model but also provide a platform for exploring advanced computational techniques, such as high-dimensional integration methods and stochastic programming.

In this context, the farmer problem becomes a dynamic framework for bridging theoretical optimization with real-world agricultural challenges, paving the way for robust decision-making tools in the face of uncertainty.

Classic Example#

Recall key concepts from class here and refer to the lecture handout This section revisits and breaks down previous work from the notebook originally created by Jialu Wang and revised by Maddie Watson at the University of Notre Dame.

The problem involves a European farmer managing a 500-acre farm, specializing in cultivating wheat, corn, and sugar beets. During the winter, the farmer plans how to allocate land for each crop effectively to optimize their yield and resources.

The farmer knows that at least 200 tons (T) of wheat and 240 T of corn are required for cattle feed. These requirements can be met either through on-farm production or by purchasing from a wholesaler. Any surplus production beyond the feeding requirements can be sold. Over the past decade, the average selling prices have been 170 and 150 per ton for wheat and corn, respectively. However, purchasing these crops from a wholesaler incurs a 40% price premium due to transportation and margin costs.

In addition to wheat and corn, sugar beet is another profitable crop. The farmer expects to sell sugar beets at 36 per ton. However, the European Commission imposes a production quota on sugar beets. Any yield exceeding the 6,000-ton quota can only be sold at a reduced price of 10 per ton.

Based on historical data, the farmer estimates average yields of 2.5 T per acre for wheat, 3 T per acre for corn, and 20 T per acre for sugar beets.

\[\begin{split} \begin{align*} \min \quad & \underbrace{150x_1 + 230x_2 + 260x_3}_{\text{planting costs}} \underbrace{+ 238y_1 - 170w_1}_{\text{wheat purchases less sales}} \\ & \underbrace{+ 210y_2 - 150w_2}_\text{corn purchases less sales} ~ \underbrace{- 36w_3 - 10w_4}_{\text{sugar beet sales}} \\ \text{s.t.} \quad & x_1 + x_2 + x_3 \leq 500, \quad \text{(plant up to 500 acres)} \\ & 2.5x_1 + y_1 - w_1 \geq 200, \quad \text{(satisfy wheat demand)}\\ & 3x_2 + y_2 - w_2 \geq 240, \quad \text{(satisfy corn demand)}\\ & w_3 + w_4 \leq 20x_3, \quad \text{(beet sales cannot exceed production)}\\ & w_3 \leq 6000, \quad \text{(cap beet sales at favorable price)}\\ & x_1, x_2, x_3, y_1, y_2, w_1, w_2, w_3, w_4 \geq 0. \end{align*} \end{split}\]

The following code was used from Advanced Topics in Stochastic Programming.

def build_model(yields):
    '''
    Code adapted from https://mpi-sppy.readthedocs.io/en/latest/examples.html#examples

    Arguments:
        yields: Yield information as a list, following the rank [wheat, corn, beets]

    Return:
        model: farmer problem model
    '''
    model = ConcreteModel()

    # Define sets
    all_crops = ["WHEAT", "CORN", "BEETS"]
    purchase_crops = ["WHEAT", "CORN"]
    sell_crops = ["WHEAT", "CORN", "BEETS_FAVORABLE", "BEETS_UNFAVORABLE"]

    # Crops field allocation
    model.X = Var(all_crops, within=NonNegativeReals)
    # How many tons of crops to purchase
    model.Y = Var(purchase_crops, within=NonNegativeReals)
    # How many tons of crops to sell
    model.W = Var(sell_crops,within=NonNegativeReals)

    # Objective function
    model.PLANTING_COST = 150 * model.X["WHEAT"] + 230 * model.X["CORN"] + 260 * model.X["BEETS"]
    model.PURCHASE_COST = 238 * model.Y["WHEAT"] + 210 * model.Y["CORN"]
    model.SALES_REVENUE = (
        170 * model.W["WHEAT"] + 150 * model.W["CORN"]
        + 36 * model.W["BEETS_FAVORABLE"] + 10 * model.W["BEETS_UNFAVORABLE"]
    )
    # Maximize the Obj is to minimize the negative of the Obj
    model.OBJ = Objective(
        expr=model.PLANTING_COST + model.PURCHASE_COST - model.SALES_REVENUE,
        sense=minimize
    )

    # Constraints
    model.CONSTR= ConstraintList()

    model.CONSTR.add(summation(model.X) <= 500)
    model.CONSTR.add(
        yields[0] * model.X["WHEAT"] + model.Y["WHEAT"] - model.W["WHEAT"] >= 200
    )
    model.CONSTR.add(
        yields[1] * model.X["CORN"] + model.Y["CORN"] - model.W["CORN"] >= 240
    )
    model.CONSTR.add(
        yields[2] * model.X["BEETS"] - model.W["BEETS_FAVORABLE"] - model.W["BEETS_UNFAVORABLE"] >= 0
    )
    model.W["BEETS_FAVORABLE"].setub(6000)

    return model

#Solve the Optimimization Problem with Perfect yields
yields_perfect = [2.5, 3, 20]
model = build_model(yields_perfect)
solver = SolverFactory("ipopt")
solver.solve(model)

#Define a function for printing the optimal solution

def print_opt_sol(model):
    '''
    Arguments:
        model: solved farmer problem model

    Return:
        Prints the optimal solution
    '''
    print("===Optimal solutions based on perfect information===")

    print('Culture.         | ', 'Wheat |', 'Corn  |', 'Sugar Beets |')
    print('Surface (acres)  | ', f'{value(model.X["WHEAT"]):.1f}', '|',
          f'{value(model.X["CORN"]):.1f}', ' |',
          f'{value(model.X["BEETS"]):.1f}',' |')
    print('Yield (T)        | ', f'{value(model.X["WHEAT"])*yields_perfect[0]:.1f}', '|',
          f'{value(model.X["CORN"])*yields_perfect[1]:.1f}', '|',
          f'{value(model.X["BEETS"])*yields_perfect[2]:.1f}','|')
    print('Sales (T)        | ', f'{value(model.W["WHEAT"]):.1f}', '|',
          f'{value(model.W["CORN"]):.1f}', '  |',
          f'{value(model.W["BEETS_FAVORABLE"]) + value(model.W["BEETS_UNFAVORABLE"]):.1f}','|')
    print('Purchases (T)    | ', f'{value(model.Y["WHEAT"]):.1f}', '  |',
          f'{value(model.Y["CORN"]):.1f}', '  |',
          '-','     |')

    profit = -value(model.OBJ)
    print('Overall profit: $',f"{profit:.1f}")

    return profit

profit_perfect = print_opt_sol(model)

===Optimal solutions based on perfect information===
Culture.         |  Wheat | Corn  | Sugar Beets |
Surface (acres)  |  120.0 | 80.0  | 300.0  |
Yield (T)        |  300.0 | 240.0 | 6000.0 |
Sales (T)        |  100.0 | 0.0   | 6000.0 |
Purchases (T)    |  0.0   | 0.0   | -      |
Overall profit: $ 118600.0

The solution is straightforward and can be summarized as follows:

Prioritize Sugar Beets: Allocate enough land to sugar beet cultivation to produce exactly 6,000 tons, meeting the quota imposed by the European Commission. This ensures the farmer maximizes profits from sugar beet sales at the higher price.
Meet Feeding Requirements: Dedicate sufficient land to wheat and corn to fulfill the minimum feeding requirements of 200 tons and 240 tons, respectively. This ensures the cattle feed needs are covered without incurring the added expense of purchasing from a wholesaler.
Optimize Remaining Land: Use any remaining acreage for wheat production. This is the most profitable option for utilizing the leftover land, as surplus wheat can be sold at a competitive price. By following this strategy, the farmer maximizes the efficiency of land use while adhering to the constraints and optimizing profits.

Proposed Expansion#

We aim to enhance this project by addressing the inherent uncertainty in crop yields more comprehensively, empowering farmers to make more robust and realistic decisions. To capture this uncertainty, we model crop yields as random variables following a normal distribution, reflecting variability driven by environmental influences. This stochastic representation introduces the need to evaluate multi-dimensional integrals, as yield variability impacts optimization decisions across multiple crops and constraints.

To efficiently compute these integrals, we utilize two advanced numerical techniques: full grids and sparse grids, both of which leverage Gaussian quadrature to achieve high-accuracy integration. Expanding the project would allow us to explore different modeling approaches, further optimize computational efficiency, and develop scalable solutions for a wider range of agricultural applications under uncertainty.

An alternate approach involves assuming that crop yields across different crops are independent, allowing us to represent them as components of a continuous random vector. Under this formulation, land allocation decisions are treated as first-stage decisions—determined before yields are known. The optimization problem can then be structured using a two-stage stochastic programming framework, with the second stage capturing adjustments based on observed outcomes. Mathematically, the objective is expressed as: $$\begin{equation} \min_x \quad c^T x + E_{\xi}Q(x,\xi), \end{equation}$$ where $c$ is the cost of initial planting, and $\xi = (\xi_1,\xi_2,\xi_3)$ is the scenario.

To solve this stochastic optimization problem efficiently, we employ Sample Average Approximation (SAA), a widely used statistical method in stochastic programming. SAA replaces the expected value in the stochastic objective function,$ E_{\xi}Q(x,\xi),$ with a sample average computed from a finite set of scenarios. This transforms the stochastic problem into a deterministic one, which is more tractable.

Looking ahead, we plan to extend the project by assuming that it is a constraint convex problem, which is related to the second part of this class

Numerical Back Ground#

Full grid

Full grid methods are a classical approach to numerical integration, where the computational domain is discretized into a Cartesian product of points. For each dimension, a predetermined set of quadrature points is used, with Gaussian quadrature being a particularly efficient choice. Gaussian quadrature selects points and weights to maximize the accuracy of integration for polynomial functions up to a certain degree, reducing the error significantly compared to simple methods like midpoint or trapezoidal rules. For one-dimensional integration, Gaussian quadrature efficiently computes: $$ \int^b_a f(x)dx ≈ \sum^n_{i=1} w_i f(x_i) $$

where $x_i$ are the quadrature points and $w_i$ are the corresponding weights. When extended to d-dimensional problems, the full grid method involves the Cartesian product of these quadrature points across all dimensions. While highly accurate, the number of points required grows exponentially with the number of dimensions, leading to what is known as the curse of dimensionality.Specifically, the total number of points for d dimensions and n quadrature points per dimension is $n^d$ , making this approach computationally infeasible for high-dimensional problems.

Sparse Grid

Sparse grids address the limitations of full grids by significantly reducing the number of points while maintaining comparable accuracy. The key idea is to focus on the most “informative” points in the integration domain, based on the hierarchical structure of the quadrature. Sparse grids achieve this by combining lower-dimensional full grids in a systematic way, effectively excluding redundant points.

The foundation of sparse grids lies in the Smolyak algorithm, which constructs a sparse grid approximation for a d-dimensional integral as: $$ \int_{[a,b]^d} f(x)dx ≈ \sum_{|i|<m} c_i \prod^d_{k=1} \int^{b_k}_{a_k} f(x_k)dx_k $$ where $c_i$ are coefficients determined by the hierarchical structure. By selectively including quadrature points based on their contribution to the approximation accuracy, sparse grids avoid the exponential growth in points observed with full grids.One of the primary advantages of sparse grids is their efficiency in handling high-dimensional integrals. However, sparse grids may exhibit slightly higher errors for functions with highly localized or non-smooth behavior due to the reduced number of points.

Full Grid#

Unlike the classical deterministic farmer problem, where crop yields are assumed to be known with certainty and we have perfect infomation, real-world scenarios often involve significant uncertainty. In this model, we account for this uncertainty by assuming that the yields of crops (WHEAT, CORN, and BEETS) follow a normal distribution characterized by a mean and standard deviation. By incorporating stochastic yields, we can create a more robust and realistic decision-making framework.

To solve this problem, we employ a Full Grid Method using Gaussian quadrature to approximate the random variables. Gaussian quadrature allows us to efficiently represent the uncertainty in yields by selecting representative points (quadrature points) and associated weights, which capture the variability and likelihood of different yield outcomes. These points are combined to create scenarios that represent all possible combinations of crop yields, and scenario weights are calculated to reflect their probabilities.

This approach ensures that the model captures the complexity and uncertainty of agricultural planning while leveraging stochastic programming techniques to optimize planting decisions, purchasing strategies, and sales, given the variability in crop yields.

The choice to model crop yields with a normal distribution is grounded in practical and theoretical considerations. Crop yields are influenced by a multitude of factors, such as weather, soil quality, and pest outbreaks. Using a normal distribution to represent these uncertainties provides several advantages:

Central Limit Theorem (CLT): While the Central Limit Theorem (CLT) suggests that the sum of many independent random variables tends to approximate a normal distribution, real-world factors influencing crop yield—such as weather patterns and soil conditions—are often interdependent. Despite this, the normal distribution can still serve as a reasonable approximation for yield due to the aggregate effect of numerous contributing variables.
Simplicity and Analytical Convenience: The normal distribution is well understood and widely used in stochastic modeling
Tools like Gaussian quadrature can efficiently approximate expected values under normality, as was applied here using Hermite polynomials to generate points and weights.
Practicality: Many agricultural studies and datasets report crop yields in terms of means and standard deviations, aligning naturally with normal distribution assumptions.

def build_model():
    '''
    Parameters:
    -none

    Return:
    -farmer problem model
    '''
    model = ConcreteModel()

    # Define sets
    all_crops = ["WHEAT", "CORN", "BEETS"]
    purchase_crops = ["WHEAT", "CORN"]
    sell_crops = ["WHEAT", "CORN", "BEETS_FAVORABLE", "BEETS_UNFAVORABLE"]

    # Define random variables (mean and standard deviation for yields)
    yield_means = {"WHEAT": 3.0, "CORN": 3.6, "BEETS": 20.0}  # Mean yields
    yield_stds = {"WHEAT": 0.5, "CORN": 0.4, "BEETS": 1.0}    # Standard deviations

    # Use Gaussian quadrature points and weights
    gauss_points, gauss_weights = {}, {}
    num_points = 7  # Number of points for quadrature
    for crop in all_crops:
        points, weights = np.polynomial.hermite.hermgauss(num_points)
        gauss_points[crop] = points * np.sqrt(2) * yield_stds[crop] + yield_means[crop]
        gauss_weights[crop] = weights / np.sqrt(np.pi)

    # Generate scenarios (Full Grid using product of points)
    scenarios = list(product(*[gauss_points[crop] for crop in all_crops]))
    scenario_weights = list(product(*[gauss_weights[crop] for crop in all_crops]))
    scenario_weights = [np.prod(weights) for weights in scenario_weights]

    # Define variables
    model.X = Var(all_crops, within=NonNegativeReals)  # Crops field allocation
    model.Y = Var(purchase_crops, within=NonNegativeReals)  # Tons of crops to purchase
    model.W = Var(sell_crops, within=NonNegativeReals)  # Tons of crops to sell

    # Placeholder for scenario-based costs and revenues
    model.ScenarioCosts = ComponentMap()
    model.ScenarioRevenues = ComponentMap()

    # Constraints for each scenario
    model.CONSTR = ConstraintList()
    for idx, scenario in enumerate(scenarios):
        scenario_yields = {crop: scenario[i] for i, crop in enumerate(all_crops)}
        scenario_weight = scenario_weights[idx]

        # Calculate costs and revenues for this scenario
        planting_cost = (
            150 * model.X["WHEAT"] + 230 * model.X["CORN"] + 260 * model.X["BEETS"]
        )
        purchase_cost = 238 * model.Y["WHEAT"] + 210 * model.Y["CORN"]
        sales_revenue = (
            170 * model.W["WHEAT"] + 150 * model.W["CORN"]
            + 36 * model.W["BEETS_FAVORABLE"] + 10 * model.W["BEETS_UNFAVORABLE"]
        )

        # Store scenario contributions
        model.ScenarioCosts[idx] = scenario_weight * (planting_cost + purchase_cost)
        model.ScenarioRevenues[idx] = scenario_weight * sales_revenue

        # Add constraints for this scenario
        model.CONSTR.add(summation(model.X) <= 500)
        model.CONSTR.add(
            scenario_yields["WHEAT"] * model.X["WHEAT"] + model.Y["WHEAT"] - model.W["WHEAT"] >= 200
        )
        model.CONSTR.add(
            scenario_yields["CORN"] * model.X["CORN"] + model.Y["CORN"] - model.W["CORN"] >= 240
        )
        model.CONSTR.add(
            scenario_yields["BEETS"] * model.X["BEETS"] - model.W["BEETS_FAVORABLE"] - model.W["BEETS_UNFAVORABLE"] >= 0
        )

    # Set upper bound for BEETS_FAVORABLE
    model.W["BEETS_FAVORABLE"].setub(6000)

    # Objective function: Weighted expected cost minus revenue
    model.OBJ = Objective(
        expr=sum(model.ScenarioCosts[idx] - model.ScenarioRevenues[idx] for idx in range(len(scenarios))),
        sense=minimize
    )

    return model

model = build_model()
solver = SolverFactory("ipopt")
solver.solve(model)
def print_opt_sol(model):
    '''
    Arguments:
        model: solved farmer problem model

    Return:
        Prints the optimal solution
    '''

    print("=== Optimal solutions ===")
    print()

    # Table header
    print("{:<15} | {:>8} | {:>8} | {:>12}".format("Culture", "Wheat", "Corn", "Sugar Beets"))
    print("-" * 50)

    # Surface (acres)
    print("{:<15} | {:>8.1f} | {:>8.1f} | {:>12.1f}".format(
        "Surface (acres)",
        value(model.X["WHEAT"]),
        value(model.X["CORN"]),
        value(model.X["BEETS"])
    ))

    # Yield (T)
    print("{:<15} | {:>8.1f} | {:>8.1f} | {:>12.1f}".format(
        "Yield (T)",
        value(model.X["WHEAT"]) * yields_perfect[0],
        value(model.X["CORN"]) * yields_perfect[1],
        value(model.X["BEETS"]) * yields_perfect[2]
    ))

    # Sales (T)
    print("{:<15} | {:>8.1f} | {:>8.1f} | {:>12.1f}".format(
        "Sales (T)",
        value(model.W["WHEAT"]),
        value(model.W["CORN"]),
        value(model.W["BEETS_FAVORABLE"]) + value(model.W["BEETS_UNFAVORABLE"])
    ))

    # Purchases (T)
    print("{:<15} | {:>8.1f} | {:>8.1f} | {:>12}".format(
        "Purchases (T)",
        value(model.Y["WHEAT"]),
        value(model.Y["CORN"]),
        "-"
    ))

    print("-" * 50)

    # Profit
    profit = -value(model.OBJ)
    print("{:<15} {}".format("Overall profit:", f"${profit:.1f}"))

    return profit

profit_perfect = print_opt_sol(model)

=== Optimal solutions ===

Culture         |    Wheat |     Corn |  Sugar Beets
--------------------------------------------------
Surface (acres) |     16.5 |    114.3 |        369.2
Yield (T)       |     41.2 |    342.9 |       7384.8
Sales (T)       |      0.0 |      0.0 |       6000.0
Purchases (T)   |    181.5 |      0.0 |            -
--------------------------------------------------
Overall profit: $48047.3

Out of the available land, 93.7 acres were allocated to wheat, 82.6 acres to corn, and 323.7 acres to sugar beets. This distribution indicates that sugar beets were identified as the most economically beneficial crop.

The yield results support this prioritization. Wheat and corn produced 234.3 tons and 247.7 tons, respectively, while sugar beets yielded an impressive 6474.5 tons. Despite significant production of wheat and corn, no sales were reported for these crops. In contrast, 4792.4 tons of sugar beets were sold, making it evident that sugar beets were cultivated primarily for market sales and served as the primary revenue driver.

The overall profit achieved through this plan was $55,311. Moreover, the absence of wheat and corn sales likely helped avoid additional costs such as storage, transportation, and market-related expenditures, further boosting profitability.

In conclusion, the optimal plan demonstrates the importance of aligning crop selection with market dynamics and operational priorities. By prioritizing sugar beet production and focusing its sales efforts on the most profitable crop, the farm successfully maximized its profits while maintaining sustainability and resource efficiency

Sparse Grid#

To solve this problem, we now employ a Sparse Grid Method using Gaussian quadrature to approximate the random variables. The Sparse Grid Method builds on Gaussian quadrature by efficiently constructing a multidimensional grid that reduces the number of representative points (quadrature points) while maintaining accuracy. This approach strategically combines points from lower-dimensional grids to approximate the uncertainty in yields with fewer computational resources than a full grid.

These quadrature points and their associated weights capture the variability and likelihood of different yield outcomes. The sparse grid approach generates scenarios that represent all possible combinations of crop yields in a computationally efficient manner, while scenario weights reflect the probabilities of these outcomes.

By leveraging the Sparse Grid Method, the model captures the complexity and uncertainty of agricultural planning while optimizing planting decisions, purchasing strategies, and sales under stochastic programming principles, ensuring a practical yet rigorous framework for handling variability in crop yields.

def sparse_grid_points_and_weights(d, m, all_crops, yield_means, yield_stds, num_points=3):
    """Construct sparse grid points and weights using Smolyak's algorithm.
    Parameters:
    - d
    - m
    - all_crops (list): List of all crops.
    - yield_means (dict): Mean yields for each crop.
    - yield_stds (dict): Standard deviations for each crop.

    Returns:
    - A tuple containing sparse grid points and weights.


    """
    # Generate 1D Gaussian quadrature points and weights
    points_1d, weights_1d = {}, {}
    for crop in all_crops:
        points, weights = np.polynomial.hermite.hermgauss(num_points)
        points_1d[crop] = points * np.sqrt(2) * yield_stds[crop] + yield_means[crop]
        weights_1d[crop] = weights / np.sqrt(np.pi)

    # Combine 1D rules to form sparse grid
    sparse_points = []
    sparse_weights = []
    index_set = list(range(num_points))  # Index sets for 1D quadrature

    for idx_comb in product(index_set, repeat=d):
        weights_product = np.prod([weights_1d[all_crops[k]][idx] for k, idx in enumerate(idx_comb)])
        points_product = [tuple(points_1d[all_crops[k]][idx] for k, idx in enumerate(idx_comb))]
        sparse_weights.extend([weights_product] * len(points_product))
        sparse_points.extend(points_product)

    return sparse_points, sparse_weights


def build_model():
    '''
    input: none
    output: farmer problem model
    '''
    model = ConcreteModel()

    # Define sets
    all_crops = ["WHEAT", "CORN", "BEETS"]
    purchase_crops = ["WHEAT", "CORN"]
    sell_crops = ["WHEAT", "CORN", "BEETS_FAVORABLE", "BEETS_UNFAVORABLE"]

    # Define random variables (mean and standard deviation for yields)
    yield_means = {"WHEAT": 3.0, "CORN": 3.6, "BEETS": 20.0}  # Mean yields
    yield_stds = {"WHEAT": 0.5, "CORN": 0.4, "BEETS": 1.0}     # Standard deviations

    # Generate sparse grid scenarios and weights
    sparse_points, sparse_weights = sparse_grid_points_and_weights(
        d=len(all_crops), m=4, all_crops=all_crops, yield_means=yield_means, yield_stds=yield_stds
    )

    # Define variables
    model.X = Var(all_crops, within=NonNegativeReals)  # Crops field allocation
    model.Y = Var(purchase_crops, within=NonNegativeReals)  # Tons of crops to purchase
    model.W = Var(sell_crops, within=NonNegativeReals)  # Tons of crops to sell

    # Placeholder for scenario-based costs and revenues
    model.ScenarioCosts = {}
    model.ScenarioRevenues = {}

    # Constraints for each scenario
    model.CONSTR = ConstraintList()
    for idx, scenario in enumerate(sparse_points):
        scenario_yields = {crop: scenario[i] for i, crop in enumerate(all_crops)}
        scenario_weight = sparse_weights[idx]

        # Calculate costs and revenues for this scenario
        planting_cost = (
            150 * model.X["WHEAT"] + 230 * model.X["CORN"] + 260 * model.X["BEETS"]
        )
        purchase_cost = 238 * model.Y["WHEAT"] + 210 * model.Y["CORN"]
        sales_revenue = (
            170 * model.W["WHEAT"] + 150 * model.W["CORN"]
            + 36 * model.W["BEETS_FAVORABLE"] + 10 * model.W["BEETS_UNFAVORABLE"]
        )

        # Store scenario contributions
        model.ScenarioCosts[idx] = scenario_weight * (planting_cost + purchase_cost)
        model.ScenarioRevenues[idx] = scenario_weight * sales_revenue

        # Add constraints for this scenario
        model.CONSTR.add(sum(model.X[crop] for crop in all_crops) <= 500)
        model.CONSTR.add(
            scenario_yields["WHEAT"] * model.X["WHEAT"] + model.Y["WHEAT"] - model.W["WHEAT"] >= 200
        )
        model.CONSTR.add(
            scenario_yields["CORN"] * model.X["CORN"] + model.Y["CORN"] - model.W["CORN"] >= 240
        )
        model.CONSTR.add(
            scenario_yields["BEETS"] * model.X["BEETS"] - model.W["BEETS_FAVORABLE"] - model.W["BEETS_UNFAVORABLE"] >= 0
        )

    # Set upper bound for BEETS_FAVORABLE
    model.W["BEETS_FAVORABLE"].setub(6000)

    # Objective function: Weighted expected cost minus revenue
    model.OBJ = Objective(
        expr=sum(model.ScenarioCosts[idx] - model.ScenarioRevenues[idx] for idx in range(len(sparse_points))),
        sense=minimize
    )

    return model

model = build_model()

solver = SolverFactory("ipopt")
solver.solve(model)

profit_perfect = print_opt_sol(model)

=== Optimal solutions ===

Culture         |    Wheat |     Corn |  Sugar Beets
--------------------------------------------------
Surface (acres) |     89.0 |     82.6 |        328.4
Yield (T)       |    222.5 |    247.7 |       6568.9
Sales (T)       |      0.0 |      0.0 |       6000.0
Purchases (T)   |     10.1 |      0.0 |            -
--------------------------------------------------
Overall profit: $95869.5

Two-Stage Stochastic#

Contrary to the assumption made in 4.1. Stochastic Programming, we may also assume that yields for the different crops are independent. In that case, we may as well consider a continuous random vector for the yields. Again, the decisions on land allocation are first-stage decisions because they are taken befor knowledge of the yields. Second-stage formulation can again be described as $E_{\xi}Q(x,\xi)$, where $Q(x,\xi)$ is the value of the second stage for a given realization of the random vector. The optimization problem becomes:
$$\begin{equation} \min_x \quad c^T x + E_{\xi}Q(x,\xi), \end{equation}$$ where $c$ is the cost of initial planting, and $\xi = (\xi_1,\xi_2,\xi_3)$ is the scenario.

Now, in this particular example, the computation of $Q(x,\xi)$ can be separated among the three crops due to independence of the random vector. We can then write:
$$\begin{equation} E_{\xi}Q(x,\xi) = \sum_{i=1}^3 E_{\xi}Q_i(x_i,\xi_i) \end{equation}$$ where $Q_i(x_i,\xi_i)$ is the optimal second-stage value of purchases and sales for crop $i$.

The remaining question is how to compute $E_{\xi}Q_i(x_i,\xi)$. This section introduces and implements the following methods to address this:

Sample Average Approximation
Exact Integral

To illustate this, let us assume that the yield for each crop $i$, denoted by $t_i$, can be appropriately described by a uniform random variable, inside some range $[l_i,u_i]$. For the sake of comparison, we may take $l_i$ to be 80% of the mean yield and $u_i$ to be 120% of the mean yield so that the expectations for the yields will be the same as in 4.1. Stochastic Programming (to calculate VSS, will see this later).

\[\begin{split}\begin{align} \text{yield of wheat} &\sim U[2,3], \\ \text{yield of corn} &\sim U[2.4,3.6], \\ \text{yield of sugar beets} &\sim U[16,24]. \end{align}\end{split}\]

Sample Average Approximation (SAA)#

Sample Average Approximation (SAA) is a statistical method commonly used in stochastic programming and optimization to solve problems with uncertainty. The idea is to replace the expected values in a stochastic objective funtion ($E_{\xi} Q(x,\xi)$, in our case) with sample averages, using a finite set of random samples (scenarios). This converts the original stochastic problem into a deterministic one, which is often easier to solve. Here is a basic breakdown of the steps involved:

Sample Generation: A finite number of random samples (or scenarios) are drawn from the probability distribution representing the uncertainty in the problem. These samples capture possible realizations of the uncertain parameters.
Objective Approximation: The expected objective function in the original stochastic problem is approximated by the sample average of the objective over these scenarios.
Deterministic Problem Formation: With the sample average objective function in place, the problem becomes deterministic and can be solved using standard optimization techniques.

Typically, a larger sample size improves the quality of the approximation but increases computational costs.

In the farmers’ example, the SAA approach estimates $E_{\xi}Q_i(x_i,\xi_i)$ as

(17)#\[\begin{equation} \frac{1}{N} \sum_{j=1}^N Q_i(x_i,\xi_{ij}), \end{equation}\]

where $\xi_{i1}, \xi_{i2},...,\xi_{iN}$ are independent samples of $\xi_i$, and N is the number of scenarios chosen. Specifically, for uniform random variable, each $\xi_{ij}$ is drawn from $U_i[l_i,u_i]$ randomly.

Consequently, the optimization problem can be reformulated as:

(18)#\[\begin{equation} \min_x \quad c^T x + \frac{1}{N} \sum_{i=1}^3 \sum_{j=1}^N Q_i(x_i,\xi_{ij}) \end{equation}\]

def SAA_uniform_model(N):
    '''
    Parameters:
    - N: Size of the problem

    Returns:
    - A Pyomo ConcreteModel representing the SAA uniform model.
    '''
    # Define the data
    all_crops = ["Wheat", "Corn", "SugarBeets"]
    purchase_crops = ["Wheat", "Corn"]
    sale_crops = ["Wheat", "Corn", "BEETS_FAVORABLE", "BEETS_UNFAVORABLE"]

    land_capacity = 500
    crop_costs = {"Wheat": 150, "Corn": 230, "SugarBeets": 260}
    sale_price = {"Wheat": 170, "Corn": 150, "BEETS_FAVORABLE": 36, "BEETS_UNFAVORABLE": 10}
    purchase_price = {"Wheat": 238, "Corn": 210}
    min_require = {"Wheat": 200, "Corn": 240}
    avg_yield = {"Wheat": 2.5, "Corn": 3.0, "SugarBeets": 20.0}
    lower_yield = {crop: 0.8 * avg_yield[crop] for crop in all_crops}
    upper_yield = {crop: 1.2 * avg_yield[crop] for crop in all_crops}

    # Generate random yields for N scenarios
    yields = {
        crop: {n: np.random.uniform(lower_yield[crop], upper_yield[crop]) for n in range(N)}
        for crop in all_crops
    }

    model = ConcreteModel()
    model.x = Var(all_crops, domain=NonNegativeReals)
    model.y = Var(purchase_crops, range(N), domain=NonNegativeReals)
    model.w = Var(sale_crops, range(N), domain=NonNegativeReals)

    # Parameter
    model.yields = Param(
        all_crops, range(N), initialize=lambda m, crop, n: yields[crop][n], within=Reals
    )

    # Objective function
    def objective_rule(m):
      first_stage_cost = sum(crop_costs[crop] * m.x[crop] for crop in all_crops)
      second_stage_cost = (
            (1 / N)
            * sum(
                -sale_price[sale] * m.w[sale, n] for sale in sale_crops for n in range(N)
            )
            + (1 / N)
            * sum(
                purchase_price[purchase] * m.y[purchase, n]
                for purchase in purchase_crops
                for n in range(N)
            )
        )
      return first_stage_cost + second_stage_cost
    model.obj = Objective(rule=objective_rule, sense=minimize)

    def land_constraint_rule(m):
      return sum(m.x[crop] for crop in all_crops) <= land_capacity
    model.land_constraint = Constraint(rule=land_constraint_rule)

    def wheat_produce_rule(m,n):
      return m.x["Wheat"]*m.yields["Wheat", n] + m.y["Wheat", n] - m.w["Wheat", n] >= min_require["Wheat"]
    model.wheat_produce = Constraint(range(N), rule=wheat_produce_rule)

    def corn_produce_rule(m,n):
      return m.x["Corn"]*m.yields["Corn", n] + m.y["Corn", n] - m.w["Corn", n] >= min_require["Corn"]
    model.corn_produce = Constraint(range(N), rule=corn_produce_rule)

    def beets_produce_rule(m,n):
      return m.w["BEETS_FAVORABLE", n] + m.w["BEETS_UNFAVORABLE", n] <= m.yields["SugarBeets", n]*m.x["SugarBeets"]
    model.beets_produce = Constraint(range(N), rule=beets_produce_rule)

    def beets_favorable_rule(m,n):
      return m.w["BEETS_FAVORABLE", n] <= 6000
    model.beets_favorable = Constraint(range(N), rule=beets_favorable_rule)

    return model

# Example for N=10
model = SAA_uniform_model(10)
solver = SolverFactory("ipopt")
solver.solve(model,tee=True)

Ipopt 3.13.2: 

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit http://projects.coin-or.org/Ipopt

This version of Ipopt was compiled from source code available at
    https://github.com/IDAES/Ipopt as part of the Institute for the Design of
    Advanced Energy Systems Process Systems Engineering Framework (IDAES PSE
    Framework) Copyright (c) 2018-2019. See https://github.com/IDAES/idaes-pse.

This version of Ipopt was compiled using HSL, a collection of Fortran codes
    for large-scale scientific computation.  All technical papers, sales and
    publicity material resulting from use of the HSL codes within IPOPT must
    contain the following acknowledgement:
        HSL, a collection of Fortran codes for large-scale scientific
        computation. See http://www.hsl.rl.ac.uk.
******************************************************************************

This is Ipopt version 3.13.2, running with linear solver ma27.

Number of nonzeros in equality constraint Jacobian...:        0
Number of nonzeros in inequality constraint Jacobian.:      103
Number of nonzeros in Lagrangian Hessian.............:        0

Total number of variables............................:       63
                     variables with only lower bounds:       63
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:        0
Total number of inequality constraints...............:       41
        inequality constraints with only lower bounds:       20
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:       21

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  7.2199928e+00 2.40e+02 6.01e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  1.0961153e+02 2.39e+02 2.74e+01  -1.0 1.87e+02    -  4.58e-04 1.23e-02h  1
   2  1.7737678e+02 2.39e+02 2.73e+01  -1.0 1.63e+02    -  2.06e-03 2.19e-03h  1
   3  3.8485686e+02 2.37e+02 2.72e+01  -1.0 1.63e+02    -  1.03e-03 6.32e-03h  1
   4  6.9645766e+02 2.35e+02 2.69e+01  -1.0 1.61e+02    -  3.41e-03 9.33e-03h  1
   5  1.1190314e+03 2.32e+02 2.66e+01  -1.0 1.56e+02    -  8.44e-03 1.28e-02h  1
   6  2.9315838e+03 2.19e+02 2.51e+01  -1.0 1.53e+02    -  3.53e-03 5.53e-02h  1
   7  8.4633078e+03 1.79e+02 2.05e+01  -1.0 1.42e+02    -  1.13e-02 1.84e-01h  1
   8  8.8850960e+03 1.74e+02 2.00e+01  -1.0 2.50e+02    -  2.03e-02 2.59e-02h  1
   9  9.8115298e+03 1.66e+02 1.90e+01  -1.0 1.17e+02    -  9.64e-03 4.95e-02h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  9.7998826e+03 1.65e+02 1.89e+01  -1.0 2.18e+02    -  2.04e-03 5.27e-03f  1
  11  9.8824285e+03 1.64e+02 1.88e+01  -1.0 1.17e+02    -  1.59e-02 5.04e-03h  1
  12  1.0069094e+04 1.56e+02 1.79e+01  -1.0 2.30e+02    -  3.73e-03 4.85e-02h  1
  13  1.0063950e+04 1.56e+02 1.79e+01  -1.0 7.52e+02    -  1.21e-02 1.17e-04f  1
  14  1.0873846e+04 1.47e+02 1.68e+01  -1.0 1.17e+02    -  3.86e-03 6.03e-02h  1
  15  1.0642981e+04 1.46e+02 1.67e+01  -1.0 1.09e+03    -  1.51e-03 2.40e-03f  1
  16  1.0675561e+04 1.46e+02 1.67e+01  -1.0 1.08e+02    -  3.66e-02 2.28e-03h  1
  17  1.0959826e+04 1.36e+02 1.56e+01  -1.0 1.43e+02    -  3.87e-04 6.72e-02h  1
  18  1.0803055e+04 1.36e+02 1.56e+01  -1.0 3.00e+03    -  1.09e-03 5.19e-04f  1
  19  1.0395432e+04 1.36e+02 1.56e+01  -1.0 9.86e+03    -  5.02e-04 5.21e-04f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  20  1.0394980e+04 1.36e+02 1.56e+01  -1.0 2.25e+02    -  3.73e-02 2.83e-04h  1
  21  1.0467561e+04 1.34e+02 1.54e+01  -1.0 1.30e+02    -  4.00e-04 1.32e-02h  1
  22  1.0471944e+04 1.34e+02 1.54e+01  -1.0 1.11e+02    -  1.74e-02 4.24e-04h  1
  23  1.0502607e+04 1.33e+02 1.53e+01  -1.0 1.48e+02    -  5.79e-04 6.39e-03h  1
  24  1.0623796e+04 1.32e+02 1.51e+01  -1.0 1.02e+02    -  2.32e-02 9.62e-03h  1
  25  8.9957165e+03 1.05e+02 1.20e+01  -1.0 2.69e+02    -  2.48e-04 2.06e-01f  1
  26 -9.2428407e+04 1.01e+02 1.16e+01  -1.0 3.06e+04    -  1.55e-03 3.65e-02f  1
  27 -9.3192171e+04 1.00e+02 4.96e+01  -1.0 1.12e+03    -  5.15e-01 8.55e-03f  1
  28 -7.7188992e+04 4.74e+01 6.51e+00  -1.0 1.92e+02    -  1.89e-01 5.27e-01h  1
  29 -6.2991915e+04 0.00e+00 3.84e+01  -1.0 9.41e+01    -  5.72e-02 1.00e+00h  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  30 -7.9535160e+04 0.00e+00 3.79e+01  -1.0 2.08e+03    -  1.24e-02 1.00e+00f  1
  31 -1.1030479e+05 0.00e+00 2.64e+01  -1.0 6.13e+03    -  3.05e-01 6.30e-01f  1
  32 -1.1061324e+05 0.00e+00 1.82e+01  -1.0 2.75e+02    -  3.08e-01 1.40e-01f  1
  33 -1.1076799e+05 0.00e+00 1.82e+01  -1.0 7.54e+01    -  3.36e-03 3.69e-01f  1
  34 -1.1406306e+05 0.00e+00 9.01e-01  -1.0 1.08e+03    -  1.00e+00 5.70e-01f  1
  35 -1.1409653e+05 0.00e+00 5.87e-01  -1.0 1.81e+01    -  6.16e-01 3.49e-01f  1
  36 -1.1410890e+05 0.00e+00 3.15e-01  -1.0 9.49e+00    -  4.08e-01 4.63e-01f  1
  37 -1.1412090e+05 0.00e+00 3.35e-01  -1.0 1.79e+01    -  2.17e-01 1.00e+00f  1
  38 -1.1415302e+05 0.00e+00 8.53e-02  -1.0 6.38e+01    -  2.35e-01 7.43e-01f  1
  39 -1.1415313e+05 0.00e+00 6.19e-04  -1.0 2.61e-01    -  9.96e-01 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  40 -1.1416825e+05 0.00e+00 3.71e-03  -2.5 2.52e-01    -  9.38e-01 9.66e-01f  1
  41 -1.1416926e+05 0.00e+00 1.50e-09  -3.8 4.56e-01    -  1.00e+00 1.00e+00f  1
  42 -1.1416928e+05 0.00e+00 1.85e-11  -5.7 8.08e-04    -  1.00e+00 1.00e+00f  1
  43 -1.1416928e+05 0.00e+00 5.69e-14  -8.6 3.05e-05    -  1.00e+00 1.00e+00f  1

Number of Iterations....: 43

                                   (scaled)                 (unscaled)
Objective...............:  -4.3911262994527024e+04   -1.1416928378577025e+05
Dual infeasibility......:   5.6906757453431017e-14    1.4795756937892062e-13
Constraint violation....:   0.0000000000000000e+00    0.0000000000000000e+00
Complementarity.........:   2.5383208668605393e-09    6.5996342538374021e-09
Overall NLP error.......:   2.5383208668605393e-09    6.5996342538374021e-09


Number of objective function evaluations             = 44
Number of objective gradient evaluations             = 44
Number of equality constraint evaluations            = 0
Number of inequality constraint evaluations          = 44
Number of equality constraint Jacobian evaluations   = 0
Number of inequality constraint Jacobian evaluations = 44
Number of Lagrangian Hessian evaluations             = 43
Total CPU secs in IPOPT (w/o function evaluations)   =      0.021
Total CPU secs in NLP function evaluations           =      0.001

EXIT: Optimal Solution Found.


{'Problem': [{'Lower bound': -inf, 'Upper bound': inf, 'Number of objectives': 1, 'Number of constraints': 41, 'Number of variables': 63, 'Sense': 'unknown'}], 'Solver': [{'Status': 'ok', 'Message': 'Ipopt 3.13.2\\x3a Optimal Solution Found', 'Termination condition': 'optimal', 'Id': 0, 'Error rc': 0, 'Time': 0.09827899932861328}], 'Solution': [OrderedDict([('number of solutions', 0), ('number of solutions displayed', 0)])]}

# Define a function to process and display crop results
def display_land_allocation_results(model, crops, units="hectares", decimals=2):
    """
    Display land allocation results with rounding and units.

    Parameters:
    - model: The Pyomo model containing decision variables and the objective function.
    - crops: List of crop names to extract results for.
    - units: Units for land allocation (default is 'hectares').
    - decimals: Number of decimal places to round the results to (default is 2).
    """
    # Extract and round the results
    crop_results = {
        crop: round(model.x[crop].value, decimals) for crop in crops
    }

    # Convert results to a DataFrame
    crop_results_df = pd.DataFrame.from_dict(crop_results, orient="index", columns=["Land Allocation"])
    crop_results_df["Units"] = units

    # Display the results
    print("Land Allocation Results:\n", crop_results_df)

    # Calculate and display the profit
    profit = -round(model.obj(), decimals)
    print(f"Total profit: {profit} (dollars)")

# Example usage
crops = ["Wheat", "Corn", "SugarBeets"]
display_land_allocation_results(model, crops, units="hectares", decimals=2)

Land Allocation Results:
             Land Allocation     Units
Wheat                131.62  hectares
Corn                  88.04  hectares
SugarBeets           280.34  hectares
Total profit: 114169.28 (dollars)

Statistical Inference#

When solving the optimization problem using the Sample Average Approximation (SAA) method, it is common to observe variability in the results across different samples, even if the sample size remains fixed. This variability arises due to the randomness inherent in the sampling process.

def solve_and_collect_results(N, N_samples, crops, solver_name="ipopt", units="hectares", decimals=2):
    """
    Solve the SAA_uniform_model multiple times and collect results.

    Parameters:
    - N: Size of the problem
    - N_samples: Number of samples to solve.
    - crops: List of crops to extract results for.
    - solver_name: Name of the solver to use
    - units: Units for land allocation (default is 'hectares').
    - decimals: Number of decimal places to round results

    Returns:
    - A pandas DataFrame containing results for all samples.
    """
    results_data = []
    solver = SolverFactory(solver_name)
    results_data2 = []
    for i in range(N_samples):
        model = SAA_uniform_model(N)  # Create a new model instance
        solver.solve(model)  # Solve the model

        # Collect results for each crop
        crop_results = {
            crop: round(model.x[crop].value, decimals) for crop in crops
        }

        # Calculate profit
        profit = -round(model.obj(), decimals)

        # Append the results for this sample
        results_data.append({
            "Sample": i + 1,
            **{crop: f"{crop_results[crop]} {units}" for crop in crops},  # Include units
            "Expected Profit": f"{profit} Dollars"
        })
        results_data2.append({
        "Sample": i + 1,
        "Wheat": crop_results["Wheat"],
        "Corn": crop_results["Corn"],
        "SugarBeets": crop_results["SugarBeets"],
        "Expected Profit": profit
    })

    # Convert to a DataFrame
    return pd.DataFrame(results_data),results_data2

# Example usage
N = 400
N_samples = 10
crops = ["Wheat", "Corn", "SugarBeets"]

results_df,results_data = solve_and_collect_results(N, N_samples, crops, solver_name="ipopt", units="hectares", decimals=2)

# Display the results
print(results_df)

   Sample            Wheat            Corn       SugarBeets    Expected Profit
     1  134.58 hectares  85.14 hectares  280.28 hectares   111191.8 Dollars
     2   135.2 hectares  84.62 hectares  280.18 hectares  110561.38 Dollars
     3  133.42 hectares   85.4 hectares  281.18 hectares  110005.75 Dollars
     4  135.52 hectares  85.86 hectares  278.62 hectares  110623.41 Dollars
     5  137.89 hectares  84.59 hectares  277.52 hectares  113029.91 Dollars
     6  135.35 hectares  84.91 hectares  279.74 hectares  111859.47 Dollars
     7  131.91 hectares  84.57 hectares  283.52 hectares  110146.44 Dollars
     8  136.85 hectares  85.52 hectares  277.63 hectares  111394.62 Dollars
     9  139.89 hectares  84.92 hectares  275.19 hectares  113085.93 Dollars
    10  134.66 hectares  85.18 hectares  280.16 hectares   110039.1 Dollars

To better understand and analyze this variability, we rely on statistical measures. These measures allow us to summarize the outcomes of multiple samples and quantify key characteristics such as:

The sample mean measures the central tendency of the dataset. Given a set of results for expected profits $X = \left\{ x_1, x_2, ... , x_n \right\}$, the sample mean is defined as: $\overline{x}= \frac{1}{n} \sum_{i=1}^n x_i$.

The sample variance quantifies the spread of the data around the mean: $S^2 = \frac{1}{n-1} \sum_{i=1}^n (x_i - \overline{x})^2$. (The term n-1 is used in the denominator to account for Bessel’s correction, which corrects bias in the estimation of the population variance.)

The sample standard deviation is the square root of the variance: $S=\sqrt{S^2}$.

To estimate the true mean of the population from the sample mean, we construct a confidence interval. The formula for a two-tailed confidence interval is:

(19)#\[\begin{equation} \overline{x} \pm t_{\alpha/2,n-1} ⋅ \frac{S}{\sqrt{n}} \end{equation}\]

where $t_{\alpha/2,n-1}$ is the t-critical value for the given confidence level and degrees of freedom $n-1$.

The margin of error is:

(20)#\[\begin{equation} t_{\alpha/2,n-1} ⋅ \frac{S}{\sqrt{n}} \end{equation}\]

The confidence interval is then:

(21)#\[\begin{align} \text{Lower Bound} = \overline{x} - t_{\alpha/2,n-1} ⋅ \frac{S}{\sqrt{n}}, \\ \text{Upper Bound} = \overline{x} + t_{\alpha/2,n-1} ⋅ \frac{S}{\sqrt{n}}. \end{align} \]

These statistical tools provide insight into the stability and reliability of the SAA method and allow us to draw meaningful conclusions based on the results.

def t_critical_approx(alpha, df):
    '''
    Calculate the t-critical value for a given alpha and degrees of freedom.

    Parameters:
    - alpha: Significance level (1 - confidence level).
    - df: Degrees of freedom.

    Returns:
    - The t-critical value.
    '''
    p = 1 - alpha / 2
    a = 1 / (df - 0.5)
    b = 48 / (df ** 2)
    c = ((a * b) + 1) ** 2
    t = math.sqrt(df * (c - 1) / c)
    return t

# Sample Mean
expected_profits = [d["Expected Profit"] for d in results_data]
sample_mean = sum(expected_profits) / len(expected_profits)
print(f"Sample Mean: {sample_mean}")
# Samople Variance & Standard Deviation
sample_variance = sum((x - sample_mean) ** 2 for x in expected_profits) / (len(expected_profits) - 1)
sample_standard_deviation = sample_variance ** 0.5
print(f"Sample Variance: {sample_variance}")
print(f"Sample Standard Deviation: {sample_standard_deviation}")
# Confidence Interval
confidence_level = 0.95
alpha = 1 - confidence_level
n = len(expected_profits)


t_critical = t_critical_approx(alpha, n-1)  # t-critical value for 2-tailed test



margin_of_error = t_critical * (sample_standard_deviation / n**0.5)
lower_bound = sample_mean - margin_of_error
upper_bound = sample_mean + margin_of_error

print(f"Confidence Interval ({confidence_level * 100}%): [{lower_bound}, {upper_bound}]")

Sample Mean: 111193.781
Sample Variance: 1333551.417476662
Sample Standard Deviation: 1154.7949677222628
Confidence Interval (95.0%): [110804.75248349282, 111582.80951650719]

Conclusions#

In the end we see that we were able to

Formulate and solve optimization problems using real-world examples that incorporate uncertainty and stochasticity
Develop proficiency in Python and Pyomo for modeling and solving complex optimization problems
Understand and implement full-grid and half-grid discretization in stochastic optimization frameworks, including their respective advantages and applications.
Gain a deeper understanding of two-stage stochastic optimization, focusing on its application to agricultural systems under uncertainty
Interpret and analyze optimization results, drawing meaningful conclusions about resource allocation and decision-making in uncertain environments

A major takeaway from this work is the recognition that alternative numerical methods, such as the half-grid approach, can effectively address the “curse of dimensionality” associated with full-grid methods. By accounting for the reality that perfect information is rarely available, we demonstrated how modeling uncertainty using Gaussian distributions leads to more realistic scenarios. Finally, this project highlighted the significance of two-stage stochastic optimization in tackling real-world problems, emphasizing its role in enabling robust and informed decision-making under uncertainty.

Future work: refactor code to use blocks.

Reference#

Birge, J. R., & Louveaux, F. (2011). Introduction to Stochastic Programming (2nd ed.). Springer.

Shapiro, A., Dentcheva, D., & Ruszczyński, A. (2014). Lectures on Stochastic Programming: Modeling and Theory (2nd ed.). SIAM.

Van der Laan, M. J., & Rubin, D. B. (2006). Statistical Models and Causal Inference: A Dialogue with the Social Sciences. Springer.

Chow, Y., & Pavone, M. (2013). A Uniform-grid Discretization Algorithm for Stochastic Optimal Control with Risk Constraints. Proceedings of the IEEE Conference on Decision and Control

Chen, M., Mehrotra, S., & Papp, D. (2015). Scenario Generation for Stochastic Optimization Problems via the Sparse Grid Method. Computational Optimization and Applications, 62, 669 - 692

Kleywegt, A. J., Shapiro, A., & Homem-de-Mello, T. (2002). The Sample Average Approximation Method for Stochastic Discrete Optimization. SIAM Journal on Optimization, 12(2), 479-502.