Simplified energy system model#

This tutorial illustrates, step by step, how to build a simple energy system optimization model with CVXlab. The tutorial mirrors the workflow described in model generation from scratch, so that the transition from conceptual design to numerical solution stays visible throughout the documentation.

This tutorial is the right place to start if you are new to CVXlab.

Problem statement

Let us consider the following energy system planning problem, applied to a generic region. The goal is to define the least-cost energy production plan over a defined time horizon, considering the following assumptions:

  • The energy demand is assumed to be known in advance over the whole time horizon (i.e., perfect foresight).

  • The energy can be supplied by a number of available technologies, each characterized by known values for:

    • installed capacities (MW, variable over time).

    • specific production costs (€/kWh, constant).

    • availabilities (i.e. values able to convert installed capacity in MW to energy supplied in MWh, assumed as constants).

Conceptual model definition#

Related user guide step: Conceptual model definition

Defining Sets

Sets defined for the model are summarized in the table below.

Sets defining model’s domain#

Set name

Symbol

Coordinates

Cardinality

Set type

Technologies

\(t\)

Solar, Gas, Nuclear

3

Dimension

Time periods

\(y\)

2025, 2026, 2027, 2028, 2029, 2030

6

Dimension

Demand scenarios

\(d\)

Low_demand, High_demand

2

Inter-problem

Notice that:

  • Inter-problem sets (\(d\)) define multiple problem instances. This implies that one optimization problem is generated and solved for each combination of demand scenario (in this case, only \(2\) problem instances).

  • Dimension sets (\(t\), \(y\)) are used to define the scope of data tables and the shapes of related variables.

  • Coordinates of each set can be associated to filters to define sub-domains. As example, the technologies set may classify technologies as renewable and non-renewable, allowing to define variables with sub-domains including only specific categories. In this simplified example, all variables are defined over full domains (no filtering is applied).

Defining Data Tables and related Variables

The following tables summarizes the Data Tables and associated variables for the energy system model.

Data tables properties#

Type

Name

Domain [Cardinality]

Description

Exogenous

\(cost(t)\)

\(t - [3]\)

Specific costs of generation by cost scenario and technology (in €/MWh).

Exogenous

\(capacity(t,y)\)

\(t \times y - [3 \times 6 = 18]\)

Installed capacity by technology and time period (in MW).

Exogenous

\(availability(t)\)

\(t - [3]\)

Availability factors by technology (in MWh/MW).

Exogenous

\(demand(d,y)\)

\(d \times y - [2 \times 1 \times 6 = 12]\)

Energy demand defined by demand scenarios and time periods.

Endogenous

\(supply(d,y,t)\)

\(d \times y \times t - [2 \times 6 \times 3 = 72]\)

Energy supply defined by demand and cost scenarios, technology and time period.

Constant

\(constant(t)\)

\(t - [3]\)

Model constants defined based on the shape of \(t\) set.

Regarding data tables above:

  • Endogenous data table has a domain defined over all model sets, while exogenous data tables are defined over specific sets.

  • For each data table, the cardinality (i.e. the total number of data entries) is reported, calculated as the product of the cardinalities of all sets in the domain. As example, the availability(t) data table includes 3 entries only, one for each technology, due to its domain defined over the technologies set only.

Variables properties#

Related data table

Variable name

Shape (rows,columns)

Intra-problem sets

Inter-problem sets

\(cost(t)\)

\(c\)

\(1, t - [1, 3]\)

\(-\)

\(-\)

\(capacity(t, y)\)

\(cap\)

\(1, t - [1, 3]\)

\(y - [6]\)

\(-\)

\(availability(d, y)\)

\(av\)

\(1, t - [1, 3]\)

\(-\)

\(-\)

\(demand(d, y)\)

\(E_d\)

\(1, 1 - [1, 1]\)

\(y - [6]\)

\(d - [2]\)

\(supply(d,y,t)\)

\(E_s\)

\(1, t - [1, 3]\)

\(y - [6]\)

\(d - [2]\)

\(consant(t)\)

\(i_t\)

\(t, 1 - [3, 1]\)

\(-\)

\(-\)

Regarding variables above:

  • Each variable stem from a related data table, inheriting its properties: the domain (defined by sets) and the data type (exogenous, endogenous, constant).

  • Each variable is characterized by a specific allocation of dimensions sets into shapes and intra-problem sets. As example, the energy supply E_s variable has 1 row and 3 columns (defined by the technologies set), it is indexed over 6 intra-problem coordinates (defined by the time periods set) and over 2 inter-problem coordinates (defined by the demand scenarios set).

  • Constants can be defined with different built-in or user defined types (see Constant data types). In the example above, the i_t variable is defined as a summation vector, consisting in a column vector of 1s, useful to perform summations by matrix multiplications.

Defining Problem and related Expressions

For the current energy system model, a symbolic problem can be defined as a linear optimization problem as follows.

\[\begin{split}\begin{aligned} \min_{E_s} \quad & c \cdot E_s' & \forall \, y\\ \text{s.t.} \quad & E_s \cdot i_t \geq E_d & \forall \, y \\ & E_s \leq cap \cdot \widehat{av} & \forall \, y \\ & E_s \geq 0 & \forall \, y \end{aligned}\end{split}\]

Notice that:

  • The problem is defined a number of times equal to the cardinality of the inter-problem set. Specifically, one problem instance is defined and solved for each energy demand scenarios \(d\). In case of multiple inter-problem sets, the problem is defined for each coordinate combination in the Cartesian product of all inter-problem sets.

  • For each simbolic expression, a number of numerical expressions is generated, equal to the Cartesian product of all intra-problem sets of the related variables. In this case, all expressions are defined over the intra-problem set time periods \(y\), generating one numerical expression per time period for all symbolic expressions.

  • In case of variables defined over different intra-problem sets, automatic broadcasting is applied, Variables not defined over specific intra-problem sets are automatically reused across all generated numerical expressions.

  • In this problem, the dot operator \(\cdot\) represents matrix multiplication, the \(\widehat{(*)}\) is the diagonalization operator, and the \((*)'\) represents the transposition operator (see Symbolic operators for a comprehensive description of built-in symbolic operators).

A note on dimensional formulations:

The allocation of dimension sets to shapes and intra-problem sets offers significant modeling flexibility. The same problem can be formulated in multiple equivalent ways.

Matrix-based formulation (as in the example above):

  • Expressions must be dimensionally consistent, and variables shapes must be compatible for matrix operations. Multiple variables can stem from the same data table, each characterized by different allocations of dimension sets: this allows for flexible model definitions.

  • Expressions works with matrix operations (multiplication, transposition, …).

  • Compact symbolic representation with fewer expression instances.

  • Potentially more efficient numerical problem generation and solution.

Scalar-based formulation (extreme case):

All dimension sets can be allocated as intra-problem sets, reducing all variables to scalars (shape \((1,1)\)). In this case, for each energy demand scenario \(d\), the problem can be reformulated as:

\[\begin{split}\begin{aligned} \min_{E_s} \quad & \sum_{t} c \cdot E_s \\ \text{s.t.} \quad & \sum_{t} E_s \geq E_d & \forall \, y \\ & E_s \leq cap \cdot av & \forall \, t \, y \\ & E_s \geq 0 & \forall \, t \, y \end{aligned}\end{split}\]

where all variables become scalars indexed over \(t\) and \(y\).

Generation of model directory#

Related user guide step: Generation of model directory

At this stage the conceptual model is already defined. The next step is to create a model directory that will contain the setup files, the sets workbook, the input-data files, and the SQLite database of the tutorial model.

For this tutorial, a compact Excel-based workflow is convenient because all setup information can be stored in a single workbook.

import cvxlab

cvxlab.create_model_dir(
    model_dir_name="simple_energy_model",
    main_dir_path="path/to/tutorial_workspace",
    settings_file_type="xlsx",
    include_user_defined_templates=False,
)

For the simple energy system model, this step typically creates:

  • A model directory named simple_energy_model.

  • A model_settings.xlsx workbook with sheets for sets, variables, and problems.

  • The directory structure expected by the following tutorial steps.

If you prefer YAML files instead of Excel, the same tutorial structure still applies. Only the format of the setup files changes.

Step 3. Fill the model setup files#

Related user guide step: Fill model setup file(s)

In this step, the conceptual structure of the energy system model is translated into CVXlab setup files. The same information can be written either in model_settings.xlsx or in the YAML files generated by cvxlab.create_model_dir().

Sets structure#

For the tutorial model, the three sets can be represented as follows in YAML:

Demand_scenarios:
    description: demand levels corresponding to different scenarios
    split_problem: true

Technologies:
    description: technologies available in the system

Time_periods:
    description: time periods considered in the model

The Demand_scenarios set is marked with split_problem: true because the model must be solved independently for each scenario. The other two sets define the internal dimensions of variables.

Data tables and variables#

The structural definition of data tables and variables can be organized as follows:

cost:
    description: specific generation costs by technology (EUR/MWh)
    type: exogenous
    coordinates: [Technologies]
    variables_info:
        c:
            Technologies:
                dim: cols

capacity:
    description: installed capacity by technology and time period (MW)
    type: exogenous
    coordinates: [Technologies, Time_periods]
    variables_info:
        cap:
            Technologies:
                dim: cols
            Time_periods:
                dim: intra

availability:
    description: availability factors by technology (MWh/MW)
    type: exogenous
    coordinates: [Technologies]
    variables_info:
        av:
            Technologies:
                dim: cols

demand:
    description: energy demand by scenario and time period (MWh)
    type: exogenous
    coordinates: [Demand_scenarios, Time_periods]
    variables_info:
        E_d:
            Time_periods:
                dim: intra

supply:
    description: energy supply by scenario, technology, and time period (MWh)
    type: endogenous
    coordinates: [Demand_scenarios, Technologies, Time_periods]
    variables_info:
        E_s:
            Technologies:
                dim: cols
            Time_periods:
                dim: intra

constant:
    description: model constants
    type: constant
    coordinates: [Technologies]
    variables_info:
        i_t:
            value: sum_vector
            Technologies:
                dim: rows

The resulting symbolic variables are summarized below.

Variables used in the tutorial model#

Related data table

Variable

Shape

Intra-problem sets

Inter-problem sets

\(cost(t)\)

\(c\)

\(1 \times t\)

\(-\)

\(-\)

\(capacity(t,y)\)

\(cap\)

\(1 \times t\)

\(y\)

\(-\)

\(availability(t)\)

\(av\)

\(1 \times t\)

\(-\)

\(-\)

\(demand(d,y)\)

\(E_d\)

\(1 \times 1\)

\(y\)

\(d\)

\(supply(d,y,t)\)

\(E_s\)

\(1 \times t\)

\(y\)

\(d\)

\(constant(t)\)

\(i_t\)

\(t \times 1\)

\(-\)

\(-\)

Problem definition#

The optimization problem can be represented in problem.yml as:

energy_system:
    objective:
        - Minimize(c @ tran(E_s))
    expressions:
        - E_s @ i_t >= E_d
        - E_s <= cap @ diag(av)
        - E_s >= 0

This structure keeps the tutorial aligned with the conceptual formulation introduced in simple-tutorial-conceptual-model-definition.

Step 4. Generate the Model instance#

Related user guide step: Model class instance generation

Once the setup files are filled, the tutorial model can be loaded into a CVXlab Model instance. This object will then be used for all remaining steps.

Typical initialization#

import cvxlab

model = cvxlab.Model(
    model_dir_name="simple_energy_model",
    main_dir_path="path/to/tutorial_workspace",
    model_settings_from="xlsx",
    use_existing_data=False,
)

What to expect#

At this point CVXlab validates the structure of:

  • The three sets of the tutorial model.

  • The exogenous, endogenous, and constant data tables.

  • The symbolic variables associated with those tables.

  • The problem definition stored in the setup files.

If validation succeeds, the model directory is ready for the next operational step. In a workflow from scratch, the most important generated artifact is the sets.xlsx file, which is filled in the next step of this tutorial.

Step 5. Fill the sets workbook#

Related user guide step: Fill sets data (model coordinates)

After the Model instance is created, CVXlab generates a workbook for the set coordinates. For the simple energy system model, the coordinates are the actual items over which scenarios, technologies, and time periods are defined.

Coordinates of the tutorial model#

Sets of the simple energy system model#

Set name

Symbol

Coordinates

Cardinality

Set type

Technologies

\(t\)

Solar, Gas, Nuclear

3

Dimension

Time periods

\(y\)

2025, 2026, 2027, 2028, 2029, 2030

6

Dimension

Demand scenarios

\(d\)

Low_demand, High_demand

2

Inter-problem

How these coordinates are used#

  • The demand-scenario coordinates create two independent problem instances.

  • The technology coordinates define the columns of the main decision variable.

  • The time-period coordinates define the intra-problem expansion of the expressions.

No filters are required for this first tutorial model, so all variables are defined on full domains.

Step 6. Initialize the data structures#

Related user guide step: Initialization of data structures

Once the set coordinates are available, CVXlab can generate the underlying data structures required by the tutorial model.

Typical command#

model.initialize_model_environment()

What this prepares for the tutorial#

For the simple energy system model, this step:

  • Loads the set coordinates into the model index.

  • Assigns coordinates and dimensions to the tables cost, capacity, availability, demand, supply, and constant.

  • Creates a blank SQLite database with normalized tables.

  • Generates blank input-data file(s) for the exogenous tables that will be filled in the next step.

After this step, the model structure is complete and ready to receive numerical input data.

Step 7. Fill the exogenous data#

Related user guide step: Fill exogenous model data

The blank input-data files generated by CVXlab must now be populated with the exogenous data of the tutorial model. Only exogenous tables are filled by the user at this stage.

Input tables to populate#

Exogenous data tables of the tutorial model#

Data table

Domain

Description

\(cost(t)\)

\(t\)

Specific generation costs by technology in EUR/MWh

\(capacity(t,y)\)

\(t \times y\)

Installed capacity by technology and time period in MW

\(availability(t)\)

\(t\)

Availability factor by technology in MWh/MW

\(demand(d,y)\)

\(d \times y\)

Energy demand by scenario and time period in MWh

Important distinction#

  • cost, capacity, availability, and demand are filled by the user because they are exogenous inputs.

  • supply is not filled manually because it is endogenous and will be solved by the optimizer.

  • constant is not an external input table in the usual sense: it is defined structurally through the setup files and used to build symbolic expressions.

Step 8. Initialize the numerical problem#

Related user guide step: Initialization of numerical problem(s)

At this point the symbolic model and the exogenous data are both available, so CVXlab can generate the numerical optimization problem.

Typical command#

model.refresh_database_and_initialize_problem()

Expressions generated for the tutorial#

The symbolic problem of the simple energy system model is:

\[\begin{split}\begin{aligned} \min_{E_s} \quad & c \cdot E_s' & \forall \, y\\ \text{s.t.} \quad & E_s \cdot i_t \geq E_d & \forall \, y \\ & E_s \leq cap \cdot \widehat{av} & \forall \, y \\ & E_s \geq 0 & \forall \, y \end{aligned}\end{split}\]

During initialization:

  • One numerical problem is built for each demand scenario.

  • One numerical expression instance is generated for each time period.

  • Variables not indexed on Time_periods are broadcast across the generated expression instances.

The result of this step is a CVXPY-ready representation of the energy planning problem for all scenarios of the tutorial model.

Step 9. Solve the numerical problem#

Related user guide step: Solution of numerical problem(s)

The energy system tutorial defines a standard convex optimization problem, so the numerical problem can now be solved directly once initialization is complete.

Typical command#

model.run_model(
    integrated_problems=False,
    solver="ECOS",
)

What happens in this tutorial#

  • The model is solved independently for each demand scenario.

  • For each scenario, the optimizer computes the least-cost feasible value of the endogenous supply variable \(E_s\).

  • The solution covers all technologies and all time periods defined in the sets workbook.

Since this is a single linear optimization problem, no iterative decomposition is required in the basic tutorial workflow.

Step 10. Export the results#

Related user guide step: Export endogenous model data

Once the optimization problem has been solved, the endogenous values can be written back to the SQLite database for inspection and reporting.

Typical command#

model.load_results_to_database()

Main result of the tutorial#

The key exported table is the endogenous supply table:

\[supply(d,y,t)\]

This table stores the optimal energy supplied by each technology, for each time period, and for each demand scenario. After export, the results can be explored through the CVXlab utilities, direct SQLite inspection, or downstream reporting tools.

Practical example#

Let us consider a model with Set structure defined as below. Notice that it is possible to define Set structure in both the structure_sets.yml file, or in the structure_sets tab in the settings.xlsx Excel file. The following tabs show the same Set structure in both formats.

Scenarios:
    description: Scenarios analyzed in the model
    split_problem: True

Technologies:
    description: Technologies included in the model
    filters:
        Type: [Supply, Demand, Storage]
        Category: [Renewable, Non-renewable]
    aggregations: [Sectors]

The tabs of sets.xlsx file are reported below. The header will be automatically generated based on the set definition, while the entries are defined by the user.

Scenarios_Name

Business As Usual

Net Zero emissions

Stated Policies

In the example above, the unused fields in the structure file(s) have been omitted (e.g., copy_from for the all Sets).