Methods |
Corresponding author: Xiaodong Duan ( xd@agro.au.dk ) Academic editor: Francesco Nazzi
© 2024 Xiaodong Duan, Christopher J. Topping.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Duan X, Topping CJ (2024) A general subpopulation model for the Animal Landscape and Man Simulation System (ALMaSS). Food and Ecological Systems Modelling Journal 5: e122467. https://doi.org/10.3897/fmj.5.122467
|
ALMaSS (the Animal Landscape and Man Simulation System) is a well-established modelling system known for its detailed agent-based animal models. However, for certain species, such as aphids, extremely high population sizes and limited detailed individual behaviour information make agent-based approaches either computationally expensive or unfeasible. Additionally, when these species are modelled as a prey base for other species, such as aphids for ladybirds, an efficient and accurate modelling approach is required. To address these challenges, we introduce a subpopulation-based model framework that effectively handles high population densities, while reducing computational complexity. This framework divides the landscape into a non-overlapping grid and population dynamics are managed within each grid using a flexible stage-structured population model. Grids are interconnected through short- and long-distance dispersal mechanisms, allowing for realistic movement and interactions. Parallel programming is employed to enhance performance, enabling faster simulation runs. We demonstrate the application of this model framework with a hypothetical species (not a species in the real world) and showcase its use in modelling aphid populations. This foundational subpopulation model could be used to model species with large population sizes within the ALMaSS framework and can be integrated into other simulation systems.
ALMaSS, dispersal, landscape simulation, mortality, reproduction, subpopulation modelling
The Animal Landscape and Man Simulation System (ALMaSS) (
Until now, all ALMaSS species models have been developed as agent-based models (
The key design objectives for the new approach were as follows:
The proposed subpopulation model integrates three types of model: stage-structured population models, spatially-structured population models and landscape simulation. Previous models typically use one of these approaches. The Leslie Matrix is one of the best-known stage-structured population models (
Our approach adapts a standard stage-structured population model to create a spatially dynamic model interacting with ALMaSS landscape models. We believe the combination created here to be unique because of the integration with the detailed landscape modelling. The stage-structured population model by
This paper describes and documents the foundational ALMaSS subpopulation model framework, specifically designed to model species with large population sizes. It is important to note that the collection of biological information and parameterisation for specific species are beyond the scope of this paper. Instead, we demonstrate the model’s design objectives and general applicability through a hypothetical generic species and provide examples of aphid models developed using this subpopulation model framework.
The implementation of the general subpopulation model within ALMaSS is achieved through two C++ classes, “Subpopulation” and “Subpopulation_Population_Manager”. The “Subpopulation” class represents a single subpopulation within a grid cell and an object of this class is used to simulate the stage-structured subpopulation in each grid cell (with a default size of 10 m × 10 m). The “Subpopulation_Population_Manager” class is an administrative class that manages the entire population across the simulated landscape, which comprises all “Subpopulation” objects and provides input/output functions for the subpopulation model.
These two classes contain all the necessary code to implement the subpopulation model framework, which can be based on modelling specific species in the real world, such as aphids (
This paper’s description of the general subpopulation model assumes a hypothetical generic species (not based on any real-world species, purely for illustration purposes) with five life stages: egg, larva, pupa, winged adult and wingless adult. Adults are only females and they are all mated and are capable of laying eggs. The winged adults are the only life stage capable of long-distance dispersal through flight, which they can undertake once in their lifetime. Since their flight is influenced by wind direction and speed, it is considered partly passive transportation. After flying, they become wingless adults. Local movement occurs by shifting a certain proportion of the subpopulation to one of the adjacent cells surrounding the current location. Thus, the distance they can cover in a single short-distance dispersal event depends on the size or dimensions of these cells. Eggs, larvae and pupae are immobile within their grid cells. When applying the general subpopulation model to a specific species, biological details such as life stages (as shown in Table
Model parameters with values used in the hypothetic generic species. TOC is an enumeration of the types of crops present in ALMaSS. TOLE is an enumeration listing the types of landscape elements (habitats) in ALMaSS. Please note that these values do not contain any real biological information since the hypothetic generic species is not a species in the real world. This is only for illustration purposes.
Name | Unit | Value | Symbol | Note |
---|---|---|---|---|
The number of life stages | – | 5 | Nl | egg (0), larva (1), pupa (2), winged adult (3), wingless adult (4) |
The maximum lifetime | Calendar day | 200 | Nc | The maximum alive days amongst all the life stages |
Maximum degree days | Degree days | 80, 90, 100, 213, 213 | DiM | The maximum degree days for each life stage are denoted by i. |
Base development temperature | °C | 10, 5.5, 5, 5, 5 | Tim | The lower temperature that the model species can accumulate day degrees for each life stage is denoted by v i. |
The size of the grid | metre | 10 × 10 | – | Width and height |
The flying array | 3 | – | Life stages that can fly | |
The landing array | 4 | – | Life stages after the flying ones landed | |
Winter host list (habitat) | – | – | – | TOLE, is not used in the generic species |
Winter host list (crop) | – | – | – | TOC, is not used in the generic species |
Summer host list (habitat) | – | tole_RoadsideVerge tole_BeetleBank tole_RoadsideSlope tole_Hedges tole_FieldBoundary tole_NaturalGrassDry tole_WoodlandMargin tole_Vildtager tole_NaturalFarmGrass tole_OPermPasture tole_OPermPastureLowYield tole_NaturalGrassDry tole_PermPastureLowYield | – | TOLE |
Summer host list (crop) | – | Spring Barley | – | TOC |
This class represents the entire population across the simulated landscape as a collection of subpopulations. For simplicity, the “Subpopulation_Population_Manager” class will be referred to as the manager class throughout the rest of this paper. Subpopulations are defined by applying a regular grid (default size of 10 m × 10 m) over the simulated landscape. For each grid cell, an object of the “Subpopulation” class is created and managed daily by the manager class. For example, a 10 km × 10 km landscape with a 10 m × 10 m grid size would result in 1 million “Subpopulation” objects. Fig.
When an instance of the manager class is created, initialisation is triggered. During this process, the manager loads all parameters and host information (see Table
The first part of the main simulation loop is to call the DoFirst function of the manager class. Here, the development season is updated first. Some species, for example, black bean aphids, switch plant hosts on which they have different life cycles. Based on this host switch, the development season is defined. If the model species do not have this behaviour, the development season will not change. Afterwards, the common part of the mortality rate depending on the environmental factors (e.g. temperature), which is identical for the whole simulated landscape at each time step, is also calculated. The flying animals will be distributed using the landing mask, based on the wind speed and wind direction for that day in this DoFirst function as well.
After the preparatory steps in the DoFirst function, a sequence of three loops for all the “Subpopulation” objects is initiated to call functions from the “Subpopulation” object. The first loop calls the BeginStep function for all “Subpopulation” objects. Although this function is typically empty in the base class, it can be overridden in a derived model class if needed. The purpose of BeginStep in a derived class is to prepare the “Subpopulation” objects for the next loop, where their Step function is called.
The second loop runs immediately after the BeginStep loop, iterating over all “Subpopulation” objects to execute their Step function. The Step function encompasses behaviour related to population density, habitat suitability, individual development, reproduction, mortality and dispersal. Once the Step function has been executed for all “Subpopulation” objects, the third loop calls the EndStep function for each one to conclude the day’s activities. Like BeginStep, the EndStep function is empty in the general subpopulation class, but can be overridden in a derived class if specific actions are required at the end of the time step for the real model species.
These three loops are parallelised using OpenMP (
After cycling through all “Subpopulation” objects, common calculations shared by all the “Subpopulation” objects are executed within the DoLast function. These calculations yield landscape-wide results used by all subpopulations. The doDevelopment function calculates the accumulated day degrees, which will be detailed in “Development calculation (doDevelopment function)”. The doSpeciesLastThing function, while empty in this general subpopulation model, offers a placeholder for potential species-specific overrides. To advance simulation time, addNewDay increments to the day counter. Finally, updateWholePopulation and subpopuBaseOutputProbe provide population recording, the former aggregating totals and the latter recording daily numbers.
Alongside core calculations, the manager provides four additional functions for “Subpopulation” objects. These are the next life stage calculation, reproduction calculation and functions linked to the development calculation (supplyOldEnoughIndex and supplyAgeInDay) (see Fig.
The first operation in the initialisation loads model parameters related to species development and host plants, as illustrated with the hypothetical generic species in Table
During initialisation, life stages capable of flight are added to a vector, with corresponding stages for post-landing stored in another vector. This accounts for species that may shed their wings after landing. If the flying and landing stages are identical, the flying ability is retained until death or until a single flight has occurred.
Additionally, a “Subpopulation” object is created for each grid cell. A suitability variable within each “Subpopulation” object is initialised, based on the habitat or crop type at the grid’s centre. If this type matches any in the host list, the suitability is set to 1; otherwise, it is set to 0. Upon creating the “Subpopulation” object, a predefined initial population of the model species at a specific life stage is added, provided the suitability value is greater than 0. In this hypothetical generic species, 1,000 eggs are introduced. This process is carried out in the initialisePopulation function.
This function controls the development season, winter or summer development. When the model species does not switch hosts, this function returns summer development as default, which means only summer (yearly) hosts are used. In this general subpopulation model, we assume there is only one development season. If there is more than one in the derived class, for example, with winter hosts, this function requires an override in the derived class for the model species.
Since the environmental factors, for example, average temperature, wind speed and direction, are identical for the whole simulated landscape, the base mortality rate, only depending on environmental factors, is calculated and stored in the manager class. When the “Subpopulation” object needs this, it does not need to be recalculated.
In this general subpopulation model, we assume a base mortality rate of 0.01 for all life stages of the generic species. This function needs to be overridden in the derived class and will typically be linked to weather or other factors controlling a general ‘background’ mortality for the model species.
This function is used to pre-calculate the landing mask for the model species doing the long-distance dispersal for the sake of computing efficiency.
We assume the long-distance flying dispersal is influenced by the wind, but this can only happen when the wind speed is lower than a certain value. Long-distance flying is done using 2D landing masks, which are generated by the landing proportion 1D curve along the wind direction. Below are the key parameters with the default values in the parenthesis for the long-distance flying dispersal.
Two polynomial functions are used to generate a long tail landing curve along the wind direction, as shown in equation (1):
(1)
where l is the distance from the landing position to the original position. In this way, the position at dp will have the largest landing proportion compared to other positions. The dispersers can land at a maximum distance of dl, beyond which no one can land (p (l) = 0). The landing curves calculated using the default parameter values can be found in Fig.
After obtaining the landing curve along the wind direction, we need to convert it to a 2D mask in order to land the dispersers with wings on a 2D landscape. Firstly, we rotate the landing curve 360 degrees through the original location. If we suppose that the original location is at (0,0), then the distance from each location denoted by (x,y) in a 2D landscape to the original location can be calculated using equation (2):
(2)
After equation (2), we can obtain the initial symmetric 2D landing mask to the source location using equation (3):
(3)
Next, we skew the landing mask, based on wind direction. We want more individuals to land downwind and none to land directly upwind. To achieve this, we use the cosine of the angle between the dispersal path (origin to destination) and the wind direction as a weight.
We suppose the wind direction to the source location is given by (b, −a) and the direction from the destination location to the source location is given by (y, −x). Then, the weight used to skew the initial landing mask can be calculated using equation (4):
(4)
where m is an integer to control the landing range; when m is larger, more individuals will land along the wind direction. Afterwards, the 2D mask with the wind direction of (b, −a) can be calculated by equation (5) as exmapled in Fig.
2D landing mask examples with the colour bar for the landing proportions. The winged adult flies from the original (0, 0). The wind direction is along the y-axis (from bottom to top). The wind adults will not land at the locations where the flying direction is opposite to the wind direction. Further, when the wind speed is increasing, they can reach further locations along the wind direction. m in equation (4) is set to be 3 in these examples.
Four wind directions (N, S, W, E) are used in the proposed model. Each wind speed range has one landing mask for every wind direction. The landing masks are normalised so their sum equals 1, ensuring appropriate distribution during landing calculations. One example of the 2D mask with a wind direction for five wind speed ranges can be found in Fig.
(5)
All the “Subpopulation” objects are looped three times to run their BeginStep, Step and EndStep functions, respectively, which are done in parallel within each containing loop using OpenMP. The details of the Step function can be found in “The Subpopulation Class”, while the BeginStep and EndStep are empty and are placeholders for modelling real species.
Since the whole simulated landscape window has the same environmental temperature, the degree-days used for the physiological age of the model species are calculated in the manager class at every time step, i.e. not individually in the “Subpopulation” class. Each life stage has its base development temperature (Tim, where i is the life stage number). For each simulated day, the degree days (DDi) are calculated using equation (6), based on the daily average temperature (Ta). When Ta is smaller than Tim, no degree days will be accumulated (i.e. DDi = 0 when Ta ≤ Tim):
DDi = Ta − Tim (6)
The model species accumulate day degrees by the creation of a 2D array, D, of which the first dimension is the index of the life stage (i = 0,…,Nl − 1), while the second is the calendar day counter (j = 0,…,Nc − 1). Each element Dij is the accumulated day degrees by adding the value calculated using equation (6) for life stage i which are born or turned to that life stage at calendar day represented by j. All the elements of D are initialised with 0 from the beginning of the simulations.
In addition to this 2D array, each life stage has two variables for the oldest (oi) and the newest index (bi) in the second dimension of D. The oldest position oi is where the oldest extant population is placed in D for life stage i. The newest position bi is where the newborn/newly-developed population should be added for life stage i. From the beginning, both the oldest and newest indices start as 0. At the end of each day:
(7)
There are two functions linked to development calculation: 1) supplyOldEnoughIndex function returns the ready-to-next-life-stage array, which will be used by the “Subpopulation” class; 2) supplyAgeInDay function returns the age in calendar days for the given life stage and index in the second dimension of D. To calculate the age of the population, the accumulated degree days at Dij in calendar day (), the equation (8) is used:
(8)
This function calculates whether the model species at one life stage has accumulated enough degree days to transition to the next life stage. This function needs to be overridden in the derived class to accurately describe the species’ biological information to be modelled. The general “Subpopulation” class example used here has five life stages, namely, egg (0), larva (1), pupa (2), winged adult (3) and unwinged adult (4). The number in the brackets represents the life stage, while −1 means death. If the current life stage is lc, the next life stage (ln) can be calculated using equation (9). The winged and wingless proportion is controlled by the population density, as described in “Calculation of population density (calPopuDensity function)”:
(9)
This function calculates the number of offspring and the life stage number of the offspring for the given life stage. It is called by the reproduction function in the “Subpopulation” class. In this base class, only the adults can give birth to one egg each day. An override is always required for a derived class — for example, species like aphids that can lay eggs or give birth to nymphs.
An instance of the “Subpopulation” class is created within each grid cell to track the population dynamics specific to that cell. The primary function in the “Subpopulation” class is the Step function, which is called by the manager class during its loop for all the “Subpopulation” objects. The flowchart for the Step function is illustrated in Fig.
The calPopuDensity function calculates the population density for the grid cell. This function must be overridden in the derived class. In the base “Subpopulation” class, population density is calculated using the suitable host green biomass available in the cell. If this density exceeds a user-defined threshold, winged adults are developed, prompting long-distance dispersal.
The calSuitability function recalculates host suitability daily to account for crop rotations in arable areas. If a crop in the cell is not on the host list, suitability is set to 0 and the doMortality function subsequently removes the subpopulation from the grid. Suitability is reset to 1 once a crop from the host list is present again.
Mortality rates for each life stage are managed through user-defined parameters or calculation functions. These require an override of the function of calCellRelatedMortalityWeight in the derived classes. The calCellRelatedMortalityWeight function is used to calculate the marginal mortality related to the grid, which is empty in the base class. In the doMortality function, a certain number of individuals in each grid at each life stage will die according to their mortality rate, using the grid cell’s suitability value as weighting. When the suitability value is 0, all the populations in the grid will be killed by this function. The doMortality function does NOT need an override in derived classes.
The local movement (short-distance movement compared to the long-distance movement in the manager) is implemented here. Short-distance movement occurs independently of long-distance flight. Further, we assume that only a certain proportion of the individuals can move and they can only move to one of the eight adjacent cells, which is chosen randomly. After the target cell is chosen, they only move to it if the destination’s suitability (see “Calculation of population density (calPopuDensity function)”) is greater than zero and the destination’s population density (see “Calculation of population density (calPopuDensity function)”) is lower than the departure cell. This movement is typically used for adults, but can applied to any life stage if they are capable of moving from one place to another.
The development of the subpopulation is implemented, based on the development array in the manager class (see “Development calculation (doDevelopment function)” in the manager). In the “Subpopulation” class, a 2-D array P is used to track the subpopulation of the model species in each life stage at different ages in a grid cell. The first dimension of P is to index the life stage. The second dimension of P is to index the calendar days. Importantly, the second index does not represent absolute age. Instead, it groups animals together, based on the day they were born or transitioned to their current life stage, i.e. pij is the number of modelled species at life stage i and having the age indexed by j. To obtain the age in calendar days for these pij modelled species, supplyAgeInDay function from the manager class is used. Further, all these pij modelled species have accumulated the same Dij degree days which is stored in the manager class.
At each time step, the “Subpopulation” class calls the supplyNextLifeStage function in the manager for every life stage to obtain the index (j) pointing to the animals that are ready to move to the next life stage. The animal number represented by pij, indexed by j for life stage i, will be added to the position indexed by the newest position variable from the manager class for the next life stage, effectively moving individuals from the last index j at life stage i to the newest index in the next life stage. If the next life stage index is −1, the population will be removed from the array of P since −1 indicates death.
The doReproduction function handles reproduction for eligible life stages, with the calOffspringStage function from the manager class returning the number and life stage of offspring. In the base subpopulation class, it is assumed for simplicity that only adults lay one egg per day. The doReproduction function must be overridden in derived classes to customise reproductive behaviour for real model species, based on its our own biological information.
The proposed model framework was implemented to seamlessly interact with the existing ALMaSS landscape model. To illustrate its application, we conducted a five-year simulation on a 10 km × 10 km Danish landscape using the hypothetical generic species parameterisation, as shown in Table
Fig.
Number of Threads | 1 | 5 | 10 | 20 |
---|---|---|---|---|
Running time in second | 1472 | 748 | 638 | 615 |
Relative to running time with one thread | 1 | 0.508 | 0.433 | 0.417 |
Based on the proposed subpopulation model, four aphid species (Aphis fabae, Acyrthosiphon pisum, Myzus persicae and Sitbion avenae) were implemented within the ALMaSS simulation environment. Please refer to
Subpopulation-based aphid models with a ladybird agent-based model as predators, simulated in a 3 km by 3 km Danish landscape over various time intervals. Shading represents aphid density: black indicates no aphids, while brighter shades signify higher densities. Coloured dots represent ladybirds at different life stages.
In this paper, we introduced a general subpopulation model framework designed to handle species with large populations. This framework operates by tracking the population development of species across a simulated landscape, recording the number of individuals at various life stages and ages within each grid cell. By aggregating these local subpopulations, the model can represent the entire population across the landscape, providing a comprehensive view of population dynamics.
Our model differs from traditional agent-based models by operating on subpopulations belonging to different non-overlapping grid cells in the landscape rather than individuals, using the number of individuals at specific life stages and at different ages as the minimal unit. While this approach offers less granularity than modelling each animal as an individual agent, it still achieves a fine resolution by allowing stage-specific age distinctions (e.g. all 12-day-old larvae). This balance enables us to capture detailed population dynamics without the computational overhead of full agent-based modelling, making it particularly suited for large-scale simulations.
By integrating with existing ALMaSS landscape models, our subpopulation framework can incorporate dynamic environmental factors, such as vegetation growth and agricultural practices. This connection allows the model species to interact realistically with their landscape, responding to factors like field management decisions (e.g. insecticide treatments) and seasonal vegetation changes. As demonstrated in Fig.
Our model links subpopulations in different cells to capture spatial population structure through dispersal mechanisms rather than relying on traditional metapopulation or patch dynamic models. Dispersal mechanisms include long-distance flights and local movement between grids, facilitating accurate spatial dynamics across the landscape. The landscape model provides a dynamic context, affecting subpopulation behaviour through factors like wind speed and direction. This approach allows for the simulation of complex dispersal scenarios, where subpopulations can respond adaptively to changing environmental conditions.
Our multi-threaded approach enables efficient simulations, even for complex scenarios. For instance, we successfully ran a 30-year simulation involving four aphid species and one ladybird species running simultaneously across a 10 km-by-10 km landscape in just 20 hours using 30 threads. This performance demonstrates the model’s capability to handle extensive simulations, while maintaining computational efficiency.
While the model’s conceptual structure is relatively straightforward, its implementation requires a solid understanding of the ALMaSS system and its programming environment, reflecting a trade-off between complexity and flexibility. Despite this, the framework is sufficiently generic to model a wide range of species, making it a versatile tool for future ecological modelling. Provided as a set of C++ classes, this model can be directly integrated into the ALMaSS software environment or adapted to other modelling systems capable of supplying the necessary data. This flexibility makes it a valuable resource for modellers seeking to explore large-scale population dynamics in complex landscapes.
This paper presents a novel base model designed to serve as a foundational framework for modelling species with large populations. This subpopulation model provides a flexible tool for developing species-specific simulations by focusing on population dynamics within detailed simulated landscapes. Our approach enables a more accurate understanding of types of population behaviour without requiring individual-level modelling by integrating elements from stage-structured and spatially structured population models with landscape simulation. Unlike previous models that often rely on a single approach, this framework leverages the strengths of multiple methodologies. It combines detailed landscape modelling with nuanced subpopulation dynamics by segmenting landscapes into non-overlapping grids, allowing for the simulation of large-scale systems involving billions of individuals. Already successfully applied to simulate aphid populations, this model serves as a robust foundation for building new, species-specific models within ALMaSS or other simulation projects. These models can be tailored to meet the unique requirements of different species.
We would like to thank Sara Nafisi for her assistance with the figures in this paper.
The authors have declared that no competing interests exist.
No ethical statement was reported.
This work was supported by DeiC National HPC (UCloud, GenomeDK, LUMI) (g.a. DeiC-AU-N1-000025, DeiC-AU-N2-2023015 and DeiC-AU-L5-0011) and by the EcoStack project, funded by the European Union’s Horizon 2020 Research and Innovation Programme under Grant Agreement no. 773554
Both authors have contributed equally.
Xiaodong Duan https://orcid.org/0000-0003-2345-4155
Christopher J. Topping https://orcid.org/0000-0003-0874-7603
All of the data that support the findings of this study are available in the main text.
The source code and the running directory are publicly available at:
https://gitlab.com/ALMaSS/almass_methodology.git
The code documentation for the subpopulation model is publicly available at:
https://almass.gitlab.io/almass_methodology/Subpopulation.pdf