Food and Ecological Systems Modelling Journal : FSKX (Food Safety Knowledge)
FSKX (Food Safety Knowledge)
Quantitative risk assessment of Haemolytic and Uremic Syndrome (HUS) from consumption of raw milk soft cheese
expand article infoSubhasish Basak‡,§, Janushan Christy|, Laurent Guillier, Frederique Audiat-Perrin, Moez Sanaa, Fanny Tenenhaus-Aziza, Julien Bect§, Emmanuel Vazquez§
‡ ANSES, Maison-Alfort, France
§ L2S, CentraleSupelec, Gif-sur-Yvette, France
| ACTALIA, La Roche-sur-Foron, France
¶ CNIEL, Paris, France
Open Access


The aim of this quantitative risk assessment model is to estimate the risk of Haemolytic Uremic Syndrome (HUS) caused by Shiga-toxin producing Escherichia coli (STEC) in raw milk soft cheese and explore intervention strategies to minimise this risk. Building upon previous work from literature, the model considers microbial contamination of raw milk at the farm level, as well as STEC growth and survival during cheese production, ripening and storage, along with intervention strategies in both pre- and post-harvest scenarios. It allows for the assessment of intervention steps at the farm level or during cheese production. Besides estimating the risk of HUS, it also assesses the production losses associated with interventions.


QMRA model, raw milk soft cheese, E. coli, R programming language

Model metadata

This section summarises the metadata of the concerning QMRA model.

General metadata

Source: RISK ASSESSMENTS: Risk assessments models

Identifier: d56222fe-149a-4f86-bf0f-1972bb22ce59

Rights: GNU General Public License 3.0

Availability: Open access

Language: English

Software: R

Language Written In: R 4


Name: Dairy products Cattle

Description: Soft mould ripened cheese

Unit: g

Method: origin Country - France


  • type: Microorganisms; Name: Escherichia coli, pathogenic, VTEC; Description: -; unit: -; Adverse Effect: Haemolytic Uremic Syndrome (HUS);

Data background

Identifier: QRA-STEC

Title: Quantitative risk assessment of Haemolytic and Uremic Syndrome (HUS) from consumption of raw milk soft cheese.


Microbiological food safety is a major challenge for the food sector (Plaza-Rodríguez et al. 2018). In this regard, the microbiological food safety community (i.e. food authorities, food industries and food research institutes) has invested great research efforts in the field of Quantitative Microbial Risk Assessment (QMRA). The framework for carrying out QMRA for food-borne pathogens is well established and relies on four components: hazard identification, hazard characterisation, exposure assessment and risk characterisation (F.A.O. and W.H.O. 2021). QMRAs have been employed to provide information for decision-making for the management of microbial risks (see, for example, Plaza-Rodríguez et al. (2018)).

The French partners of the ArtiSaneFood project, a European initiative aimed at improving the microbial safety of artisanal fermented foods in the Mediterranean Region, seek to optimise control measures to reduce the risk of food-borne illness from consuming soft cheese made from raw milk. While cheese is generally considered safe and nutritious, there are instances of food-borne illnesses related to its consumption, as noted by Sherkat et al. (1998). To address this issue, Perrin et al. (2014) investigated potential control measures to minimise the risk of Haemolytic and Uremic Syndrome (HUS) caused by Shiga-toxin producing Escherichia coli (STEC) in raw milk soft cheese. They developed a stochastic QMRA model to assess the risk of HUS associated with the five Main Pathogenic Serotypes (see Endnote *1) of STEC (MPS-STEC) in raw milk soft cheeses and a version of this model was implemented in SAS (SAS version 9.3 TS). In this work, we present an alternative implementation of the model using R language, along with several modifications to the original model.

The QMRA model is implemented as a stochastic simulator that is composed of several hierarchical levels that will be called modules. The simulator is divided into two parts—namely, the batch-level simulator and the output module. In the batch level, the simulator is composed of a farm module followed by a pre-harvest intervention step, a cheese production module, a consumer module and a post-harvest sampling module. This corresponds to the fabrication of a single batch of cheeses, that is produced from a single batch of milk coming from a fixed number of farms. Fig. 1 provides a schematic diagram of the batch level simulator that generates three outputs corresponding to a particular batch to which pre- and post-harvest interventions are applied: milk loss per batch, probability of rejecting a particular batch and risk of illness if that particular batch of cheese is consumed. It also simulates internal variables that characterise a particular batch, namely the STEC concentration \(Y_0\) in the aggregated milk tank, the median concentration of bacteria at the time of consumption \(Y_s^{\rm consum}\) and the average number of colonies \(\lambda_s^{\rm colony}\) in a single cheese.

Figure 1.  

Schematic diagram of the batch level simulator of the risk assessment model. Modules are denoted by pink coloured boxes with the blue boxes denoting the set of corresponding input parameters \(\theta = \{\theta^{\rm farm}, \theta^{\rm cheese}, \theta^{\rm con}, \theta^{\rm post}\}\) and the orange boxes denoting the outputs, namely, milk loss per batch \(M^{\rm batch}\), probability of rejecting a particular batch \(P^{\rm batch}\) and batch risk \(R^{\rm batch}\).

The model is made available in Food Safety Knowledge Markup Language (FSK-ML) format to facilitate its reuse. This open format is based on predefined terms, metadata and controlled vocabulary to harmonise annotations of risk assessment models (Haberbeck et al. 2018). This article details the construction of the modules, including the underlying assumptions and the methods employed to estimate various quantities of interest. The subsequent section presents specifics of the modules as they were implemented in R and are available in the FSK-ML format. We also provide a comparative summary of the model in relation to the initial model proposed by Perrin et al. (2014). The article concludes with a section that addresses the scope and applicability of the model.

QRA simulator

In this section, we describe the different modules of the simulator that represent the farm-to-fork continuum in the cheese fabrication process. It starts with a farm module which computes the STEC concentration (CFU/ml) in the aggregated milk tank that is used for cheese-making. This module includes a pre-harvest intervention step which performs "farm milk sorting" or, more precisely, rejecting the contaminated tankers of milk with a concentration in E. coli above a given threshold. Next, we have the cheese production module which describes the evolution of STEC over different stages of cheese fabrication—namely, milk storage, moulding, draining, salting, ripening and cheese storage. Then the consumer module computes the risk of HUS for a particular batch of cheese, based on the consumption behaviour for different age groups. The post-harvest intervention module describes the sampling of cheese after production and computes the probability of rejecting a particular batch of cheese. These four modules, all together, constitute the batch level simulator, which simulates the output corresponding to the fabrication of a particular batch of cheeses. In this model, with the default paramaters, a batch is usually composed of 22,000 to 23,000 cheeses of 250 g. For the computation of ultimate quantities of interests, the output module simulates several batches and computes the overall risk of HUS \(R^{\rm HUS}\), the average batch rejection rate \(P^{\rm avg}\) and the average milk loss due to sorting \(M^{\rm avg}\).

Farm module

The outcome of the farm module is the STEC concentration (CFU/ml) in the aggregated milk tank that collects milk from all the farms after pre-harvest milk sorting. The inputs of this module are denoted by \(\theta^{\rm farm}\). The model considers a fixed number of farms (N_farms) with given hygiene conditions (controlled by parameters \(\alpha\) and \(\sigma\) *4, characterising the E. coli concentration in milk) and each of the farms has a certain number of cows*5. Milk from each of the farms is collected in a bulk tank and called Bulk Tank Milk (BTM), which is subjected to pre-harvest intervention (milk sorting). The milk from all the sorted farms is then collected in an aggregated milk tank. The farm module simulates the concentration \(Y_0\) of STEC (CFU/ml) in this aggregated milk tank.

Module inputs

The set of inputs parameters of the farm module denoted by \(\theta^{\rm farm}\), are described in Table 1.

Table 1.

Inputs of farm module \(\theta^{\rm farm}\). Unless specified, the parameter values are taken from Perrin et al. (2014).

Symbol Description default values/references
N_farms Number of farms 31 (see *3)
N_cow_i Number of cows in i-th farm see *5
alpha_i Parameter for distribution of E. coli in milk (LogNormal distribution) -1.3 (see *4)
sigma_i Parameter for distribution of E. coli in milk (LogNormal distribution) 2.9 (see *4)
a_weibull Parameter for distribution of STEC in faecal matter 0.264
b_weibull Parameter for distribution of STEC in faecal matter 16.288
mu_ecoli Parameter for distribution of E. coli in faecal matter 6
tau_ecoli Parameter for distribution of E. coli in faecal matter 0.3
mu_u Parameter for distribution of probability infected cows -0.927
tau_u Parameter for distribution of probability infected cows 1.47411
q_milk Average quantity of milk from a cow 25 litres
sorting_frequency Milk testing frequency 10
sorting_lim Max. limit of E. coli concentration 50
p_MPS_STEC Proportion of pathogenic STEC in cows 0.025

Module description

The concentration of STEC in farm milk is usually too low to be assessed quantitatively through microbiological methods (because of their limit of detection). For this reason, we rely on the "relative" approach proposed by Perrin et al. (2014), which uses collected data*4 on Escherichia coli (E. coli) to estimate the STEC concentration. This approach is based on the assumption that E. coli and STEC strains follow the same faecal routes in the cows.

For each farm \(i = 1, 2, ..., N^{\rm farms}\), the milk is collected into a bulk tank from \(N^{\rm cow}_i\)cows, where \(N^{\rm cow}_i\) is simulated from a cow distribution estimated from data (see Endnote *5). The STEC concentration in the bulk tank milk (BTM) corresponding to farm \(i\) is denoted by \(Y_{0, i}\) and is computed using a proportion rule as:

\(Y_{0, i} = Y^{\rm EC}_i \frac{\overline{F^{\rm STEC}_i}}{\overline{F^{\rm EC}_i}},\)

where \(Y^{\rm EC}_i\) is the concentration of E. coli (CFU/ml) in bulk milk tank, \(\overline{F^{\rm STEC}_i}\) is the average STEC concentration (CFU/gram) and \(\overline{F^{\rm EC}_{i}}\) is the average E. coli concentration in faecal matter (CFU/gram) from all the cows.

  • Pre-harvest intervention*6

Bulk tank milk coming from each farm is tested for E. coli concentration and the farms with concentration \(Y_i^{\rm EC}\) higher than a certain threshold \(l^{\rm sorting}\) are rejected. Milk sorting is not performed for every batch of milk: instead, it is controlled by a parameter \(f^{\rm sorting}\), which denotes the frequency (in days) of testing the farms milk. Let \(S\) denote the set of farms that are qualified after milk sorting and let \(N^{\rm farms, sorted} = |S|\). The milk loss for a particular batch is given by \(M^{\rm batch} = \sum_i^{N^{\rm farms}}V_{i}\, {1}_{\{i \in S\}}\), where \(V_i = q_{\rm milk} N_i^{\rm cow}\) is the amount of milk produced by the \(i\)-th farm\(N_i^{\rm cow}\)*5. After milk sorting, milk from all the qualified bulk tanks is collected into a single aggregated milk tank. Hereafter, only the farms with accepted level of E. coli are considered.

For the \(i\)-th accepted farm, \(\overline{F^{\rm STEC}_i}\) is the average of the individual STEC concentrations \(F^{\rm STEC}_{i,\,j}\) in infected cows \(j = 1, 2, ..., k_{i}\), where \(k_i\) is the number of infected cows in the \(i\)-th farm simulated according to a binomial distribution:

\(k_{i} \sim {\rm Bin}(N^{\rm cow}_i, p_i) \), where \({\rm logit}(p_i) = u_i\) and \(u_i \sim \mathcal{N}(\mu_{u}, \tau_{u})\).

The number of cows infected with MPS-STEC in the \(i\)-th farm is also simulated according to a binomial distribution:

\(k^{\rm MPS}_{i} \sim {\rm Bin}(k_i, p^{\rm MPS}_{\rm STEC}).\)

Remark: If we are interested to compute the concentration of MPS-STEC in the aggregated milk, we use \(k_i^{\rm MPS}\) instead of \(k_i\)in the following computations:

\(F^{\rm STEC}_{i,\,j}\) is simulated according to a Weibull distribution:

\(\begin{equation}\begin{split}F^{\rm STEC}_{i,\,j} &\sim {\rm Weibull}(a^{\rm weibull}, b^{\rm weibull}),\\\overline{F^{\rm STEC}_i} &= \frac{1}{N_{i}^{\rm cow}}\sum_{j=1}^{k_{i}}F^{\rm STEC}_{i,\,j}.\end{split}\end{equation}\)

The concentration of E. coli (CFU/ml) in BTM \(Y^{\rm EC}_i\), is modelled by a lognormal distribution:

\(\begin{equation}Y^{\rm EC}_i \sim {\rm Lognormal}(\alpha_{i}, \sigma_i).\label{eq:farm_alpha_sigma}\end{equation}\)

\(\overline{F^{\rm EC}_{i}}\) is the average of individual E. coli concentration in faecal matter for each cow, denoted by \(F^{\rm EC}_{i,\,j}\), \(j = 1, 2, ..., N^{\rm cow}_{i}\) cow. It is simulated according to a Lognormal distribution:

\(\begin{equation}\begin{split}&\log_{10}(F^{\rm EC}_{i,\,j}) \sim \mathcal{N}(\mu^{\rm ecoli}, \tau^{\rm ecoli}), \\&\overline{F^{\rm EC}_{i}} = \frac{1}{N^{\rm cow}_i}\sum_{j=1}^{N^{\rm cow}_i}F^{\rm EC}_{i,\,j}.\end{split}\end{equation}\)

The STEC concentration \(Y_0\) (in CFU/ml) in this aggregated milk tank is then given by:

\(\begin{equation}Y_0 = \sum_{i=1}^{N^{\rm farms}} \bigg (Y_{0, i} \cdot \frac{V_{i}{1}_{\{i \in S\}}}{\sum_{i=1}^{N^{\rm farms}}V_{i}{1}_{\{i \in S\}}} \bigg).\label{eq:farm_atm_STEC}\end{equation}\)


The quantities of interest from the farm module are \(Y_0\), the milk loss due to sorting, the \(M^{\rm batch}\) total milk utilised \M^{\rm batch} and the number of farms discarded (due to sorting), i.e. \(N^{\rm farms} - N^{\rm farms, sorted}\). Fig. 2 plots the histogram for the final STEC concentration in the aggregated milk tank without milk sorting testing.

Figure 2.  

Histogram of STEC (main pathogenic serotypes MPS-STEC) concentration (log10 (CFU/ml)) in milk put into production.

Cheese module

The cheese module begins with the input of the initial concentration \(Y_0\) of STEC in the aggregated milk tank and simulates the evolution of the STEC bacteria throughout the cheese-making process, which involves a transition from the liquid state of milk to the solid state of cheese.

Module inputs

The inputs of the cheese module are the initial concentration \(Y_0\) of STEC and the parameters described in Table 2, which are denoted as \(\theta^{\rm cheese}\).

Table 2.

Parameters of cheese module \(\theta^{\rm cheese}\). Unless specified, the parameter values are taken from Perrin et al. (2014).

Symbol Description Values/reference
Parameters for mu_max Parameters to compute mu_max Table I in Perrin et al. (2014)
mu_opt Optimal growth rate 1.85 (average from Table II in Perrin et al. (2014))
y_max_milk Hypothetical maximum population in milk 109 CFU/ml
y_max_cheese Hypothetical maximum population in cheese 105 CFU/g
p_O157H7 Class probability of O157:H7 0.76 (taken from *3)
p_MPS_STEC Proportion of main pathogenic STEC 0.025
rho_O157H7 Parameter for decline phase of O157:H7 0.14 log10 CFU/day
rho_otherMPS Parameter for decline phase of other MPS 0.033 log10 CFU/day
a_w Water activity parameter 0.99
w_loss Proportion of water loss 0.9
v_cheese Milk used in a single cheese 2200
t_consum Consumption time

Triangular (22,30,60)

d Duration of a step Table III in Perrin et al. (2014)
pH pH of a step Table III in Perrin et al. (2014)
T Temperature of a step Table III in Perrin et al. (2014)
NA Absolute tolerance of ode solver 10-6
NA Maximal step size of ode solver 0.01

Module description

The evolution of STEC involves six steps, with milk storage and moulding taking place during the liquid growth phase and draining and salting occurring during the solid growth phase. Ripening and cheese storage represent the (solid) decline phase of STEC.

In all the growth steps, the concentration \(y(t)\) for STEC at time \(t\), is modelled using an ordinary differential equation:

\(\begin{equation}\frac{dy}{dt} = \mu^{\rm max}(t) y(t) \bigg(1-\frac{y(t)}{y^{\rm max}}\bigg),\end{equation}\)

where \(\mu^{\rm max}(t)\) stands for the maximum growth rate (in \(\rm h^{-1}\)) and \(y^{\rm max}\) is a parameter that represents the hypothetical maximum population of STEC strain in milk or cheese. For each step, the physico-chemical parameters \(\{d, pH, T, a_w\}\) are computed from Table III in Perrin et al. (2014). The maximum growth rate \(\mu^{\rm max}(t)\) for a particular step, at time \(t\) is according to model 5 in Augustin et al. (2005). \(\mu^{\rm max}(t)\) is dependent on the values of physico-chemical parameters \(\{d, pH, T, a_{\rm w}\}\) at time \(t\) and the optimal growth rate parameter \(\mu^{\rm opt}\). The optimal growth rate is taken to be the average of the growth rates corresponding to different strains as provided in Table II in Perrin et al. (2014).

The milk storage step starts with initial concentration \(Y_0\) and outputs the final concentration \(Y^{\rm storage}\) after growth at the end of storage. The moulding step takes \(Y^{\rm storage}\) as input to model the corresponding growth of the bacteria during moulding and outputs the final concentration \(Y^{\rm molding}\). After moulding, the milk is converted into solid cheese and the STEC bacteria from colonies inside a single cheese made from a volume \(v^{\rm cheese}\) (ml) of milk. The number of colonies in a cheese is modelled as a Poisson variable \(N^{\rm colony}_s \sim \rm Poisson(\lambda^{\rm colony}_{\rm s})\), for each strain \(s\) of MPS-STEC \(s \in \{ {\rm O157}, \overline{\rm O157}\}\), with the following mean parameters:

\(\lambda^{\rm colony}_{\rm O157} = Y^{\rm molding} \cdot v^{\rm cheese}\cdot w^{\rm loss} \cdot p_{\rm O157} \cdot p^{\rm STEC}_{\rm MPS}\),

\(\lambda^{\rm colony}_{\overline{\rm O157}} =Y^{\rm molding} \cdot v^{\rm cheese} \cdot w^{\rm loss} \cdot(1-p_{\rm O157}) \cdot p^{\rm STEC}_{\rm MPS}\).

Remark: The factor \(p^{\rm STEC}_{\rm MPS}\) takes into account only the concentration of pathogenic STEC bacteria (MPS-STEC). Alternatively, this can be taken into account by directly computing the concentration of MPS-STEC in the aggregated milk tank as an output of the farm module using \(k_i^{\rm MPS}\) instead of \(k_i\). The FSKX implementation provides a flag flag_MPS = FALSE that allows the user to directly compute the concentration of MPS-STEC in the farm module using the random proportion of MPS-STEC infected cows in a farm.

Starting from the draining phase, the evolution of the size of colony, stemming for one immobilised STEC cell, is studied. The draining phase commences with an initial colony size of 1 CFU and growth continues until the salting phase. It is assumed that the evolution of each colony inside each cheese (weighing 250 g) is identical during these phases, since they have the same environmental conditions. The output of the draining and salting steps are called \(Y^{\rm draining}\) and \(Y^{\rm salting}\), respectively.

The growth in colony size stops after the salting phase. Then starts the decline phase, which is composed of two steps—namely, ripening and cheese storage. The ripening phase lasts until the 14th day of cheese production and the cheese storage phase duration depends on the consumption time \(t^{\rm consum}\) of the cheese. The rate of decline is different for different strains—namely, MPS (\({\rm O157}, \overline{\rm O157}\)) and non-MPS STEC. The median number of bacteria per colony, for a particular strain \(s\) after decline, at the end of the ripening phase and at the time of consumption, is computed as:

\(Y^{\rm ripening}_s =Y^{\rm salting} \cdot 10^{- \rho_s \times (14 \times 24 - t)/24},\)

\(Y^{\rm consum}_s =Y^{\rm salting} \cdot 10^{- \rho_s \times (t^{\rm consum} \times 24 - t)/24}.\)

where \(t\) the time (in hours) taken until the salting step.

The expected number of colonies \(\lambda^{\rm colony}_s\) is adjusted taking the inactivation during the ripening and cheese storage phases. For instance if the median number of bacteria for a particular strain \(Y^{\rm consum}_s\) falls below 1, it signifies that the colony might have disappeared with probability \(Y^{\rm consum}_s\) and the corresponding expected number of colonies is obtained by multiplying \(\lambda^{\rm colony}_s\) by \(Y^{\rm consum}_s\). In case of \(Y^{\rm consum}_s > 1\), the expected number of colonies remains unchanged.

\(\lambda^{\rm colony}_s \leftarrow \lambda^{\rm colony}_s \cdot \min(1, Y_s^{\rm consum})\).


The outputs of interest for the cheese module are the average number of colonies \(\lambda^{\rm colony}_{s}\) and the median colony size at the time of consumption \(Y^{\rm consum}_s\), where the subscript s denotes the strain \(\{ {\rm O157},\overline{\rm O157}\}\). Fig. 3 shows the growth of STEC concentration (CFU/ml) starting from \(Y_0 = 10^{-3.8}\) CFU/ml, during the storage and moulding steps, for the baseline scenario (with no interventions and default values of the parameters). After the transition from liquid to solid state, the evolution of a single colony is shown in Fig. 4. The colonies grow during the draining and salting step and then decline during the ripening and cheese storage steps.

Figure 3.  

Evolution of STEC (main pathogenic serotypes MPS-STEC) in log10 CFU/ml during the storage and moulding step. The blue vertical line shows the end of the storage phase.

Figure 4.  

Evolution of STEC colony size during draining, salting and ripening of cheese fabrication. The decline rate for the MPS O157:H7 strain and non-MPS strains are equal (orange line) and significantly higher than the decline rate of MPS non-O157:H7 strain (red line). The three phases, namely, draining, salting and ripening are separated by vertical blue dotted lines.

Figure 5.  

Batch rejection probability as a function of the initial STEC (main pathogenic serotypes MPS-STEC) concentration (CFU/ml).

Consumer module

In this section, we describe the module of the batch level simulator that computes the risk of HUS for a particular batch. Given the outputs of the cheese module, i.e. the average number of colonies \(\lambda_s^{\rm colony}\) and the median number of bacteria at the end of the decline phase \(Y_s^{\rm consum}\), the consumer module estimates the risk of HUS, based on the raw milk cheese consumption behaviour of people in different age groups \(\theta^{\rm con}\).

Module description

The dose \(\Gamma\), corresponding to the concentration of MPS-STEC per \(25\) g of cheese serving, is computed as:

\(\Gamma = \sum_s N^{\rm colony}_{\rm s, sample} Y^{\rm colony}_{s}\)

where \(N^{\rm colony}_{\rm s, sample}\) denotes the number of colonies in a cheese serving of weight \(wt^{ \rm serving}\) g, which is a Poisson random variable with mean \(\lambda^{\rm colony}_s \times \frac{wt^{ \rm serving}}{wt^{\rm cheese}}\). \(Y^{\rm colony}_s\) denotes the size of all colonies in a cheese, which follows a Log-Normal distribution:

\(Y_s^{\rm colony} = Y^{\rm consum}_s \cdot 10^{\epsilon_s},\)

where \(\epsilon_s \sim \mathcal{N}(0, \tau_{\epsilon_s})\) represents inter-batch variablity (i.e. variability between different cheeses in a single batch). Note that here we assume all the colonies inside a single cheese, for a particular batch and a given strain \(s\), have identical size \(Y^{\rm colony}_{s}\).

The average number of colonies \(\lambda_s^{\rm colony}\) depends on several random quantities, such as the initial STEC concentration \(Y_0\),\(\) \(t^{\rm storage}\), \(d^{\rm storage}\)and \(t^{\rm consum}\) (see Table 2). The median colony size \(Y_s^{\rm consum}\) depends on the random variable \(t^{\rm consum}\). In summary, the dose \(\Gamma\) is a combination of several Poisson and Log-Normal random variables corresponding to different strains and their parameters are dependent on the random quantities \(\xi^{\rm dose} = \{Y_0, t^{\rm storage}, d^{\rm storage}, t^{\rm consum}\}\). The probability of HUS (risk) associated with the ingestion of a dose \ (\gamma\), for age \(a\) is:

\(\begin{equation}P({\rm HUS}| a, \gamma) = 1-(1-r_{\rm a})^\gamma,\end{equation}\)

where \(r_{\rm a} = r_0 \times e^{-k a}\) for age \(a\) varying from \(1\) to \(15\). The risk of HUS for a particular batch of milk, conditional on \(\xi^{\rm dose}\), can be derived by integrating w.r.t \(\Gamma \vert \xi^{\rm dose}\):

\(R^{\rm batch}= \sum_{a=1}^{15}g(a)\int_0^{\infty}[1-(1-r_{\rm a})^\gamma]\,p(\gamma| \xi^{\rm dose})\,\rm{d}\gamma\) ,

where \ (p(\gamma|\xi^{\rm dose})\) is the conditional probability density function of \(\Gamma\) given \(\xi^{\rm dose}\) and \ (g(a)\) is the proportion of cheese consumed by the age group \(a\).

Module inputs

The inputs of the consumer module are the average number of colonies \(\lambda_s^{\rm colony}\), the median colony size \(Y_s^{\rm consum}\) and other parameters denoted by \(\theta^{\rm con}\), which are listed in Table 3.

Table 3.

Parameters of the consumer module \(\theta^{\rm con}\). Unless specified, the parameter values are taken from Perrin et al. (2014).

Symbol Description Values/references
k Parameter for risk computation 0.38
r_0 Parameter for risk computation 10-2.33
wt_cheese weight of a single cheese 250 gm
wt_serving weight of a single serving 25 gm
g(a) Proportion of cheese consumed by age group a taken from *3
tau_eps_O157H7 Parameter for inter cheese variability 0.000279659 (taken from *3)
tau_eps_otherMPS Parameter for inter cheese variability 0.000065399 (taken from *3)
a_max Maximum age group 15
N_dose Monte Carlo sample size 0
Table 4.

Inputs of output module \(\theta^{\rm output}\).

Symbol Description Values/reference
N_batch Monte Carlo sample size 1
p_test Proportion of cheese batch tested 0.5
Table 5.

Default simulation settings.

fm_N_farms 31
fm_q_milk 25
fm_sorting_freq 10
fm_sorting_lim 50
fm_mu_u -0.927
fm_tau_u 1.47411
fm_a_weibull 0.264
fm_b_weibull 16.288
fm_mu_ecoli 6
fm_tau_ecoli 0.3
cm_mu_max_T_min 5.5
cm_mu_max_T_opt 40.6
cm_mu_max_T_max 48.1
cm_mu_max_pH_min 3.9
cm_mu_max_pH_opt 6.25
cm_mu_max_pH_max 14
cm_mu_max_aw_min 0.9533
cm_mu_max_aw_opt 0.999
cm_mu_max_mu_opt 2.03
cm_w_activity 0.99
cm_rho_O157H7 0.14
cm_rho_otherMPS 0.033
cm_y_max_milk 1e+09
cm_y_max_cheese 1e+05
cm_storage_duration 12
cm_storage_duration_min 1
cm_storage_duration_max 40
cm_storage_duration_mode 12
cm_storage_temperature 5
cm_storage_temperature_min 1
cm_storage_temperature_max 6
cm_p_O157H7 0.76
cm_p_MPS_STEC or fm_p_MPS_STEC 0.025
cm_mu_eps_O157H7 0
cm_tau_eps_O157H7 0.000279659
cm_mu_eps_otherMPS 0
cm_tau_eps_otherMPS 6.5399e-05
cm_molding_duration 3
cm_draining_duration 17








cm_v_cheese 2200
cm_w_loss 0.9





cm_m_sample 25
cm_n_sample 5
cm_k 0.38
cm_r0 1e-2.33
cm_age_max 14













Numerical methods for the estimation of the batch risk

For a given set of input parameters \(\xi^{\rm dose}\), the conditional risk \(R^{\rm batch} =P({\rm HUS}|\xi^{\rm dose})\) can be computed using simple Monte Carlo using i.i.d. samples from \(p(\Gamma|\xi^{\rm dose})\). The number of Monte Carlo samples is given by the numerical parameter \(N^{\rm dose}\). Note that the randomness in \(\Gamma|\xi^{\rm dose}\) is only due to the fact that the dose is a combination of several pairs of a Poisson and Lognormal random variables.

Alternatively, the batch risk can be computed by approximation of the integral \(\mathbb{E}[(1-r_a)^{\Gamma}]\). Since the two classes of MPS-STEC strains are independent, we can write the dose as a sum \(\Gamma = \Gamma_1 + \Gamma_2\) with:

\(\Gamma_s = Y^{\rm consum}_s \cdot N_s^{\rm colony} \cdot 10^{\tau_{\epsilon_s}\nu_s}= d_s \cdot N_s^{\rm colony}\cdot b_s^{\nu_s}\),

where \(\nu_s\) is a standard normal variable. Now the expectation can be written as \(\mathbb{E}[(1-r_a)^{\Gamma}] = \prod_s \mathbb{E}[\mathbb{E}[(1-r_a)^{\Gamma_s}|\nu_s]]\). Using the result in Endnote*2, the expectation can be further reduced to:

\(\mathbb{E}[(1-r_a)^{\Gamma}] = \exp(-\sum_s \lambda^{\rm colony}_s)\prod_s\mathbb{E}[c_{s,1}^{c_{s, 2}^{c_{s, 3}^{\nu}}}] ,\)

where \(c_{s, 1} = \exp(\lambda_s^{\rm colony}), c_{s, 2} = (1-r_a)^{d_s}\)and \(c_{s, 3} = b_s\). Since \(c_{s, 1} > 1, c_{s, 2} < 1\) and \(c_{s, 3} > 1\), the function

\((c_{s,1})^{(c_{s, 2})^{c_{s, 3}^{\nu_s}}}\)

is monotone (non-increasing) in \(\nu_s\). For such functions, deterministic quadrature methods (such as the trapezoidal rule) have convergence rate \(O(n^{-1})\), which is better than simple Monte Carlo (see, for example, Novak (1992)).

The choice of the method for computating the batch risk is determined through the parameter value \(N^{\rm dose}\). For a non-zero integral value, the simulator uses the simple Monte Carlo method and, if \(N^{\rm dose}\) is set to zero, it used the integral approximation method.

The current implementation also allows the user to compute the conditional batch risk \(\xi^{\rm dose}\setminus t^{\rm consum}\)averaged out with respect to the consumption time. This is done by using the flag flag_consum = TRUE that computes the integration of \(R^{\rm batch}\) with respect to \(t^{\rm consum}\). The integral \(I = \mathbb{E}_{t^{\rm consum}}[R^{\rm batch}(t^{\rm consum})]\) is approximated using a piecewise constant function on a regular grid \(\{t_1^{\rm consum}, t_2^{\rm consum}, \cdots, t_n^{\rm consum}\}\) of size \(n\), that spans the support of the triangular distribution of the variable \(t^{\rm consum}\). For each interval, the value of the piecewise constant function is taken equal to the value of \(R^{\rm batch}\) at the left endpoint of that interval and the approximation is computed as:

\(\hat{I} = \sum_{i=1}^{n-1}P[t_i^{\rm consum} \leq t^{\rm consum} \leq t_{i+1}^{\rm consum}] \cdot R^{\rm batch}(t_i^{\rm consum})\).

For STEC, due to the inactivation during the post-ripening storage phase, \(R^{\rm batch}\) turns out to be a decreasing function of \(t^{\rm consum}\), which slightly overestimates the risk with this approach. It is worth noting that the computation of \(R^{\rm batch}\) for a grid of values is not very expensive compared to the computation of the same for a single value of \(t^{\rm consum}\), since both of them require a single run to the cheese module that can compute all the necessary outputs, \(\lambda_s^{\rm colony}\)and \(Y^{\rm consum}_s\) corresponding to the consumption time points.


The output of the consumer module is the estimated batch risk of HUS. For a given set of input parameters, Fig. 6 plots the relative risk \(R^{\rm batch}/R^{\rm baseline}\) against \(log10(y_0)\). The baseline risk is computed with no intervention steps.

Figure 6.  

The relative batch risk (with respect to a baseline risk value) is plotted as a function of the initial STEC (main pathogenic serotypes MPS-STEC) concentration (CFU/ml).

Post-harvest: Cheese sampling

The post-harvest step, also known as the sampling step, can be carried out at various stages of cheese production, depending on the type of bacteria. For STEC, this step is conducted at the end of the ripening phase, more precisely at the 14th day of production by default. However the current implementation allows us to change this parameter with a minimum value of 3 days and a maximum value of 14 days. During this step, the batch of cheese produced is examined for MPS-STEC contamination by taking small portions of cheese samples from the batch. Once a single sample unit tests positive, the entire batch of cheese is rejected, meaning that the specific batch does not enter into the calculation of the overall risk of HUS.

Module inputs

The inputs of the post-harvest module are initial STEC concentration in milk \(Y_0\), average number of colonies \(\lambda_s^{\rm colony}\) and the parameters denoted by \(\theta^{\rm post}\), listed in Table 6.

Table 6.

Parameters of the post-harvest module \(\theta^{\rm post}\). Unless specified, the parameter values are taken from *3.

Symbol Descrtption Values/reference
n_sample Number of test portions 5
m_sample Mass of each test portion 25 gm
d_test Post-harvest sampling day 14 days

Module description

We assume that the colonies are homogeneously distributed inside a cheese, a colony contains at least one MPS-STEC cell and the test is accurate enough to detect a colony with a single MPS-STEC cell. We observe that probability of a sample unit testing positive is \(P^{\rm sample} = P[N^{\rm colony}_{\rm sample}>0]\), where \(N^{\rm colony}_{\rm sample}\) is the total number of colonies from all strains. Since the number of cells corresponding to two different strains grow independently, the total number of colonies is also a Poisson random variable, \(N^{\rm colony}_{\rm sample} \sim \rm Poisson((\lambda^{\rm colony}_{1} + \lambda^{\rm colony}_{2})\cdot \frac{m^{\rm sample}}{wt^{\rm cheese}})\). For given values of \(\xi^{\rm dose}\) using this Poisson distribution, we compute the probability of rejecting a particular batch \(\) \(P^{\rm reject} = 1-(1-P^{\rm sample})^{n^{\rm sample}}\)\(\).


The output of the post-harvest module is the probability of rejection. Fig. 5 plots the probability of rejecting \(P^{\rm reject}\) with fixed \(m^{\rm sample} = 25\) and \(n^{\rm sample} = 5\), for \(10^3\) batches corresponding values of \ (y_0\).

Output module

The output module computes the overall risk of HUS from MPS-STEC and other quantities required to assess the analytical cost corresponding to the intervention steps. This module is outside the batch-level simulator that computes the quantities of interest corresponding to the fabrication of a particular batch of cheese (see Fig. 7).

Figure 7.  

Output module.

Module input

The inputs of the output module are the outputs of the farm, consumer and post-harvest module along with the parameters denoted by \(\theta^{\rm output}\), listed in Table 4.

Model description

The output module simulates several batches and produces estimates of the overall risk \(R^{\rm HUS}\), the probability of rejection \(P^{\rm avg}\) and the milk loss \(M^{\rm avg}\). We define the average milk loss \(M^{\rm avg} = \mathbb{E}[M^{\rm batch}]\), average batch rejection rate: \(\)

\(P^{\rm avg} = \mathbb{E}[P^{\rm batch} \cdot p^{\rm test}] = \int_{\xi^{\rm dose}}P^{\rm batch} \cdot p^{\rm test} \cdot p(\xi^{\rm dose})\, {\rm d}\xi^{\rm dose},\)

with \(p^{\rm test}\) being the proportion of batch tested and the average over batch risk: \(R^{\rm avg} = \mathbb{E}[R^{\rm batch} \cdot (1-P^{\rm batch} \cdot p^{\rm test})]\),

where the batch risk is set to zero for rejected batches.

The overall risk of HUS is conditional on the event that the batch actually goes into the market, i.e. not rejected. It is computed by dividing \(R^{\rm avg}\) by the probability that a produced batch is not rejected:

\(R^{\rm HUS} = \frac{\int_{\xi^{\rm dose}}R^{\rm batch} \cdot (1-P^{\rm batch} \cdot p^{\rm test}) \cdot p(\xi^{\rm dose}){\rm d}\xi^{\rm dose}}{1-P^{\rm avg}}\).

The quantities are estimated using simple Monte Carlo with sample size \(N^{\rm batch}\). We use i.i.d. samples \(\{\xi^{\rm dose}_{1}, \xi^{\rm dose}_{2}, \cdots , \xi^{\rm dose}_{N^{\rm batch}}\}\), drawn from the conditional distribution \(p(\xi^{\rm dose})\) to construct the unbiased simple Monte Carlo estimator:

\(\hat{P}^{\rm avg} = \frac{1}{N^{\rm batch}}\sum_{l=1}^{N^{\rm batch}}\hat{P}^{\rm batch}(\xi^{\rm dose}_{l}) \cdot p^{\rm test}\),

\(\hat{R}^{\rm avg} = \frac{1}{N^{\rm batch}}\sum_{l=1}^{N^{\rm batch}}\hat{R}^{\rm batch}(\xi^{\rm dose}_{l}) \cdot (1-\hat{P}^{\rm batch}(\xi^{\rm dose}_{l}) \cdot p^{\rm test})\).

The current implementation of the output module computes the relative risk of HUS with respect to a baseline scenario with no intervention steps. This quantity is obtained by dividing \(R^{\rm HUS}\) by the risk of HUS in the baseline scenario (the baseline risk is also estimated by simple Monte Carlo with sample size \(N^{\rm batch}\)).


The output module returns the relative risk of HUS \(R^{\rm HUS}\), average proportion of rejected batches \(P^{\rm avg}\) and average milk loss \(M^{\rm avg}\).

Comparison with the previous implementation

The R implementation proposed in this article differs in several respects from the QMRA model originally proposed by Perrin et al. (2014). The modifications have been validated by experts from ANSES, CNIEL and L2S. In this section, we summarise the differences.

Modifications in the farm module

  1. In the farm module, the hyper-parameters of the distribution of the concentration of E. coli (CFU/ml) in bulk milk tank have been estimated using a Bayesian approach, based on a hierarchical Poisson mixed model as described by Equation 4 in Perrin et al. (2014). The approach is based on the E. coli data provided by *3 and uses a Gibbs sampler to derive the posterior distribution of \(\alpha\) and \(\sigma\).
  2. The proportion of MPS infected cows is considered to be \(p^{\rm STEC}_{\rm MPS}\). Previously, this proportion was multiplied with the average colony size of STEC after the moulding step in order to obtain the average colony size of MPS-STEC. In the current implementation, this proportion is used directly inside the farm module to simulate the number of MPS infected cows in each farm, that allows us to simulate directly the concentration of MPS-STEC as an output of the farm module.
  3. As an additional metric of cost of intervention, this model computes the average quantity of milk lost due to pre-harvest milk sorting.

Modifications in the cheese and consumer module

  1. The variable \(u_i\) in Equation (7) of Perrin et al. (2014), denoting intra-cheese variability is set to zero. In other words, all the colonies inside a particular cheese have an identical colony size.
  2. The volume of milk used to produce a single cheese is taken (default value) as 2.2 litres instead of 2.5 litres. However, this value can be changed depending on the production scenario.
  3. In the current implementation, the batch risk is computed at the time of consumption which includes the inactivation (decline in concentration) during the cheese storage phase. This is different from the batch risk computed at the end of production which does not take into account the decline in bacteria population during the cheese storage. The duration of this phase is modelled as a Triangular distribution with more recent and updated values of the parameters as shown in Table 2.

Usage and applicability

The FSKX implementation (Suppl. material 1) allows the user to execute the simulator on KNIME, using a set of input parameters listed in Table 5. By suitably adjusting the input parameter cm_n_batch, the user can run the FSKX implementation to either simulate a single batch (by setting cm_n_batch = 1) or multiple independent batches (by setting cm_n_batch > 1) to estimate the ultimate quantities of interest. Simulation of a single batch produces three numerical outputs, namely, the STEC concentration (CFU/ml) in milk put in production, the amount of milk loss (in litres) due to testing and the probability of rejecting the cheese batch. This also produces graphical representation of the evolution of STEC and colonies during cheese fabrication (both solid and liquid phase) and the evolution of the bacteria growth rate over the different phases of production. On the other hand, when multiple batches are simulated, it produces the estimates of the ultimate quantities of interest, averaged over these batches, namely, the relative risk of HUS computed with respect to a baseline scenario (with no intervention steps), the average milk loss (in litres) and the average probability of rejecting a cheese batch after production. The corresponding graphical outputs show the distribution of STEC concentration (CFU/ml) in aggregated milk tank, the relative batch risk of HUS (computed with respect to the baseline risk) and the relative batch risk and batch rejection probability as a function of initial STEC concentration, as shown in Figs 5, 6. The baseline scenario—i.e. the cheese production without any intervention step—can be simulated by appropriately choosing the parameters \(p_{\rm test} = 0\) and \(f^{\rm sorting} = \infty\).

This article presents a QMRA model that offers a scientific approach to simulate the real-life scenarios encountered during the production of raw milk soft cheese. The model builds upon the work of Perrin et al. (2014), as well as inputs from ANSES, CNIEL, ACTALIA and L2S. The primary goal of this model is to conduct optimisation studies for the process intervention parameters and to make recommendations to cheese producers. It is essential to note that the outputs obtained using the simulator, such as the batch risk, loss of milk and proportion of rejected cheese batches, are just the estimates of a hypothetical scenario simulated with a state-of-the-art*6 QMRA model available for raw milk cheese. These quantities can be used to improve the intervention processes and also study the effects of different input parameters in the production process, but should not be interpreted as the actual prevalence of HUS observed in reality.

Model assumptions

According to F.A.O. and W.H.O. 2021 "Models are always incomplete representations of the system they are intended to model, but they can still be useful". The QMRA model for STEC is based on several assumptions which are quite common in microbiological risk modeling, however some of the important assumptions are listed below in Table 7

Table 7.

QMRA model assumptions

Assumptions Significance Comments
Homogeneous distribution of colonies inside a cheese. The distribution of colonies inside a cheese impacts the post-harvest sampling step. This assumption is used to simplify the cheese testing step, which assures that, if the cheese is contaminated, it is always tested positive. This overestimates the detection probability when colonies are clustered.
Identifying MPS-STEC as the unique HUS-causing hazard. It was not taken into account that certain non-MPS-STEC strains can also cause HUS and that, within MPS-STEC, some of the strains maybe less virulent. Some recent publications, see, for example, Auvray et al. (2023) suggest that MPS could be based on stx subtypes rather than serotypes. The current level of knowledge does not allow us to determine the prevalence, based on this new definition.
STEC and E. coli follow the same intestinal route in the farm animal. The pre-harvest intervention step is based on this assumption. The milk sorting is carried out using the E. coli concentration in the farm milk. This assumption is based on Perrin et al. (2014).
No intra-cheese variability. All the colonies inside a single cheese are of same colony size.

Grant title

This work is part of the ArtiSaneFood project (grant number: ANR-18-PRIM-0015) which is part of the PRIMA programme supported by the European Union.

Author contributions

Subhasish Basak: Writing - Original Draft, Writing - Review & Editing

Janushan Christy, Laurent Guillier, Fanny Tenenhaus-Aziza, Julien Bect, Emmanuel Vazquez: Methodology, Writing- Reviewing and Editing

Fanny Tenenhaus-Aziza: Project coordinator

Moez Sanaa, Frédérique Audiat-Perrin: Writing Reviewing and Editing

Conflicts of interest

The authors have declared that no competing interests exist.
Disclaimer: This article is (co-)authored by any of the Editors-in-Chief, Managing Editors or their deputies in this journal.


Supplementary material

Suppl. material 1: QRA simulator 
Authors:  Subhasish Basak et al.
Data type:  fsk file

The Shiga-toxin producing Escherichia coli a.k.a STEC is a special kind of E. coli bacteria. There are five main pathogenic stereotypes of STEC a.k.a MPS-STEC, that are identified so far in Europe. These are O157:H7, O26:H11, O103:H2, O111:H8 and O145:H28. In our study, we assume only MPS-STEC is responsible for the disease Haemolytic Uremic Syndrome (HUS).


The integral inside \(R^{\rm batch}(y_0)\) can be written as \(\mathop{\mathbb{E}}[1-(1-r_a)^{N^{\rm colony}_s \cdot C}]\).

\(\begin{equation}\begin{aligned}&= \sum_{N^{\rm colony}_s = 0}^{\infty}[1-(1-r_a)^{N^{\rm colony}_s \cdot C}]e^{-\lambda^{\rm colony}_s}\frac{(\lambda^{\rm colony}_s)^{N^{\rm colony}_s \cdot}}{N^{\rm colony}_s!}\\&= \sum_{N^{\rm colony}_s = 0}^{\infty}e^{-\lambda^{\rm colony}_s}\frac{(\lambda^{\rm colony}_s)^{N^{\rm colony}_s}}{N^{\rm colony}_s!} - \sum_{N^{\rm colony}_s = 0}^{\infty}(1-r_a)^{N^{\rm colony}_s \cdot C}e^{-\lambda^{\rm colony}_s}\frac{(\lambda^{\rm colony}_s)^{N^{\rm colony}_s}}{N^{\rm colony}_s!}\\&= 1 - \sum_{N^{\rm colony}_s = 0}^{\infty}\frac{\{(1-r_a)^C\lambda^{\rm colony}_s\}^{N^{\rm colony}_s}}{N^{\rm colony}_s!}\frac{e^{-\{(1-r_a)^C \cdot \lambda^{\rm colony}_s\}} \cdot e^{-\lambda^{\rm colony}_s}}{e^{-\{(1-r_a)^C \cdot \lambda^{\rm colony}_s\}}}\\ &= 1 - e^{-\lambda^{\rm colony}_s \cdot [1-(1-r_a)^C]}\sum_{N^{\rm colony}_s = 0}^{\infty}e^{-\{(1-r_a)^C \lambda^{\rm colony}_s\}}\frac{\{(1-r_a)^C \lambda^{\rm colony}_s\}^{N^{\rm colony}_s}}{N^{\rm colony}_s!}\\&= 1 - e^{-\lambda^{\rm colony}_s \cdot [1-(1-r_a)^C]}\\\end{aligned}\end{equation}\).


ACTALIA SAS Script: The SAS script used by CNIEL and developed by ACTALIA uses a set of parameter values for the implementation. In our work, we have considered it as a reference for several parameter values.


We have used a Bayesian approach to estimate the values of the hygiene parameters in the farm module: this uses a Gibbs sampler to estimate posterior distribution of alpha and sigma, based on the E. coli data provided by ACTALIA/CNIEL and used by Perrin et al. (2014). In this implementation, all the 31 farms considered are assumed to have same hygiene conditions with respect to these hygiene parameters, which are taken as, respectively, the means of the estimated posterior distributions.


Number of cows: In the current implementation, the number of cows per each farm is sampled from the cow distribution data provided by ACTALIA/CNIEL and used by Perrin et al. (2014).


The pre-harvest intervention step does not implement the re-integration procedure of farms in the production process, once a particular farm is rejected. The typical process of re-integration involves conducting repeated tests on the milk from the farm over several days until it consistently shows no signs of contamination, ensuring the production of uncontaminated milk from the farm.

login to comment