Iterative Adjustment of Survival Functions by Composed Probability Distortions

Bienvenüe, Alexis; Rullière, Didier

doi:10.1057/grir.2011.7

Iterative Adjustment of Survival Functions by Composed Probability Distortions

Article
Published: 20 December 2011

Volume 37, pages 156–179, (2012)
Cite this article

Download PDF

The Geneva Risk and Insurance Review Aims and scope Submit manuscript

Iterative Adjustment of Survival Functions by Composed Probability Distortions

Download PDF

Alexis Bienvenüe¹ &
Didier Rullière¹

1492 Accesses
5 Citations
Explore all metrics

Abstract

We introduce a parametric class of composite probability distortions that can be combined to converge to a target survival function. These distortions respect analytic invertibility and stability, which are shown to be relevant in many actuarial fields. We study the asymptotic impact of such distortions on hazard rates. The paper provides an estimation methodology, including hints for initialisation. Some applications to survival data bring results for catastrophic event impact modelling. We also obtain accurate parametric representations of the mortality trend over years. Finally, we suggest a prospective mortality simulation model that comes naturally from the above analysis.

Robust Paradigm Applied to Parameter Reduction in Actuarial Triangle Models

The estimation for the general additive–multiplicative hazard model using the length-biased survival data

Article 07 January 2019

Chengbo Li & Yong Zhou

High dimensional nuisance parameters: an example from parametric survival analysis

Article Open access 09 August 2020

H. S. Battey & D. R. Cox

Introduction

In an insurance company, many problems may occur when analysing data mortality. First, it may be necessary to use a reference mortality table, especially when there is a lack of data at some ages, or when the construction of a whole mortality table is excluded. In this case, the reference mortality table lies on a population with a specific risk, distinct from that of the insurance company. These differences of risk-exposed population require an adaptation of one table given the other, which can be expressed as a parametric deformation. Second, a precise representation of mortality over ages shows some local phenomena, leading to a non-monotone hazard rate, which may require a relatively complex parametric shape. Third, the analysis of the evolution of mortality rates over time requires a model that can stay reliable after years.

A large amount of literature deals with these problems. To adapt a mortality insurance table given a reference one, one may use Proportional Hazard transform or Wang transform.^{Footnote 1} Heligman and Pollard^{Footnote 2} studied the precise structure of mortality as a function of age. Lee and Carter^{Footnote 3} described the evolution of mortality over time, and many other authors suggest different parametric representations of mortality and its evolution.^{Footnote 4}

Nevertheless, these classical parametric solutions have several drawbacks:

These solutions do not improve data adequation, and adding parameters is relatively tricky. This way, considering Wang transforms,¹ the use of several successive transforms does not extend the class of transformed survival functions; the adaptation of one table given another with a single parameter may remain insufficiently accurate, and parameters adjunction could denature such a transform. Among other models, such as those of Heligman and Pollard² or Lee and Carter,³ potential extensions may lead to very different expressions depending on the number of parameters that we wish to add, and the convergence properties of such transformation when increasing the number of parameters are unknown.
The use of several parameters in order to fit data may cause important estimation problems, this estimation being numerically feasible only in the presence of initial values sufficiently close to the solution. Adding parameters or introducing a prospective framework requires the knowledge of initial values that may be hard to obtain.
Practical simulations of random death dates are sometimes generated from easily invertible survival functions in order to speed up simulations. This choice leads away from previously presented classical models to favour simple, easily invertible laws. The good representation of mortality tables is then reduced with the use of laws having few parameters, such as that of Gompertz. Thus, parametric inverse distribution functions are sometimes used to obtain stochastic simulations, but the adequacy of a set of mortality tables will not be able to exceed a given precision.

Many parametric expressions have been suggested to deal with each of those problems, but they assume different forms, and it is interesting to look for a common parametric form, which may be used for probability distortions, for static and prospective mortality tables, and for inverse distribution function intended for stochastic simulations. Moreover, depending on the desired accuracy, the choice of the number of parameters, without modifying the nature of the adjustment, is a question of great importance that is difficult to solve with classical tools.

Trying to give a helpful tool for all the issues we have introduced, it is natural to suggest the use of probability distortions, and to consider the composition of these distortions. Composed distortions allow us to get accurate and easily invertible adjustments of survival functions, with the possibility of increasing the number of parameters in order to converge to a target law. This choice can be useful to many issues, such as pricing or risk measuring.

In this paper, we show how our distortions modify random variables (Proposition 1, linked with Accelerated Failure Time models), hazard rates (Proposition 2) and stop-loss premiums in the regular variation case (Proposition 3). The main finding of this paper is to establish that some particular distortions reduce the number of parameters (Theorem 1), that these distortions allow an initial survival function to converge to any target survival function (Theorem 2) and that accurate initialisation values can be given for parameter estimation (Proposition 4).

The paper is structured as follows. In the section “Probability distortions and constraints”, we introduce some general uses of probability distortions in the actuarial field, and the more specific constraints that we have chosen for our distortions. In the section “Transformations”, we deal with the general form of these distortions. Some initial results on distorted random variables are given here. In particular, the sub-section ‘Conversion functions’ gives specific examples of distortions, mainly smoothed and composed versions of a basic class of angle functions. The estimation problem and the convergence demonstration of chosen distortions to any survival function target is explained in the section “Estimation and convergence of iterative adjustment”. Lastly, some applications are given in the specific field of multiple mortality tables adjustment in the section “Numerical applications”.

Probability distortions and constraints

There are many different aims when using probability distortions, including the following:

Obtaining a parametric form for quantity of interest, improving the fit of a reference with real data (adjusting an official mortality table to business data, adjusting claims distribution on a segment given a global distribution).
Explaining a phenomenon by the considered distortion, with the parametric distortion being the centre of interest (e.g. explaining the evolution of a phenomenon over time).
Applying a prudential rule, adding weight to the distribution's tail, or more generally taking into account phenomena that are not observed in the data (carrying out a loading that preserves bracket pricing, giving a solvability margin).

The first use of probability distortions can be attributed to d’Alembert.^{Footnote 5} Amount distortions by way of utility functions appeared in Bernoulli's treaties.^{Footnote 6} A few years later, d’Alembert suggested distorting probabilities themselves.^{Footnote 7} Ironically, his intention was not to take into account a prudential constraint, but on the contrary to lessen rare events, in order to answer to the well-known Saint Petersburg paradox.

More recently, probability distortions gained interest. In economics, as utility functions modify amount perception while keeping probabilities unchanged, the dual theory from Yaari^{Footnote 8} keeps amounts unchanged while distorting probabilities.^{Footnote 9} These different points of view can be seen as heirs of antagonistic views from d’Alembert and Bernoulli. In the actuarial field, probability distortions have been popularised by Wang's work. He used different distortions for pricing, and for risk measurement.^{Footnote 10} Risk measure evaluation for financial assets are also concerned by probability distortions, as illustrated in Wang^{Footnote 11} or Hamada and Sherris.^{Footnote 12} Constraints can nevertheless appear in such an evaluation, as detailed in Pelsser.^{Footnote 13}

Generally speaking, risk measurement is framed by numerous axioms or principles on probability distortions.^{Footnote 14} Thus, distortions are usually suggested as a viewpoint on prudential and risk analysis, following an axiomatic set of constraints characteristic of this field.

When one needs distortions likely to fit to data as closely as desired, and able to maintain some key properties like analytic invertibility of survival functions, one faces some deeply different constraints. Some authors use distortions to model the temporal evolution of risk, like mortality. As an example, the article of De Jong and Marshall^{Footnote 15} is based on the evolution of Wang's transform parameters, and gives projections of mortality tables. Nevertheless, some properties that seem helpful to us are not satisfied with the transforms they use, such as the ability of a transformation to be iterated in order to get as close as desired to business data.

We try to detail more precisely these constraints, aiming in particular at an invertible parametric form for a quantity of interest. The demands of analytic invertibility emanate from the pragmatic desire for ease of simulation of continuous random variables, conditional on their belonging to a given set. Here, distortions are simple real functions applied to survival functions, and the problem of the composition of such functions is also addressed. Ideally, the result is a representation of a survival function as a composition of several parametric functions. This aim is similar to the idea of a wavelet decomposition of a function: getting a class of functions large enough to generate (here by composition) target functions in several kinds of problems, relevant enough to necessitate only a restricted number of parameters. These functions should also preserve some properties that are likely to be helpful to actuarial problems. Here, we present some distortions whose properties seem interesting to us, and which are efficient in our numerical applications.

Specific constraints

In this paper, we try to restrict the huge set of possible choices for probability distortions by suggesting a set of constraints that are relevant for many actuarial issues. We consider a class of distortions , which will be applied on survival functions from a class , so that each distorted function is also a survival function:

The class of distortions consists of the set of real functions T _θ, for some vector of parameters θ∈Θ, Θ⊂ℝ^p, p∈ℕ^*:

We will try to find a distortion with a reduced number of parameters and with an analytic expression that is likely to be easily computed with common computer languages. We set five constraints for these distortions, detailed below.

C1: Invertibility

Simulation techniques being very commonly used in actuarial work, the preservation of the invertible character of a survival function arise from the knowledge of the analytic expression of the inverse distortion function T _θ: this knowledge allows easy simulation of random variables from the distorted law, given that this random variable belongs to a given set. Such a simulation is straightforward when applying the inverse survival function to a uniform random variable on a subset interval of ]0, 1[, but requires easy computation of the inverse function. The choice of working on survival functions may be explained by the presence, in life or non-life insurance, of conditioning on overshooting a given threshold by considered random variables.

C2: Stability

Ideally, we try to preserve the intuitive interest of being able to distort a function in a direction or its opposite, by demanding that inverse distortions belong to the same class as original distortions. This helps symmetry properties, as well as computer coding of distortions and their inverse functions. Under this constraint, exchanging the target function and the initial one will modify distortion parameters, but not the distortion expression itself. This seems logical without a priori information on the shape of target or approximated functions.

C3: Regularity

Explaining the distortion is a pragmatic constraint, as is being able to estimate its parameters. For example, we try to determine the influence of each parameter on some commonly used quantities (expectancies, stop-loss premiums and so on) to identify the consequences of fixing minimal or maximal possible values for each parameter. This leads us to establish some constraints on parameters, that is, on the components of θ vectors, θ∈Θ. To get some quantitative arguments when a parameter is varying, and for the sake of clarity, we prefer that the set of parameters values be an open hyperrectangle of ℝ^p. In addition, interpreting the impact of a parameter on the distortion should not lead to separate the analysis into several cases, and should be interpreted logically; this leads us to formulate continuity and differentiability conditions:

C4: Convergence

In order to better fit a reference survival function or business data, we set a convergence constraint. Applying distortions iteratively should lead us to reduce a specified distance (in the following, L ¹ distance) between any target survival function and any initial survival function: iterated transformed functions must converge to the target survival function. We suppose that when the initial survival function is identical to the target function, the distortion does not change this function, so that the identity function belongs to the class of considered distortions.

C5: Parameterisation

It is possible to change the parameterisation of a distortion with a bijection from the set Θ of all parameters to a new set Θ˜. This way, one can replace a distortion T _θ by T̃ _θ= . The set of all distortions is then obviously the same, but the parameters meaning, the constraints on parameters and the ease of estimation could be modified. We prefer the parameters of an inverse distortion when their expression is a simple direct function of the parameters of the initial distortion. Among these preferred parameterisations, we present a particular class that can be expressed more formally: from axiom C2, there exists a bijection I _T, which for all θ∈Θ associate a θ′∈Θ such that T _θ ⁻¹=T _θ′, and we present parameterisations leading to

where D _T is a diagonal matrix, with diagonal =(d ₁,…, d _p), d ₁,…, d _p∈{−1, 1}. We call such a parameterisation a symmetrical parameterisation. When switching to an inverse distortion, the ith parameter is unchanged if d _i=1. We then call it a position parameter. Its sign will change if d _i=−1. We then call it a distance parameter. The parameterisation is said to be entirely symmetrical when Θ=ℝ^p and T _θ ⁻¹=T _−θ for all θ∈Θ. This implies in particular that =Id. This can facilitate the interpretation of the change of parameters while deriving the estimation. Entirely symmetrical parameterisations offer the possibility of suppressing a parameter while keeping inverse distortions in the same class, by simply choosing 0 for the value of suppressed parameter.

Transformations

Definitions

Our transformations act on the logit scale, which has been shown to be relevant in various contexts. We focus here on distortions of real random variable survival functions. Let be the set of real integrable random variable survival functions, so that functions S∈ are cadlag from ℝ to [0, 1], S(x)=1 for all x⩽0 and ∫₀ ^+∞ S(t)dt<∞. For S∈ and f any bijective increasing function from ℝ to ℝ, we denote T _f the function from [0, 1] to [0, 1] such that

We call f the conversion function of the distortion T _f. The logit function and its inverse, logit(x)=ln(x/(1−x)) and logit⁻¹(x)=1/(1+e^−x), respectively, are used here in a very classical way, so that for any f the distortion belongs to [0, 1]. This choice is not crucial, since the survival function distortion mainly relies on f. The main advantage of the logit function is the simple analytic expression of its inverse. It can be rapidly evaluated, as exponential and logarithm functions are directly computable by the arithmetic coprocessor of modern computers. Note that any distribution function could have been chosen instead. One can easily switch from one setting to the other, modifying the conversion function: logit⁻¹(f(logit(u)))=Φ(f̃(Φ⁻¹(u))), with f(u)=logit(Φ(f̃(Φ⁻¹(logit⁻¹(u))))). In particular, the Wang transform¹ could be accessed letting Φ be Gaussian distribution function, and f̃(x)=x+λ, λ∈ℝ.

Setting 𝕋_f(S)(x)=T _f(S(x)) for all x∈ℝ, one gets 𝕋_g○𝕋_f =𝕋_g○f and .

Impact on random variables

Let X and X̂ be real random variables with respective survival functions S∈ and Ŝ=𝕋_f(S). In this section, we observe how some characteristics of X are modified by the distortion.

Proposition 1 (From X to X̂ )

Let S∈ be an invertible survival function. Then

The proof is straightforward. This depiction of the distortion gives a direct link with Accelerated Failure Times models.^{Footnote 16}

Proposition 2 (Hazard rate)

Let μ(t) and μ̂(t) denote the respective hazard rates of one random variable and its transform: μ(t) =−S′(t)/S(t) and μ̂(t)=−Ŝ′(t)/(t). Then, when t → ∞,

When f has an asymptotic direction f′(t) → a, the hazard rate is asymptotically multiplied by a.

Proof When , leading to the result. □

Proposition 3 (Regular variations)

Let Z ₀ *=E(X−x)₊ be the average charge for Stop-Loss reinsurance treaty with priority x, and the same quantity for X̂. Suppose S is regularly varying with exponent ρ⩽0, that is , and f has an asymptote with slope a, that is . Then, Ŝ is regularly varying with exponent aρ and

Proof

Note Z _p ^*(x)=∫_x ^+∞ t ^p S(t)dt=E[(X ^p+1−x ^p+1)₊]/(p+1) (for p such that the integral converges). When S is slowly varying, theorem 1 (p. 281), from Feller's book^{Footnote 17} provides us the following equivalency when x → +∞ and for ρ+p+1<0 and aρ+p+1<0: −(ρ+p+1) Z _p ^*(x)∼x ^p+1 S(x) and −(aρ+p+1) .□

Conversion functions

Affine functions

These functions are defined by two parameters, p>0 and m:

See Figure 1 for function and parameter illustration. They are obviously invertible, with (D _p,m)⁻¹=D _1/p, m. Parameter p is the slope, and m the threshold for which D _p, m(m)=m, separating the areas where the distorted survival function is increased or not. One can remark that for these functions the induced distortion corresponds to the Brass model.^{Footnote 18}

Choosing parameters ρ=ln p and m leads to one distance parameter and one position parameter (see axiom C5). Choosing h=m(1−p)/(1+p) instead of m leads to the entirely symmetrical parameterisation: for h∈ℝ and ρ∈ℝ, \(\overline{D}\) _ρ, h=e^ρ(x+h)+h and D̄ _ρ, h ⁻¹=D̄ _{−ρ, −h}. ρ is the logarithmic slope and h the height of the intersection with the diagonal y=−x.

Angle functions

See Figure 1 for function and parameter illustration. Angle functions have four parameters: the apex position (x ₀,y ₀), and two slopes p ₁>0 and p ₂>0. They can be written as:

These functions are bijective functions, with .

Replacing (x ₀, y ₀) by (m, h ₁)=((x ₀+y ₀)/2, (y ₀−x ₀)/2), m becomes a position parameter and h ₁ a distance parameter. Next replace m by h ₂=sm, where s is a distance parameter, say s=sign (p ₁−p ₂), so that the angle symmetry is preserved. This leads to an entirely symmetrical parameterisation: , where p ₁=, p ₂=, x ₀=sh ₂−h ₁, y ₀=sh ₂+h ₁, and s=sign (ρ ₁−ρ ₂), for which and Ā _{0, 0, 0, 0}=Id. In the coordinate system (O, , ), where >=(1, 1) and =(−1, 1), h ₁ is a measure of the vertical position of the apex, and h ₂ of its horizontal position.

Hyperbolic functions

Hyperbolic functions are smooth versions of the angle functions; see Figure 2 for function and parameter illustration. They can be defined using five parameters: apex position (x ₀, y ₀), asymptotes rates p ₁, p ₂, and smoothing ɛ:

with the convention sign(0)=0. As expected, . One can also use an entirely symmetrical parameterisation: , with p ₁=, p ₂=, x ₀=sh ₂−h ₁, y ₀=sh ₂+h ₁, ɛ=se, where s=sign(ρ ₁−ρ ₂). We get: and \(\overline{H}\) _{0, 0, 0, 0}=Id.

Angle composition

It may be useful to employ composite functions as one conversion function:

The composition of several conversion functions may cause some parameters to be useless. As an example, the composition of n angle functions is entirely characterised by 2n+2 parameters, which is less than n times the four parameters of an angle.
A particular knowledge (e.g. known asymptotical direction y=x if the transformation is to be local) may simplify the composite function expression and reduce the parameters number.
Parameter meaning may be clearer with the composite function.

In order to better manage the successive composition of functions, it may be interesting to write a composition of n angles as a composition of one angle with four parameters and n−1 angles of two parameters, which gives the 2n+2 degrees of freedom of the global composition.

Let us simply denote by A ₄ an angle with four parameters, and A ₂ an angle with two parameters (of kind ). We are interested in the form of a A ₄○A ₄′…○A ₄″′ composition.

Theorem 1

Any composition of n angles can be reduced to a composition of one angle with four parameters and n−1 angles with two parameters of kind , whatever the position of the angle with four parameters. In particular, any composition of angles can be written in the form
where is the composition of k angles with p parameters, p∈{2, 4}, and . All A ₂ denote angles with two parameters, of the kind , x ₀∈ℝ, p>0, with their apex on the diagonal y=x, and their first slope equal to 1.

Proof This derives from the fact that every composition of two angles can be written as a composition of two angles with two and four parameters, A ₄○A ₄′=A ₄′′○A ₂=A ₂′○A ₄′′′, where all A ₂ are angles with two parameters, of kind , x ₀∈ℝ, p>0. □

The 2n+2 parameters that are necessary to characterise the composite function can be decomposed as 4+2(n−1), and no parameter is useless. The choice of the parameterisation, which was a simple preference, is important: if the inverse function of a two parameters angle did not belong to the same class of function, like for example , one could not establish the previous result without imposing that A ₄ be in the last position.

Shift functions

See Figure 3 for function illustration. Shift functions have first increasing then decreasing derivative, in order to locally adjust hazard rates, through Proposition 2. They are defined as a smoothed version of two angles composite: . Moreover, asymptotic directions are chosen to be one at +∞ and −∞, so that p′= 1/p. Finally, Shift functions are:

For h=0 we get the identity function, and h can be seen as a distance between the two asymptotes. Parameter m localises the centre, and ρ represents the shift speed from an asymptote to the other. Shift functions act in the same way as Wang's transform¹ with a smooth transition between two levels. They may be useful in non-life insurance context, when only the distribution tail has to be changed.

Bump functions

These functions are smooth versions of a three A2-angle composite, with fixed asymptotes with equation y=x at −∞ and +∞, so that the adjustment is local; see Figure 3 for an illustration. Without smoothing, these Bumps correspond to: , with p ₂′=1/(p ₂ p ₂″) and x ₀′=(x ₀″(p ₂″−1)+x ₀ p ₂″(p ₂−1))/(p ₂ p ₂″−1). Changing parameterisation, we define a smooth version as (for ρ≠0):

When ρ≠0, B _{m, h, ρ, γ, ɛ} ⁻¹=B _{m, −h, −ρ, γ, ɛ}, the degenerate case ρ=0 corresponds to the identity function. m represents the horizontal position of the Bump, h its height. Slopes p ₂ and p ₂″ acting on the left-hand side and right-hand side of the Bump, ρ can be seen as the return to asymptote speed, and γ as an symmetry coefficient.

Hyperbolic composite functions

In some situations where increasing the number of parameters is needed for better accuracy, we compose several hyperbolic functions. By Theorem 1, we compose smooth versions of a four-parameter angle and (n−1) two-parameter angles. Choosing the same smoothing parameter for all these functions leads to:

From Theorem 1, all increasing continuous stepwise linear functions with n+1 vertices can be written this way, with any possible position within the composition for the five-parameter hyperbola. We will see in paragraph Initialisation values that initialisation parameters are easy to obtain. These functions are very well adapted for the determination of one monotone analytically invertible parametric function corresponding to a particular data set. This kind of situation usually occurs when one needs to sample from a smoothed empirical distribution.

Estimation and convergence of iterative adjustment

Estimation methodology

Here, we aim at transforming a survival function S ₀∈ in order to get it close to another survival function S∈ . We consider for this purpose the distance D on , defined by

This distance is finite for every couple of due to the integrability hypothesis on elements of .

Remark 1

Let X and X′ be two nonnegative random variables with respective survival functions S and S′. Then ∣EX′−EX∣⩽D(S, S′). Thus, distance D allows to control the difference between the expectations of these two random variables.

Restricting on the family of transformations, with conversion functions (f _θ)_θ parameterised with vector θ, one gets:

One can iterate this process, defining a survival functions sequence (S _n)_n:

Convergence

We try here to check that the constraint C4 holds for most of the conversion functions we have suggested. We write, for S∈ , M(S)={x∈ℝ, S(x)∈]0, 1[}. The set M(S) can be seen as a support interval for the survival function S, or for its underlying random variable. The following theorem shows that any suitable initial survival function S ₀ can be iteratively distorted so that the resulting survival function converges to S. Suitability conditions on S ₀ are only depending on its natural support and on its strict monotony.

Theorem 2 (Convergence to any target)

Let S∈ be a given target survival function. Suppose that S ₀∈ is such that M(S)⊂M(S ₀) and S ₀ is strictly decreasing on M(S ₀). Then, for the families (A _θ)_θ, (H _θ)_θ, and families built by composition from these ones, like (G _θ)_θ,

In particular, any strictly decreasing S ₀∈ on ℝ is suitable for any target S∈ .

Proof Let a, b∈ℝ₊ such that a<b. Consider ɛ>0. Let us first prove that there exists n∈ℕ^* and a finite sequence (t _i)_0⩽i⩽n on [a, b] such that for any survival function S′∈, if S′ coincides with S on all t _i, then

Let N be an integer such that (2+b−a)/N<ɛ. We build the sequence (t _i)_i by induction, and a subset J of ℕ: one first sets t ₀=b and J=∅, then:

if S(t _i ⁻)⩾S(t _i)+1/N, then we add item i to J, and set t _i+1=max (t _i−1/N ², a),
else, we set t _i+1=max (inf{t, S(t)⩽S(t _i)+1/N}, a), so that (if t _i+1>a) S(t _i+1)⩽S(t _i)+1/N and S(t _i+1 ⁻)⩾S(t _i)+1/N.

We stop this induction as soon as a t _i reaches a, and we denote by n this final subscript. This way, the sequence (t _i)_i is strictly decreasing, and for all i<n−2, S(t _i+2)⩾S(t _i)+1/N. This sequence is thus finite, with length at most 2N.

Let S′∈ be such that, for all 0⩽i⩽n, S′(t _i)=S(t _i). Since S′ and S are decreasing,

To end the proof, the key point is that any piecewise affine function from ℝ to ℝ with finite number of apices can be seen as a composition of angle functions. To approach S, S ₀ is then distorted so as to coincide with S at the points x ₁,…, x _n from (2). □

Initialisation values

We suggested using some particular conversion functions: angle compositions or smoothed versions of them like hyperbolas compositions. When estimating parameters of these compositions, it is necessary to start from a good initial value. It is possible to proceed in several ways:

The initial parameter vector may correspond to an identity conversion function if the initial survival function is not too far from the target one. Nevertheless, this choice may lead many optimisation algorithms to a solution far from the optimal solution.
When composing multiple functions, it might be easier to estimate separately optimal parameters for each distortion, and this may lead to initial values for aggregate parameters of the composite function. This choice, however, has to cope with the case where two antagonistic effects compensate each other, as an example when a first conversion function creates a distance from the target, in order to ease the adjustment of a second conversion function.
The simultaneous adjustment of all parameters of the composite function is the solution likely to give best results if initial parameters value is not too far from the optimal vector. We try here to suggest initial values that lead to a correct approximation of the target function.

In this paragraph, one considers conversion functions that are all hyperbolic conversion functions. The smoothing parameter does not seem to be the hardest to estimate, and therefore we focus on the estimation of angle composite functions, with angles defined by two or four parameters.

We suppose that we start from a finite set of abscissas , with I={1,…, p}, for which are given the target survival function and its logit l _i=logit S(x _i), as well as the logit of the current survival function to be distorted α _i=logit Ŝ(x _i). This scatter plot is a finite part of a curve that we write l(α). We are looking for f _θ so that points {(l _i, f _θ(α _i))}_i∈I are as close as possible to the first diagonal Δ, defined by equation y=x. One possibility is to look for a function f̂ _θ that could be able to associate l _i to some of the α _i, for i∈I _k, I _k⊂I. For those points, we could ensure are in Δ. Is it possible to find an angle composite function that reaches all points , I _k⊂I? It is relatively simple, thanks to the following proposition.

Proposition 4

Consider a set of successive points {(u _i, v _i)}_{i∈{1, …, 3+k}} of an increasing curve, with u ₁⩽⋯⩽u _3+k and v ₁⩽⋯⩽v _3+k, k⩾0. The angle composite functions,

and

are such that: G _θ ^(k)(u _i)=v _i, for all i∈{1,…, 3+k}, setting x ₀=u ₂, y ₀=v ₂, p ₁=(v ₂−v ₁)/(u ₂−u ₁), p ₂=(v ₃−v ₂)/(u ₃−u ₂),

and

Proof One easily checks that , and . One then checks by induction that for all i⩽k, G ^(k)(u _i)=v _i. □

Consequently, we use the following process for the initialisation of the parameters vector: a subset (u _i, v _i) is extracted from the set (l _i, α _i), taking care to choose points as far as possible from each other. We take, for example, the minimum abscissa point, the maximum abscissa point and k+1 intermediate points regularly spaced (i.e. k+2 intermediate intervals):

The function L can be chosen as L(i)=∣l _i−l ₁∣, or any other increasing function. By Proposition 4, we deduce initialisation values for all previously presented compositions A ₂○…A ₂○A ₄. By choosing for the smooth parameter ɛ a small positive value, for example starting from ɛ=1 in our applications, we obtain initialisation values for conversion functions of kind H ₂○…H ₂○H ₄. The choice of an initialisation value ɛ≠0 can be explained by continuous differentiability conditions, which ease the convergence of main optimisation algorithms.

Numerical applications

We have chosen here to present applications to survival data analysis. We use in this section some mortality tables from the Human Mortality Database Internet website.^{Footnote 19} These tables are given by death year, for the United States and France (tables for an age bracket of one year, and a death year bracket of one year, denoted 1 × 1, for the whole population, men and women). We call these two tables, respectively, U.S. and France.

In the discrete case, it is necessary to provide a distance measure between the target function and its adjustment, and this distance should be adapted to the discrete character of the problem. We have here a set of points that correspond to the values of the survival function at step n, s _i ⁿ=S _n(x _i) and the set of values of the target survival function s _i=S(x _i), for different points x _i, i∈{1, …, p}. We measure the quality of the adjustment at step n with the following quality index I _Q ⁿ=−log₁₀(p ⁻¹∑_i=1 ^p∣s _i ⁿ−s _i∣).

Catastrophic event modelling

For this application, we distort the French table for the death year 1913, in order to get closer to the French table for the death year 1915, in the very first year of the First World War. The distortion we get aims at showing the ability of conversion functions to model a catastrophic change on a given table, even when this one affects differently the survival probabilities at different ages, and concerns in particular young adults. Survival functions correspond to a product of annual survival probabilities, as if mortality was always in accordance with that of the considered year, 1913 or 1915. The applied model is:

With four parameters, we found a quality index close to two with a symmetric bump, or a little bit better with more parameters. We suggest here adjustments that minimise the distance between the target and the adjusted distribution function. This choice could lead to differences, for example, in annual death probabilities. Depending on the further use of the distortion, it may be preferable to choose other optimisation criteria. We do not develop here such other criteria.

We have used an age bracket from 0 to 104 years (taking a wider bracket would artificially increase the quality index), and results are given in the Table 1. For other adjustments with less than five parameters, it seems here better to use bump functions, which benefit from the fact that the table is not deeply modified for higher ages. This last bump function is illustrated in Figure 4. This quality index with this last five-parameter function is better than the one obtained with seven parameters hyperbolic composed distortions. Thanks to the knowledge of particular properties of the conversion function, we were able to benefit from two parameters less. To improve precision even more, we need to use composed hyperbolic distortions.

Table 1 Adjustment of survival function for death year 1915 with distortions of the survival function for death year 1913

Full size table

Figure 4 shows starting and ending curves, and distortion functions are shown in Figure 4 (right). As it appears in Table 1, the composed hyperbolic distortion H ₂○H ₂○H ₅ is the better distortion to apply. One can see in Figure 4 (right) the symmetry constraints, which gives its special shape to the symmetric bump, and the need for an asymmetry coefficient, which makes the asymmetric bump close to the H ₂○H ₂○H ₅ distortion.

Prospective model, dynamic distortion

A second model consists in representing each table with a hyperbolic or composed hyperbolic distortion, where all parameters are evolving with time. The model is the following one:

Where θ(t) is depending on t, and S ⁰ is the survival function of an exponential law with parameter 1. In order to get a quality index close to three, we use of kind H ₂○H ₅ for conversion function. Indeed, this function gave good results over one unique year.

For the evolution of the parameter θ(t), we make the simple choice of a linear evolution. One may notice that, due to this choice, results could depend on the chosen parameterisation. In particular, a linear evolution on a slope p would not have the same effect than a linear evolution on the logarithm ρ of this slope.

Parameters are given with respect to a reference year. For tables available on death year 1975–2000, we take the middle of the bracket as reference year, that is, t ₀=1990. The chosen age bracket includes all available ages, from 0 to 110 years. The function's parameters are supposed to evolve linearly with considered death year:

t represents the difference between the considered death year and the reference year t ₀.

The results of these dynamic distortions are given in Table 2. One might be afraid of a greater instability of the quality index, but here no death year leads to an index less than 2.4: even for the worst adjustment, the curves of distorted and target survival functions are almost identical. Most adjustments have a quality index greater than 2.6, which represent an error of order 2 × 10⁻³ on survival functions.

Table 2 Simultaneous adjustments of survival functions, for death years 1975–2005, by dynamic distortions of an exponential law of parameter 1, with linear evolution of distortions parameters

Full size table

Lastly, considering narrower age brackets (here from 0 to 110 years), narrower death year bracket (here from year 1975 to 2005), would lead, as previously, to an appreciable improvement in the quality index. The parameters number is here equal to 14, and some parameters have a reduced utility. This number is not so big when looking to the quantity of data. Some parameters remain very stable with death year. The study of which parameters we shall keep is not developed at this time, since the aim of this paragraph is just to show the faculty of some conversion functions to adapt themselves in a prospective framework. We could have suggested another model, by representing all tables in the 1975–2005 period by a distortion of the table of the reference year, for example 1990:

It seems far easier to adapt one mortality table from year 1990 rather than an exponential law of parameter 1, deeply unsuited to human mortality. To give an illustration, the quality index we get by adjusting the 2005 death year table with the 1990 one, for a hyperbolic conversion function like H ₂○H ₅, is 3.13 for French tables, and 3.32 for American tables. The improvement of the quality index is close to 0.1 or 0.2 only, compared to the distortion of an exponential law. We decided here to keep a continuous parametric expression, with an easy analytic inverse function.

Stochastic simulations

Let us suppose first that one can easily compute the inverse function of the initial survival function S of X. The invertibility constraint on distortions allows easy simulation (by inversion method) of the law of the distorted variable X̂: if U is a random variable with a uniform law on ]0, 1[,

When the conversion function depends on an explanatory parameter vector or does evolve with the passage of time (as in the sub-section “Prospective model, dynamic distortion”), this method allows simulation of residual lifetimes in accordance with a mortality table depending on one or several parameters, like a prospective mortality table. As an example, let us denote by f _t a conversion function to be applied to a survival reference function S to model the law of the residual lifetime of someone born at a time t. For someone from this birth date aged u, a random sample of a survival residual lifetime X _u ^t can be obtained from a random variable V generated from a uniform distribution between 0 and 1, by:

Conclusion

Starting from a given initial survival function, iterative distortions allow to converge to any target survival function. This is of great importance when looking for a parametric representation of the distribution of a random variable, especially when distortions parameters change over time. We proposed readily invertible distortions that help simulation of the distorted distributions. Finding the best number of parameters, or investigating risk-measure properties of the distorted random variable are natural perspectives of this work.

Notes

1 See Wang (1996).
2 Heligman and Pollard (1980).
3 Lee and Carter (1992).
4 See Pitacco (2004).
5 d’Alembert (1768)
6 See Bernoulli (1731).
7 See Pradier (1998).
8 Yaari (1987).
9 See Bleichrodt and Eeckoudt (2006) for applications in the actuarial situations.
See Wang (1996), Wirch and Hardy (1999).
Wang (2000).
Hamada and Sherris (2003).
Pelsser (2007).
One can refer to Bühlman (1980), Artzner et al. (1997), Landsman and Sherris (2001), Goovaerts et al. (2004), and to articles quoted therein.
De Jong and Marshall (2007).
See Bagdonavicius and Nikulin (2002).
(see Feller, 1968).
See Brass (1969, 1974).
Human Mortality Database (2008).

References

Artzner, P., Delbaen, F., Eber, J.-M. and Heath, D. (1997) ‘Thinking coherently’, Risk 10: 68–71.
Google Scholar
Bagdonavicius, V and Nikulin, M. (2002) Accelerated life models, Monographs on Statistics and Applied Probability 94, pp 1–334.
Bernoulli, D. ([1731] 1954) ‘Specimen theoriae novae de mensura sortis [Exposition of a New Theory on the Measurement of Risk]’, Econometrica XXI: 223 sqq.
Google Scholar
Bleichrodt, H. and Eeckoudt, L. (2006) ‘Survival risks, intertemporal consumption, and insurance: The case of distorted probabilities’, Insurance: Mathematics and Economics 38: 335–346.
Google Scholar
Brass, W. (1969) ‘A generation method for projecting death rates’, in F. Bechhofer (ed.) Population Growth and the Brain Dead, Birmingham: Edinburgh University Press, pp. 75–91.
Google Scholar
Brass, W. (1974) ‘Mortality models and their uses in demography’, Transactions of the Faculty of Actuaries 33: 122–133.
Google Scholar
Bühlman, H. (1980) ‘An economic premium principle’, ASTIN Bulletin 11: 52–60.
Article Google Scholar
d’Alembert J.L.R . (1768) ‘Vingt-troisième mémoire. V. Sur le calcul des probabilities’, in David (ed) Opuscules Mathématiques Vol. IV, Paris, pp 74–79.
Google Scholar
De Jong, P. and Marshall, C. (2007) ‘Mortality projection based on the Wang transform’, ASTIN Bulletin 37 (1): 149–161.
Article Google Scholar
Feller, W. (1968) An Introduction to Probability Theory and Its Applications, New York; London: John-Wiley & Sons.
Google Scholar
Goovaerts, M.J., Kaas, R., Dhaene, J. and Tang, Q. (2004) ‘Some new classes of consistent risk measures’, Insurance: Mathematics and Economics 34: 505–516.
Google Scholar
Hamada, M. and Sherris, M. (2003) ‘Contingent claim pricing using probability distortion operators: Methods from insurance risk pricing and their relationship to financial theory’, Applied Mathematical Finance 10 (1): 19–47.
Article Google Scholar
Heligman, L and Pollard, J.H. (1980) ‘The age pattern of mortality’, Journal of the Institute of Actuaries 107: 49–80.
Article Google Scholar
Human Mortality Database (2008) University of California, Berkeley (U.S.A.), and Max Planck Institute for Demographic Research, Rostock (Germany) from http://www.mortality.org.
Landsman, Z. and Sherris, M. (2001) ‘Risk measures and insurance premium principles’, Insurance: Mathematics and Economics 29 (1): 103–115.
Google Scholar
Lee, R.D. and Carter, L.W. (1992) ‘Modelling and forecasting U.S. mortality (with discussion)’, Journal of the American Statistical Association 87: 659–675.
Google Scholar
Pelsser, A. (2007) On the applicability of the Wang transform for pricing financial risks, working paper, University of Amsterdam and NETSPAR.
Pitacco, E. (2004) ‘Survival models in a dynamic context: A survey’, Insurance: Mathematics and Economics 35 (2): 279–298.
Google Scholar
Pradier, P.-C. (1998) ‘Concepts et mesures du risque en théorie économique – essai historique et critique’, Thèse ENS-Cachan.
Wang, S.S. (1996) ‘Premium calculation by transforming the layer premium density’, ASTIN Bulletin 26: 71–92.
Article Google Scholar
Wang, S.S. (2000) ‘A class of distortion operators for pricing financial and insurance risks’, The Journal of Risk and Insurance 67 (2): 15–36.
Article Google Scholar
Wirch, J.L. and Hardy, M.R. (1999) ‘A synthesis of risk measures for capital adequacy’, Insurance: Mathematics and Economics 25: 337–347.
Google Scholar
Yaari, M.E. (1987) ‘The dual theory of choice under risk’, Econometrica 55: 95–115.
Article Google Scholar

Download references

Acknowledgements

We wish to thank the anonymous referee and the editor whose comments helped us in improving the content and readability of our paper.

Author information

Authors and Affiliations

Laboratoire SAF, Institut de Science Financière et d’Assurances, Université de Lyon, Université Lyon 1, EA 2429, 50 Avenue Tony Garnier, Lyon, F-69007, France
Alexis Bienvenüe & Didier Rullière

Authors

Alexis Bienvenüe
View author publications
You can also search for this author in PubMed Google Scholar
Didier Rullière
View author publications
You can also search for this author in PubMed Google Scholar

Additional information

^*This work has been funded by ANR Research Project ANR-08-BLAN-0314-01.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bienvenüe, A., Rullière, D. Iterative Adjustment of Survival Functions by Composed Probability Distortions. Geneva Risk Insur Rev 37, 156–179 (2012). https://doi.org/10.1057/grir.2011.7

Download citation

Published: 20 December 2011
Issue Date: 01 September 2012
DOI: https://doi.org/10.1057/grir.2011.7

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Iterative Adjustment of Survival Functions by Composed Probability Distortions

Abstract

Similar content being viewed by others

Robust Paradigm Applied to Parameter Reduction in Actuarial Triangle Models

The estimation for the general additive–multiplicative hazard model using the length-biased survival data

High dimensional nuisance parameters: an example from parametric survival analysis

Introduction

Probability distortions and constraints

Specific constraints

C1: Invertibility

C2: Stability

C3: Regularity

C4: Convergence

C5: Parameterisation

Transformations

Definitions

Impact on random variables

Proposition 1 (From X to X̂ )

Proposition 2 (Hazard rate)

Proposition 3 (Regular variations)

Proof

Conversion functions

Affine functions

Angle functions

Hyperbolic functions

Angle composition

Theorem 1

Shift functions

Bump functions

Hyperbolic composite functions

Estimation and convergence of iterative adjustment

Estimation methodology

Remark 1

Convergence

Theorem 2 (Convergence to any target)

Initialisation values

Proposition 4

Numerical applications

Catastrophic event modelling

Prospective model, dynamic distortion

Stochastic simulations

Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation