The CashValue_ME Model¶
Overview¶
The CashValue_ME
model is a faster reimplementation of
the CashValue_SE
model.
The CashValue_ME
model reproduces the same results as
CashValue_SE
, but is more suitable for
processing a large number of model points.
Each formula to be applied to all the model points
operates on the entire set of model points at once
with the help of Numpy and Pandas.
The default product specs, assumptions and input data
are the same as CashValue_SE
.
Changes from CashValue_SE¶
Below is the list of
Cells and References that are newly added or
updated from CashValue_SE
.
max_proj_len
<new>mort_table_reindexed()
<new>
Running with 10000 model points¶
The main advantage of the CashValue_ME
model over the
CashValue_SE
model is its speed.
By default, CashValue_ME
is configured
to run the same 4 model points as the ones in CashValue_SE
,
but a larger table of 10000 model points is also included in the model.
The larger model point table is saved in the model folder
as an Excel file named model_point_10000.xlsx,
and this table is read into the model as
a DataFrame named model_point_10000
.
The 10000 model points are all new business at time 0, and
created by modifying the sample model points in basiclife
.
To run the model with the larger model point table,
assign the table to model_point_table
:
>>> Projection.model_point_table = Projection.model_point_10000
In the code above, Projection
must be defined beforehand to
refer to the Projection
space.
Below is the speed result of running the entire 10000 model points on a consumer PC equipped with Intel Core i5-6500T CPU and 16GB RAM.
>>> timeit.timeit("Projection.result_pv()", globals=globals(), number=1)
34.3045132
>>> Projection.result_pv()
Premiums Death ... Change in AV Net Cashflow
policy_id ...
1 5.349200e+07 4.852612e+05 ... 4.465141e+06 1.881696e+06
2 4.446054e+07 1.356468e+07 ... 1.604824e+07 1.177058e+07
3 1.210841e+08 7.179706e+07 ... 2.796228e+07 2.192224e+07
4 3.038400e+07 2.938897e+05 ... 4.840159e+06 4.346383e+06
5 5.989500e+07 3.342985e+05 ... 7.238519e+06 7.671440e+06
... ... ... ... ...
9996 2.067500e+07 5.335816e+05 ... 3.627718e+06 1.951471e+06
9997 6.690600e+07 4.291283e+05 ... 8.984202e+06 4.590207e+06
9998 7.629662e+06 4.007463e+06 ... 1.982635e+06 1.441142e+06
9999 2.835552e+06 1.258307e+06 ... 8.583928e+05 4.917449e+05
10000 1.513202e+07 3.834462e+06 ... 6.555234e+06 3.667102e+06
[10000 rows x 9 columns]
The above run projects all model points for the max length of the entire model points:
>>> Projection.max_proj_len()
1141
Since product A and B are limited term up to 20 years
and C and D are whole life,
it may be more efficient to run the limited term and whole life model
points separately,
because the limited term model points don’t need as long the length
of projection period as C and D model points.
You can do so by defining, for example, seg_id
to
filter model points in the formula of model_point()
.
The code below is an example of the modified formula of model_point()
to filter the model points by seg_id
:
>>> Projection.model_point.formula
def model_point():
""""Target model points
...
"""
cond = model_point_table_ext()['is_wl'] == (True if seg_id == 'WL' else False)
return model_point_table_ext().loc[cond]
Assigning "WL"
to seg_id
results in running only whole life model points,
while assigning anything other than "WL"
results in running
limited term model points:
>>> Projection.seg_id = 'WL'
>>> timeit.timeit("Projection.result_pv()", globals=globals(), number=1)
24.311953799999998
>>> Projection.result_pv()
Premiums Death ... Change in AV Net Cashflow
policy_id ...
2 4.446054e+07 1.356468e+07 ... 1.604824e+07 1.177058e+07
3 1.210841e+08 7.179706e+07 ... 2.796228e+07 2.192224e+07
6 5.051520e+06 3.011482e+06 ... 1.152065e+06 4.933033e+04
9 5.537287e+07 3.686858e+07 ... 1.126645e+07 7.784990e+06
10 2.957650e+07 1.156441e+07 ... 9.877157e+06 6.451814e+06
... ... ... ... ...
9988 5.051377e+05 1.692807e+05 ... 1.715821e+05 1.065215e+05
9989 2.889266e+07 1.769398e+07 ... 5.681889e+06 4.869589e+06
9998 7.629662e+06 4.007463e+06 ... 1.982635e+06 1.441142e+06
9999 2.835552e+06 1.258307e+06 ... 8.583928e+05 4.917449e+05
10000 1.513202e+07 3.834462e+06 ... 6.555234e+06 3.667102e+06
[5015 rows x 9 columns]
>>> Projection.max_proj_len()
1141
>>> Projection.seg_id = 'NWL'
>>> timeit.timeit("Projection.result_pv()", globals=globals(), number=1)
5.201247100000003
>>> Projection.result_pv()
Premiums Death ... Change in AV Net Cashflow
policy_id ...
1 53492000.0 485261.238999 ... 4.465141e+06 1.881696e+06
4 30384000.0 293889.696238 ... 4.840159e+06 4.346383e+06
5 59895000.0 334298.511514 ... 7.238519e+06 7.671440e+06
7 42066000.0 337495.895163 ... 3.513768e+06 1.355059e+06
8 5270000.0 85955.434866 ... 7.032080e+05 -2.137742e+05
... ... ... ... ...
9993 1116000.0 6862.126164 ... 1.498832e+05 6.988384e+04
9994 22050000.0 765048.795780 ... 3.453998e+06 3.159848e+06
9995 3420000.0 24997.118829 ... 6.067274e+05 -4.430946e+04
9996 20675000.0 533581.639585 ... 3.627718e+06 1.951471e+06
9997 66906000.0 429128.288942 ... 8.984202e+06 4.590207e+06
[4985 rows x 9 columns]
>>> Projection.max_proj_len()
1141
To keep the results for both "WL"
and "NWL"
,
you can parameterize Projection
with seg_id
and have Projection['WL']
and Projection['NWL']
as dynamic child spaces of Projection
:
>>> Projection.parameters = ("seg_id",)
>>> Projection['WL'].model_point()
spec_id age_at_entry sex ... surr_charge_id load_prem_rate is_wl
policy_id ...
2 C 29 M ... NaN 0.10 True
3 D 51 F ... type_3 0.05 True
6 D 51 F ... type_3 0.05 True
9 D 59 F ... type_3 0.05 True
10 D 35 F ... type_3 0.05 True
... ... .. ... ... ... ...
9988 C 32 M ... NaN 0.10 True
9989 C 56 M ... NaN 0.10 True
9998 D 45 F ... type_3 0.05 True
9999 D 39 M ... type_3 0.05 True
10000 D 22 F ... type_3 0.05 True
[5015 rows x 14 columns]
>>> Projection['WL'].result_pv()
Premiums Death ... Change in AV Net Cashflow
policy_id ...
2 4.446054e+07 1.356468e+07 ... 1.604824e+07 1.177058e+07
3 1.210841e+08 7.179706e+07 ... 2.796228e+07 2.192224e+07
6 5.051520e+06 3.011482e+06 ... 1.152065e+06 4.933033e+04
9 5.537287e+07 3.686858e+07 ... 1.126645e+07 7.784990e+06
10 2.957650e+07 1.156441e+07 ... 9.877157e+06 6.451814e+06
... ... ... ... ...
9988 5.051377e+05 1.692807e+05 ... 1.715821e+05 1.065215e+05
9989 2.889266e+07 1.769398e+07 ... 5.681889e+06 4.869589e+06
9998 7.629662e+06 4.007463e+06 ... 1.982635e+06 1.441142e+06
9999 2.835552e+06 1.258307e+06 ... 8.583928e+05 4.917449e+05
10000 1.513202e+07 3.834462e+06 ... 6.555234e+06 3.667102e+06
[5015 rows x 9 columns]
>>> Projection['NWL'].model_point()
spec_id age_at_entry sex ... surr_charge_id load_prem_rate is_wl
policy_id ...
1 B 47 M ... type_1 0.0 False
4 A 32 F ... NaN 0.1 False
5 A 28 M ... NaN 0.1 False
7 B 45 F ... type_1 0.0 False
8 B 47 F ... type_1 0.0 False
... ... .. ... ... ... ...
9993 B 29 M ... type_1 0.0 False
9994 A 52 F ... NaN 0.1 False
9995 B 24 M ... type_1 0.0 False
9996 B 47 M ... type_1 0.0 False
9997 B 30 M ... type_1 0.0 False
[4985 rows x 14 columns]
>>> Projection['NWL'].result_pv()
Premiums Death ... Change in AV Net Cashflow
policy_id ...
1 53492000.0 485261.238999 ... 4.465141e+06 1.881696e+06
4 30384000.0 293889.696238 ... 4.840159e+06 4.346383e+06
5 59895000.0 334298.511514 ... 7.238519e+06 7.671440e+06
7 42066000.0 337495.895163 ... 3.513768e+06 1.355059e+06
8 5270000.0 85955.434866 ... 7.032080e+05 -2.137742e+05
... ... ... ... ...
9993 1116000.0 6862.126164 ... 1.498832e+05 6.988384e+04
9994 22050000.0 765048.795780 ... 3.453998e+06 3.159848e+06
9995 3420000.0 24997.118829 ... 6.067274e+05 -4.430946e+04
9996 20675000.0 533581.639585 ... 3.627718e+06 1.951471e+06
9997 66906000.0 429128.288942 ... 8.984202e+06 4.590207e+06
[4985 rows x 9 columns]
While running the entire model points at once took 34 seconds, running the whole life and limited term model points separately took about 30 seconds in total. The whole life segment is about half the size of the entire model points, and takes 24 seconds, while the entire segment takes 34 seconds, which implies the processing speed per model point improves as the number of model points gets larger.
Formula examples¶
Most formulas in the CashValue_ME
model
are the same as those in CashValue_SE
.
However, some formulas are updated since they cannot
be applied to vector operations without change.
For example, below shows how
pols_maturity
, the number of maturing policies
at time t, is defined differently in
CashValue_SE
and in
CashValue_ME
.
def pols_maturity(t):
if duration_mth(t) == policy_term() * 12:
return pols_if_at(t, "BEF_MAT")
else:
return 0
def pols_maturity(t):
return (duration_mth(t) == policy_term() * 12) * pols_if_at(t, "BEF_MAT")
In CashValue_SE
,
policy_term()
returns an integer,
such as 10 indicating a policy term of the selected model point in years,
so the if
clause checks if the value of duration_mth()
is equal to the policy term in month:
>>> policy_term()
10
>>> pols_maturity(120)
65.9357318577613
In contrast, policy_term()
in CashValue_ME
returns
a Series of policy terms of all the model points.
If the if clause were
defined in the same way as in the CashValue_SE
,
it would result in an error,
because the condition duration_mth(t) == policy_term() * 12
for a certain t
returns a Series of boolean values and it is ambiguous
for the Series to be in the if condition.
Further more, whether the if
branch or the else
branch should
be evaluated needs to be determined element-wise,
but the if
statement would not allow such element-wise branching.
Instead of using the if
statement, the formula in CashValue_ME
achieves the element-wise conditional operation by multiplication
by a Series of boolean values.
In the formula in CashValue_ME
,
pols_if_at(t, "BEF_MAT")
returns the numbers of policies at time t for all the model points
as a Series.
Multiplying it
by (duration_mth(t) == policy_term() * 12)
replaces
the numbers of policies with 0 for model points whose policy terms in month
are not equal to t
. This operation is effectively an element-wise if
operation:
>>> policy_term()
poind_id
1 10
2 20
3 95
4 65
dtype: int64
>>> (duration_mth(120) == policy_term() * 12)
poind_id
1 True
2 False
3 False
4 False
dtype: bool
>>> pols_maturity(120)
1 65.935732
2 0.000000
3 0.000000
4 0.000000
dtype: float64
Basic Usage¶
Reading the model¶
Create your copy of the savings library by following
the steps on the Quick Start page.
The model is saved as the folder named CashValue_ME
in the copied folder.
To read the model from Spyder, right-click on the empty space in MxExplorer,
and select Read Model.
Click the folder icon on the dialog box and select the
CashValue_ME
folder.
Getting the results¶
By default, the model has Cells
for outputting projection results as listed in the
Results section.
result_cf()
outputs total cashflows of all the model points,
and result_pv()
outputs the present values of the cashflows
by model points.
Both Cells outputs the results as pandas DataFrame.
By following the same steps explained in the Quick Start page using this model, You can get the results in an MxConsole and show the results as tables in MxDataViewer.
Changing the model point¶
By default, model_point()
returns the entire model_point_table_ext()
:
>>> Projection.model_point.formula
def model_point():
return model_point_table_ext()
The calculations in Projection
apply to all the model points
in model_point_table
.
To limit the calculation target, change the model_point()
formula
so that model_point()
returns a DataFrame that contains
only the target rows.
For example, to select only the model point 1:
>>> Projection.model_point.formula
def model_point():
return model_point_table_ext().loc[1:1]
There are many methods of DataFrame for selecting its rows. See the pandas documentation for details.
When selecting only one model point, make sure that model_point()
returns the model point as a DataFrame not as a Series.
In the code example above, model_point_table_ext().loc[1:1]
is specified instead of model_point_table_ext().loc[1]
,
because model_point_table_ext().loc[1]
would return the model point as a Series.
Also, you should be careful not to accidentally update the original DataFrame
held as model_point_table
.
Model Specifications¶
The CashValue_ME
model has only one UserSpace,
named Projection
,
and all the Cells and References are defined in the space.
The Projection Space¶
The main Space in the CashValue_ME
model.
Projection
is the only Space defined
in the CashValue_ME
model, and it contains
all the logic and data used in the model.
Parameters and References
(In all the sample code below,
the global variable Projection
refers to the
Projection
Space.)
- model_point_table¶
All model points as a DataFrame. By default, 4 model points are defined. The DataFrame has an index named
point_id
.spec_id
age_at_entry
sex
policy_term
policy_count
sum_assured
duration_mth
premium_pp
av_pp_init
Cells defined in
Projection
with the same names as these columns return the corresponding column’s values for the selected model point.>>> Projection.model_poit_table spec_id age_at_entry sex ... premium_pp av_pp_init poind_id ... 1 A 20 M ... 500000 0 2 B 50 M ... 500000 0 3 C 20 M ... 1000 0 4 D 50 M ... 1000 0 [4 rows x 10 columns]
The DataFrame is saved in the Excel file model_point_samples.xlsx placed in the model folder.
model_point_table
is created by Projection’s new_pandas method, so that the DataFrame is saved in the separate file. The DataFrame has the injected attribute of_mx_dataclident
:>>> Projection.model_point_table._mx_dataclient <PandasData path='model_point_table_samples.xlsx' filetype='excel'>
- model_point_10000¶
Alternative model point table
This model point table contains 10000 model points and is saved as the Excel file model_point_10000.xlsx placed in the folder. To use this table, assign it to
model_point_table
:>>> Projection.model_point_table = Projection.model_point_10000
- disc_rate_ann¶
Annual discount rates by duration as a pandas Series.
>>> Projection.disc_rate_ann year 0 0.00000 1 0.00555 2 0.00684 3 0.00788 4 0.00866 146 0.03025 147 0.03033 148 0.03041 149 0.03049 150 0.03056 Name: disc_rate_ann, Length: 151, dtype: float64
The Series is saved in the Excel file disc_rate_ann.xlsx placed in the model folder.
disc_rate_ann
is created by Projection’s new_pandas method, so that the Series is saved in the separate file. The Series has the injected attribute of_mx_dataclident
:>>> Projection.disc_rate_ann._mx_dataclient <PandasData path='disc_rate_ann.xlsx' filetype='excel'>
See also
- mort_table¶
Mortality table by age and duration as a DataFrame. See basic_term_sample.xlsx included in this library for how the sample mortality rates are created.
>>> Projection.mort_table 0 1 2 3 4 5 Age 18 0.000231 0.000254 0.000280 0.000308 0.000338 0.000372 19 0.000235 0.000259 0.000285 0.000313 0.000345 0.000379 20 0.000240 0.000264 0.000290 0.000319 0.000351 0.000386 21 0.000245 0.000269 0.000296 0.000326 0.000359 0.000394 22 0.000250 0.000275 0.000303 0.000333 0.000367 0.000403 .. ... ... ... ... ... ... 116 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 117 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 118 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 119 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 120 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 [103 rows x 6 columns]
The DataFrame is saved in the Excel file mort_table.xlsx placed in the model folder.
mort_table
is created by Projection’s new_pandas method, so that the DataFrame is saved in the separate file. The DataFrame has the injected attribute of_mx_dataclident
:>>> Projection.mort_table._mx_dataclient <PandasData path='mort_table.xlsx' filetype='excel'>
See also
- std_norm_rand¶
Random numbers drawn from the standard normal distribution
A Series of random numbers drawn from the standard normal distribution indexed with
scen_id
andt
. Used for generating investment returns. Seeinv_return_table()
.
- scen_id¶
Selected scenario ID
An integer indicating the selected scenario ID.
scen_id
is referenced in byinv_return_mth()
as one of the keys to select a scenario fromstd_norm_rand
.
- surr_charge_table¶
Surrender charge rates by duration
A DataFrame of multiple patterns of surrender charge rates by duration. The column labels indicate
surr_charge_id()
. By default,"type_1"
,"type_2"
and"type_3"
are defined.
- product_spec_table¶
Table of product specs
A DataFrame of product spec parameters by
spec_id
.model_point_table
andproduct_spec_table
columns are joined inmodel_point_table_ext()
, and theproduct_spec_table
columns become part of the model point attributes. Theproduct_spec_table
columns are read by the Cells with the same names as the columns:>>> Projection.product_spec_table premium_type has_surr_charge surr_charge_id load_prem_rate is_wl spec_id A SINGLE False NaN 0.10 False B SINGLE True type_1 0.00 False C LEVEL False NaN 0.10 True D LEVEL True type_3 0.05 True
Projection parameters¶
This is a new business model and all model points are issued at time 0.
The time step of the model is monthly. Cashflows and other time-dependent
variables are indexed with t
.
Cashflows and other flows that accumulate throughout a period
indexed with t
denotes the sums of the flows from t
til t+1
.
Balance items indexed with t
denotes the amount at t
.
|
Projection length in months |
The max of all projection lengths |
Model point data¶
The model point data is stored in an Excel file named model_point_table.xlsx under the model folder.
Target model points |
|
Extended model point table |
|
|
The sex of the model points |
The sum assured of the model points |
|
The policy term of the model points. |
|
|
The attained age at time t. |
The age at entry of the model points |
|
|
Duration of model points at |
|
Duration of model points at |
Whether surrender charge applies |
|
|
Whether the model point is whole life |
Rate of premium loading |
|
ID of surrender charge pattern |
|
Type of premium payment |
|
Initial account value per policy |
Assumptions¶
The mortality table is stored in an Excel file named mort_table.xlsx
under the model folder, and is read into mort_table
as a DataFrame.
mort_table_reindexed()
returns a mortality table
reshaped from mort_table
, which is a Series
indexed with Age
and Duration
.
mort_rate()
looks up mort_table_reindexed()
and picks up
the annual mortality rates to be applied for all the
model points at time t
and returns them in a Series.
mort_rate_mth()
converts mort_rate()
to the monthly mortality
rate to be applied during the month starting at time t
.
The discount rate data is stored in an Excel file named disc_rate_ann.xlsx
under the model folder, and is read into disc_rate_ann
as a Series.
The lapse by duration is defined by a formula in lapse_rate()
.
expense_acq()
holds the acquisition expense per policy at t=0.
expense_maint()
holds the maintenance expense per policy per annum.
The maintenance expense inflates at a constant rate
of inflation given as inflation_rate()
.
The last age of mortality tables |
|
|
Mortality rate to be applied at time t |
Monthly mortality rate to be applied at time t |
|
MultiIndexed mortality table |
|
Discount factors. |
|
Monthly discount rate |
|
|
Lapse rate |
Acquisition expense per policy |
|
Annual maintenance expense per policy |
|
The inflation factor at time t |
|
Inflation rate |
Policy values¶
By default, the amount of death benefit for each policy (claim_pp()
)
is set equal to sum_assured
.
premium_pp()
is the single premium amount if the model point
represents single premium policies (i.e. premium_type()
is "SINGLE"
),
or the monthly premium amount if the model point represents
level premium policies (i.e. premium_type()
is "LEVEL"
).
|
Claim per policy |
|
Premium amount per policy |
Maintenance fee per account value |
|
|
Cost of insurance rate per account value |
Surrender charge rate |
|
Stacked surrender charge table |
|
maximum index of surrender charge table |
Policy decrement¶
The policy decrement logic of CashValue_ME
is based on that of CashValue_SE
.
except that each relevant formula operates on the entire model points.
|
Number of death |
|
Number of policies in-force |
|
Number of policies in-force |
Initial number of policies in-force |
|
|
Number of lapse |
Number of maturing policies |
|
|
Number of new business policies |
Account Value¶
The account value logic of CashValue_ME
is based on CashValue_SE
.
except that each relevant formula operates on the entire model points.
|
Investment income on account value |
Investment income on account value per policy |
|
Rate of investment return |
|
Table of investment return rates |
|
|
Account value per policy |
Net amount at risk per policy |
|
|
Cost of insurance charges per policy |
Per-policy premium portion put in the account value |
|
|
Maintenance fee per policy |
|
Account value in-force |
|
Premium portion put in account value |
|
Account value taken out to pay claim |
|
Claim in excess of account value |
|
Cost of insurance charges |
|
Maintenance fee deducted from account value |
|
Change in account value |
Check account value roll-forward |
Cashflows¶
Cashflows are calculated as its per-policy amount times the number of policies.
The expense cashflow consists of acquisition expenses at issue and monthly maintenance expenses spent each month.
By default, commissions are defined as 5% premiums.
|
Surrender charge |
|
Claims |
|
Commissions |
|
Premium income |
|
Expenses |
|
Net cashflow |
Margin Analysis¶
net_cf()
can be expressed as the sum of expense and mortality
margins. The expense margin is defined as the sum of
premium loading, surrender charge and maintenance fees
net of commissions and expenses.
The mortality margin is defined coi()
net of claims_over_av()
.
Expense margin |
|
Mortality margin |
|
Check consistency between net cashflow and margins |
Present values¶
The Cells whose names start with pv_
are for calculating
the present values of the cashflows indicated by the rest of their names.
pv_pols_if()
is not used
in CashValue_SE
and BasicTerm_ME
.
|
Present value of claims |
Present value of commissions |
|
Present value of expenses |
|
Present value of net cashflows. |
|
Present value of policies in-force |
|
Present value of premiums |
|
Present value of change in account value |
|
Present value of investment income |
|
Check present value summation |
Results¶
result_cf()
outputs the total cashflows of all the model points
as a DataFrame:
>>> result_cf()
Premiums Claims Expenses Commissions Net Cashflow
0 1.002000e+08 795274.520511 2.016667e+06 5.010000e+06 -1.888257e+06
1 1.982432e+05 782659.768793 1.653397e+04 9.912161e+03 1.102781e+05
2 1.965019e+05 776554.840116 1.640233e+04 9.825093e+03 1.083123e+05
3 1.947758e+05 777512.879071 1.627174e+04 9.738790e+03 1.088861e+05
4 1.930649e+05 770406.958695 1.614219e+04 9.653245e+03 1.082080e+05
... ... ... ... ...
1136 0.000000e+00 0.000000 0.000000e+00 0.000000e+00 0.000000e+00
1137 0.000000e+00 0.000000 0.000000e+00 0.000000e+00 0.000000e+00
1138 0.000000e+00 0.000000 0.000000e+00 0.000000e+00 0.000000e+00
1139 0.000000e+00 0.000000 0.000000e+00 0.000000e+00 0.000000e+00
1140 0.000000e+00 0.000000 0.000000e+00 0.000000e+00 0.000000e+00
[1141 rows x 5 columns]
result_pv()
outputs the present values of the cashflows by model points:
>>> result_pv()
Premiums Death ... Change in AV Net Cashflow
poind_id ...
1 5.000000e+07 1.350327e+05 ... 3.771029e+06 4.957050e+06
2 5.000000e+07 1.608984e+06 ... 8.740610e+06 4.241619e+06
3 2.642236e+07 6.107771e+06 ... 1.104837e+07 6.058365e+06
4 2.201418e+07 1.329419e+07 ... 4.763115e+06 2.514042e+06
[4 rows x 9 columns]
Result table of cashflows |
|
Result table of present value of cashflows |
|
Result table of policy decrement |