4. Profiling and Optimizing the Model#
This example shows how to profile the CashValue_ME_EX1
model and how to optimize it. The optimized model is named CashValue_ME_EX4
, and included in the savings
library. CashValue_ME_EX4
is about 4 times faster than CashValue_ME_EX1
, although the result may be different on different hardware environments. The profiling and model optimization approach shown here is applicable to any models in lifelib that use modelx and heavily depends on pandas.
The profiling takes the following steps: * Measure the run time of CashValue_ME_EX1
using the timeit
standard library. * Profile the model using the start_stacktrace
, get_stacktrace
, stop_stacktrace
functions of modelx. * Output the profiling result and see what cells are taking time.
In this example, time consuming cells are those that are using pandas heavily, so the optimization focuses on replacing pandas DataFrames and Series with numpy arrays and avoiding time-consuming pandas operations. Keep in mind that replacing pandas objects with numpy arrays reduces readability of the data that the model holds. Pandas objects can have more representative indexes compared to integer-intexed numpy arrays. Pandas objects also allow more sphisticated and complex operations on tabular data, so whether to use numpy for speed or pandas for ease of use is a trade-off.
Measuring the runtime of the model#
The code below loads CashValue_ME_EX1
, assigns it to ex1
.
[1]:
import timeit
import pandas as pd
import modelx as mx
[2]:
ex1 = mx.read_model('CashValue_ME_EX1')
By default, in CashValue_ME_EX1
, 1 model point on 10,000 scenarios are set, which would require the same calculation load as running 10,000 model points on 1 scenario.
The product spec id (spec_id
) of the default single model point is A
, which does not have surrender charge. Later in this example, we want to observe how surrender charge rates are looked up based on product specs, so here we change the model point table so that it refers to a table of multiple model points of multiple product specs.
[3]:
ex1.Projection.model_point_table = ex1.Projection.model_point_moneyness # Set multiple model points
ex1.Projection.model_point_table['spec_id'] = ['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'A'] # Set various spec IDs
ex1.Projection.model_point_moneyness
[3]:
spec_id | age_at_entry | sex | policy_term | policy_count | sum_assured | duration_mth | premium_pp | av_pp_init | accum_prem_init_pp | |
---|---|---|---|---|---|---|---|---|---|---|
point_id | ||||||||||
1 | A | 20 | M | 10 | 100 | 500000 | 0 | 500000 | 0 | 0 |
2 | B | 20 | M | 10 | 100 | 500000 | 0 | 475000 | 0 | 0 |
3 | C | 20 | M | 10 | 100 | 500000 | 0 | 450000 | 0 | 0 |
4 | D | 20 | M | 10 | 100 | 500000 | 0 | 425000 | 0 | 0 |
5 | A | 20 | M | 10 | 100 | 500000 | 0 | 400000 | 0 | 0 |
6 | B | 20 | M | 10 | 100 | 500000 | 0 | 375000 | 0 | 0 |
7 | C | 20 | M | 10 | 100 | 500000 | 0 | 350000 | 0 | 0 |
8 | D | 20 | M | 10 | 100 | 500000 | 0 | 325000 | 0 | 0 |
9 | A | 20 | M | 10 | 100 | 500000 | 0 | 300000 | 0 | 0 |
The product specs by spec_id
are defined in product_spec_table
. The is_wl
column indicates whether each type is whole life or not. To save run time and memory, let’s set is_wl
to False
for all the specs.
[4]:
ex1.Projection.product_spec_table['is_wl'] = False
ex1.Projection.product_spec_table
[4]:
premium_type | has_surr_charge | surr_charge_id | load_prem_rate | is_wl | |
---|---|---|---|---|---|
spec_id | |||||
A | SINGLE | False | NaN | 0.00 | False |
B | SINGLE | True | type_1 | 0.00 | False |
C | LEVEL | False | NaN | 0.10 | False |
D | LEVEL | True | type_3 | 0.05 | False |
For the same reason, we reduce the number of scenarios from 10,000 to 1000.
[5]:
ex1.Projection.scen_size = 1000
Now let’s see how much time the model takes for a run. The code below calculates result_pv()
by measuring the run time by timeit
. number=1
indicates the run is performed only once.
[6]:
timeit.timeit('ex1.Projection.result_pv()', globals=globals(), number=1)
[6]:
0.915054
Let’s output the mean of the present value of net cashflows of ex1
, as we want to check it against the result of the optimized model.
[7]:
ex1.Projection.result_pv()['Net Cashflow'].mean()
[7]:
44386401.300826035
Profiling the run#
To profile ex1
, we use modelx’s feature to trace a run. modelx offers 3 functions, start_stacktrace
, get_stacktrace
and stop_stacktrace
, to start, output and stop tracing the call stack during a run. The code block below is an idiomatic expression for using the functions:
try:
mx.start_stacktrace(maxlen=None)
ex1.Projection.result_pv()
df = pd.DataFrame.from_dict(
mx.get_stacktrace(summarize=True), orient="index")
finally:
mx.stop_stacktrace()
ex1.clear_all()
In this example, we want more concise output on what cells are taking time and how much, so we define our custom function that profiles and reports a run using the code block above.
[8]:
def get_time_info(model):
try:
mx.start_stacktrace(maxlen=None)
model.Projection.result_pv()
df = pd.DataFrame.from_dict(
mx.get_stacktrace(summarize=True), orient="index")
finally:
mx.stop_stacktrace()
model.clear_all()
# Remove model and space names from index
prefixlen = len(model.name + '.Projection.')
df.index = [s[prefixlen:] for s in df.index]
# Add duration %
total = df['duration'].sum()
df['dur_perc'] = df['duration'] * 100 / total
df = df[['calls', 'duration', 'dur_perc']]
return df.sort_values(['dur_perc'], ascending=False)
The code below performs a profile run, and output 10 most time-consuming cells.
[9]:
ex1.clear_all() # Clear the result from the previous run
get_time_info(ex1).iloc[:10]
UserWarning: call stack trace activated
UserWarning: call stack trace deactivated
[9]:
calls | duration | dur_perc | |
---|---|---|---|
surr_charge_rate(t) | 121 | 0.529899 | 42.448094 |
premium_pp(t) | 121 | 0.128997 | 10.333420 |
claim_pp(t, kind) | 242 | 0.096036 | 7.693036 |
inv_return_mth(t) | 121 | 0.040032 | 3.206834 |
av_pp_at(t, timing) | 485 | 0.036636 | 2.934734 |
claims(t, kind) | 484 | 0.035000 | 2.803736 |
pols_new_biz(t) | 121 | 0.030402 | 2.435416 |
pols_if_at(t, timing) | 364 | 0.030123 | 2.413071 |
inv_income(t) | 121 | 0.027999 | 2.242901 |
pols_lapse(t) | 121 | 0.027977 | 2.241144 |
The output tells that surr_charge_rate(t)
is consuming time the most, which is more than 40% of the total run time. Its fomula looks like below.
surr_charge_rate(t)
represents the surrener charge rates to be applied at time t
. The surrender charge rates are defined by rate ID (such as type_1
) and duration, and stored in surr_charge_table
as a DataFrame.
[10]:
ex1.Projection.surr_charge_table
[10]:
type_1 | type_2 | type_3 | |
---|---|---|---|
duration | |||
0 | 0.10 | 0.08 | 0.05 |
1 | 0.09 | 0.07 | 0.04 |
2 | 0.08 | 0.06 | 0.03 |
3 | 0.07 | 0.05 | 0.02 |
4 | 0.06 | 0.04 | 0.01 |
5 | 0.05 | 0.03 | 0.00 |
6 | 0.04 | 0.02 | 0.00 |
7 | 0.03 | 0.01 | 0.00 |
8 | 0.02 | 0.00 | 0.00 |
9 | 0.01 | 0.00 | 0.00 |
10 | 0.00 | 0.00 | 0.00 |
surr_charge_table_stacked()
transforms the DataFrame into a Series by combining the row and column indexes of the DataFrame into a MultiIndex.
[11]:
ex1.Projection.surr_charge_table_stacked().head()
[11]:
duration
type_1 0 0.10
1 0.09
2 0.08
3 0.07
4 0.06
dtype: float64
surr_charge_rate(t)
looks up surr_charge_table_stacked()
using a MultiIndex key, which is created from surr_charge_id()
and duration(t)
and other cells as in the formula definition below.
def surr_charge_rate(t):
idx = pd.MultiIndex.from_arrays(
[has_surr_charge() * surr_charge_id(),
np.minimum(duration(t), surr_charge_max_idx())])
return surr_charge_table_stacked().reindex(idx, fill_value=0).set_axis(
model_point().index, inplace=False)
Many pandas operations are involved in the formula. We can perform the same operations with much smaller cost by rewriting the formula and other relevant formulas to use numpy arrays instead of DataFrames or Series. Based on this approach, surr_charge_rate(t)
can be written as:
def surr_charge_rate(t):
ind_row = np.minimum(duration(t), surr_charge_max_idx())
return surr_charge_table.values.flat[
surr_charge_table_column() + ind_row * len(surr_charge_table.columns)]
where surr_charge_table_column()
is a newly introduced cells and is defined as follows:
def surr_charge_table_column():
return surr_charge_table.columns.searchsorted(
surr_charge_id(), side='right') - 1
The new surr_charge_rate(t)
returns a numpy array for each t
, instead of a Series. In the same way, we can make other cells return numpy arrays instead of Series or DataFrames. CashValue_ME_EX4
included in the library is the model to which all the changes are applied to.
The linked page below shows the entire comparison of before and after the changes.
https://www.diffchecker.com/iseYbXUD/
You can see the expression for the returned object has .values
at the end in many of the changed formulas. The .values
property on a DataFrame or a Series is for returning the contained values as a numpy array instead of the DataFrame or the Series.
Now let’s check the speed of the optimized model.
[12]:
ex4 = mx.read_model('CashValue_ME_EX4')
timeit.timeit('ex4.Projection.result_pv()', globals=globals(), number=1)
[12]:
0.23962179999999922
[13]:
ex4.Projection.result_pv()['Net Cashflow'].mean()
[13]:
44386401.300826035
[14]:
ex4.clear_all() # Clear the result from the previous run
get_time_info(ex4).iloc[:10]
UserWarning: call stack trace activated
UserWarning: call stack trace deactivated
[14]:
calls | duration | dur_perc | |
---|---|---|---|
pols_lapse(t) | 121 | 0.037503 | 13.435736 |
premium_pp(t) | 121 | 0.023004 | 8.241311 |
surr_charge_rate(t) | 121 | 0.023000 | 8.240116 |
av_pp_at(t, timing) | 485 | 0.018002 | 6.449362 |
claims(t, kind) | 484 | 0.014502 | 5.195450 |
pols_if_at(t, timing) | 364 | 0.011003 | 3.941879 |
pv_claims(kind) | 4 | 0.010003 | 3.583729 |
lapse_rate(t) | 121 | 0.009001 | 3.224894 |
expenses(t) | 121 | 0.008003 | 2.867000 |
inv_income(t) | 121 | 0.007999 | 2.865804 |