4. Profiling and Optimizing the Model#

This example shows how to profile the CashValue_ME_EX1 model and how to optimize it. The optimized model is named CashValue_ME_EX4, and included in the savings library. CashValue_ME_EX4 is about 4 times faster than CashValue_ME_EX1, although the result may be different on different hardware environments. The profiling and model optimization approach shown here is applicable to any models in lifelib that use modelx and heavily depends on pandas.

The profiling takes the following steps: * Measure the run time of CashValue_ME_EX1 using the timeit standard library. * Profile the model using the start_stacktrace, get_stacktrace, stop_stacktrace functions of modelx. * Output the profiling result and see what cells are taking time.

In this example, time consuming cells are those that are using pandas heavily, so the optimization focuses on replacing pandas DataFrames and Series with numpy arrays and avoiding time-consuming pandas operations. Keep in mind that replacing pandas objects with numpy arrays reduces readability of the data that the model holds. Pandas objects can have more representative indexes compared to integer-intexed numpy arrays. Pandas objects also allow more sphisticated and complex operations on tabular data, so whether to use numpy for speed or pandas for ease of use is a trade-off.

Measuring the runtime of the model#

The code below loads CashValue_ME_EX1, assigns it to ex1.

[1]:
import timeit
import pandas as pd
import modelx as mx
[2]:
ex1 = mx.read_model('CashValue_ME_EX1')

By default, in CashValue_ME_EX1, 1 model point on 10,000 scenarios are set, which would require the same calculation load as running 10,000 model points on 1 scenario.

The product spec id (spec_id) of the default single model point is A, which does not have surrender charge. Later in this example, we want to observe how surrender charge rates are looked up based on product specs, so here we change the model point table so that it refers to a table of multiple model points of multiple product specs.

[3]:
ex1.Projection.model_point_table = ex1.Projection.model_point_moneyness  # Set multiple model points
ex1.Projection.model_point_table['spec_id'] = ['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'A'] # Set various spec IDs
ex1.Projection.model_point_moneyness
[3]:
spec_id age_at_entry sex policy_term policy_count sum_assured duration_mth premium_pp av_pp_init accum_prem_init_pp
point_id
1 A 20 M 10 100 500000 0 500000 0 0
2 B 20 M 10 100 500000 0 475000 0 0
3 C 20 M 10 100 500000 0 450000 0 0
4 D 20 M 10 100 500000 0 425000 0 0
5 A 20 M 10 100 500000 0 400000 0 0
6 B 20 M 10 100 500000 0 375000 0 0
7 C 20 M 10 100 500000 0 350000 0 0
8 D 20 M 10 100 500000 0 325000 0 0
9 A 20 M 10 100 500000 0 300000 0 0

The product specs by spec_id are defined in product_spec_table. The is_wl column indicates whether each type is whole life or not. To save run time and memory, let’s set is_wl to False for all the specs.

[4]:
ex1.Projection.product_spec_table['is_wl'] = False
ex1.Projection.product_spec_table
[4]:
premium_type has_surr_charge surr_charge_id load_prem_rate is_wl
spec_id
A SINGLE False NaN 0.00 False
B SINGLE True type_1 0.00 False
C LEVEL False NaN 0.10 False
D LEVEL True type_3 0.05 False

For the same reason, we reduce the number of scenarios from 10,000 to 1000.

[5]:
ex1.Projection.scen_size = 1000

Now let’s see how much time the model takes for a run. The code below calculates result_pv() by measuring the run time by timeit. number=1 indicates the run is performed only once.

[6]:
timeit.timeit('ex1.Projection.result_pv()', globals=globals(), number=1)
[6]:
0.915054

Let’s output the mean of the present value of net cashflows of ex1, as we want to check it against the result of the optimized model.

[7]:
ex1.Projection.result_pv()['Net Cashflow'].mean()
[7]:
44386401.300826035

Profiling the run#

To profile ex1, we use modelx’s feature to trace a run. modelx offers 3 functions, start_stacktrace, get_stacktrace and stop_stacktrace, to start, output and stop tracing the call stack during a run. The code block below is an idiomatic expression for using the functions:

try:
    mx.start_stacktrace(maxlen=None)
    ex1.Projection.result_pv()
    df = pd.DataFrame.from_dict(
        mx.get_stacktrace(summarize=True), orient="index")
finally:
    mx.stop_stacktrace()
    ex1.clear_all()

In this example, we want more concise output on what cells are taking time and how much, so we define our custom function that profiles and reports a run using the code block above.

[8]:
def get_time_info(model):

    try:
        mx.start_stacktrace(maxlen=None)
        model.Projection.result_pv()
        df = pd.DataFrame.from_dict(
            mx.get_stacktrace(summarize=True), orient="index")
    finally:
        mx.stop_stacktrace()
        model.clear_all()

    # Remove model and space names from index
    prefixlen = len(model.name + '.Projection.')
    df.index = [s[prefixlen:] for s in df.index]

    # Add duration %
    total = df['duration'].sum()
    df['dur_perc'] = df['duration'] * 100 / total
    df = df[['calls', 'duration', 'dur_perc']]

    return df.sort_values(['dur_perc'], ascending=False)

The code below performs a profile run, and output 10 most time-consuming cells.

[9]:
ex1.clear_all() # Clear the result from the previous run
get_time_info(ex1).iloc[:10]
UserWarning: call stack trace activated
UserWarning: call stack trace deactivated
[9]:
calls duration dur_perc
surr_charge_rate(t) 121 0.529899 42.448094
premium_pp(t) 121 0.128997 10.333420
claim_pp(t, kind) 242 0.096036 7.693036
inv_return_mth(t) 121 0.040032 3.206834
av_pp_at(t, timing) 485 0.036636 2.934734
claims(t, kind) 484 0.035000 2.803736
pols_new_biz(t) 121 0.030402 2.435416
pols_if_at(t, timing) 364 0.030123 2.413071
inv_income(t) 121 0.027999 2.242901
pols_lapse(t) 121 0.027977 2.241144

The output tells that surr_charge_rate(t) is consuming time the most, which is more than 40% of the total run time. Its fomula looks like below.

surr_charge_rate(t) represents the surrener charge rates to be applied at time t. The surrender charge rates are defined by rate ID (such as type_1) and duration, and stored in surr_charge_table as a DataFrame.

[10]:
ex1.Projection.surr_charge_table
[10]:
type_1 type_2 type_3
duration
0 0.10 0.08 0.05
1 0.09 0.07 0.04
2 0.08 0.06 0.03
3 0.07 0.05 0.02
4 0.06 0.04 0.01
5 0.05 0.03 0.00
6 0.04 0.02 0.00
7 0.03 0.01 0.00
8 0.02 0.00 0.00
9 0.01 0.00 0.00
10 0.00 0.00 0.00

surr_charge_table_stacked() transforms the DataFrame into a Series by combining the row and column indexes of the DataFrame into a MultiIndex.

[11]:
ex1.Projection.surr_charge_table_stacked().head()
[11]:
        duration
type_1  0           0.10
        1           0.09
        2           0.08
        3           0.07
        4           0.06
dtype: float64

surr_charge_rate(t) looks up surr_charge_table_stacked() using a MultiIndex key, which is created from surr_charge_id() and duration(t) and other cells as in the formula definition below.

def surr_charge_rate(t):
    idx = pd.MultiIndex.from_arrays(
        [has_surr_charge() * surr_charge_id(),
         np.minimum(duration(t), surr_charge_max_idx())])

    return surr_charge_table_stacked().reindex(idx, fill_value=0).set_axis(
        model_point().index, inplace=False)

Many pandas operations are involved in the formula. We can perform the same operations with much smaller cost by rewriting the formula and other relevant formulas to use numpy arrays instead of DataFrames or Series. Based on this approach, surr_charge_rate(t) can be written as:

def surr_charge_rate(t):
    ind_row = np.minimum(duration(t), surr_charge_max_idx())
    return surr_charge_table.values.flat[
        surr_charge_table_column() + ind_row * len(surr_charge_table.columns)]

where surr_charge_table_column() is a newly introduced cells and is defined as follows:

def surr_charge_table_column():
    return surr_charge_table.columns.searchsorted(
        surr_charge_id(), side='right') - 1

The new surr_charge_rate(t) returns a numpy array for each t, instead of a Series. In the same way, we can make other cells return numpy arrays instead of Series or DataFrames. CashValue_ME_EX4 included in the library is the model to which all the changes are applied to.

The linked page below shows the entire comparison of before and after the changes.

https://www.diffchecker.com/iseYbXUD/

You can see the expression for the returned object has .values at the end in many of the changed formulas. The .values property on a DataFrame or a Series is for returning the contained values as a numpy array instead of the DataFrame or the Series.

Now let’s check the speed of the optimized model.

[12]:
ex4 = mx.read_model('CashValue_ME_EX4')
timeit.timeit('ex4.Projection.result_pv()', globals=globals(), number=1)
[12]:
0.23962179999999922
[13]:
ex4.Projection.result_pv()['Net Cashflow'].mean()
[13]:
44386401.300826035
[14]:
ex4.clear_all() # Clear the result from the previous run
get_time_info(ex4).iloc[:10]
UserWarning: call stack trace activated
UserWarning: call stack trace deactivated
[14]:
calls duration dur_perc
pols_lapse(t) 121 0.037503 13.435736
premium_pp(t) 121 0.023004 8.241311
surr_charge_rate(t) 121 0.023000 8.240116
av_pp_at(t, timing) 485 0.018002 6.449362
claims(t, kind) 484 0.014502 5.195450
pols_if_at(t, timing) 364 0.011003 3.941879
pv_claims(kind) 4 0.010003 3.583729
lapse_rate(t) 121 0.009001 3.224894
expenses(t) 121 0.008003 2.867000
inv_income(t) 121 0.007999 2.865804