Generating sample bond data#

This notebook generates the sample bond data used by the BasicBond model. Numpy and QuantLib is required to run this notebook.

Columns:

  • bond_id: Bond identifier

  • settlement_days: 0

  • face_value: Face value. Uniformly distributed from 10,000 to 1,000,000.

  • issue_date: Issue date. Uniformly distributed between 1 Jan 2017 to 31 Dec 2021.

  • bond_term: Bond term in years. Evenly distributed among 5, 10, 15, 20, 25, 30 years.

  • maturity_date: Matuirty date. issue_date + bond_term.

  • tenor: “6M” or “1Y” to indicate length of time between each coupon payment. The samples are evenry distributed.

  • coupon_rate: Coupon rate. Uniformly distributed between 0 and 8% by 1%

  • z_spread: Z-spread. Uniformly distributed from 0% to 4%.

Number of data:

  • 1000

[7]:
import numpy as np
from numpy.random import default_rng  # Requires NumPy 1.17 or newer

rng = default_rng(12345)

# Number of Data
DataSize = 1000

# Settlement Days: 0
settlement_days = np.array([0] * DataSize)

# Face Value (Float): 10000 - 1000000

face_value = np.round((1000000 - 10000) * rng.random(size=DataSize) + 10000, -3)

# Issue Date (datetime64): 1 Jan 2017 to 31 Dec 2021

import datetime
date_begin = datetime.date(2017,1,1)
date_end = datetime.date(2022,1,1)

issue_date = (date_end - date_begin) * rng.random(size=DataSize) + date_begin

# Bond Term: 5,..,40 years
import QuantLib as ql

terms = [ql.Period(y, ql.Years) for y in [5, 10, 15, 20, 25, 30]]
bond_term = [terms[i] for i in rng.integers(low=0, high=len(terms), size=DataSize)]

# Maturity Date (datetime64): Issue date + Bond Term
maturity_date = [ql.Date(issue.day, issue.month, issue.year) + term for issue, term in zip(issue_date, bond_term)]


# Coupon Tenor: "6M" or "1Y"
_tenor = ["6M", "1Y"]
tenor = np.fromiter(map(lambda i: _tenor[i], rng.integers(low=0, high=len(_tenor), size=DataSize)), np.dtype('<U2'))

# Coupon Rate: 0% - 8%
coupon_rate = rng.integers(low=0, high=9, size=DataSize) / 100

# Z-spread: 0% - 4%
z_spread = np.round(rng.random(size=DataSize) * 0.04, 4)
[8]:
import pandas as pd

attrs = [
    "settlement_days",
    "face_value",
    "issue_date",
    "bond_term",
    "maturity_date",
    "tenor",
    "coupon_rate",
    "z_spread"
]

data = [
    settlement_days,
    face_value,
    issue_date,
    [y.length() for y in bond_term],
    [d.to_date() for d in maturity_date],
    tenor,
    coupon_rate,
    z_spread
]

bond_data = pd.DataFrame(dict(zip(attrs, data)), index=range(1, DataSize+1))
bond_data.index.name = "bond_id"
bond_data
[8]:
settlement_days face_value issue_date bond_term maturity_date tenor coupon_rate z_spread
bond_id
1 0 235000.0 2017-12-12 10 2027-12-12 1Y 0.07 0.0304
2 0 324000.0 2021-11-29 25 2046-11-29 1Y 0.08 0.0304
3 0 799000.0 2017-02-03 10 2027-02-03 6M 0.03 0.0155
4 0 679000.0 2017-11-19 10 2027-11-19 1Y 0.08 0.0229
5 0 397000.0 2018-07-01 5 2023-07-01 6M 0.06 0.0142
... ... ... ... ... ... ... ... ...
996 0 560000.0 2019-02-16 10 2029-02-16 1Y 0.06 0.0261
997 0 161000.0 2020-03-12 30 2050-03-12 6M 0.05 0.0199
998 0 375000.0 2019-05-05 5 2024-05-05 1Y 0.03 0.0138
999 0 498000.0 2019-02-21 15 2034-02-21 1Y 0.03 0.0230
1000 0 438000.0 2019-03-14 30 2049-03-14 1Y 0.06 0.0256

1000 rows × 8 columns

Uncomment the command below to save the data to an Excel file.

[1]:
# bond_data.to_excel("bond_data_sample.xlsx")