Generating sample bond data¶
This notebook generates the sample bond data used by the BasicBond model. Numpy and QuantLib is required to run this notebook.
Columns:
bond_id
: Bond identifiersettlement_days
: 0face_value
: Face value. Uniformly distributed from 10,000 to 1,000,000.issue_date
: Issue date. Uniformly distributed between 1 Jan 2017 to 31 Dec 2021.bond_term
: Bond term in years. Evenly distributed among 5, 10, 15, 20, 25, 30 years.maturity_date
: Matuirty date.issue_date
+bond_term
.tenor
: “6M” or “1Y” to indicate length of time between each coupon payment. The samples are evenry distributed.coupon_rate
: Coupon rate. Uniformly distributed between 0 and 8% by 1%z_spread
: Z-spread. Uniformly distributed from 0% to 4%.
Number of data:
1000
[7]:
import numpy as np
from numpy.random import default_rng # Requires NumPy 1.17 or newer
rng = default_rng(12345)
# Number of Data
DataSize = 1000
# Settlement Days: 0
settlement_days = np.array([0] * DataSize)
# Face Value (Float): 10000 - 1000000
face_value = np.round((1000000 - 10000) * rng.random(size=DataSize) + 10000, -3)
# Issue Date (datetime64): 1 Jan 2017 to 31 Dec 2021
import datetime
date_begin = datetime.date(2017,1,1)
date_end = datetime.date(2022,1,1)
issue_date = (date_end - date_begin) * rng.random(size=DataSize) + date_begin
# Bond Term: 5,..,40 years
import QuantLib as ql
terms = [ql.Period(y, ql.Years) for y in [5, 10, 15, 20, 25, 30]]
bond_term = [terms[i] for i in rng.integers(low=0, high=len(terms), size=DataSize)]
# Maturity Date (datetime64): Issue date + Bond Term
maturity_date = [ql.Date(issue.day, issue.month, issue.year) + term for issue, term in zip(issue_date, bond_term)]
# Coupon Tenor: "6M" or "1Y"
_tenor = ["6M", "1Y"]
tenor = np.fromiter(map(lambda i: _tenor[i], rng.integers(low=0, high=len(_tenor), size=DataSize)), np.dtype('<U2'))
# Coupon Rate: 0% - 8%
coupon_rate = rng.integers(low=0, high=9, size=DataSize) / 100
# Z-spread: 0% - 4%
z_spread = np.round(rng.random(size=DataSize) * 0.04, 4)
[8]:
import pandas as pd
attrs = [
"settlement_days",
"face_value",
"issue_date",
"bond_term",
"maturity_date",
"tenor",
"coupon_rate",
"z_spread"
]
data = [
settlement_days,
face_value,
issue_date,
[y.length() for y in bond_term],
[d.to_date() for d in maturity_date],
tenor,
coupon_rate,
z_spread
]
bond_data = pd.DataFrame(dict(zip(attrs, data)), index=range(1, DataSize+1))
bond_data.index.name = "bond_id"
bond_data
[8]:
settlement_days | face_value | issue_date | bond_term | maturity_date | tenor | coupon_rate | z_spread | |
---|---|---|---|---|---|---|---|---|
bond_id | ||||||||
1 | 0 | 235000.0 | 2017-12-12 | 10 | 2027-12-12 | 1Y | 0.07 | 0.0304 |
2 | 0 | 324000.0 | 2021-11-29 | 25 | 2046-11-29 | 1Y | 0.08 | 0.0304 |
3 | 0 | 799000.0 | 2017-02-03 | 10 | 2027-02-03 | 6M | 0.03 | 0.0155 |
4 | 0 | 679000.0 | 2017-11-19 | 10 | 2027-11-19 | 1Y | 0.08 | 0.0229 |
5 | 0 | 397000.0 | 2018-07-01 | 5 | 2023-07-01 | 6M | 0.06 | 0.0142 |
... | ... | ... | ... | ... | ... | ... | ... | ... |
996 | 0 | 560000.0 | 2019-02-16 | 10 | 2029-02-16 | 1Y | 0.06 | 0.0261 |
997 | 0 | 161000.0 | 2020-03-12 | 30 | 2050-03-12 | 6M | 0.05 | 0.0199 |
998 | 0 | 375000.0 | 2019-05-05 | 5 | 2024-05-05 | 1Y | 0.03 | 0.0138 |
999 | 0 | 498000.0 | 2019-02-21 | 15 | 2034-02-21 | 1Y | 0.03 | 0.0230 |
1000 | 0 | 438000.0 | 2019-03-14 | 30 | 2049-03-14 | 1Y | 0.06 | 0.0256 |
1000 rows × 8 columns
Uncomment the command below to save the data to an Excel file.
[1]:
# bond_data.to_excel("bond_data_sample.xlsx")