Skip to content

Commit

Permalink
Cleaned version-Chapter 7 and Chapter 8
Browse files Browse the repository at this point in the history
  • Loading branch information
tatsath committed May 23, 2020
1 parent 9b083fd commit da79212
Show file tree
Hide file tree
Showing 11 changed files with 314 additions and 585 deletions.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,7 @@
"\n",
"# Yield Curve Construction\n",
"\n",
"Interest rates provide a fairly good standard for applying PCA and getting a good feel for the characteristics of the interest rate curves. In this case study we use Principal Component Analysis (PCA) to generate the \"typical\" movements of a yield curve and show that the first 3 principal components correspond to yields, slope, and curvature respectively. We implement PCA for the treasury rate \n",
"\n",
"\n",
"Things to focus on in this case study: \n",
"\n",
"* Understand the intuition behind the eigenvectors or the pricinpal components.\n",
"* Using lower number of dimensions after performing dimensionality reduction to reproduce the actual data. \n",
"* Visualise the results and reconstruct the original data using the principal components.\n",
"* Loading data from the extenal sources such as quandl.\n"
"In this case study we use principal component analysis (PCA) to generate the typical movements of a yield curve "
]
},
{
Expand All @@ -32,7 +24,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"* [1. Introduction](#0)\n",
"* [1. Problem Definition](#0)\n",
"* [2. Getting Started - Load Libraries and Dataset](#1)\n",
" * [2.1. Load Libraries](#1.1) \n",
" * [2.2. Load Dataset](#1.2)\n",
Expand All @@ -55,29 +47,22 @@
"metadata": {},
"source": [
"<a id='0'></a>\n",
"# 1. Introduction"
"# 1. Problem Definition"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Our goal in this jupyter notebook is to understand how to work through a dimensionality reduction models problem end-to-end. This notebook is applicable for all kinds of dimensionity reduction problems.\n",
"\n",
"\n",
"We import the pandas dataframe containing all the adjusted closing prices for all the companies in the DJIA as well as the DJIA's index. Because our starting date was the year 2000, the adjusted closing prices for Dow Chemicals and Visa appear as Not a Number values. For this we will take away both respective columns. We will end up with 28 columns of companies information and an additional one for the DJIA index.\n",
"\n",
"We use the metrices of the portfolio performance defined below:\n",
"\n",
"Sharpe Ratio: The sharpe ratio explains the annualized returns against the annualized volatility of each company in a portfolio. A high sharpe ratio explains higher returns and lower volatility for the specified portfolio.\n",
"\n",
"Annualized Returns: We have to apply the geometric average of all the returns in respect to the periods per year (days of operations in the exchange in a year).\n",
"Our goal in this case study is to use dimensionality reduction techniques to generate\n",
"the “typical” movements of a yield curve.\n",
"The data used for this case study is obtained from Quandl. \n",
"\n",
"Annualized Volatility: We have to take the standard deviation of the returns and multiply it by the square root of the periods per year.\n",
"Annualized Sharpe: we compute the ratio by dividing the annualized returns against the annualized volatility.\n",
"Optimized Portfolio\n",
"\n",
"We compute an iterable loop to compute the principle component's weights for each Eigen Portfolio, which then uses the sharpe ratio function to look for the portfolio with the highest sharpe ratio. Once we know which portfolio has the highest sharpe ratio, we can visualize its performance against the DJIA Index to understand how it outperforms it.\n"
"Quandl is a premier\n",
"source for financial, economic and alternative datasets. We use the data of 11 tenors\n",
"(from 1 month to 30 years) of the treasury curves. The frequency of the data is daily\n",
"and the data is available from 1960 onwards\n",
"\n"
]
},
{
Expand Down Expand Up @@ -132,8 +117,8 @@
"metadata": {},
"outputs": [],
"source": [
"quandl.ApiConfig.api_key = 'tH8x1csKSWgxUcqdRifB'\n",
"#quandl.ApiConfig.api_key = 'QUANDL_API_KEY'"
"#The API Key can be optained from Quandl website by registering. \n",
"quandl.ApiConfig.api_key = 'QUANDL_API_KEY'"
]
},
{
Expand Down Expand Up @@ -636,7 +621,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Looking at the movement of the rates"
"Let us look at the movement of the yield curve. "
]
},
{
Expand Down Expand Up @@ -668,7 +653,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Taking a look at the correlation. More detailed look at the data will be performed after implementing the Dimensionality Reduction Models."
"In the next step we look at the correlation."
]
},
{
Expand Down Expand Up @@ -728,7 +713,9 @@
"source": [
"<a id='3.1'></a>\n",
"## 4.1. Data Cleaning\n",
"Check for the NAs in the rows, either drop them or fill them with the mean of the column"
"We check for the NAs in the data, either drop them or\n",
"fill them with the mean of the column and the steps are same as mentioned in previ‐\n",
"ous case studies."
]
},
{
Expand Down Expand Up @@ -883,11 +870,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"All the variables should be on the same scale before applying PCA, otherwise a feature with large values will dominate the result. Below I use StandardScaler in scikit-learn to standardize the dataset’s features onto unit scale (mean = 0 and variance = 1).\n",
"\n",
"Standardization is a useful technique to transform attributes with a Gaussian distribution and\n",
"differing means and standard deviations to a standard Gaussian distribution with a mean of\n",
"0 and a standard deviation of 1."
"All the variables should be on the same scale before applying PCA, otherwise a feature with large values will dominate the result. We use StandardScaler in sklearn to standardize the dataset’s features onto unit scale (mean = 0 and variance = 1)."
]
},
{
Expand Down Expand Up @@ -1158,7 +1141,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"We compute several functions to determine the weights of each principle component. We then visualize a scatterplot that visualizes an organized descending plot with the respective weight of every company at the current chosen principle component."
"We\n",
"first have a function to determine the weights of each principal component. We then\n",
"perform the visualization of the principal components."
]
},
{
Expand Down Expand Up @@ -1240,23 +1225,13 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Looking at the the interpretation of the first three principal components, they should correspond to:\n",
"Looking at the the interpretation of the first three principal components, they correspond to:\n",
"\n",
"__Principal Component 1__: Directional movements in the yield curve. These are movements that shift the entire yield curve up or down.\n",
"\n",
"__Principal Component 2__: Slope movements in the yield curve. These are movements that steepen or flatten (change the first derivative wrt maturity) the entire yield curve.\n",
"\n",
"__Principal Component 3__: Curvature movements in the yield curve. These are movements that change the curvature (or the second derivative wrt maturity) of the entire yield curve.\n",
"\n",
"The detailed interpretation is as follows:\n",
"\n",
"__Principal Component 1 (PC1)__: All of the tenors of Treasury Rates are weighted in the same direction. This means that PC1 reflects movements that causes IRS of all maturities to move in the same direction. This corresponds to directional movements in the yield curve,if the yield curve goes up, all yields go up be it the short end or the long end and vice versa.\n",
"\n",
"__Principal Component 2 (PC2)__: Treasury Rates on the short end of the curve are weighted negatively and the ones reaching the long end (y10, y20 and y30) are weighted positively. This means that PC2 reflects movements that cause the short end to go in one direction and the long end in the other. This is exactly what slope movements do -- if the yield curve steepens, the short end goes down and the long end goes up and vice versa if the yield curve flattens.\n",
"\n",
"__Principal Component 3 (PC3)__: Treasury Rates on the short and long ends of the curve are weighted positively while the ones in the middle are weighted negatively. This means that PC3 reflects movements that cause the short and long end to go in one direction, and the middle to go in the other. This is exactly what curvature movements do -- if the yield curve increases in curvature, the short and long end goes down while the middle goes up.\n",
"\n",
"Hence PC1 can be interpreted as directional movements, PC2 as slope movements, and PC3 as curvature movements."
"__Principal Component 3__: Curvature movements in the yield curve. These are movements that change the curvature (or the second derivative wrt maturity) of the entire yield curve."
]
},
{
Expand Down Expand Up @@ -1297,7 +1272,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"One of the key features of PCA is the ability to reconstruct the initial dataset using the outputs of PCA. Using the simple matrix reconstruction, we can generate an approximation/almost exact replica of the initial data.\n",
"Using the simple matrix reconstruction, we can generate an approximation/almost exact replica of the initial data.\n",
"\n",
"Mechanically PCA is just a matrix multiplication:\n",
"\n",
Expand Down Expand Up @@ -1343,7 +1318,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"As shown in the picture above the replicated Treasury Rate Chart with three principal components is able to replicate the original data quite well"
"Figure above shows the replicated treasury rate chart."
]
},
{
Expand All @@ -1352,13 +1327,15 @@
"source": [
"__Conclusion__\n",
"\n",
"* The pricipal components are quite intuitive for this case study. The first three principal components represent directional movements, slope movements, and curvature movements respectively. \n",
"\n",
"* Given all the tenors of the treasury rate are represented by just three principal components leading to significant dimensionality reduction and these three principal components can be used to reconstruct the original time series. \n",
"We demonstrated the efficiency of dimensionality reduction and principal components analysis in reducing the number of dimension and coming up with new intuitive feature.\n",
"\n",
"* The eigenvectors can be visualised to understand the intuitive drivers of the time series changes.\n",
"\n",
"* On a different note, we have loading data from the extenal sources (such as quandl) for this case study. \n"
"The first\n",
"three principal components explain more than 99.5% of the variation and represent\n",
"directional movements, slope movements, and curvature movements respectively.\n",
"Overall, by using principal component analysis, analyzing the eigen vectors and\n",
"understanding the intuition behind them, we demonstrated how the implementation\n",
"of a dimensionality reduction lead to fewer intuitive dimensions in the yield curve.\n"
]
}
],
Expand Down
Loading

0 comments on commit da79212

Please sign in to comment.