Dec 21, 2013 at 8:56AM

Caleb Doxsey
After doing some project cleanup recently I ran across this repository and it occurred to me that this project would make for an interesting blog post. It involves a difficult financial analysis problem and cross-language integration. A demo of the application is available here. This post is broken into 3 parts: a description of a problem, an explanation of the solution to that problem and a code walkthrough.

Style analysis is often an important part of making financial decisions. From Investopedia:

## Definition of 'Style Analysis'

The process of determining what type of investment behavior an investor or money manager employs when making investment decisions. Virtually all investors subscribe to a form of investment philosophy, and a prudent analysis of a money manager's style needs to be performed before an investor can determine whether the manager will be good fit for his or her personal investment goals and preferences. investopedia | Style Analysis

Determining the style of a mutual fund is traditionally done using Holdings-Based Style Analysis (HBSA). HBSA works like this: Take the underlying securities of a mutual fund, bucket them according to their asset class (large cap, small cap, fixed income, etc...), and finally aggregate their relative weights into a fund level style breakdown. In other words a 'large value' mutual fund is made up of largely 'large value' holdings.

Though straightforward, HBSA has some problems:

- Holdings information is not always easy to obtain. Data providers might give you the largest components of a mutual fund, but they are rarely willing to give you all of the components. Without all the data it's impossible to determine style accurately.
- Mutual funds are not static, and so merely getting the holdings as of this moment will not provide you with an accurate reflection of the manager's approach. If getting the current holdings information is hard, then you can imagine what getting the holdings information over time is going to be like.

In 1988 William F. Sharpe proposed an alternative solution to the problem called Returns-Based Style Analysis (RBSA). As the name would indicate, RBSA uses the returns of a mutual fund to determine its style. Conceptually the basic idea is represented like this:

- given a series of manager returns:
`M`

- and a series of passive index returns:
`R`

,_{1}`R`

, ...,_{2}`R`

_{N} - determine the weights:
`w`

,_{1}`w`

, ...,_{2}`w`

_{N} - such that constructed portfolio:
`w`

_{1}R_{1}+ w_{2}R_{2}+ ... + w_{N}R_{N} - is a best fit for
`M`

, where 'best fit' is defined as the minimization of the variance of excess return:`variance(M - w`

_{1}R_{1}+ w_{2}R_{2}+ ... + w_{3}R_{3})

Here's an alternative visual representation of what we're trying to do: (from the Handbook of Equity Style Management)

According to several sources this is a quadratic optimization problem. Quadratic optimization is anything but trivial, but there are freely available algorithms which can solve problems like this. In this case I used quadprog from the R statistical programming environment. Released under the GPL license this algorithm solves problems of this form:

this routine uses the Goldfarb/Idnani algorithm to solve the following minimization problem: minimize -d^{T}x + ½ x^{T}Dx where A_{1}^{T}x = b_{1}A_{2}^{T}x ≥ b_{2}the matrix D is assumed to be positive definite. Especially, w.l.o.g. D is assumed to be symmetric.

Transforming our equation into this form we end up with the following.

`D`

is built from a covariance matrix. The matrix is doubled and an additional row and column is added to store the manager's variance at`0, 0`

.

| var(M) 0 0 ... 0 | | 0 2⋅cov(R_{1}, R_{1}) 2⋅cov(R_{1}, R_{2}) ... 2⋅cov(R_{1}, R_{N}) | | 0 2⋅cov(R_{2}, R_{1}) 2⋅cov(R_{2}, R_{2}) ... 2⋅cov(R_{2}, R_{N}) | | 0 ... | | 0 2⋅cov(R_{N}, R_{1}) 2⋅cov(R_{N}, R_{2}) ... 2⋅cov(R_{N}, R_{N}) |

`d`

is built from the covariance between the manager's returns (`M`

) and the index's returns (`R`

), with the first entry being the variance of the manager's returns_{1}to R_{N}

| 2⋅var(M) | | 2⋅cov(R_{1}, M) | | 2⋅cov(R_{2}, M) | | ... | | 2⋅cov(R_{N}, M) |

`A`

&_{1}`b`

contain two constraints: the first weight must always equal 1, and the sum of the remaining weights must also be equal to 1. (All of the index's weights in our portfolio must add up to 1)_{1}

| 1 0 ... 0 | = | 1 | | 0 1 ... 1 | | 1 |

`A`

&_{2}`b`

contain_{2}`N`

constraints: each weight must be greater than or equal to 0 (we don't allow negative holdings)

| 1 0 0 ... 0 | = | 0 | | 0 1 0 ... 0 | | 0 | | ... | | 0 | | 0 0 0 ... 1 | | 0 |

Or at least it's something like that.

So how does one build a solution to this problem?

My goal was to build a Go package that when given a series of index and manager returns it would hand me back the corresponding weights. Like this:

```
package rbsa
...
type (
ReturnsBasedStyleAnalysis struct {
indices []string
returns map[string]Vector
}
)
func (this *ReturnsBasedStyleAnalysis) AddIndex(id string, returns Vector)
func (this *ReturnsBasedStyleAnalysis) Run(returns Vector) (map[string]float64, error)
```

I made 3 additional packages: `lalg`

(for constructing vectors & matrices), `quadprog`

(for calling the underlying library) and `statistics`

(for computing variance, covariance, etc...). `lalg`

and `statistics`

are straightforward, but `quadprog`

is not since it's not native Go code.

As previously stated the optimizer (which actually does most of the work) comes from R and is written in fortran. I don't know fortran, and in my admittedly non-exhaustive search I've not found any documentation for how to call fortran from Go. That said it's not altogether different from calling C.

- First I grabbed the source code, and any dependent libraries (from LAPACK) and I put them in a
`src`

folder in my project. - I compiled these files using gfortran: (this was on windows, linux or osx would be similar)
gfortran -c src\aind.f -o lib\windows\amd64\aind.o gfortran -c src\daxpy.f -o lib\windows\amd64\daxpy.o ...

- From here there were 2 approaches I could take: (1) Load the .o files using a cgo LDFLAGS argument or (2) generate a .syso file. (1) worked for windows and (2) worked for linux and osx. (I prefer (2)) To generate a .syso file do this:
Notice that I also had to include libm. (Incidentally this is the bit that didn't work on windows)
ld -lm -r lib/linux/amd64/*.o -o quadprog_linux_amd64.syso

- The Go glue code can be seen here. I used cgo to invoke the fortran functions and a small bit of c to expose them. Those functions sure aren't pretty:
But they do work.
extern void qpgen2_( double *dmat, double *dvec, int *fddmat, int *n, double *sol, double *lagr, double *crval, double *amat, double *bvec, int *fdamat, int *q, int *meq, int *iact, int *nact, int *iter, double *work, int *ierr );

- With everything ready to go, I put together a simple test and ran
`go test`

to make sure it worked. Because of the way this was put together this library is go gettable (on amd64 for windows, linux & osx at any rate) and the rather messy details of its implementation are hidden to the end user.

With quadprog ready there were 3 remaining tasks:

- First I implemented the math described above in rbsa
solution, err := quadprog.Solve(fixedExtendedCovarianceMatrix, covarianceVector, constraintMatrix1, constraintVector1, constraintMatrix2, constraintVector2, )

- Next I needed to get index & mutual fund data. Google (in typical fashion) killed off their finance product, but amazingly Yahoo still provides data in CSV format. I defined a set of asset classes (using ETFs):
And put together some code to get the Yahoo data:
DEFAULT_INDICES = map[string]string{ "IWB": "Large Cap", "IWD": "Large Cap Value", "IWF": "Large Cap Growth", "IWM": "Small Cap", "IWN": "Small Cap Value", "IWO": "Small Cap Growth", "IWR": "Mid Cap", "EEM": "Emerging Markets", "ICF": "Real Estate", "EFA": "International", "AGG": "Fixed Income", }

func getYahoo(symbol string, year, month int) (*http.Response, error) { u := "http://ichart.finance.yahoo.com/table.csv?s=" + url.QueryEscape(symbol) + fmt.Sprint("&a=", (month-1), "&b=5&c=", (year-4), "&d=", (month-1), "&b=5&c=", year, "&ignore=.csv") log.Println("GET", u) res, err := http.Get(u) if err != nil { log.Println("- ERROR", err) return nil, err } if res.StatusCode/100 != 2 { return nil, fmt.Errorf("Error retrieving data for %v", symbol) } log.Println("-", res.Status) return res, err }

- Finally I threw together a very simple UI and used Google Charts to draw the pie chart.

And there you have it a complete, working, hybrid go-fortran, web-based, financial analysis tool.

A version of this project was originally done for client work using C#, and then later Node.js here. (source for the latter is available here)