I'd been running my script that generates stock index performance results for three years before I shared the output with colleagues following a particularly provocative period in the markets. To me, they were just a couple of statistical graphs detailing a lot of stock market performance information in a format I could easily digest. My partners, though, were more enamored with the displays than I imagined, one even offering that he loved my performance measurement “dashboard”. A Harvard professor friend said “not bad” -- about the strongest endorsement I can remember from him.
To me, dashboards were about sex and sizzle, about glossy 3-dimensional stacked bar charts and rotated 3-D pie charts, about dials and maps, about meters and ring charts, about gages and cockpits, and about alerts and drill-throughs – about everything that made my open source graphs seem a sickly stepchild by comparison. But my partner offered a soothing -- and important -- reminder: dashboards are less about slick graphics than they are about “effective and comprehensive performance measurement displayed in an easy-to-understand visual format.” Other things equal, high-end graphics are preferred, but effective performance communication is first and foremost.
My new stock portfolio performance “dashboard” is now driven from a script that automatically runs late evenings during the work week, first grabbing index values from the Russell website (russell.com) for 25 portfolios classified by market cap and investment style, then organizing and calculating performance statistics for submission to the graphics functions of the analytics platform R (r-project.org). While R can do the entire job itself, I use the agile language Ruby (ruby-lang.org) to download and reshape the data before passing it on. Open source R is a personal favorite, now lingua franca in the academic statistical community, offering an extensible object oriented language, a wealth of functions/packages routinely embellished by a volunteer community, and strong, integrated graphics. Indeed, with a debt to the work on statistical graphics by William S. Cleveland (http://www.stat.purdue.edu/~wsc/), which showed me that simple and powerful needn't be contentious, the dashboard was a piece of programming cake in R.
Figure 1 details 2007 over time returns for the various Russell portfolios that together represent the U.S. stock market. The graph, a trellis xyplot, consists of a series of panels, each of which displays selected portfolio performance over time for the year. Within each panel, the horizontal line at zero represents the investment break-even point. By design, the panels of trellis graphs have identical axes and scale, allowing the contents of each to be readily compared visually. In addition, for this illustration, the panels are ordered by the market cap size of companies in the portfolios from upper left to bottom right – so top50 represents the very largest firms, while micro cap denotes much smaller companies. Finally, all but the first panel detail multiple plots representing the different investment styles: growth, neutral, and value. One can imagine the difficulty attempting to display all 25 over-time plots in a single visual without the trellis concept. With the order of the panels detailing size, and the the grouping of plots per panel indicating value/growth, this graph provides two-dimensional insight into portfolio performance for the year.
Holders of mid cap and 1000 growth portfolios should be especially pleased in 2007. A change from 2002-2006, growth portfolios have now taken the lead over value. Also a departure from the recent past, large cap portfolios are now more than holding their own against small. Overall, performance for 2007 is quite good through October 5. Note the recovery from the sharp, end-of-summer, mortgage crises-fueled declines. Now if this performance can only persist through the end of year!
As much as I like the simplicity of Figure 1, Figure 2 is the graphic I study most each morning. The building block of Figure 2 is the pedestrian dot plot, also of the trellis variety. Each column of the visual is a separate trellis plot, with growth of an initial $1 investment on the x-axis, size-ordered portfolio labels on the y-axis, value/growth as a grouping variable, and time as a panel variable. The scale of the x-axis is different for each of the three column graphs, reflecting differences in returns among the time periods. Portfolio performance can thus be visually compared across size, value, and time within each column. The first shows a 1-5-20 day snapshot of increasing performance from one to five to twenty days. Within a very gratifying performance period overall, it seems that small has done somewhat better than large over the four weeks, with growth inconclusively besting value. The 2000 neutral portfolio, for example, has increased a whopping 9% in the last 20 trading days. The middle column details the legacy problems of the summer months, with many of the value portfolios in the negative over 90 days, but with an overall clear advantage for growth in each of the three time periods. Finally, the third column highlights the benefits of persistence in the marketplace. Forcing the same x-axis scale on 1 year, 3 year, and 5 year results demonstrates well the multiplicative nature of portfolio returns. $10,000 invested in micro cap five years ago would be worth more than $25,000 today.
Figures 1 and 2 are both quite information rich, with large data/ink ratios and little wasted space. Figure 2 is unusually dense, with a full 225 data points. This magnitude of information display might be impossible with big 3-D graphics. Though there's little sizzle in these visuals, they provide me, a demanding consumer, with much of the information I need to manage my meager wealth. The concepts presented in this article – panels, grouping, dot plots, xyplots, trellis – can be applied to other performance measurement domains as well, especially those demanding high data-density presentation. And the added benefits of an integrated programming, statistical, and graphical environment like R for business analytics should not be overlooked.
R dashboards of statistical graphics like the one presented in this article, while rich in information and “poor” in visual clutter, are static -- lacking the important abilities to interact with and drill into the the underlying data. With these limitations, it is critical to supplement R's strengths with the dashboarding capabilities of other open source BI platforms such as Pentaho and JasperReports. We will continue to review os visual deployments for Dashboard Insight in coming months. Fortunately, the solutions will generally not be as unsexy as the ones presented here. We will, however, remain obsessed with a dictate to design our dashboards for “effective and comprehensive performance measurement displayed in an easy-to-understand visual format.”
About the Author:
Steve Miller is President of OpenBI, LLC, a Chicago-based services firm focused on delivering business intelligence solutions with open source software. A statistician/quantitative analyst by education, Steve has 30 years BI experience. His charter – and OpenBI's – is to help customers manage performance through optimal deployment of analytics. Steve is a columnist for DMReview and writes also for BIReview and the B-Eye-Network. In addition to R, OpenBI specializes in the Pentaho and JasperSoft open source BI platforms and Weka data mining. Steve can be reached at email@example.com.