Sparklines galore

Sparklines galore

Remember, when we were young children, we liked to ask the other kids, “What is heavier – a kilogram of cotton or a kilogram of nails?” when we already knew the correct answer and wanted to see their reaction. This post is dedicated to this problem, but in the field of web development :) Okay, I will not outwear your patience. We’ll talk about the optimal way to display a large number of small graphs on the same page.

I will begin with a little background. Deep Shift Labs; the company I have the great honor to work at, from 2002 is engaged in the development and support of a very large system for one of their clients. When developing a new application for this system, we decided to use small charts called sparklines. What they are, how we found them, and how we decided to implement them are described in this post. For those who have not read it I would like to explain why we decided to generate graphs with PHP. All reports this system generates are also available in PDF and Excel. If you implement sparklines in JS, then for their insertion into PDF or Excel we will still need to generate an image. So why do the same job twice?

Let’s go back to a developed application. Here is how a page displays the history of two parameters changes for each patient during last 12 months.

We have about 100 lines on a web page

Everything would be fine, but the customer has decided not to use pagination to be able to conveniently scroll all patients in the same facility and quickly update data or spot problems. As a result we got a page that contains about a hundred or more lines, each of which has two sparklines. I must say that we use charts 115×30 pixels, and the average file size is from 250 bytes for a line chart to 500 bytes for a bar chart.

So, our task was to find the best solution to generate and display graphs on such a long page. There are two common approaches to solving this problem – generate graphs “on the fly” or use pre-generated graphs. In order to choose which approach to use, we conducted an experiment and measured the page load speed. Here it should be pointed out that the client’s standard browsers are IE8 and IE9 and we used AOL Pagetest (pagetest.wiki.sourceforge.net) for measurements.

First, we checked the option of generating graphs “on the fly”. The test page contained a table with 96 rows, each with two text columns and two charts. For each experiment we did three consecutive measurements and this is what we got for the first experiment:

Page load time: 58.991 seconds, 60.278 seconds, 61.219 seconds

Before testing the second approach we had to resolve some security concerns. Our application uses a strict system to control users’ access rights – a page can be seen only by the user with proper permissions. And here we pondered how to make sure that a user (meaning an authorized user of course) could only see the charts of patients who they have permission for. If the graphs will be shown with HTML IMG tag, then they should be in a location accessible by a web server, right? I.e., there is nothing stopping a user to access image URL directly in a browser bypassing the security system. And if we name graphic files simply, like patient ID, anyone can access any graph. We could not do it this way. In other applications of our system, for files that should be displayed on a web page, but require strict access control, we store such files above a web server root (not available directly from the browser), and display them via a special web page which controls access rights. These scripts are responsible for checking the access rights. Even though permission checks involve reading data from user’s session, i.e. do not involve a database, it still takes time and impacts page loading times. Could there be a better and faster way to restrict access to files, than having an intermediate script checking permissions and reading a file to provide access to it? We thought about it and decided that a reasonable compromise is to store files in a web server folder, but hash their names. Thus, files are still available to all authorized users of the system, but there is no way to identify what patient they belong to, and without this information they become less of an interest and it will be difficult to view the files, just going through URL’s.

Thus, we needed to test the two approaches for pre-generated graphs: with an intermediate script and direct use of the page file with hashed names. Here are the results where a script was used to check access rights:

Page load time: 61.258 seconds, 60.643 seconds, 58.5 seconds

And here is the approach with a direct use of files:

Page load time: 3.345 seconds, 3.173 seconds, 3.182 seconds

Here is the result we wanted to share. It turns out that even for small images in quantities 200 pieces per page or more, reading and displaying with a script “eats” all the benefits from their pre-generation. We chose to use hashed file names, however, you may choose a different approach, especially if the displayed data are strictly confidential. In any case, we hope that you will find this post useful.

Alex

Print this post | Home

Comments are closed.