A minor boast: it’s gratifying to see the hard work Dr. Roederer and I have done on SPICE is being put to good use. The paper has been cited* a number of times. This week I went back to work for Dr. Roederer full-time to continue improving SPICE and to plan and build yet another bioinformatics tool.

* I also just discovered Google Scholar. :-)

 

SPICE was cited in a paper published in the Journal of Virology this month.

Human Immunodeficiency Virus Type 1 Infection is Associated with Increased NK Cell Polyfunctionality and Higher Levels of KIR3DL1+ NK Cells in Ugandans Carrying the HLA-B Bw4 Motif

Michael A. Eller,1,2,3,* Rebecca N. Koehler,3 Gustavo H. Kijak,3 Leigh Anne Eller,2,3 David Guwatudde,2,4 Mary A. Marovich,3 Nelson L. Michael,3 Mark S. de Souza,3,5 Fred Wabwire- Mangen,2,4 Merlin L. Robb,3 Jeffrey R. Currier,3 and Johan K. Sandberg,1

 

SPICE was published in the Wiley journal Cytometry on January 7, 2011 and is listed on the Vaccine Research Center’s list of 2011 publications.

SPICE: Exploration and analysis of post-cytometric complex multivariate datasets

Mario Roederer, Joshua L. Nozzi, Martha X. Nason

Cytometry

DOI: 10.1002/cyto.a.21015

Abstract

Polychromatic flow cytometry results in complex, multivariate datasets. To date, tools for the aggregate analysis of these datasets across multiple specimens grouped by different categorical variables, such as demographic information, have not been optimized. Often, the exploration of such datasets is accomplished by visualization of patterns with pie charts or bar charts, without easy access to statistical comparisons of measurements that comprise multiple components. Here we report on algorithms and a graphical interface we developed for these purposes. In particular, we discuss thresholding necessary for accurate representation of data in pie charts, the implications for display and comparison of normalized versus unnormalized data, and the effects of averaging when samples with significant background noise are present. Finally, we define a statistic for the nonparametric comparison of complex distributions to test for difference between groups of samples based on multi-component measurements. While originally developed to support the analysis of T cell functional profiles, these techniques are amenable to a broad range of datatypes. Published 2011 Wiley-Liss, Inc.

View & Download Full Article

View | Download PDF

SPICE Web Site

SPICE is a product of NIH/NIAID and is hosted here: http://exon.niaid.nih.gov/spice

 

A quick vignette of a passing interaction at work.

I snagged my finger on a sharp edge and yelled, “Ouch! Sonofabitch!”

A biologist coworker – sounding perhaps too eager – said, “What’s wrong? Are you bleeding?”

What I said: “No, I’m alright.”

What I thought: Stay back! There’s no blood! Damned vulture biologists! Always trying to make multiple-assed monkeys with the blood of the unsuspecting!

Yeah … I should really keep my thoughts on the inside.

 

I had missed this (or, rather, Google Alerts had). AlphaVax had apparently given a presentation titled Flow-Cytometric Evaluation of T-Cell Responses Elicited by an Alphavirus Replicon Particle Vaccine for Cytomegalovirus (CMV) in Healthy Adults sometime in June.

Quite a mouthful. The presentation describes how AlphaVax used multiparameter flow cytometry to explore one of their vaccine candidates. They then used – among other applications – SPICE 5 to do their data mining, visualization, and production of presentation-quality graphs for their presentation.

Samples were fixed and stored at 4°C until acquisition on an LSRII flow cytometer. Flow data was analyzed with FlowJo version 8.7 (Tree Star, Ashland, OR). Further data analysis was performed using software (PESTLE and SPICE, version 5.0) provided by M. Roederer and J. Nozzi, NIAID, NIH.

It makes me proud to know my work is directly helping researchers find vaccines for “what ails ya.” Very gratifying.

 

My recent announcement on Twitter about the release of SPICE 5.1 public beta prompted a few questions. I also get questions from the occasional interested developer, friend, or family member, so I thought I’d describe it a bit.

My Day Job

Your taxes (and mine) pay my salary. My employer is Lockheed Martin, MSD, Inc. and I work as a contract software developer for the Bioinformatics and Computational Biosciences Branch of the National Institute of Allergy and Infectious Diseases (part of the US Dept. of Health & Human Services).

BCBB is providing their bioinformatics and technology expertise to the VRC (Vaccine Research Center), among other NIAID initiatives. I’ve been working on a software application called SPICE for the VRC for the last three years.

The Problem

The Vaccine Research Center does as their name implies: they research possible vaccines for things like HIV, H1N1, and others. To do this, they administer a test substance and take lots of blood samples over time, looking for a useful immune response.

The blood samples (for many subjects, over different time points, with different vaccination methods) are all washed with a biomarker (a kind of stain that binds to certain cell types or cells that carry out a particular function) run through a machine called a flow cytometer. The flow cytometer bounces different color (wavelength) lasers off the individual cells and detects the presence of these biomarkers, which fluoresce when hit with a particular wavelength.

All this let the researchers know what percentage of cells fall under a specific category. For example, one category might be “what percentage of cells belong to Subject A, one week after the vaccination challenge, that are CD4, Interleukin 2 positive, and Interfereon-gamma negative?”

Now, what do we do with 20,000 categorical measurements? They have to be evaluated in a number of different ways to see if an immune response occurs, what type, when, and how strong (among other things).

You might think, “Okay, load all that up in Excel and start making charts.” There are a number of problems with that approach, but I’ll simplify them: Excel is slow and cumbersome when used for setting up and dynamically modifying complicated data views and its graphing capabilities are vast but don’t quite cover what the researchers need to see.

The researchers may want to average all the subjects together (because average response is more useful than that of an individual), overlay by vaccination type (because an injection might be a better vector than an inhalant), and ignore CD4 cells, looking only at CD8s. Then they might want to eliminate a few subjects who didn’t complete the study. Then they might want to turn the whole thing on its head and compare individual responses, grouping by their study group. All these custom data views would take a lot of effort to configure and reconfigure in Excel.

That’s where SPICE comes in: Simplified Presentation of Incredibly Complex Evaluations. This application allows you to cut, shuffle, recombine, compare, isolate, and reorder this vast categorical data set in near-real-time by drag-and-drop. What’s more, you can choose graph types, format the print-quality graphs, then drag them straight from the graph view into PowerPoint for a nice presentation of your findings, or into Word to write a paper. All this you can do in minutes (even seconds) versus hours.

My Part

I was brought on board to build SPICE 5, a complete rewrite to an in-house tool written and personally maintained by Dr. Mario Roederer. Bright guy.

Through version 4, Mario used an older Mac development library called Carbon. With Carbon’s future looking bleak and Cocoa designated as Apple’s New Hotness for Mac OS X apps, it looked like SPICE needed a good update. With ever-increasing duties keeping him from delving into an entirely new API, Mario needed an experienced Cocoa developer, preferably with a background in biology.

Instead, he got me.

The goal was to bring SPICE over to the Cocoa world – to modernize it and maybe add a few new features. As any developer would expect, there was plenty of scope creep. It took over two years to meet the extra demands placed on the new version (bigger data sets, more options, etc.).

SPICE 5.0 (the version I wrote) gained the ability to handle much larger data sets (while being more efficient with memory) by virtue of using a hash table instead of a sparse matrix. It also gained more formatting features and an easier-to-use data view control panel.

But it was pretty slow for a variety of reasons. Even slower than what we affectionately call, “Old Spice” (cue the whistled Old Spice tune). It also mirrored the “saved settings” mechanism of SPICE 4, where a particular configuration (a “data view” in proper terminology) was saved from the UI or loaded back into the UI.

I’m not a biologist. I’m not a scientist. I don’t even have a Computer Science degree. It took a lot of effort to build version 5.0 and I had no problem building it to “mirror” version 4 as much as possible. Once it was done and I saw what I had made, I realized I understood the problem very well by that point and I began to have some ideas of my own. SPICE 5.1 was very much on my mind.

I think I’ll end this post here as it answers the question of what I’m working on for Uncle Sam. In short, it’s an application that helps researchers identify good vaccine candidates for things like HIV and H1N1.

Up Next

In my next post, I’ll focus on the development aspect of version 5.1 for you developer types out there.

Disclaimer

Opinions and view points expressed in this article are my own. I do not speak for the US Government or Lockheed Martin, MSD, Inc.

 

Today I was proud to stand with my colleagues and receive the 2009 National Institute of Allergy & Infectious Diseases Chief Information Officer Award for our hard work on the Papillomavirus Episteme (PaVE).

PaVE is a public-facing research tool for biologists researching the Papillomaviridae family of viruses. I was (and continue to be) the usability analyst and UI designer for the project. The team includes Yasmin Mohomoud, Vivek Gopalan, Sandya Bandaru, Qina Tan, Jason Barnett, and Yongjian (Jason) Guo, Leo (Li) Lu, as well as myself. Congratulations, all!

My 2009 NIAID CIO Award for My Work on PaVE

C. Montgomery Burns Voice: Ehhhxcellence.

 

This past Thursday, my group had its first annual “Bioinformatics Festival”.

We had some informational talks, software demonstrations, and a “Genius Booth” (yes, we’re mostly Apple fans in our group and we all use Macs, as do many of the researchers within NIAID). It was intended as a “who-we-are-and-what-we-can-do-for-you” pitch and it went rather well.

Into the Fray

My role was (supposed to be) minimal, merely being available the last two hours of the day to talk about and demonstrate SPICE 5 to those who are interested. Whether fortunate or not for my employers, I felt a bit more extroverted than usual that day, and I felt bad for having been held up by traffic on I-270 (and subsequently missing a speech by one of the senior management within NIAID). To make up for it, I spent much of the day trying to help ensnare potential collaborators (or, at the very least, free word-of-mouth advertising for our group). Within the first five minutes, I “enlightened” a random passer-by about our general existence and willingness to serve. I felt accomplished.

For being such a niche tool, however, SPICE did quite well, garnering more attention than I expected. A number of people had some basic questions about it and wanted to learn more. It was clear, however, that some were flow cytometry technicians who operated the instrument but had not yet taken a look at the big picture (ie, what the data they were collecting would be used for). Some assumed it was a replacement for FloJo and others thought it was a means to get the data into Excel. I found that last bit both funny and frustrating.

Let me explain that. The “big picture” (from a non-scientist’s perspective and vocabulary) is to test the efficacy of a proposed vaccine, for example. This involves groups being given different vaccination methods (or placebo) and having blood samples collected at set intervals to measure their response. Flow cytometry is how this response is measured. The blood (or tissue) samples are then run through the instrument in order to determine the type of cell and/or the level of cell function. Put simply – probably too simply to be strictly correct – in this case the researchers are measuring the body’s immune response based on how strongly a certain cell is prepared to carry out a certain immune-related function. After this data is processed, you’re left with a large number of individual categorical measurements. For example, “0.026% of the cells were CD8 positive, IL2 negative, from patient 23, three months after the vaccination, via Dryvax method”.

The next step is to analyze this information. You may wish to average all of the patients’ responses to the Dryvax method, overlaid by time point. You may then wish to turn certain functions on or off (such as viewing only CD8-positive measurements, or IL2-negative measurements), or turn the whole analysis on its head and look at the data in a different way. It is this live querying, exploration, and visualization of the data set that SPICE is meant for.

It is difficult (and in many ways impossible) to produce a sufficiently-flexible Excel spreadsheet for any kind of query. It is tedious and time-consuming to tailor a new, information-rich spreadsheet (complete with graphs) for each scenario on a case-by-case basis. In addition, there are some sorts of visual annotation that are specifically useful to the analysis of vaccination trial data but aren’t part of standard charts and graphs. This is where SPICE … well … excels.

All that aside, the “festival” was exhausting (I’m sure much more so for the guy that arranged it all – poor Jason) but it looks to have been a huge success. I enjoyed showing off my hard work and I’m sure my colleagues enjoyed showing theirs off as well. Go team.

Vivek and I, Getting Some Work Done

Disclaimer: I do not speak for NIAID or the US government. This is an unofficial, personal post about an event in my professional life. Any views expressed herein are my own opinion and do not necessarily reflect those of my employer or NIAID.

Photos by Leo Lu

© 2011 Joshua NozziJoshua Nozzi is a Cocoa developer for hire.Suffusion theme by Sayontan Sinha