Tag Archives: visualization

electronics

Feather HUZZAH Temperature Monitor

I recently visited the cabin, and it was cold. Like, excessively cold. Like, 37 degrees, which is perilously close to pipes-freezing cold. The thermostat shouldn’t allow that to happen. I hadn’t been there for more than a month, so I don’t know how long there had been a problem, but clearly, the thermostat or furnace wasn’t working. I went into the crawlspace under the house, pulled some panels off the furnace, and did a bit of troubleshooting on my own. Then I did a bit of troubleshooting with an HVAC guy on the phone. Eventually, we determined that something was, indeed, broken. The HVAC guy came out, replaced the controller board on the furnace, and I had heat again.

At the office, I build and maintain complicated software systems. Any sufficiently complicated system is going to have unpredictable failure modes. I accept that I can’t avoid all possible failure modes, but once I recognize a critical failure class, I build monitors to alert me to any failure in that class. It’s what I do in software, so it makes sense to do it in hardware as well. I don’t know all the failure modes of the heating system in the cabin, but failure of the heating system is certainly a failure class that could have very bad (as in, expensive) consequences.

I recently became aware of Adafruit’s new Arduino-compatible line of development boards, Feather. The Feather HUZZAH, is particularly interesting, as it has built-in WiFi (based on the ESP8266 chipset), and costs only $16. With a Feather HUZZAH and a temperature sensor, like the MPC9808 I2C breakout, I could put together an inexpensive monitor. I happened to have a spare, small I2C OLED display that I could add to the mix for a bit of feedback.

Components

The code to initialize and control the temperature sensor and OLED is short and easy. The loop() portion of the sketch reads the temperature, puts it on the display, and if 15 minutes have passed since the last time data was sent to the server, send the temperature to the server and reset timer variable. Finally, shut down the temp sensor and sleep for two seconds. It looks like this:

void loop() {
  float f = tempsensor.readTempF();

  display.clearDisplay();
  display.setCursor(0,0);
  display.print(f, 1);
  display.print('F');
  display.display();

  if (millis() - send_timer >= 1000 * 60 * 15) {
    WiFiClient client;
    if (!client.connect(host, httpPort)) {
      Serial.println("connection failed");
    }
    else {
      client.print(String("GET ") + url +
        "?code=" + mac + "&tval=" + f + " HTTP/1.1\r\n" +
        "Host: " + host + "\r\n" + 
        "Connection: close\r\n\r\n");
      send_timer = millis();
    }
  }

  tempsensor.shutdown_wake(1);
  delay(2000);
  tempsensor.shutdown_wake(0);
}

The only problem I had was that when I tried uploading the sketch to the HUZZAH, I got the error,

warning: espcomm_sync failed
error: espcomm_open failed

A bit of research indicated that to upload a sketch, I’d need to connect Pin 0 to ground and reset the unit (either by power cycling it, or by hitting the reset button).

Pin 0 to Ground

With Pin 0 held to ground, the sketch uploaded. After connecting the temp sensor and OLED, the device seemed to measure the temperature accurately. I took some dimension measurements, and designed an enclosure in TinkerCAD. By the time I had soldered the connections, the two pieces of the enclosure had finished printing.

Enclosure

The last component for this project is a server-side piece that could record the temperature. In the simplest case, I could set up a page that listens for incoming data, and sends me an email or text message when a temperature is posted below some threshold. But I wanted also to be able to see trends over time. So I needed to store readings in a database. Since I might want to have multiple temperature monitors running in several locations, I need to record a source with each temperature reading. To normalize the database, I split the source and measurement into two tables, like this:

mysql> describe sources;
+-------+-------------+------+-----+---------+----------------+
| Field | Type        | Null | Key | Default | Extra          |
+-------+-------------+------+-----+---------+----------------+
| id    | int(11)     | NO   | PRI | NULL    | auto_increment |
| code  | varchar(64) | YES  | MUL | NULL    |                |
| name  | varchar(64) | YES  |     | NULL    |                |
+-------+-------------+------+-----+---------+----------------+
3 rows in set (0.00 sec)
mysql> describe temps;
+-------------+--------------+------+-----+-------------------+-----------------------------+
| Field       | Type         | Null | Key | Default           | Extra                       |
+-------------+--------------+------+-----+-------------------+-----------------------------+
| id          | int(11)      | NO   | PRI | NULL              | auto_increment              |
| source_id   | int(11)      | NO   | MUL | NULL              |                             |
| measured_at | timestamp    | NO   |     | CURRENT_TIMESTAMP | on update CURRENT_TIMESTAMP |
| temperature | decimal(4,1) | NO   |     | NULL              |                             |
+-------------+--------------+------+-----+-------------------+-----------------------------+
4 rows in set (0.00 sec)

After I had recorded temperature measurements for several days, I had enough data to start putting something on a graph. Rather than building a graphing mechanism from scratch, I repurposed some D3 code that I had written for my UltraSignup Visualizer (which was, at least in part, repurposed from my MMT Graph project). The D3 code pulls data (as JSON) from a PHP script that retrieves temperature measurements and timestamps from some specified source. It then draws the graph, and adds the (slightly smoothed) measurements.

// Smooth temperature readings over avglen measurements
var temperatures = [];
var cur_temp = 0;
var avglen = 8;
var i = 0;
// Seed the running average
while (i < (avglen - 1) && i < results.length) {
  cur_temp += (1.0 * results[i].t);
  i++;
}
// Populate the running average (cur_temp) as a FIFO list of avglen length
while (i < results.length) {
  cur_temp += (1.0 * results[i].t);
  temperatures[temperatures.length] = {x: results[i].d, y: (cur_temp / avglen)};
  i++;
  cur_temp -= (1.0 * results[i - avglen].t);
}

// Create the SVG line function
var line = d3.svg.line()
   .interpolate("basis")
   .x(function(d, i) { return xScale(new Date(d.x)); })
   .y(function(d) { return yScale(d.y); });

// Add the data to the graph using the line function defined above
svg.append("path")
   .attr("d", line(temperatures))
   .attr('class', 'rank_line')
   .style("fill", "transparent")
   .style("stroke", "rgba(71, 153, 31,.8)")
   .style("stroke-width", 1.25);

Temperature Graph

Now that everything works, I’d like to make a few more of these devices. The main costs are $15.95 for the Feather HUZZAH, $4.95 for the temperature sensor, $17.50 for the OLED display, a few dollars for a micro-USB cable and power supply, and some cents for a few inches of wire and a few grams of PLA (for the 3D printed enclosure). For the cost of the device, the display is disproportionately expensive. Once the device is running, the purpose is to record temperature remotely. If I replace the OLED with a single NeoPixel (which’ll run about $1) that flashes some color code to indicate status, I don’t get the onboard temperature readout, but I DO get the entire device for around $23 (plus the micro-USB cable and power supply). So the next iteration will replace the OLED with a NeoPixel. Stay tuned.

programming

UltraSignup Visualizer


Instructions

  • In the text box at the top of the graph, enter the full name of a runner whose results can be found on UltraSignup, then hit enter.
  • The points on the graph represent individual race results for the given runner. Move your mouse over a point to see details of that race.
  • The line represents the evolution of the runner’s UltraSignup rank.
  • Timed events (eg, 12-hour races, 24-hour races) appear as empty circles. It seems that as of mid-October, 2014, timed events are included in the ranking. However, it is not clear to me if that change is retroactive, and in some circumstances, I cannot get my calculation of the ranking to line up with their calculation of the ranking. So if you have a large number of timed events in your history, the line I’ve calculated might be e’er so slightly off. The ranking reported below the graph is the official number, provided by UltraSignup.

Background

[Update: The friendly folks at UltraSignup came across this, and they liked it. I worked with them to get it integrated into the official runner results page. So now you can click the “History” link just below a runner’s overall score on UltraSignup and see the plot on the results page. Though if you like the spanky transitions between runners, you still need to come here.]

In the world of ultrarunning, it seems that the ranking calculated by UltraSignup has become the de facto standard for ranking runners. I think that part of the reason for its acceptance is its simplicity. A runner’s rank in a single race is just the ratio of the winner’s finish time to the runner’s finish time. So if you win a race, you get a 100%; if you take twice as long as the winner, you get a 50%. The overall ranking is a single number that represents an average of all of a given runner’s race rankings. If you were to look up my results on UltraSignup, you would see that as of this moment of this blog post, my 10+ years of racing ultras has been boiled down to a ranking of 88.43% over 48 races.

Of course, with simplicity comes inflexibility. What that number doesn’t capture is change over time. By summing up my results as a single number, it’s hard to see how my last few years of Lyme-impaired running have affected my rank, or how my (hoped-for) return to form will affect it. I was curious to see how runners progress over time, and how it affects the UltraSignup rank. In looking at the details of how UltraSignup delivers their rank pages, I noticed that the results come as JSON strings. Therefore, I realized, I wouldn’t even have to do any parsing of irregular data. I could just pull the JSON, and use my handy D3 skillz to put the results in a scatter plot.

I won’t go into great depth about implementation details. If you happen to be interested, you can go to the source. A passing familiarity with D3 would be helpful, but familiarity with only vanilla Javascript should allow you to get the gist.

Oh, and be aware that since this pulls data from UltraSignup, it’s entirely possible that it will stop working someday, either because they change the way they deliver data, or because they don’t like third parties creating mashups with their data. Also, this doesn’t work on Internet Explorer 8, or earlier. Sorry ’bout that!

programming

The Obsessing Over The Splits

“There’s one more piece,” I explained to Martha, “that you have to master.” The previous fall, she had developed a fibroma in her foot that curtailed her running. Hoping to keep her active (ie, non-grumpy), I dragged her to the pool. She never claimed to enjoy swimming, but on Monday and Wednesday nights, she would make sure I was planning on swimming the following morning. Even if she felt like it was a constant struggle, in a few months, she had improved significantly (ie, not nearly as much gasping and clinging to the side of the pool as when she started).

In the spring, she surprised me with her keenness to spend time on a bike. At first, it was mountain biking in West Virginia. Then she got a BikeShare membership so we could ride in Rock Creek Park on the weekends, when they close Beach Drive to traffic. Then she started talking about getting her own bike. After years of referring to bikes as, “The Vehicle Of Death,” I wasn’t sure what to make of it. But I was happy to go along with it. Eventually, I casually mentioned that, what, with all the swimming and biking, she might as well sign up for a triathlon. And much to my surprise, she was game!

I hadn’t raced a tri since 2008, so I was looking forward to a return to the sport. I picked Luray Triathlon (international distance — 1500 meter lake swim, 40km bike, 10km run) in August as a target race, and we got about to training. Well, there really wasn’t so much “training” in a specific sense. I mean, we’d go to the pool once or twice a week, we’d do 40-50 mile bike rides (far longer and hillier than the bike portion of the race) pretty regularly, and running is our bread and butter.

Long story short, she had a great race, despite coming out of the water pretty close to the tail end of the field. She tells the full story on her blog, so I won’t restate it all. But after the race, there was one last lesson of triathlon that she needed to learn — one more piece to master.

“Part of the triathlon experience is obsessing over the results.” In a running race, you might have intermediate splits, but after looking at the results, all you can really say is, “I gotta run faster.” Or maybe, “Look at that positive split! I gotta not race like a friggin’ moron!” But in triathlon, you get your finish time, but also times for the swim, bike, run, and two transitions. So you can say things like, “My swim, bike, and run were awful, and my first transition was slow as dirt… But I ROCKED my second transition!” Yes, obsessing over results, and imagining how much more awesome you would be if you could only swim faster is a grand part of the triathlon tradition.

Looking at Martha’s splits, it’s clear that she’s a weak swimmer (4th percentile of the race), a fair cyclist, and a standout runner (10th overall, including elite men). This seems like a time for some visualizations! The first step was to put the results into a CSV file, and load it into R. I wrote a little function to convert the times to total second, so everything could be compared numerically.

getTime <- function(time) {
  sec <- 0
  if ('' != time) {
    t <- as.integer(strsplit(as.character(time), ':')[[1]])
    sec <- t[1]
    for (i in 2:length(t)) {
      sec <- sec * 60 + t[i]
    }
  }
  sec
}

And I used that in a function that compiles the splits in to a vector.

getSplits <- function(results) {
  splits <- c()
  for (i in 1:length(results$TotalTime)) {
    swim <- getTime(results$Swim[i])
    t1 <- getTime(results$T1[i])
    bike <- getTime(results$Bike[i])
    t2 <- getTime(results$T2[i])
    run <- getTime(results$Run[i])
    penalty <- getTime(results$Penalty[i])
    total <- getTime(results$TotalTime[i])

    if (0 == t1) t1 <- 180 # Default of 3m if missing T1
    if (0 == t2) t2 <- 120 # Default of 2m if missing T2

    # If missing a split, figure it out from total time
    known <- swim + t1 + bike + t2 + run
    if (0 == swim) swim <- total - known
    else if (0 == bike) bike <- total - known
    else if (0 == run) run <- total - known
    
    if (swim & run & bike) { # Exclude results missing two splits
      splits <- c(splits, swim, t1, bike, t2, run, penalty)
    }
  }
  splits
}

From there, I could produce a graph showing color-coded splits in the order of finish for the race.

splits <- getSplits(results)

barplot(matrix(splits, nrow=6), border=NA, space=0, axes=FALSE,
        col=c('red', 'black', 'green', 'black', 'blue', 'black'))

# Draw the Y-axis
axis.at <- seq(0, 14400, 1800)
axis.labels <- c('0:00', '0:30', '1:00', '1:30', '2:00',
                 '2:30', '3:00', '3:30', '4:00')
axis(2, at=axis.at, labels=axis.labels)

Luray Intl. Distance Tri, Overall

Each vertical, multi-colored bar represents a racer. The red is the swim split, green is the bike, and blue is the run (with black in between for transitions, and at the end for penalties). It becomes clear from this graph that Martha was one of the last people out of the water (notice her tall red bar), then had a fair bike ride, but didn’t make up much time there. It wasn’t until the run that she started to make up time. That’s what moved her from the tail end of the field to the top half.

But part of the beauty of obsessing over triathlon results is that there are so many ways to slice and dice the data. It seems only fair that we should look at the sex-segregated results, and of course, triathletes are very into age group results. So we can limit the sets of data to our individual sexes and age groups.

Luray Results

So that’s one way to look at the data. However, that only provided a fuzzy notion of how each of us did in the three sports. For example, my swim time is similar to the swim times of many people who finished with similar overall times. It’s difficult to tell where I stand relative to the entire field.

Perhaps a histogram is more appropriate. For example, I could use my getTime function to create a list of the finish times for everyone.

times <- sapply(results$TotalTime, getTime)

Then it’s trivial to draw a histogram of finish times.

hist(times, axes=FALSE, ylab='Frequency of Finishers', xlab='Finish Time',
     breaks=20, col='black', border='white', main='Histogram of Finishers')

To draw the X-axis, I created a function that translates a number of seconds to a time string with the H:MM format.

# Make a function to print the time as H:MM
formatTime <- function(sec) {
  paste(as.integer(sec / 3600),  # Hours
        sprintf('%02d', as.integer((sec %% 3600) / 60)), # Minutes
        sep=':')
}

# Specify where the tick marks should be drawn, and how
# they should be labeled
axis.at <- seq(min(times), max(times),
               as.integer((max(times) - min(times)) / 10))
axis.labels <- sapply(axis.at, formatTime)

# Draw the X-axis
axis(1, at=axis.at, labels=axis.labels)

That gives me this:

Luray 2014 International Distance Results, HistogramI’ve also inserted an ‘A’ below the results to notate where I finished, and an ‘M’ to notate where Martha finished. However, as I’ve indicated, part of the obsessing over the splits involves slicing the data as many ways as possible. I wanted to see this sort of histogram for each of the sports overall, by sex, and by age group. That’s a nine-way breakdown, for both me and Martha. Fortunately, since the data is all in R, and since I have the code all ready, it’s fairly trivial to make the histograms. They need to be viewed a bit larger than the width of this column, so you can click on the images below to see more detail. Here’s mine:

Luray Histogram, AaronLooking at my results, it is clear that I’m a stronger swimmer than cyclist, but it’s really the run that saves my race. Here’s Martha’s:

Luray Histogram, Martha

Notice that in her age group, she had the slowest swim, and the fastest run. She clearly gets stronger as the race goes on.

But there is still (at least) one more way to look at the results. Not only do we want to know how we perform in each of the disciplines; we also want to know how we progress through the race. That is, how do our positions change from the swim to the bike to the run to the finish? I started off with a function similar to “getSplits” above. I called this totalSplits. For a given racer, this produced a vector of the cumulative time after six points in the race: swim, t1, bike, t2, run, penalties. I could use those vectors to build a matrix, which I could then use to build a graph of how race positions changed from the swim to the bike to the finish.

all.totals <- t(matrix(apply(results, 1, totalSplits), nrow=6))
# Exclude results that are incomplete
all.totals <- all.totals[which(all.totals[,6] != 0),]
cnt <- length(all.totals[,1])

# Map the swim, bike, and finish times onto a range of 0 to 1, with
# 1 being the fastest, and 0 being the slowest.
doScale <- function(points) {
  1 - ((points - min(points)) / (max(points) - min(points)))
}
scaled.swim <- doScale(all.totals[,1])
scaled.bike <- doScale(all.totals[,3])
scaled.finish <- doScale(all.totals[,6])

# Plot points for swim, bike and finish places
plot(c(rep(1, cnt), rep(2, cnt), rep(3, cnt)),
     c(scaled.swim, scaled.bike, scaled.finish),
     pch='.', axes=FALSE, xlab='', ylab='',
     col=c(rep('red', cnt), rep('green', cnt), rep('blue', cnt)))

# Add the lines that correspond to individual racers
for (i in 1:cnt) {
  lines(c(1,2,3),
        c(scaled.swim[i], scaled.bike[i], scaled.finish[i]),
        col='#00000022')
}

# Add some axes
axis(1, at=c(1, 2, 3), labels=c('Swim', 'Bike', 'Finish'))
axis(2, at=c(0, 1), labels=c('Last', 'First'))

From that, I get something that looks like this:

Luray Results, Places

It looks like a crumpled piece of paper, so perhaps it needs some explanation. At the left is the placing for racers after the swim from the fastest swimmer at the top, to the slowest at the bottom. In the middle is the placing after the bike, and on the left is the placing at the finish. The first thing I notice is that there seems to be little correlation between placing after the swim and after the bike. The left side of the graph looks like a jumbled mess. The other thing I notice is that the top racers — note that prize money brought some pros to this race — are fantastic all-around. To pick out my results and Martha’s results, I highlighted them in aqua and yellow, respectively.

And for the sake of completeness, we need to break that down by sex and age group.

Luray Placing by Sex and AG

So yes, I suppose the moral of the story is that no one can obsess over results like a triathlete can obsess over results.

And in case anyone wants to play with the results, click the link to get the CSV of the results for the 2014 Luray International Distance Triathlon.

programming

Race Progress Visualization Using D3

[The project referred to in this post can be found at http://vestigial.org/MMT/ ]

I’ve been looking for some better tools to produce interactive, data driven, visually appealing web content. In the past couple of years, I’ve become enamored with R for analysis and visualization, but the graphic results are static. (Sure, there are tricks to create animations, but I’m not looking for workarounds.) I occasionally use Google Charts when I need to put together a quick visualization, but they don’t provide quite the level of flexibility I’d like. I started looking at either working directly with SVG or Canvas DOM elements, or using a Javascript SVG library that would allow me to avoid the low-level details.

The most interesting possibility was the D3 framework. D3 — for Data-Driven Documents — is an entire framework for DOM manipulation in data-driven sites. Browsing through the examples on the D3 site, I recognized several memorable visualizations that have appeared on one of my favorite blogs through the years, Flowing Data. It is possible to use D3 for SVG construction and manipulation while non-data-driven portions of the site are handled by, eg, jQuery or standard Javascript. But as long as you’re already using the bandwidth to load the framework, you might as well drop other frameworks, and use the tools that D3 provides.

I was keen to get some experience with D3. When learning a new technology, I prefer to dive straight in — come up with a short, but non-trivial project that I can build. In this case, I came up with a project that melds technology, data visualization, and ultrarunning. The Massanutten Mountain 100 Mile Trail Run (or MMT) is in a few weeks. In such a long race, runners and crews like to have some idea when they’ll arrive at intermediate points along the course if they’re aiming for some given finish time. Conversely, knowing when they’ve arrived at points along the course can help to predict what sort of finish time to expect. While I’m not the first person to provide a visualization, or some tool to correlate aid station splits with finish times, it’s fun to put together something that’s visually appealing and useful.

Showing data from 2011 and 2013 for finishers who finished between 20:59 and 25:55, race time. The horizontal axis is time and the vertical axis is distance, labeled on the left with mileage at each aid station, and on the right with the aid station name. Each diagonal line represents a single racer. Intermediate times on the graph show first and last racer times of arrival at each aid station (for racers in the result set).

Showing data from 2011 and 2013 for finishers who finished between 20:59 and 25:55, race time. The horizontal axis is time and the vertical axis is distance, labeled on the left with mileage at each aid station, and on the right with the aid station name. Each diagonal line, or “track”, represents a single racer. Intermediate times on the graph show first and last racer times of arrival at each aid station (for racers in the result set). Tufte would be proud.

 

There are several interactive components that I think are noteworthy. First, I provide on-demand data loading. When the page loads, none of the race results is loaded. When a year is selected, the page checks whether the data have been downloaded. If not, it fires an AJAX request, and saves the data so the results can be turned on and off.

The page also provides sliders to limit the result set based on finish time. Each limiter consists of three components: a triangular slider widget (represented by an SVG path element), a time display (represented by an SVG text element), and a vertical guide line (represented by an SVG line element). When the widget is slid, all three elements should move in unison, and the time display should update with the time value at the current point. As a bonus, the vertical guide gets brighter. So I needed to be able to address each element individually, but move them in unison. To build that, first I needed to define the shape for my widget (note that in SVG coordinates, the top left is [0,0]):

var limpolygon = [{x: 0, y: 0}, {x: 10, y: 0}, {x: 5, y: 10}, {x: 0, y: 0}];

I also need to define a function to tell D3 how to interpret the data above. I can use d3.svg.line() to return a function for this purpose. Since I’ve built the object with straight-forward X and Y coordinates, I just need to build a simple function based on those values:

var limline = d3.svg.line()
  .interpolate("linear")
  .x(function(d) { return d.x; })
  .y(function(d) { return d.y; });

Finally, I put the group together. I define a group element (“g”), and append the widget, which I construct in place. I then use the D3 selector to reselect the group, and add the line, then the text:

svg.append("g")  // Create the group, append it to the svg object
  .attr("id", "lim1")
  .attr("transform", "translate("+lim1x+","+limy+")")  // Put it into position
  .append("path")  // Create "path" element for widget, and append it to group
    .attr("id", "lim1_point")
    .attr("d", limline(limpolygon))  // A path has a "d" attribute which gives
                                     // instructions for drawing. Our limline()
                                     // translates raw data into path data
    .attr("fill", "white")
    .on("mousedown", function() {
      capt = "lim1";
      d3.select("#lim1_line").style("stroke-opacity", "1");
    });

d3.select("#lim1").append("svg:line")   // Create line element, append to group
  .attr("x1", limhalfw)
  .attr("y1", ex_pad.top)
  .attr("x2", limhalfw)
  .attr("y2", height - ex_pad.bottom)
  .attr("id", "lim1_line");

d3.select("#lim1").append("svg:text")   // Create text element, append to group
  .attr("id", "lim1_time")
  .text("00:00")
  .style("text-anchor", "end")
  .attr("transform", "translate(-2)");  // Push it 2px to left, for a nice gap

In my view, the coolest trick is making the data respond to the sliders. Whereas showing or hiding the individual years relies on a small number (3) of discrete values, I need to show or hide individual race results based on what is essentially a continuous scale. This involves several steps. First, when adding each track to the graph, I need to attach the finish time to it. Fortunately, HTML5 provides the ability to specify arbitrary data attributes with the data-* construct.

lineset.enter()
  .append("path")
  .attr("data-finish", function(d) {  // Add the data-finish attribute
    return d.finish;
   })
  .style("stroke-opacity", function(d) {
    if (d.finish > finScale(lim2x) || d.finish < finScale(lim1x)) return "0";
    else return ".3";
   })
  .datum(function(d) { return d.splits; })
  .attr("class", "rtrack line " + iden)  // Classes to use later in selectors
  .attr("d", line);

Above is the code to add the tracks. While it might not make much sense if you are not familiar with D3, the key point is the third line. The object has a data object, d, applied to it, and on that line, we set the data-finish attribute to the value of d.finish. (Directly below that, we set the opacity of the line to 0 (making it invisible) if it falls outside of our specified range, or .3 if it is inside the range. But we’re getting ahead of ourselves.)

The next thing we need to a way to translate the location of a slider into a finish time. D3 provides “scales” for just such a purpose. Usually, D3 scales are used to translate some real world value to a pixel position. In this case, we want to do the reverse. I want to build a function that will translate an input domain of a pixel position into the output range of a race time, which in this case is between 0 and 36 hours.

var finScale = d3.scale.linear()
  .domain([lim1x, lim2x])
  .range([0, 36]);

(An astute reader who is familiar with D3 might note that somewhere else, I must have defined a scale to translate from times to pixel values. In that case, someone might wonder why I don’t just use linear.invert() to translate a range value into its corresponding domain value. The answer is that the scale that translates from time to position uses a domain defined by the time of day as a date object, whereas in this case, I want to translate between position and a floating point number representing the finish time in hours (with minutes represented in the fractional portion of the number). Hence the need to define a new scale.)

In this case, lim1x is the initial pixel position of the lower limit slider, and lim2x is the pixel position of the upper limit slider. That produces a function that can be called as finScale(px_pos) to return a corresponding race time. I can then use that in the function that is called when a slider is released.

function updateRange() {
  var fin1 = finScale(lim1x);  // Translate pixel positions to finish times
  var fin2 = finScale(lim2x);
  d3.selectAll(".rtrack").transition(500).style("stroke-opacity", function(d) {
    if (this.getAttribute("data-finish") > fin2 ||
        this.getAttribute("data-finish") < fin1) return "0";
    else return ".3";
  });

  updateAidStationTimes();
}

That function translates the current pixel positions of the sliders into race times (fin1 and fin2). Then it uses d3.selectAll to get every item with the class “rtrack” (which is every race line displayed on the graph), applies a 500ms transition time to the following step, then sets the stroke-opacity style based on a function that checks whether the custom attribute data-finish is in the range defined by the limiters. Finally, it calls updateAidStationTimes(), which I won’t explain in detail here, but it uses d3.extent() with a custom accessor function to find the first and last arrival time of racers in the result set at each aid station. (If you’re particularly interested, you can always dig it out of the source.) It then updates the times displayed on the graph, and moves them into the proper positions.

I started the project on Saturday morning with no experience in D3 (or with SVG graphics), and I finished Sunday evening. I even had time to get out for a bike ride, a run, and a trip to the library to get a movie (which I also watched over the weekend). In the course of this project, I came to appreciate just how massive D3 is. I’m starting to get a feel for it, but this project just scratched the tip of the D3 iceberg (though I’m not sure one would really scratch an iceberg, the tip or otherwise).

[The project referred to in this post can be found at http://vestigial.org/MMT/ ]