My article on solarracing.org has only been available on archive.org recently, so I decided to re-post the original English version of the solar racing strategy article here to keep it world-readable (chmod+r, sort of). The German counterpart is to be found here: http://dc.georgruss.ch/2014/06/10/ein-data-scientist-an-der-world-solar-challenge-ein-datenblogbeitrag/
A Data Scientist’s Race Summary for the 2013 WSC
What’s a data scientist to do during a solar race, being embedded into a team of engineers and drivers, you may think? Well, if normal telemetry (as used by nowadays’ cars) is providing you with streams of data, a data scientist turns this into useful information. You’re essentially going from answering the question How fast are we driving? via How far can we go at that speed? to Where’s that going to rank our team in the end under different weather conditions for the next five days? This article will explain a few details of that job and may hopefully give you an idea of what all those data scientists do.
Things to Know — Race Preparation
Having been to Australia before, there was not much for me to prepare for in terms of wilderness fear, sun and weather conditions and language. I just love it over there, but it’s a long flight nevertheless each time. I’ll try to get an upgrade next time. I ended up checking the race track online, which is not too thrilling anyway, mostly just the Stuart Highway from north to south.
What I did check, though, was the elevation profile of the track. It just tells me that we might be using some more energy until the middle of the race, while we’d be using less after that. And we’d end up at just about the same altitude. In absolute terms, the elevation would matter if we had controlled and fixed external conditions — but with fluctuating sun and wind around it just doesn’t really matter. There’s a rather steep inclination in the first half that the organizers warn about in their roadbook, but during the race I didn’t even take notice of that in the heat of the moment. Some other teams had trouble there, having their cars optimized for straight runs and trying that slope with sheer momentum, which didn’t really work.
When it comes to photovoltaics, there’s some more things to theoretically prepare for. As we’re going a rather long way from north to south, the sun’s irradiation changes (going away from the equator during that time of year). And we’re also going a little east, so the daily sun schedule shifts slightly. In addition, there’s the daylight savings issue which would be handled by making the distinction between event time and civil time. So at least I needn’t worry about that. So, what would the daily irradiation look like?
Fortunately, there’s formulae for that. You need a few parameters like the converter’s efficiency, the array’s maximum output power and the day of year. This, depending on the location along the track (which is just another way of giving a longitude/latitude position) yields a few curves, depicted in the following graphics. The full solar power under real conditions during the race varied slightly, but the curves were essentially like that. What’s not shown is the control stops: during those you can arrange your solar panel optimally with the sun, harvesting the maximum power possible at the particular time of day. We’ll see a real measurement curve later in this post.
However, pure power-versus-timeofday graphics are not really useful. What you want at the end of the day (pun intended) is the collected energy. So we sum up (integrate) the power graphics above and come to the energy graph. You can see that the amount of collected energy goes south as we go likewise (Adelaide just yields 7500Wh, while it’s much higher in Darwin).
Things to Try — Theory Put into Practice
So far there’s no talk about the car, just about the energy intake. We know that we’d collect about 8kWh plus some extra control stop energy (due to perfectly aligned panels) per day. But how much is the car going to consume?
During testing, the whole setup of the car and some of its performance values can be obtained, in this case mostly offline, i.e. without the live-analysis pressure. With a new car, without extensive testing, without wind tunnel testing, you don’t really have an idea of how the car is going to perform. What you’d ideally want to know is the specific power consumption, i.e. how much mileage you get from your energy, except we do it metrically in watt-hours per kilometer (Wh/km). Scratching a few measurements from the data under non-optimal conditions gave me the following curve, showing a nice broad range of speeds at which the specific power consumption is rather constant at 15-16 Wh/km. Reminder: this is not the Power-vs-Speed graph; in power-vs-speed, the motor output increases monotonically with speed, though not fully linearly (a lot of people have mistaken that graph,that’s why I’m explicitly mentioning it). That said, during the race the specific consumption varied strongly due to head- and tailwinds. In numbers: one day we’d sail along at 10 Wh/km, not even being able to spend all the energy we’d collect, and another day we’d battle the heavy headwinds at 20 Wh/km, depleting the battery.
From Dawn till Dusk — During the Challenge
The technical setup for the data flow can be guessed: the car regularly sends data blocks transmitting most of the information its controller has (temperatures, voltages, currents, distance, etc.). You don’t do that too often because it consumes power — we did it every four seconds. This block of data is added to the so-far received data and can then be analysed on-the-fly. In my case, I write all the analysis myself in R, repeating the calculation periodically (once or twice a minute) to obtain updated graphics. That’s also what I did on the first race day: six hours’ coding R scripts to have all the analysis I need for strategic reasoning and decision-making. Not much to do the other days, though. At least not the basic stuff, more strategic thoughts and some sleep deprivation to account for. With the weather conditions being as they were, it wasn’t really complicated, just the fifth race day was partly thrilling, some more on that later.
As stated in the theoretical part I knew about the solar energy curves during the day, i.e. how much energy can be collected in normal conditions. However, nothing beats hands-on practical experience. During the solar challenge, the daily race period starts at 0800h and ends at 1700h; before and after that, you’re free to align your solar array perfectly towards the sun and spray it with water for cooling. As mentioned, there’s also control stops during the race where you can do the same as in the mornings and evenings. An exemplary day during the 2013 race looks as in the following graph in terms of solar irradiation: there’s a period from sunrise to 0800h where the optimal alignment of the panel with the sun gives you up to around 900 Watts (80-85% of what you get at noon). Then the race starts and runs until the first control stop from about 1000-1030h — optimal alignment yields higher solar power. After that, normal driving under the blistering Australian sun follows until the next control stop from 1310-1340h, followed by another driving period until 1700h. The car is stopped and the array is optimally aligned again, raising the power from around 200W to 900W, decreasing until sunset. The effect of cooling the panels via spraying them with water was determined to be about 50-100 Watts (around 5%).
You can’t yet make strategic decisions based on solar irradiation, but we’ll be getting there. If you integrate the energy flows into and out of the battery over time, you obtain a rather precise picture of the state of charge for the battery. That’s what you’d like to have ideally… Based on this, calculations during the race are typically just simple mental arithmetics with a few constraints. One constraint: you need to spend enough energy during the day such that the battery can absorb the extra energy for that evening _and_ for the next morning until the race starts. To put it into perspective: the battery holds around 5,000 Wh and a typical evening provides 1,000 Wh, with the next morning giving another 1,000-1,200 Wh. So the battery must be almost half empty at 1700h if you don’t want to waste solar energy. You’re not really wasting it, of course, but here comes the other important thought: a specific power consumption of 15Wh/km over 3,000km amounts to 45 kWh of energy in total — and as long as you haven’t collected that amount of energy, you won’t make it to the finish line. So, by wasting energy (because the battery can’t absorb it) you’re indirectly losing time or falling behind. This energy waste happened twice (day 3/4) and it was in a way just horrible to see the battery reduce energy intake because it was simply full. Just because of some miscalculation and really conservative driving and strategy inputs from the data analyst. Luckily it probably wouldn’t have changed much in the end.
As the above graph shows, the battery is just a rather small energy buffer and runs dry really rapidly if there’s no sun (day 5 afternoon). Small spikes upwards in between are typically the half-hour control stops.
As Thrilling as it Gets — The Fifth Race Day
The first four days were rather dull, just flowing along Stuart Highway, harvesting energy and using it on the road. The first half of the fifth day also went rather normal, but the weather changed in the afternoon, with rain and heavy winds. No sun meant no incoming solar energy and this dramatically showed that the battery is just a small energy buffer rapidly depleting under those conditions. So with the graphs and the data flowing in I did manual calculations on distance and energies, while all the time watching battery capacity and the rest of the statistics. It’s not my normal handwriting though, with all the jitter caused by the car’s movements.
During the race, the most important information blocks are voltages, power inputs (solar) and outputs (motor), specific power consumption (Wh/km) and some temperatures. So the main graph display just gives me those values in a 30-45 minute time frame — I can easily see trends and changes during the preceding 30-45 minutes. The following is just an example: a cut-out of one race day afternoon to provide a feeling of what it looks like to look at incoming data all day.
Coming back to the fifth race day, the most important thing I watched at first was the remaining battery capacity. We had never done a full discharge before and it’s usually not recommended because it may permanently damage your battery. So after the control stop, bad weather accompanied us, along with a lot of traffic as we were getting closer to Adelaide now. See the battery discharge?
Because I had no chance with the theoretical battery capacity, we decided to keep watching the battery voltage instead. From my preliminary calculations long before the race I knew the discharge curves of the battery cells and when the discharge slope turns steeper then you’re probably better off stopping the car, no matter where you are. So I kept watching the battery voltage while doing periodic pen-and-paper calculations. And indeed, around 1642h the battery voltage started to show downward spikes and the downward trend slope became slightly steeper. Around that time we safely decided to stop the car. We had already overtaken the team in front of us (their battery went low before ours did likewise), were far enough ahead of them and could safely come in in fifth position the next morning after recharging enough that evening and the following morning.
Summary and Remarks
That’s a lot of data analytics fun, done under real conditions with live data where you can’t postpone the number crunching just because you’re working a nine-to-five shift. The results should speak for themselves. I can clearly recommend it. A word of caution, though: I removed a lot of red sand from my Toshiba Ultrabook after coming home. And this case also demonstrates the power of free software, here (Gentoo) Linux (which I started using in Australia in 2004), R (in use since 2009) and especially ggplot2 (since 2012) for graphics. Coding was done in vim (since 2004, emacs never had a chance) within a tmux terminal. There’s other strategic calculations I did but I won’t give it all away and most of the things are just straightforward ideas, such as answering the question of when (during the day) a control stop should be ideally situated or if it may be worthwile to stop earlier than necessary to optimize energy harvest and so on. I know the answers because I did the maths.
9500 Wil SG