Yes, after all the hard work and hard thinking, we were crowned winners of the Big Data VR Challenge by the panel of esteemed judges. It was a lovely result for us as we really believe that our solution has massive potential, so it was great to have that confirmed by being chosen as winners, even more so after hearing that the decision was unanimous!
I thought it would be good to do a final piece, rounding up the prototype, the challenge and also where we are headed now with this. I shall pick up where we left off in the last post, starting with a new project code-name: V-Arc
The final week of the challenge was the most intense. All the features that we wanted to cram in had to be hand coded, tested, tweaked and then implemented. Our solution was always meant to be a complete end-to-end prototype, so that we begin with our raw data, we filter the data according to our focus, then we experiment with various ways of displaying that data in order to maximise the pattern/trend recognition and then we save out the final data to a file that we can then takeaway to do more comprehensive analysis. This meant that we needed a lot of functionality that worked together as a whole, which means testing…lots of testing!
End-to-End Data Flow Diagram.
It also meant that some of the polish that we wanted to do had to take a back seat or even get cut entirely. As we had never used the Unreal Engine for a full blown project before (we have always used Unity) a lot of our time was spent just learning how to use the engine to create what we had in mind. Things such as User Interface design was shunted back a bit and didn’t end up exactly how we had planned; functional? Yes, but pretty? Not really!
Final Prototype User Interface
A particularly tricky aspect to get working correctly was the concept of slider-based data filters. At the request of our scientists, the ability to crop top and/or bottom ends of our data sets would be useful to researchers as it would allow them to omit data that they are not particularly interested in and allow for a more focused analysis. As our data environment was generated in real-time, filtering the data inputs and then have it immediately update the environment was tough. In VR, frame rates are a big issue and therefore everything needs to be highly optimised and not be overly heavy on computation or else lags would creep in. When you are turning your head in VR, lags are massively noticeable and often lead to motion sickness for the user. We put a lot of effort in the back end dev work in order to get smooth slider interaction that would not cause any lags at all. The result was a functional (if not particularly pretty yet) slider interface that would allow real-time cropping of top and bottom ends of any of the available data variables. It also acted like a kind of zoom function, if a clump of data was there in the center of the arcs that thinned out at either end (our bio data often has clumps in the middle where most people lie, with a few extremes at either end) then you can simply crop the top and bottom extremes out in the filter, which would then take the center clump and stretch it out over the environment, effectively acting like a zoom-in function.
Another massive head-ache was getting the networking to..well, work. Due to many technical issues that Pascal would be far better placed to explain than me, we were working on the networking right up until midnight the night before the judging! In our defence though, we weren’t alone in that race to the finish. After many hours wrestling with Unreal however, we eventually got it playing nicely and we had our networking function. This was a crucial aspect for us as collaborative analysis is so much more powerful as a tool than a closed system for a single user. Furthermore, our scientists at ALSPAC have a ‘data-buddy’ system that we wanted to include. This system means that we have a main ‘data buddy’ user who has complete access to all the raw and highly confidential data and who would be driving the application for another, probably remote researcher. The remote user would also be immersed in the data environment with their own VR headset, but their application would be a filtered environment, displaying only the key parts relevant to their query. The data buddy can talk in live-chat and point at parts of the data using the hand-held laser tool. The laser would hit the target section of data and this would bring up a display label with more precise values about that particular piece of data. These are visible for the remote user so that between them, they can collaborate and discuss efficiently the data presented, whilst at the same time, allowing the raw data set to be protected and restricted to the authorised data buddy.
Laser pointer highlighting key information
To populate our arcs with data we settled on 5 main ‘modifiers’ that would allow many different aspects of the data to be shown simultaneously in the environment: Node Size, Heat Map (colour), Ring Position (left to right) Ring Height (bottom to top in 25% increments) and Pulsing (flashing animation limited to the top 30% in the chosen data variable). These modifiers were shown on screen as radial menus that the user can load with any one of the 8 data variables (or all 5 with the same variable if desired) to then populate the arcs. The fixed ‘modifiers’ were for sex (sphere node for females, pyramids for the males) and also animated transitions on the colour for data from aged 7 and 11. As you moved the right thumbstick right, the colours animate from the values at 7 to those at age 11. This animation feature is not perhaps as clear as it could be due to our colour scheme and we are looking to test it out with some of the other modifiers, perhaps animating their vertical positions over time instead may be a clearer option going forward.
We have more modifiers in mind for future versions (3D sound for instance, which we were not fully able to implement in time), however using these modifiers only it is already possible to spot trends within the data set. We found it useful to start simply and then layer the complexity one modifier at a time to get used to interpreting the data in any meaningful way. In no time at all, we (as untrained data analysts) were able to start drawing conclusions from the data; taller children tend to weigh more but do not necessarily have the highest BMI for example. One could spend 30 mins or so immersed in this environment, just swapping in and out variables and the patterns begin to leap out at you. Exactly what we wanted, a multi-layered but simple to use VR tool.
It all comes together.
The morning of the judging arrived and we had a working end-to-end prototype that we were eager to show off to our scientists. They had been patiently waiting to get their hands on it for 7 weeks and now the moment of truth had arrived. We were pretty confident that it worked as intended ie. that it was a useful tool for allowing super-fast and highly flexible exploration of a large data set. You could load in whichever data variables you wished, display them however you liked, change the display in real time if it was not presenting an obvious trend right away, filter the data to zoom-in on areas of interest, collaborate with a remote user and finally save and print out the individual ID’s of the remaining displayed data for takeaway analysis. Result!
One of our project scientist finally dives into the prototype.
To summarize the project thus far, we would say that it validates our early assumptions that Virtual Reality can offer substantial benefits over more traditional methods in data analysis. The wrap-around environment certainly allows for a lot more data packing, whilst tapping into that reservoir of human instincts to instantly recognise patterns in shape, colour, movement and instinctual analysis of the 3D space all around us. We just get 3D space as living beings as it is where we exist, it just feels more natural. Interacting directly with our hands is another step closer to a more natural user interface; grabbing, sliding, pointing etc. not moving a mouse in one plane while looking at a little cursor moving in another plane, but actually interacting in real space, with (seemingly) real objects in front of you. But more than this even, VR focuses your attention, you start to believe on some level that you are no longer sitting at a desk working with spreadsheet data, but in fact you are standing in another place, a place that is the data. This is the key benefit of VR, this connection, yes the tech is not yet perfect, but it will be soon. That is when people will stop wondering what VR is actually good for, and they will start wondering how they ever got by without it.
We would like to thank Epic and the Wellcome Trust for having the vision to organise a challenge such as this, we know games will take care of themselves but it will take lots of visibility to prove to those outside games that VR can be of huge benefit. Events like this will go a long way to achieving this. We thank too the other teams who took part, the solutions all had a unique aspect to them and everyone had something worthwhile to say in their work. Also thanks to the judges for awarding us the top prize!
We deliberately built the prototype to be versatile, that is to say it can be adapted to take lots of different data types and can be used with data from different disciplines. Our discussions with ALSPAC themselves look promising at this point, and we all have ambitions to take this project further towards a robust and practical solution. It also should be said that we are happy to take enquiries about how this solution, or something like it, could work with different data, no matter how challenging it may seem, after all this is what we do!