Combining, visualising and augmenting open data sources can highlight trends, close gaps in your businesses knowledge, highlight critical issues or point to new areas for exploration.
For this map of road traffic accidents involving cyclists in Glasgow, we combined open data from the Department for Transport, the local council, a cycling app provider (Strava) and Open Street Map. Simply overlaying these data sources and applying descriptive techniques like heat maps raise questions that can be probed further by drilling-down into the data and applying more complex data science techniques.
The cost associated with these road traffic accidents is both personal and financial:
- Personal trauma and recovery
- Police and ambulance services
- Hospital and post accident treatment
- Insurance claims
- Absence from work
- Traffic delays (over a 12 month period)
In our map we have provided a high-level ten year overview of the data in three layers:
- Bike trips in the city, provided by Glasgow City Council and Strava – this allows us to show where the popular cycle routes are.
- Density of the accidents across the local authority, provided by the Department for Transport – this highlights potential black spots where more accidents occur.
- Accidents over time, extracted from the individual accident records – combining data from ten years’ worth of accidents this animated heat map shows the pattern of accidents by time of the day.
You can zoom into areas of the map using the controls on the left, turn on and off the layers by selecting them in the top right and control the animation with the time scroll bar in the bottom left. If you were the data science team looking at this map, what questions does it raise? What trends are you interested in?
- Does the pattern of accidents change month to month?
- What are the characteristics of the cyclists involved?
- What was the weather like?
- What changes have been made over that time to the road infrastructure?
- Have the changes made an impact?
This high-level look at the data is often the entry point for data scientists – combining, visualising and describing the data that businesses already have from multiple angles to get a data-driven perspective on business problems.
Crucially, this can spark the ‘why’ sort of questions – why do we see this trend in our data, what impact does it have? Then we can use our knowledge of the sector and develop hypotheses to test explanations for the patterns we observe. Not only that, we can then model how doing things differently might affect the outcome. In this case, reduce the number of cycling accidents, and thus both personal trauma and financial costs.
At Agile, we provide data science expertise on a daily basis across a wide range of business sectors and business challenges.