Howzat? The data behind sport

The variation in gameplay for cricket is diverse thanks to multiple competition formats and intricate data knowledge can make the difference for a test match lasting five days. To explain this in more detail, Hannah Jowitt, Disability and Pathway Analyst at the England & Wales Cricket Board, spoke with DataIQ to explore the changes in cricket data in recent years, where it looks set to go and how data is helping select the future stars of the sport.
howzat-the-data-behind-sport

A natural fit

Professional athletes are often involved with data analysis subconsciously. “Cricket naturally lends itself to being data driven without people realising it,” said Jowitt. “Every club player wants to know their individual stats – balls faced, runs scored – so they are constantly analysing data without really knowing they are doing so.” This has made data engagement comparatively easy compared to other environments, meaning the focus is on how to best present the data. “It is all about relationships,” said Jowitt. “As an analyst, my job is to manage a massive pool of data that keeps growing and to narrow that down into relevant nuggets that can be presented in an interesting way to individuals. Jowitt further explained that analysts need to know the individuals on a personal level to understand if they would better engage with visual tools, video or another form of presentation.

When asked if there was a case for a certain level of data literacy being needed by staff and players, or whether it was more of a case-by-case basis, Jowitt explained, “A certain level of literacy helps, but I do not think it is necessarily a growing need because that is what the data team is there for. Coaches and players naturally have questions and engage with the information presented that is relevant to them to better understand it and dig deeper into what needs to be done.”

In recent years, Jowitt and ECB have explored how data can be best examined and used so less time is spent searching for data and more time is spent analysing. Data access has been a big aspect of this and there has been a concerted effort to get multifaceted teams involved (coaches, selectors, science and medicine) to ensure data is playing its part in the wider picture. There is minimal resistance to data in cricket: “Most people in the sporting world, I have found, realise that data is not going away and that it does help fuel progression, so they usually embrace it,” said Jowitt.

The growth of data

It used to be the case that the ECB used large spreadsheet printouts for those outside of the data office, but now apps have been developed for different users that can accurately deliver instant and relevant data in a more engaging way. Furthermore, technical advancements such as sensors in cricket bats means that an exponential growth of new types of data collection is likely. There are ongoing tests to introduce ball-tracking technology to mobile phones giving a wider population – particularly smaller budget teams – access to intricate data-collecting technology, biomechanics and video analysis thanks to Machine Learning and Artificial Intelligence improvements.

At the ECB, all data comes into a data lake with more than 20 sources, creating more than 270 terabytes of data and video which is continually growing. The lake is fed in real time for domestic, international, first class events and everything in between, tracking data from how the ball is hit, to technical hawkeye ball-tracking data. From this, a specific data warehouse is built for different purposes, such as performance analysis and talent identification. As an example in cricket, if focusing on bowling, there can be around eight million deliveries within that data warehouse.

The matches that have been recorded go back to 1881 thanks to contemporary data (this first match was an Australia versus England test match in Melbourne, which was coincidentally test cricket’s first ever draw). The majority of the data is from 1997 onwards and, since mid-2015, there has been an explosion in data collection because of technical developments.

“We have a network of scouts, so we get scouting data alongside performance data and it is this combination, alongside coaching insight, that we then use to aid our decisions and selections,” said Jowitt. “This influences everything from where someone might bat in the lineup to overall squad selection.”

“Scouting involves an array of questions including observations for batting and bowling, as well as more general information about the type of individual the player is and how much potential they are deemed to have. There are also questions around physical attributes and the smaller details that are difficult to capture in video or from a scorecard.”

A level playing field

When exploring data in cricket, the differences between variables such as counties, squad type and available slots for participation must be kept in mind. For example, there is only one wicket-keeper position on every team, so this must be accounted for if there are multiple up-and-coming players that can fulfil this role. This leads to weighted metrics developed by the ECB that include factors such as the time of the match, weather, quality of opposing teams and more. “We cannot compare apples and oranges and must keep a holistic view,” stated Jowitt.

Another unique set of circumstances for sporting enterprises is that there is a constant change in under-19s sport, which makes it difficult to gather data in the same way as a team with more permanent players. Therefore, the majority of the data being gathered and used to examine under-19s players tends to be coaching data. “It varies from squad to squad. For example, at the under-19s level in the recent world cup, we focused on our own game and our own data because we had no information on the opposition. As the tournament progressed, we gathered some data, but still only had basic levels based on five matches. So data here is very simple such as information on how different players bat, where they are likely to be in the lineup, who is going to face different players etc, and post-match analysis is usually focused on targets we set ourselves and comparing it to our own data.”

Cricket has come a long way from the handwritten notes at the turn of the nineteenth century, but the journey is far from over as the next decade looks poised to introduce multiple new data fields and the pool of talent being monitored will grow thanks to more economically viable access to tools.