https://mojdigital.blog.gov.uk/2018/04/05/pushing-the-boundaries-of-data-science-with-the-moj-analytical-platform/

Pushing the boundaries of data science with the MOJ Analytical Platform

It’s a great time to be an analyst at the Ministry of Justice. Our new Analytical Platform is giving our analysts access to cutting edge open source tools, improving our ability to use data and evidence to drive decision making.

The challenges of government analysis

When I started working for government as an analyst in 2006 we worked mainly with spreadsheets, and sometimes with more specialist proprietary tools. Working with bigger, complex datasets was difficult and software licensing and training was very expensive. Government IT often got in the way, and people joked how we were years behind the private sector.

The past decade has seen an explosion in free and open source analytical tools that broaden the range of problems analysts can tackle. They enable us to build machine learning models, process bigger and more complex datasets, and analyse unstructured text.

However, historically these tools have not been available to government analysts. The key difficulty is in giving analysts greater freedom, while safeguarding sensitive government data.

Our new Analytical Platform

For the last 18 months the MoJ Digital and Technology team have been working closely with MoJ Analytical Services on a new Analytical Platform for our community of around 300 analysts. (This platform is built on AWS and Kubernetes, which allows us to create virtual infrastructure that supports secure environments running analytical software like R Studio and Jupyter Lab.)

The ambition is to empower our analysts to be the best they can be, by giving them access to tools which they regard as the best available. We want to provide them with powerful tools that make it easier to tackle our most thorny analytical problems and reduce the complexity of some of the important but time consuming aspects of their jobs - things like working efficiently in large, collaborative teams while maintaining extremely high standards of reproducibility and quality assurance, and maintaining the security of our sensitive data.

The power of open source tools

New tools are exciting to analysts because they expand the realms of the possible. They can be used to help us tell compelling data-driven stories - for example, the Guardian’s recent piece on homelessness, or the New York Times’ analysis of ‘Stop and Frisk’. This kind of data-driven storytelling is now increasingly also being seen in government, for instance in the Crown Court Information Tool which contains a wealth of interactive charts and data about the Crown Court.

Amazingly, most of these tools are free and open source, which has benefits way beyond simply avoiding software licensing fees. It’s meant that I’ve been able to learn how to use them without having to buy anything. High quality training materials are available online for free from ‘the community’ - enthusiastic users across the world who contribute their time and code to making this software better and helping others. It means we can more easily collaborate on analytical projects across government, and new government analysts have often used these tools at university, meaning they can be more productive from day one.

The result is that these new tools are by far the most popular tools in use by data scientists. The top 5 most popular tools are all free and open source. Increasingly, these tools are being used by analysts and scientists more generally: for instance, these same tools were used by winners of the 2017 Nobel Prize for Physics.

In building the Analytical Platform we’ve conducted user research with MoJ analysts for the first time. To meet their needs we’re making these new open source tools available on the platform. We’re also taking our first steps in contributing to the open source community with the MoJ Analytical Services Github organisation.

How we’re using our new analytical tools

With the platform now in private beta and in use by over 50 analysts, the most rewarding part of my job has been seeing the creative ways that analysts are finding to use the new tools:

  • We’ve built a Parliamentary Question Tool, which uses machine learning to perform analysis of free text of parliamentary questions and their responses. We have open-sourced the code for this tool, which is available here.
  • Our analysts recently published their first National Statistics using the new tools on the platform. You can read more about this in a recent blog post by the UK Statistics Authority here. As part of this, we released the MoJ’s first piece of open source statistical software, which is available here.
  • Our data scientists have produced a variety of interactive data visualisations for internal audiences, including complex operational models running in real time that have enabled us to better understand how new policies may affect people’s access to justice.

What’s next?

We are currently planning to go into public beta in April, when we will offer access to the platform to MoJ’s whole analytical community. The platform will help make our ‘business as usual’ activities easier, but I’m most excited about how new tools can expand the scope of analysis in government, increasing the range of situations in which we can make high quality evidence available to decision makers at the right time and in the right place.

Please get in touch if you have any questions or comments. Subscribe to the blog for updates on our work, or follow us on Twitter.

Want to work on things that matter? Find out more about working at MoJ Digital & Technology.

Leave a comment

We only ask for your email address so we know you're a real person