Fork me on GitHub

About Culture Hack Data

We’ve created a simple web service to make it easier to discover and find out about the growing number of open data sources and providers within the arts and culture communities.

As well as linking to each of the data sources, we’ve also tried to describe what each dataset is about, the size of each dataset, how easy it is to use and the license it’s released under.

Most importantly, if we know about interesting hacks or prototypes made with the data, we've linked to those, too.

This tool is an experiment in the most simple way to catalogue data in a way that's both 'native to the web' and the way creative technologists work. In the future we want to make this a way of finding out about the creative things people have done with data, as much as the data itself. We're working on ways that you can associate prototypes with data sources, or write introductions to prototypes that help non-technical users understand just how awesome the prototypes are.

If you have ideas about how to do this, we'd love to chat to you.

You can get in touch via e-mail at

You're Missing My Data!

If there’s a dataset or a prototype we’re missing, you can add it. All of the entries here are just plain, simple old-fashioned text files held over on GitHub.

You can read instructions on GitHub about the format we use, and how to submit a “pull request” for us to add your contribution. It doesn’t take long, and you can do it all online.


Data sets carry many different licenses. Some APIs have their own terms and conditions to access them: most entries will have notes about specific licensing conditions. We have mainly categorised data sets under the following licenses: We've included some 'closed' data sources because we think that the information inside could be interesting when combined with open data, or because we think someone should create an open alternative. After all, the amazing GDS has its roots back with a bunch of people who hacked together a fax gateway...

Culture Hack 'Hackability' Scale

We've started rating some of the data sources for 'hackability' from 1-5, where 5 is best. To make your life easier, we've written a guide to how we think those ratings work from the point of view of a creative technologist. It's a rating scale that's about doing things with the data, rather than espousing a certain philosophy about how you should release data or create the sematic web. Oh, apart from the bit where we snark about MS SQL, but that's because we actually had that problem on a hack day once.


This data source is really good quality and well documented: you could do something interesting in a hour using existing tools. It 'passes the P-Test'.

  • Lots of records, all pretty much complete
  • Clean data (eg dates)
  • Pretty pictures and rich content available, not just metadata
  • Well supported, with good documentation
  • Frequent updates, active community or responsive owner


It's going to take me 3 hours to make this work, but I'll do that because it's going to make something awesome.

  • 60% good data: could use some cleaning up, but not terrible. I could work around it in code.
  • A bit awkward to work with - perhaps flat files rather than a nice dynamic API - or an overly complex implementation.
  • Documentation makes me mildly punchy
  • Might not have a lot of pretties, images, rich media associated with it.


Kind of nearly there, but might not be enough to do something awesome: I'd need to be really motivated to work with this, or for it to be exactly the right information for my idea.

  • Feels a bit flat - data, but not very dynamic or deep
  • Data infrequently or never updated, published as flat files: prototypes will be one-offs with no ongoing life.
  • A CSV with 400 things in it, and quite a few gremlin characters


I would complain quite loudly if you asked me to work with this in the context of paying me a good salary.

  • An MS SQL Database
  • Dirty or very fragmentary data
  • A spreadsheet made by an idiot which uses text formatting to convey information or that has no headings or logic
  • Poor or absent documentation


I will assume you are making a bad joke, look at you funny and walk away if you ask me to hack with this data set (and if I eventually do, it will be to prove a point about what an idiot you are: eg. Hansard PDFs)

  • A PDF
  • An actual piece of paper
  • Rights restricted, closed data
  • May involve locked filing cabinets in basements with 'beware of the leopard' written on them.


The Culture Hack Data Tool is supported by a grant from the Technology Strategy Board.

Website produced by Frankie Roberto and Kim Plowright for Culture Hack.