We’ve created a simple web service to make it easier to discover and find out about the growing number of open data sources and providers within the arts and culture communities.
As well as linking to each of the data sources, we’ve also tried to describe what each dataset is about, the size of each dataset, how easy it is to use and the license it’s released under.
Most importantly, if we know about interesting hacks or prototypes made with the data, we've linked to those, too.
This tool is an experiment in the most simple way to catalogue data in a way that's both 'native to the web' and the way creative technologists work. In the future we want to make this a way of finding out about the creative things people have done with data, as much as the data itself. We're working on ways that you can associate prototypes with data sources, or write introductions to prototypes that help non-technical users understand just how awesome the prototypes are.
If you have ideas about how to do this, we'd love to chat to you.
You can get in touch via e-mail at firstname.lastname@example.org
You're Missing My Data!
If there’s a dataset or a prototype we’re missing, you can add it. All of the entries here are just plain, simple old-fashioned text files held over on GitHub.
You can read instructions on GitHub about the format we use, and how to submit a “pull request” for us to add your contribution. It doesn’t take long, and you can do it all online.
LicensesData sets carry many different licenses. Some APIs have their own terms and conditions to access them: most entries will have notes about specific licensing conditions. We have mainly categorised data sets under the following licenses:
- PD - Public Domain
- CC0 - Creative Commons Zero
- ODbL - Open Data Commons Open Database License
- OGL - Open Government License
- CC-BY - Creative Commons Attribution
- CC-BY-NC - Creative Commons Attribution Non-Commercial
- CC-BY-SA - Creative Commons Attribution Share-Alike
- CC-BY-NC-SA - Creative Commons Attribution Non-Commercial Share-Alike
- CC-BY-NC-ND - Creative Commons Attribution Non-Commercial No Derivs (CC BY-NC-ND 3.0)
- T+C - the data/entry has specific terms and conditions. Look at the main body of the article for more information.
- © - the data/entry is rights restricted: someone should have a word...
Culture Hack 'Hackability' Scale
We've started rating some of the data sources for 'hackability' from 1-5, where 5 is best. To make your life easier, we've written a guide to how we think those ratings work from the point of view of a creative technologist. It's a rating scale that's about doing things with the data, rather than espousing a certain philosophy about how you should release data or create the sematic web. Oh, apart from the bit where we snark about MS SQL, but that's because we actually had that problem on a hack day once.
This data source is really good quality and well documented: you could do something interesting in a hour using existing tools. It 'passes the P-Test'.
- Lots of records, all pretty much complete
- Clean data (eg dates)
- Pretty pictures and rich content available, not just metadata
- Well supported, with good documentation
- Frequent updates, active community or responsive owner
It's going to take me 3 hours to make this work, but I'll do that because it's going to make something awesome.
- 60% good data: could use some cleaning up, but not terrible. I could work around it in code.
- A bit awkward to work with - perhaps flat files rather than a nice dynamic API - or an overly complex implementation.
- Documentation makes me mildly punchy
- Might not have a lot of pretties, images, rich media associated with it.
Kind of nearly there, but might not be enough to do something awesome: I'd need to be really motivated to work with this, or for it to be exactly the right information for my idea.
- Feels a bit flat - data, but not very dynamic or deep
- Data infrequently or never updated, published as flat files: prototypes will be one-offs with no ongoing life.
- A CSV with 400 things in it, and quite a few gremlin characters
I would complain quite loudly if you asked me to work with this in the context of paying me a good salary.
- An MS SQL Database
- Dirty or very fragmentary data
- A spreadsheet made by an idiot which uses text formatting to convey information or that has no headings or logic
- Poor or absent documentation
I will assume you are making a bad joke, look at you funny and walk away if you ask me to hack with this data set (and if I eventually do, it will be to prove a point about what an idiot you are: eg. Hansard PDFs)
- A PDF
- An actual piece of paper
- Rights restricted, closed data
- May involve locked filing cabinets in basements with 'beware of the leopard' written on them.
The Culture Hack Data Tool is supported by a grant from the Technology Strategy Board.
Website produced by Frankie Roberto and Kim Plowright for Culture Hack.