02-16-2016 09:52 PM
The next few posts will be focusing on Data Management, something I feel is grossly under-represented in the data analysis literature. I had the opportunity to do a review for Kristen Briney’s amazing book “Data Management for Researchers: Organise, maintain and share your data for research success” and I highly recommend that anyone who works with data read this book, and have it close at hand during day-to-day work. I frequently go to this book for ideas, direction and suggestions on how to approach various situations that arise in my work. This next group of articles will be heavily reliant on her book, but with my own personal thoughts and recommendations.
The first step I constantly emphasise for any project is planning – whether it’s developing a database, putting together a website, or embarking on a massive analysis – putting pencil-to-paper and mapping out the steps, resources, and timelines is key. Planning out your steps allows you to see potential areas of difficulty, resources enables you to establish not only team members but also the data and software you’ll need, and timelines will ensure you are meeting any deadlines or identifying tasks that may take longer than anticipated. Going in blind to a project will put you at significantly more risk for problems, even more so than under/over estimating any of the three areas above. If you’ve documented everything, and the team agrees to the plan, a failure will still be a lesson learnt – you can go back to the documentation and see where things fell apart. Without that documentation, there is nothing to go back to, and so the same mistakes will be repeated in future projects.
Briney’s book provides detailed steps on creating a Data Management Plan, and I wanted to highlight the questions she suggests you ask yourself:
Although these questions are seemingly intuitive, not having detailed answers to each will inevitably cause confusion and potentially significant delays. Using the first question as an example, if the you work at a hospital and the person requesting the data assumes the data will be in raw form with Personal Health Information, and you have department policies prohibiting the distribution of anything other than summarized data, this will pose a problem when you send out your report. If this is clarified, agreed upon and documented at the beginning, the project may take a very different route (going to Health Records for the data), additional resources may be brought in (Privacy Officer) or the person requesting the data may be OK with summaries.
Although planning your data project will vary from extremely simple to highly complex, having a data management plan documented and agreed upon by all stakeholders will ensure a much smoother project and will allow your efficiency to improve dramatically. This is just a very brief summary of Briney’s book – she goes through various policies, backups, etc. that need to considered.
The next article talks about Documentation, but I look forward to your comments!