Good file and folder organisation will help you to locate, identify and retrieve your data quickly and accurately, therefore making it easier to manage your data. To do this you need to:
You should establish a file organisation scheme at the start of each project to avoid having to sort out your files retrospectively:
Document your file organisation scheme in a 'readme' file, preferably in plain text, and store it at the top level folder for your project where you (or anyone in your group) will be able to access it easily.
Although these principles are aimed at digital files and folders, it is just as important to organise physical files, folders and other materials in a meaningful, consistent and documented manner.
These practices for organising and documenting your data, will also help you when it is time to deposit your final dataset in a research data archive at the end of your project.
There are many ways of organising your files so think about what makes sense for your research. If you are doing qualitative work you might want to organise your folders by topic, participant group or data collection method; if you are doing experimental work you might want to organise the results into folders by the dat that you did the experiment, or by key experimental condition.
You can use the following suggestions to help you organise your data:
Example of a folder structure. The arrows indicate the contents of that folder organised into subfolders.
Naming conventions are rules that allow electronic and physical records to be named in a consistent and logical way. Use of consistent and meaningful names will enable you to identify and distinguish between similar records, making it easier to find your data.
You can use the following suggestions to decide how to name your files:
Here is an example of a file naming convention using the date of file creation, information about the contents of the file and data type:
It is important to be able to distinguish between different versions or drafts of your files. Version control can help you to easily identify the current version of your data or document so that you avoid working on older or outdated copies. If you are working with others it can also help to link versions of the data or document to time and author of the change.
There are number of ways that version control can be managed:
A simple method of version control is to create a duplicate copy and then update the version information to create a unique file or folder name. This is method is appropriate for data or documents where you are not expecting to have numerous versions or to need to keep track of exactly what has changed between one version and another.
These are included within documents and can capture more information than using file naming conventions. Version control tables typically include the new version number, date of the change, person who made the change, and the nature and purpose of the change.
|Version||Date||Name||Summary of change|
|Version 1.1||2018-08-09||AN||Amended body mass index (BMI) categories to include a separate category for a BMI >40|
|Version 2.0||2018-09-05||AB||Added geographical location data for each data subject and updated participant identifiers with agreed new identifier structure.|
These are automated systems available that can store a repository of files and monitor access to them, logging who has made what change and when. These are essential for the development of software or complex code where updates may be released to users. They are also particularly useful for collaborative work on data or on code. Computing Services provide an institutional GitHub service and there is online guidance on using Git for version control and an online guide to using GitHub. Please contact Computing Services for more information on this service.