Skip to Main Content
library logo banner

Archiving and sharing data: Open Science Framework

Guide on archiving research data and making your data available to other researchers

Image for decorative purposes

On this page you will find guidance on: managing personal data and the Open Science Framework (OSF), hints and tips on publishing your data via the OSF, and a summary of the copyright licences supported by the OSF that can be applied to projects or individual components. 

Please note that the OSF does not replace the University of Bath Research Data Archive for the purposes of publishing and sharing datasets. All researchers are still encouraged to use the University of Bath Research Data Archive for this purpose. We also have guidance on identifying other suitable discipline-specific data archives for data publication and sharing.  

Before using the OSF for collaboration and the storage of files please read the University's guidance on using cloud storage. Please also be aware that the OSF is an external provider and, as such, the University are not responsible for resilience, service time or support, should any problems arise. 

 

Overview

open science framework logo

 

 

 

 

 

The Open Science Framework has been developed by the Centre for Open Science as a platform to facilitate collaboration between researchers and to facilitate open science practices throughout the entire project lifecycle. It is free to use and you can add collaborators from all over the world to work together on projects. 

It facilitates collaboration through: 

  • controlled access to projects that can be set as all private, all public or a mixture between the two.
  • providing a structure through which files can be managed, worked upon, and shared within teams.
  • connection to other online platforms that can facilitate data generation, analysis, sharing, and publication.

It facilities open science and reproducibility through: 

  • the ability to pre-register, and make public, protocols, analysis plans, research outputs, and datasets.
  • the ability to generate persistent digital identifiers (Digital Object Identifiers (DOIs)) for datasets and pre-prints.
  • the ability to license outputs and datasets to govern use by other researchers.
  • the ability to archive datasets in a way that is consistent with University, funder, and journal data policies. 

Working with personal data on the OSF

Personal data (data which relates to a living individual who can be identified from the data either directly, or indirectly by combining the data with other available sources of information) is subject to the UK Data Protection Act 2018. It is also governed by the University's Data Protection Policy and Electronic Information Systems Security PolicyPersonal data must be stored securely in encrypted storage preferably on the X:Drive, and is subject to the required ethical consents for data sharing between collaborators and with other researchers after the end of the study. If you are planning to use the Open Science Framework for sharing data from human participants please ensure that you take the following steps: 

  • If you are storing personal data you must set the storage location for your OSF project to Germany; this can only be done at time of project creation.
  • If you are only storing fully anonymised data then the storage location can be set to any of the options but we would encourage you to choose Germany in all cases. 
  • If you have to share personal data make sure that you have ethical consent to do so and a data sharing agreement in place with collaborating organisations. Data sharing agreements can set up by the Data Protection Team (dataprotection-queries@lists.bath.ac.uk).
  • Do not transfer personal data outside of the EEA without contacting the Data Protection Team (dataprotection-queries@lists.bath.ac.uk).
  • Avoid storing identifiable personal data on the OSF; pseudonymise or anonymise all datasets before uploading them to the OSF.
  • Keep the identifiable data and any re-identification keys (if the data are pseudonymised) in a restricted-access encrypted folder on the X:Drive separate from the study data.
  • Keep any paper-based personal data in a locked filing cabinet in a locked office on University premises.
  • Ensure that your data management practices are fully documented in a project Data Management Plan.

Publishing and sharing your datasets through the OSF

One of the features of the OSF is that you can preserve, publish and share datasets alongside project documentation and generate a Digital Object Identifier (DOI) for the dataset. If you plan to publish your dataset via the OSF please make sure that you do the following: 

  • If relevant, ensure that you have ethical consent from your study participants for the preservation, sharing and re-use for new purposes of the anonymised dataset(s).
  • If your datasets require safeguarding (e.g. restricted access for commercial or ethical purposes) we recommend that you archive and share your dataset(s) via a data archive that has this functionality e.g. the University of Bath Research Data Archive. You can then add the DOI for the dataset to the OSF to facilitate data discovery. 
  • Fully anonymise datasets before publication and sharing.
  • Save your data files in interoperable formats - these are file formats that do not rely on proprietary software. The UK Data Service provide guidance on interoperable data formats
  • Give your dataset a descriptive name so that it can be discovered online. Our institutional data archive house style is to call the dataset 'Dataset for '<name of paper>'.
  • Generate a DOI for the datasets that you make public and ensure that citation information is provided with the dataset. 
  • Make sure that you provide contextual and discovery metadata for the dataset: 
    • contextual metadata helps others to understand the dataset. This type of metadata can be provided through detailed documentation that describes how the dataset was generated and how to interpret and re-use the data - for example, provide a 'readme' file for the dataset.
    • discovery metadata helps others to find the dataset online. This type of metadata needs to be provided by making sure that you create 'tags' for your project - these are the keywords that might be use to search for the information online.
  • Add a licence to your project / component / dataset before sharing. We have provided a summary of the licences supported by the OSF in the next box. 
  • Let us know that you have published a dataset on the OSF. Send the dataset DOI to us at research-data@bath.ac.uk and we can create a record in Pure for you. 

Copyright guidance: licensing your content, datasets and software

The OSF provides their own guidance on adding licences for content. Licences (copyright) tell others what they are permitted to do with the content that you have shared; they can be applied at a project or component level. We have expert knowledge of data and software licensing within the Library Research Data Service so please contact us if you are unsure about which licence you should choose (research-data@bath.ac.uk)

We recommend that you use the Creative Commons Attribution 4.0 International licence (CC BY) for content and datasets, and the Apache 2.0 licence for software source code. We also recommend using the most recent version of each licence and therefore have only provided guidance below on those licences. 

 

Licence Summary of the conditions of the licence
CC0 1.0 Universal 

Content licence (including datasets).

This licence waives copyright for the work you have created and dedicates it to the public domain. It does not require attribution. If you choose this licence you are opting out of copyright. The use of this licence is in breach of principle 12 of the University's Research Data Policy.  

CC BY Attribution 4.0 International

Content licence (including datasets).

This licence lets others derive new datasets and other resources from your data, and redistribute your data and their derivations, both openly and commercially, as long as they credit you for the original creation. This is the licence that we generally recommend for datasets but it is not suitable for software. 

MIT licence

Licence for software code.

This is a permissive licence that allows others to use, reproduce, modify, distribute, publish, sublicense or sell the software, including for commercial use, but users must include the original copyright and licence notice in any copy of the software. It is compatible with the GNU GPL 3.0 licence. 

Apache 2.0 licence

Licence for software code. 

This is a permissive licence that allows others to use, reproduce, modify, distribute, sublicense or sell the software, including for commercial use, but users must include the original copyright and licence notice in any copy of the software, and identify changes they have made. This licence explicitly sets out the grant of patent rights when using, modifying or distributing Apache licensed software. It is compatible with the GNU GPL 3.0 licence. This is the licence that we recommend for software. 

BSD 2-Clause "Simplified" Licence

Licence for software code. 

This is a software licence that is very similar to the MIT licence. It is a permissive licence that allows others to use, reproduce, modify, distribute or sell the software, including for commercial use providing that the original copyright and licence notice is included in any copy of the software.  It is compatible with the GNU GPL 3.0 Licence. 

BSD 3-Clause "New / Revised" Licence

Licence for software code. 

This is a software licence that is very similar to the MIT licence. It is a permissive licence that allows others to use, reproduce, modify, distribute or sell the software, including for commercial use providing that the original copyright and licence notice is included in any copy of the software. The difference from the 2-clause version is that it requires users to obtain permission before using the names of the original project or its contributors to endorse or promote derived products. It is compatible with the GNU GPL 3.0 Licence. 

GNU GPL 3.0

Licence for software code. 

This is the most frequently used software licence and is a ShareAlike (copyleft) licence. The licence allows users to use, modify, copy and sell the software, including for commercial use, but users must include prominent legal notices preserving copyright information and all new software based on yours must carry the same licence (GPL). Earlier versions of this licence (e.g. GNU GPL 2.1) are in common use but are not as compatible with other licences and do not include an explicit patent licence.  

Artistic Licence 2.0

Licence for software code 

This licence is widely used in the Perl community. The licence allows others to use, modify, copy and sell the software, including for commercial use, but users must include the original copyright and licence notice in any copy of the software, and state the changes that have been made from the original code. It also requires that modified versions of the software do not prevent users from running the standard version. It is compatible with the GNU GPL 3.0 Licence. It is not suitable for data. 

Eclipse Public Licence 1.0

Licence for software code - particularly useful for software libraries

This licence is a weak copyleft licence that allows users to use, modify, copy and sell the software, including for commercial use, but users are must include the copyright and licence notice in any copy of the source code and license modified software under the same licence as the original (though separate, additional code can have a different licence). It includes an explicit patent licence. The source code must be made available upon request when the software is distributed. Unlike v1.0, v2.0 adds the ability for the licensor to permit redistribution under the GNU GPL 2.0 or later Licence. 

GNU LGPL 3.0

Licence for software code - particularly useful for software Libraries

The LGPL is the weak copyleft version of the GPL (see above). This means that if the code is included in a wider software project, the code from the wider software project does not have to be licensed under the LGPL. Modifications of the licensed code itself must be licensed under the LGPL or the equivalent version of the GPL. 

Mozilla Public Licence 2.0

Licence for software code - particularly useful for software libraries

This is a weak copyleft licence that allows others to use, reproduce, modify, distribute or sell the software, including for commercial use, but users must include the original copyright and licence notice in any copy of the source code. It includes an explicit patent licence. Unless otherwise stated, if the code is included in a wider software project, the wider project can use a different licence, in which case the MPL code is dual licensed under that wider licence. The source code must be made available upon request when the software is distributed. 

Some wording in the table above has been reproduced from https://creativecommons.org/licences/ and from choosealicence.com/licences.