There are numerous discipline-specific and institutional data repositories / archives holding research datasets across a wide range of disciplines and this number is growing rapidly. If data-sharing has been usual practice within your discipline you are likely to be aware of the relevant repositories, and how to search them. If data sharing is a growing field within your discipline you may be more likely to find data through general data searches or through identifying relevant repositories.
You can search for existing research data within discipline-specific data archives / repositories if you are aware of those that tend to archive data within your discipline. You can also find relevant archives using the Registry of Research Data Repositories (re3data), which maintains a list of archives, along with properties such as their subject and content type specialisms. The easiest way to search re3data is by using the 'browse' option which allows you to browse for data repositories / archives by subject or discipline.
You can also contact your Subject Librarian for advice on searching for data.
The Management subject pages have information on searching for data and statistics relating to management.
When using the ideas and words of other authors in your research publications, you are required to observe limitations placed on you by copyright law, and provide the proper attribution to avoid charges of plagiarism. The same is true when you reuse data shared by other researchers and other organisations.
When using third party research data, you have a responsibility to respect the rights that may be held with other people or organisations, including copyright, sui generis database rights, and moral rights.
Using data in the public domain
If you use third party data that have been dedicated to the public domain, you do not have to fulfill any particular legal responsibilities in respect to those data. Nevertheless, you are still expected to act honestly regarding them, meaning you should acknowledge the source of the data in your documentation. You should also acknoweldge that you used data in any research outputs arising from them, preferably in the form of a data citation.
These are the most commonly used licences for research data. If you have a question about the terms of a licence associated with data that you are using please contact us (email@example.com).
Under this licence you can use and share the data as they stand but you are not allowed to alter or transform them in any way. As with the case of 'all rights reserved', you are unlikely to be able to do much more than verify results that have already been derived from the data.
Examples: Creative Commons Attribution-NoDerivs (CC BY-ND), Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)
Some licences forbid data being used for commercial purposes. This is not usually an issue in academic research, but you might come across circumstances where you would not be able to use the data. Examples include undertaking consultancy for an external organisation, applying for a patent, or commercialising your research in some other way.
Examples: Creative Commons Attribution-NonCommercial (CC BY-NC), Creative Commons Attribution-NonCommercial-NoDerivs (CC BY-NC-ND)
A licence with a Share-Alike or Copyleft requirement allows you to make adaptations to the data, and combine them with data from different sources, but if you share the resulting dataset you must apply the same licence to it.
Some licences are stricter than others about their Share-Alike or Copyleft conditions. Most allow you to use a later version of the same licence. Some allow you to use a functionally equivalent licence, or have explicit compatibility clauses. If you have any questions or concerns about licensing data that have been provided with a Share-Alike or copyleft licence please contact us.
Examples: Creative Commons Attribution-ShareAlike (CC BY-SA), Creative Commons Attribution-NonCommercial-ShareAlike (CC BY-NC-SA).
Most licences require you to acknowledge that you have used the resource in question, and many require an explicit acknowledgement of the originator or rights holder. In addition to the licence requirements, you are also expected to acknowledge in your research outputs any third party data underlying your results, preferably in the form of a data citation.
Examples: Creative Commons Attribution (CC BY) and all other examples given above that include 'BY'.
Some licences have terms that explicitly prevent you from locking down the copies or derivations of the data that you share with others.
Examples: Public Domain Mark (PD), Public Domain Dedication (CC0)
The approach you take regarding archiving third party data depends on how you have used the data and what permissions you have been granted.
If you have used third party data without altering them in a significant way, and they are already available from a third party archive, you do not need to archive them again. Simply cite the original dataset when you publish your results.
If they are not available from an archive, check with the data originator to see if they plan to archive the data themselves. If not, and you have permission to do so, you should archive your copy. Be sure that you credit the correct creators and rights holders in the archive record, and apply the licence under which you received them.
If you used a subset of third party data or database, and it would take some effort to extract the same subset again, you should consider archiving your subset. Ensure you have permission to retain and share your copy, credit the original creators and rights holders, and apply the licence under which you received them.
If you have integrated a third party dataset with other data, check the licence you received it under. If you have permission to share the resulting dataset, archive it, remembering to fulfil all relevant licence terms such as those relating to onward licencing, acknowledgement and preservation of notices. If you do not have permission to share the resulting dataset, archive those components of the dataset you do have the rights or permissions to share, and in the documentation provide full instructions for how to obtain the remaining components and derive the final dataset.
When you use third party data in your research, you must acknowledge this in the resulting research outputs. Ideally, you would cite the data directly, just as you would an academic paper. If this is not possible there are indirect methods you can use instead.
University of Bath Harvard style
Smith, M, and Jones, G.R., 2015. Title of dataset. Version 1. University of Bath. Available from: http://doi.org/10. 15125/12345 [Accessed 1 March 2018].
You can omit the version number if this is not provided.
Smith, M., & Jones, G.R. (2015). Title of dataset. [Data set].[insert DOI here].
Footnote: Melville Smith and G.R. Jones. Title of dataset (accessed March 1, 2018). [insert DOI here].
Reference list: Smith, Melville, and G.R. Jones. Title of dataset (accessed March 1, 2018). [insert DOI here].
Smith, Melville, and G.R. Jones. "Title of dataset". University of Bath, 2015. Web. 1 March 2018, [insert DOI here].
No publisher or style manual example for referencing datasets
Some journals are hesitant to include direct citations of datasets and may ask you to remove them. If this happens:
There are two ways that you can approach the citation of a subset of a larger dataset or database: