We welcome proposals for additional resources for inclusion in Connected Histories, irrespective of whether your resource is a dataset consisting of millions of pages or a small, focused body of material. This page explains our criteria for inclusion, the costs involved, and how to propose new content.
Contents of this article
Criteria for inclusion
We will consider resources in a variety of formats, including free text transcriptions, databases and image catalogues, and both those which are available free of charge or which require subscription access. Our main criteria for inclusion are:
- Date range and geographical scope: for the present, Connected Histories is composed of sources whose primary content concerns British history between 1500 and 1900. Resources which extend beyond these boundaries are welcome as long as the chief focus falls within them. In the longer term we hope to extend both the chronological and geographical scope of Connected Histories. We welcome therefore suggestions for content which falls outside our current focus, but such proposals will take longer to implement.
- Content: resources should be principally composed of primary sources, and should comprise complete editions wherever possible. The quality of the transcription and/or data creation must meet normal academic standards.
- Accessibility: resources must be available on the web, irrespective of whether they are available to users free of charge or through subscription. CD-ROM publications and other media do not qualify. Connected Histories does not host entire resources. Rather, it indexes resources for inclusion in our search engine and then directs users to live websites to view the full texts of search results.
Because it costs money to incorporate new resources into Connected Histories, and we do not have a revenue stream for this purpose, adding new resources incurs a modest fee based on the costs of administration, indexing and data storage. Fees are determined by the type and size of the data and therefore vary. Our charging formula is intended to be attractive to both large organisations and independent researchers.
Briefly, the following factors determine the fee for inclusion:
1. Type of data. If the data is unstructured we need to configure and run our natural language processing algorithms in order to add the necessary structure which is required by Connected Histories to facilitate searching by name, place and date, as well as keyword. In this context, unstructured data is data in which names, places and dates have not been tagged (for free text data) or separated out into separate fields or columns (for databases).
2. The amount of technical information you provide. We need to know essential technical information which will allow us to understand your data and determine how to process it. If you are unable to provide us with this information we need to conduct our own audit and this will incur an additional cost. Essential technical information concerns: i) character encoding, ii) URL formation, and iii) the identification of relevant data for indexing (mainly relating to databases where not all table data should be indexed).
3. Background information. We need to create background information which describes the subject matter, scope and context of your resource for the benefit of our users.
4. Storage. This will have the most significant impact on the fee because it is a calculation based upon the size of the dataset in its pre-processed state; that is, the textual data (whether it is in a free text or database format) which you hand over to us. Please note that we do not factor the size of images and other media into our calculations of storage fees as we do not need to keep them.
The fee for inclusion within Connected Histories is for 5 years. After 5 years you will be re-billed for a further 5 years for the costs of storage only.
Example fees for guidance only (exclusive of VAT):
1. A database under 1 GB in which key technical information is provided: £680.
2. A database over 1 GB but under 20 GB in which key technical information is provided: £1,140.
3. Unstructured text at 50 GB in which key technical information is provided: £1,540.
Terms and conditions
Inclusion within Connected Histories does not in any way infringe your rights as the owner of your data. While we need a copy of the data for indexing purposes, this is discarded once the index is completed and functioning correctly on our website (within a year). Connected Histories never directly gives full access to data; it only provides snippet results with live links to the full text held on owners' websites.
How to propose new content
Please send an email to Dr Jane Winters (email@example.com) with the following information:
a brief description of the resource, with the URL and an explanation of how the resource was created
whether the data is unstructured or structured, and if structured, how
its approximate size, in gigabytes (GB)
whether the site is freely available or by subscription access, and if it is the latter who can acquire subscription access
If your resource is suitable for inclusion, we will then ask for some sample data and proceed to determine the cost involved. Once the fee is agreed, we will draft a licence agreement for review and signing.