Access rights management in compliance with the French Code du patrimoine: a generic approach for the OAIS model run by SLDR
Automatic management of access rights is fully operational at SLDR. This process allows data producers to specify access rights to resources, or documents they contain, at the three stages of their lifecycle: source data, dissemination and long-term preservation.
Long-term preservation implies a compliance with the current French Code du patrimoine which makes it mandatory that a public archive be immediately accessible. However, derogations to this principle have been spelled out, and the durations of their effects defined, for the protection of basic rights, among which privacy, medical secret and specific contracts.
SLDR displays explicitly the motivations of access restrictions (as required by Law) and the availability of permissions signed by informants/authors. In addition to this, producers can enter the whole set of parameters that determine the long-term evolution of access rights, plus confidential metadata that will become public after the completion of a period decided by them in compliance with Law.
Practical proceeding with access rights is explained to data producers on page Access rights settings, and for developers on page Access procedures on SLDR. Queries for downloading, processing or viewing a document are submitted to decision tree: SLDR_access_decision_tree.
SLDR (Speech & Language Data Repository, http://sldr.org, formerly CRDO-Aix) is offering labs and scholars a free-of-charge service for sharing and preserving audio/video recordings and linguistic resources with scientific and/or patrimonial value. Despite the evidence of a benefit of resource sharing for the community, research scientists and collectors often need a reassessment of their duties and rights before making a decision. Questions they raise deal with intellectual property, copyright, the protection of private life, storing personal identity information, well-informed consent and related legal issues.
Since fall 2010 these questions have been on the agenda of the TGE-Adonis/CINES/CC-IN2P3/CRDO pilot project on the sharing and long-term preservation of oral/linguistic resources. The aim is to design an automatic management of access rights for documents in compliance with the French legislative framework recently updated for public archives.
- A dissemination site hosted by CC-IN2P3 (Centre de calcul de l'Institut national de physique nucléaire et de physique des particules, Lyon, France).
The present page is dedicated to solutions currently implemented at SLDR and submitted to the team of the pilot project. It does not claim to be exhaustive and no statement should be considered final because of the novelty of our operational framework. (Long-term preservation has been activated to production for SLDR in July 2010.) A comprehensive report on discussions is available here: http://sldr.org/wiki/SyntheseDroitsAccesFevrier2011_en.
SLDR has been registered with "Commission Nationale de l'Informatique et des Libertés" (CNIL) under agreement Nr. 1222972 dated 26 March 2008 giving permission to handle information on private persons if the latter are granted access to editing, modifying or suppressing their personal record.
Items submitted to long-term preservation may comprise confidential metadata along with instructions for dealing with this information in the long term. For instance, (1) accurate identity of the right-owner of the stored item, and (2) the full names of informants taking part in an interview. These names are expected to become public after the period of access restriction pertaining to the protection of private life (see infra).
By definition, confidential metadata are not accessible from the dissemination site whereas they are preserved by the archival site. (See page 55 of SLDR presentation.)
Relying an institutional archive for long-term preservation implies a compliance with legal aspects of public archives, namely the current version of the French Code du patrimoine, loi du 15 juillet 2008 (the Heritage Code) in its consolidated version of 13 January 2011.
- L213-1: Public archives are in open access if not subject to restrictions as per Article L. 213-2.
- Public archives should be, by default, immediately accessible.
This last statement is a radical change since a period of 30 years had been in effect before 15 July 2008.
These constraints may clash with the interests of research scientists and holders of documents with patrimonial value. However we will show that a flexible yet lawful interpretation is possible coping with most cases encountered by the research community (see infra).
From an archival perspective, it is said that a document becomes an "archive" at the end of its "normal usage period". We may therefore consider that its long-term preservation is only mandatory after the completion of the related research project and associated publications.
Research scientists eager to preserve and share a document before the completion of its normal usage period may wish to put it in dissemination mode, hereby meaning reliable backup and dissemination procedures that do not endow the document with the legal status of a (public or private) "archive". The duration of the normal usage period is indeed project-dependent and it may include:
- when no permission has been signed by participants (for the transfer of their patrimonial rights of reproduction and presentation), the period of restriction of patrimonial rights. (Code de la propriété intellectuelle)
In general, restricting access to a document in dissemination mode is not contrary to Law since the document is not yet in a public archive. Data producers may therefore impose this restriction without spelling out its motivation. Nonetheless they should keep in mind that this status coincides with the normal usage period of documents.
SLDR is unique for its using the same OAIS model for dissemination and long-term preservation. Technically, documents are stored on the dissemination site (CC-IN2P3) after a brief transit on the archival site (CINES) during which file formats and tree structure are checked for consistency. This technique is reliable because the entire content of an item may be rebuilt from datastreams found on the dissemination site and checked for consistency with the source material. (See pages 58-60 of SLDR presentation). Further, SLDR is able to handle URLs that remain unchanged when switching from dissemination to long-term preservation and uploading new versions of an item (see page 28).
Dissemination is also possible on sites dedicated to medium-term preservation as examplified by the ISAAC project recently launched by CINES. Statistics (see ISAAC) show that, on the average, 20% of the material stored in a medium-term archive is likely to be preserved in the long term. Since only CINES is qualified for medium-term (and long-term) preservation, we use a distinctive term for dissemination.
Experience shows that once the project has been completed, research scientists lack time and human resources (and indeed motivation) to submit its valuable data and results for long-term preservation. At this stage, for instance, modifying some file formats to make them eligible for long-term preservation (see Formats) may be counterproductive. It may even be the case that primary data required for re-encoding files - e.g. uncompressed video - are no longer available.
For these reasons we believe that the operational mode of SLDR is a good incentive for initiating dissemination or long-term archiving at the very onset of any research project. We even recommend creating records (metadata) for primary data that has not yet been collected as these records produce identifiers that may be cited by the team in support to its search for collaborators and funding.
- L213-2: Notwithstanding the provisions of Article L. 213-1 (...) public archives are automatically granted open access after a delay of ... (read details).
These motivations have been listed and tokenized in a table (see table). This makes it easy for producers to declare the status of their set of documents. Restriction AR048 is applicable to many situations of data being collected in field research: 50 years. Documents disclosure of which undermines the protection of privacy or for appreciation or value judgments about a person named or easily identifiable, or which reveal the behavior of a person under circumstances which might bring him/her injury. For medical data the restriction period (AR061) might be 120 years. Given the restriction code and the date of a recording (or the date of submission to the archival system, if the former is not known), the system automatically sets up access rights at the current date. These parameters may be specified differently for several documents belonging to the same item. By default, documents inherit the access rights of the directory in which they are stored.
Article L213-2 is a derogation to article L213-1 (see supra). This means that the owners of a public archive should try their best to get permission for providing free access to its documents before the end of the period of restricted access. To this effect they collect signed permissions from the informants/speakers/authors of these documents.
Permissions should follow the procedure of informed consent: make it explicit that the signee is aware of the type and range of dissemination planned for the document. In French-speaking countries a typical form has been used by many producers (page 115 of Corpus oraux, guide des bonnes pratiques 2006). The ICAR team published its own adapted version: http://icar.univ-lyon2.fr/projets/corinte/recueil/document_informations.htm.
Signees might also decide that a particular audio/video recording is worth sharing with scholars though it shall not be displayed in a public presentation. In this case, the standard SLDR licence signed by users will be completed with an additional licence (for example http://sldr.org/sldr000761/licence/LicenceStRemy.pdf); this additional licence has be presented to informants/speakers/authors as an annex of their consent form.
A complete consent form taking this additional licence into account is proposed here : http://sldr.org/doc/forms/ConsentementModele_en.pdf.
With respect to oral corpora, an generally any 'sensitive' primary data, it is necessary to distinguish between the dissemination of content (listening to a recording, viewing a video or photographs) and distributing its source file in 'high resolution' that might be reused to produce artifacts with potential damage to speakers (voice imitation etc.). For this reason SLDR makes it possible to trace downloadings in a shared visibility (see page 44 of SLDR presentation) even if their contents are freely accessible in streaming.
SLDR makes it possible to define a limit date for a signed permission, beyond which the system will automatically revert to the legal restriction status in case the permission has not been extended. We believe that writing a limit date in a permission does not make sense because informed consent may be revoked at any time. Technically, should this happen, the limit date would be immediately set to the date of revokation so that automatic access rights management can cope with this event.
Signed permissions are scanned and stored on the archival site along with confidential data (see supra).
Some documents submitted for long-term preservation might fall under restrictions due to commercial copyright. This is the case with publications in scientific journals. Although it is worth preserving the full text of an article or a book along with the recordings and annotations it is based on, access may be restricted till the end of the embargo period imposed by the publisher.
If the document is also eligible for restricted access under one of the categories defined in Article L213-2 (see supra) and if an informed consent has been signed for its anticipated free access, a comprehensive solution will be to set the start date of the permission at the end of the embargo period. Whenever these conditions cannot be met we recommend not to submit the document immediatley for long-term preservation because of the the risk of litigation.
Whether copyright does or does not qualify as "commercial and industrial secrecy" (AR039, 25 years) remains an open question that we are submitting to lawyers.
This issue is raised by the following rule:
- L213-5: Any Administration holding public or private archives is required to give reasons for objecting to a request for access to archival documents.
The reason for the derogation to the principle of communicability (as per Article L213-2) is displayed on records of archived items and exported in their OLAC metadata. (See for instance accessRights at the top of this page.)