Call 1 FAQs

CLARIN-NL FAQ (Frequently Asked Questions)

Persistent Identifiers

Q: What is a persistent identifier and what should I do for it?

A: persistent identifier (PID) is a stable (persistent) and unique reference (identifier) to identify a resource, in the case of CLARIN a digital language resource. A well-known example of PIDs outside of CLARIN is formed by ISBN numbers, which are persistent identifiers for books.

PIDs for resources are surely needed for tools, applications and services running on the CLARIN infrastructure to provide unique identifiers for resources but they can be useful for humans as well.

 

Q: Can the title of a resource not serve as its PID?

A: No, a title probably is persistent, but it is not so unique and has other disadvantages. There are cases where two different resources happen to have the same title. But more importantly, titles tend to be long and redundant for humans ("Corpus Gesproken Nederlands"), so that humans start using abbreviated forms ("CGN"), and they are language-dependent, so often translations are also used ("Spoken Dutch Corpus")

 

Q: Can the URL of a resource not serve as its PID?

A: No, URLs avoid some of the disadvantages of titles, but they tend to be not so persistent (web sites often change and the related URLs change as well or disappear completely). Humans can cope with missing references, computers cannot.

 

Q: Where do I get a PID for my resource?

A: CLARIN-NL will later this year but ultimately at the start of your project point out a URL and a programming interface where you can get a PID for your resource via a Persistent Identifier Service

 

Q: What do I have to do to obtain a PID for my resource?

A: Make a request using the Persistence Identifier Service provided by CLARIN-NL later this year. In this request you will be asked to provide some minimal information about your resource such as a small subset of the metadata which you have to provide anyway in the context of your project. The exact nature of this minimal set of resource metadata will be made known ultimately at the start of your project

 

Q: How much effort must I plan in my project for obtaining a PID for my resource?

A: It depends a little bit on the nature of your resource, but in general the effort involved will be minimal, typically 1 person day per resource.In general there should be a proper repository system with a software component that requests PIDs automatically when new resources are uploaded.

 

Q: If I have a PID, what can I do with it?

A: You can use it in programs to uniquely refer to your resource, and the organization that provides a Persistent Identifier Service will make available functionality so that you can click on it in a web browser or another context and it will lead you directly to the resource metadata. However, in most cases, you will identify the resource's metadata in other ways (by searching, querying or browsing in metadata overviews), and the CLARIN infrastructure will use the PID (behind the screens) to get from the resource's metadata to the resource itself.

 

Q: Where can I obtain more introductory information on PIDs?

A: Here

 

Q: Where can I obtain more (technical) information on PIDs?

A:Here (but you really need not know this if you are project leader or participant)


ISOcat and Semantic Operability

Q: What is ISOcat?

A: ISOcat is a web-based implementation to store and make accessible concepts (a concept registry), more specifically data categories, that are relevant for the CLARIN infrastructure and for encoding linguistic phenomena. Basically it provides a persistent identifier for each data category, and a variety of properties of the data category. It allows one to uniquely refer to a data category (using a PID, e.g. http://www.isocat.org/datcat/DC-1333) under abstraction from language-specific (e.g. English ‘noun’ v. French ‘nom’) and arbitrary differences in notations for data categories (e.g. ‘noun’ v. ‘n’). This will make it possible for all kinds of resources and tools to ‘interoperate’, not only on the format level, but also on the level of content.

A large list of data categories, mainly originating from the ISO TC 37/SC 4 project (which itself based its selection on earlier projects for best practices and standards such as EAGLES and ISLE) has been created. These data categories are currently considered as candidates for official inclusion in ISOcat and some have already been accepted (but all are already accessible for inspection, comments etc.).

Of course, in some cases the same expression is used for two or more different concepts, sometimes dependent on a specific theoretical view on the matter. But ISOcat is open, one can add one’s own concepts, and even organize a whole group of related concepts in a so-called ‘profile’.

Currently, ISOcat is basically a flat list of data categories, each with its properties. Data category specifications can be associated with a variety of data element names and with language-specific versions of definitions, names, value domains and other attributes. It is the intention to add, in a next stage, relations between concepts. This will allow one to specify many types of relations between concepts, e.g. that one concept is a hyponym of another one; that two concepts are not completely identical but very close; using such relations one can specify multiple hierarchical ontology’s on these concepts, etc. etc.

 

Q: How should I work with ISOcat in my project?

A: For each concept that occurs in your resource or in the metadata of your resource, you should check whether a corresponding concept already exists in ISOcat. If this is not the case, you will have to add the concept in ISOcat. You will also have to make a formally represented mapping between the notations for concepts that occur in your resource or its metadata, and the PIDs of the corresponding ISOcat data categories. For example, if you use “zn” as the notation for the concept of ‘noun’, this mapping will have to include:

zn ó http://www.isocat.org/datcat/DC-1333

 

Q: Where can I obtain more information on ISOcat?

A: Concept Registry Short Guide and ISOcat website


Metadata

Q: What is CMDI?

A: Descriptive Metadata is used to characterize data resources and tools to facilitate discovery and management in large (virtual) infrastructures and repositories, i.e. they make resources visible to everyone.

CMDI is the CLARIN Component Metadata Framework. The need for a component based metadata framework has been established in studies and discussions on metadata in the European CLARIN preparatory project. A first version of CMDI has been defined. Currently a project (executed by representatives of the targeted Dutch CLARIN Centres) is about to start up to carry out the first experiments with the CMDI against real data, in particular data which have been found to be troublesome for the IMDI framework. If these experiments are successful, a stable and tested version of CMDI can be released for use by others and supporting tools can be developed.

 

Q: Where can I obtain more introductory information on CMDI?

A: In the CMDI Short Guide

 

Q: What is IMDI?

A: The ISLE Meta Data Initiative (IMDI) is a proposed metadata standard to describe multi-media and multi-modal language resources. The standard provides interoperability for browsable and searchable corpus structures and resource descriptions with help of specific tools.

Q: How do I make CLARIN-compliant Metadata?

A: As long as CMDI has not been released officially, you should make metadata for resources in your project in accordance with IMDI. You will be assisted by an IS-specialist who is an expert in the area of metadata. It will be guaranteed that IMDI descriptions of metadata can be automatically converted into CMDI descriptions.

If CMDI has been officially released before or at the very beginning of your project, you will have to make a description in accordance with CMDI. CLARIN-NL will provide ample educational and training opportunities to get oneself acquainted with CMDI and tools that support it.


CLARIN Centres

Q: Which are the CLARIN centres in the Netherlands and how can I contact them?

A: The status and types of CLARIN centres have been described in this document. There is also a Short Guide on CLARIN centres. There are currently no officially recognized CLARIN centres, whether in the Netherlands or abroad. However, a number of organizations intend to become a CLARIN Centre and are working towards this status. In the Netherlands there are currently four such centres, each with their own expertise and specialization:

You can approach these institutes in relation to projects for the CLARIN-NL first call via their contact persons:

Other organizations that aim to become an officially recognized CLARIN Centre can contact the CLARIN-NL Office

 


Web Services

Q: What are web services, and how is it relevant to my project?

A: Web services are programs that can be called from other programs that reside somewhere on the World Wide Web. They differ from other programs because (1) they must communicate with other programs, and (2) they must do so over the World Wide Web, which requires special protocols (SOAP is one important example of such a protocol). It is expected that most web applications that consist of two clearly separated parts, viz.  a web-based user interface part and a core functionality part with a well-defined API, can be easily turned into web services using a generic wrapper.

If your application includes software tools and services you should interact as early as possible with the infrastructure team to chat about the way your software can best be made available within the CLARIN infrastructure to other users as well.

Web services are so important because a lot of the functionality that will be offered in the CLARIN infrastructure will be in the form of web services. This will make it possible to set up work flows of interacting programs, e.g. a pipeline of actions that have to be carried out in sequence (e.g. a sequence of text cleaning, text normalization, tokenization, PoS tagging, lexicalization, Named Entity Recognition, full parsing web services applied to a text corpus.)


Participation of Foreigner and/or Foreign organizations

Q: Can foreign persons participate in CLARIN-NL?

A: Foreign persons (I.e. persons with a nationality other than the Dutch nationality) can participate in a CLARIN-NL project if they carry out the work for the project as an employee of a CLARIN-NL partner that has signed the CLARIN-NL consortium agreement.

 

Q: Can foreign organizations participate in CLARIN-NL?

A: Organizations from outside of the Netherlands can in principle participate in proposals for CLARIN-NL, but:

·         They must do so preferably in cooperation with one of the CLARIN-NL partners (which must be and all are organizations from the Netherlands)

·         It must be justified why certain work must be done by a foreign organization (and not by a CLARIN-NL partner)

·         No funding can be provided by CLARIN-NL to organizations that are not CLARIN-NL partners.


Funding of an awarded project

Q: Which costs are eligible for funding in projects for the CLARIN-NL First Call

A: Two types of costs are eligible for funding in these projects

·         Personnel costs directly related to the project up to a maximum of €60,000 in accordance with the Akkoord NWO-VSNU 2008 (and any additions to it). See also http://www.nwo.nl/nwohome.nsf/pages/NWOP_67QK4E.

·         A fee of maximally € 3.000 per FTE per year (or a pro rata part for less than 1 FTE per year) for covering travel and subsistence costs

 

Q: Are “vakantiegeld” and “eindejaarsuitkering”related to personnel costs eligible for funding?

A: Yes. Personnel costs include apart from the basissalaris” also “vakantiegeld” and “eindejaarsuitkering” . The applicable percentages are mentioned in the Akkoord NWO-VSNU 2008.

 

Q: Is the “opslag voor werkgeverslasten”, eligible for funding?

A: Yes. See the Akkoord NWO-VSNU 2008, in particular 2.1b en ad b.

 

Q: Is the “opslag overige personeelskosten” eligible for funding

A: Yes, see Akkoord NWO-VSNU 2008

 

Q: Is the “einde projectvergoeding” eligible for funding?

A: In principle: yes, but the projects in this call will be shorter than one year and therefore this does not apply (see Akkoord NWO-VSNU 20082.1, ad d)

Q: Is the “bench fee” of €5000 mentioned in the Akkoord NWO-VSNU 2008 eligible for funding?

A: NO. Instead a yearly “travel and subsistence costs” fee is eligible for funding.

 

Q: Are material costs other than the ones falling under the "travel and subsistence cost" fee eligible for funding?

A: For projects in this First Call such costs are NOT eligible for funding. This has been done on purpose since the nature of the projects is such that such material costs will not be necessary. If one can argue convincingly that that is necessary anyway, please contact the CLARIN- NL office as soon as possible so that the CLARIN-NL Executive Board can assess the issue.

 

Q: When and how do I get the actual money for my project?

A: If you project is awarded, you will receive a commitment letter describing in detail what your rights and obligations are in relation to your project. This letter will also contain a list (that must be provided by the project leader) containing the goals of the projects and a list of deliverables, their target dates and types (documents, software, etc.).

·         This letter has to be signed for approval and returned to the CLARIN-NL office.

·         Once the signed commitment letter has been received by the CLARIN-NL office, you are entitled to an advance payment of maximally 75% of the costs of your project.

·         The funding will be declared definitive by the CLARIN-NL board if

o    Your project has finished, achieved the goals it has set and delivered all deliverables, and

o    A final technical report and a financial report on the project has been made available to the CLARIN-NL office, and

o    The CLARIN-NL board has approved these technical and financial reports.

·         The remaining 25% of the funding for your project will be paid as soon as the funding for your project has been declared definitive.

 

Q: Who will the advance payment and the remaining funding be paid to?

A: To the organization (which must be a CLARIN-NL partner that has signed the CLARIN-NL consortium agreement) where the project leader is carrying out the project as an employee

 

Q: How do I get my money if I am a project participant?

A: The organization of the project leader will receive the money for the whole project from CLARIN-NL and this organization must distribute the portion of it assigned to your organization is an accordance with the commitment letter to your organization.

 


IPR

Q: What are the general rules that apply in relation to IPR and Ethical Issues for projects and project proposals in the First Call ?

A: These are described in the Call Text and here

 

Q: The IPR of my resource is arranged in a way as required by CLARIN-NL but not all subjects whose speech or audio occur in it have given explicit permission. Can I put such data on a server of a CLARIN centre?

A: In principle yes, though it may depend a little bit on the policy of the specific CLARIN centre where the data reside, and provided you have procedure in place so that subjects who object to this can make this known to your organization and measures can be taken to accommodate the subject’s objections (e.g. by restricting access to these data or in the extreme case by removal of the relevant data from a CLARIN server). You also will have to describe this procedure in your project deliverables and list its functionality as one of the functionalities that should be offered by the CLARIN infrastructure.

 

Q: What happens to derivatives created by CLARIN partners that have made use of data for which a subject has requested removal?

A: If such a derivative contains or presents the relevant data in a recognizable way, they are subject to the same measures as applied to the original data (see the answer to the previous question). However, if such a derivative does not contain the relevant data in a recognizable way, CLARIN-NL will support all parties involved in obtaining an agreement that allows the derivative to stay available and maximally accessible in the CLARIN infrastructure.


Project Proposal and Template

Q: Must I use the project proposal template as provided on the CLARIN-NL website?

A: Yes, this is obligatory

 

Q: Must I use the table for calculating the costs in the project proposal template?

A: This table has been added for your convenience. We strongly advise you to use this table. If, for some reason, this table is not suited to your needs (e.g. you want to compute the costs by another unit than Person Months (PM)), you are allowed to replace the table by another table. Of course, the overview of the costs must in all cases be clearly specified and the calculations must be correct.

 

Q: In what format must I submit my proposal?

A: The mandatory format is PDF. Though the template is in the MS Word format, you must convert your finalized proposal from Word into PDF before submitting it.

 

Q: When must I submit my proposal?

A: The deadline for submitting proposals is Monday August 17, 2009 13:00hrs. At this time the web submission form will closed and submissions are not possible anymore via the web form. All proposals submitted after the deadline will be considered to be formally non-compliant and will not be taken into consideration.


CLARIN Centres in the Netherlands: Contact Details

CLARIN Centres: Contact Persons INL

Coordination:

Jan Theo Bakker

Technical Matters:

Sebastiaan Jansen

Postal Address:

Matthias de Vrieshof 2-3, 2311 BZ Leiden, The Netherlands

Tel:

+31 71 5141648

Fax:

+31 71 5272115

e-mail:

Jan Theo Bakker and Sebastiaan Jansen

CLARIN Centres: Contact Persons Meertens

Coordination:

Douwe Zeldenrust

Technical Matters:

Jan Pieter Kunst

Postal Address:

Postbus 94264, 1090 GG Amsterdam, The Netherlands

Tel:

+31 20 4628500

Fax:

+31 20 462 85 55

e-mail:

douwe.zeldenrust@meertens.knaw.nl and jan.pieter.kunst@meertens.knaw.nl

CLARIN Centres: Contact Person MPI

Name:

Daan Broeder

Postal Address:

PO Box 310, 6500 AH Nijmegen, The Netherlands

Tel:

+31-24-3521103

Fax:

+31-24-3521213

e-mail:

daan.broeder@mpi.nl

 

CLARIN Centres: Contact Person DANS

Name:

Dirk Roorda 

Postal Address:

P.O Box 93067 2509 AB Den Haag, The Netherlands

Tel:

+31-70 3494450

Fax:

+31-070 3494451

e-mail:

dirk.roorda@dans.knaw.nl