COLIEE-14 Competition

Call to JURISIN 2014
JURISIN home
CALL FOR PARTICIPATION Workshop on Juris-informatics (JURISIN) 2014 Competition on Legal Information Extraction/Entailment (COLIEE-14) November 23-24, 2014 Raiosha Building, Keio University Kanagawa, Japan Submission deadline: September 17, 2014 (extended) submission page: https://easychair.org/conferences/ conference_info.cgi?a=7296960 For the paper format and submission instruction, please refer to JURISIN 2014. Sponsored by the National Institute of Informatics (NII)
The Juris-informatics (JURISIN) workshop series has been created to discuss both the fundamental and practical issues among people from the various backgrounds such as law, social science, information and intelligent technology, logic and philosophy, including the conventional "AI and law" area. The information extraction/reasoning from legal data is one of the important targets of JURISIN including legal information representation, relation extraction, textual entailment, summarization, and their applications. Participants in the JURISIN workshops have examined a wide variety of information extraction techniques and environments with a variety of purposes, including retrieval of relevant articles, entity/relation extraction from legal cases, reference extraction from legal cases, finding relevant precedents, summarization of legal cases, and question answering. Details about JURISIN can be found at the JURISIN web site: http://www.jaist.ac.jp/jurisin2014 This year, JURISIN invites participation in a competition on legal information extraction/entailment. Previous conferences/workshops have not conducted such a shared task on a large legal data collection, so we hope that the 2014 workshop will help establish a major experimental effort in the legal information extraction/retrieval field. The motivation for the competition is to help create a research community of practice for the capture and use of legal information. The Legal Question Answering Task This first competition focuses on two aspects of legal information processing related to answering yes/no questions from Japanese legal bar exams (the relevant data sets have been translated from Japanese to English). 1) Phase one of the legal question answering task involves reading a legal bar exam question Q, and extracting a subset of Japanese Civil Code Articles S1, S2,..., Sn from the entire Civil Code are those appropriate for answering the question such that Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q). Given a question Q and the entire Civil Code Articles, we have to retrieve the set of "S1, S2, ..., Sn" as the answer of this track. 2) Phase two of the legal question answering task involves the identification of an entailment relationship such that Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q). Given a question Q and relevant articles S1, S2, ..., Sn, we have to determine if the relevant articles entail "Q" or "not Q". The answer of this track is binary: "YES"("Q") or "NO"("not Q"). The Legal Question Answering Data Corpus The corpus of legal questions is drawn from Japanese Legal Bar exams, and the relevant Japanese Civil Law articles are also provided (file format and access described below). 1) The Phase One problem is to use an identified set of legal yes/no questions to retrieve relevant Civil Law articles. In this case, the correct answers have been determined by a collection of law students, and those answers are used to calibrate the performance of a program to solve Phase One. 2) The Phase Two task requires some method of information extraction from both the question and the relevant articles, and then to confirm a simple entail relationship as described above: either the articles confirms "yes" or "no" as an answer to the yes/no questions. Measuring the Competition Results The measures for ranking competition participants are intended only to calibrate the set of competition submissions, rather than provide any deep performance measure. The data sets for Phases One and Two are annotated, so simple information retrieval measures (precision, recall, F-measure, accuracy) can be used to rank each submission, which are described in detail below. Wider dissemination of the JURISIN challenge results is welcome, but the conditions of participation specifically preclude any advertising claims based on JURISIN competition rankings. As noted above, the intention is to start to build a community of practice regarding legal textual entailment, so that the adoption and adaptation of general methods from a variety of fields is considered, and that participants share their approaches, problems, and results. We expect that all competition results submitted to JURISIN are published in the Proceedings and are archived on the JURISIN web site. Schedules Submit your application to participate in JURISIN 2014 as described below. Submitting an application will add you to the active participants' mailing list. July 21, 2014 Training corpus Available. Sep 17, 2014 (extended) Paper submission deadline. (The submission page is here https://easychair.org/conferences/conference_info.cgi?a=7296960 (We use the same submission page with JURISIN 2014.) The deadline is also the same with the JURISIN 2014 paper deadline. Participants need to include their evaluation results using cross-validation of training data.) Sep 22, 2014 Test queries available. Oct 6, 2014 Results submission deadline. Nov 23-24,2014 JURISIN 2014 Workshop and assessments returned to participants Phase One Details Our goal is to explore and evaluate legal document retrieval technologies that are both effective and reliable. The task investigates the performance of systems that search a static set of civil law articles using previously-unseen queries. The goal of the task is to return relevant articles in the collection to a query. We call an article is "Relevant" to a query iff the query sentence can be answered Yes/No, entailed from the meaning of the article. If combining the meanings of more than one article (e.g., "A", "B", and "C") can answer a query sentence, then all the articles ("A", "B", and "C") are considered "Relevant". If a query can be answered by an article "D", and it can be also answered by another article "E" independently, we also consider both of the articles "D" and "E" are "Relevant". This task requires the retrieval of all the articles that are relevant to answering a query. Japanese civil law articles (English translation besides Japanese) will be provided, and training data consists of pairs of a query and relevant articles. The process of executing the queries over the articles and generating the experimental runs should be entirely automatic. Test data will include only queries but no relevant articles. There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs. One run from each group will be assessed. The submission format and evaluation methods are described below. Phase Two Details Our goal is to construct Yes/No question answering systems for legal queries, by entailment from the relevant articles. The task investigates the performance of systems that answer "Y" or "N" to previously unseen queries by comparing the meanings between queries and relevant articles. Training data consists of triples of a query, relevant articles and a correct answer "Y" or "N". The process of executing the queries over the relevant articles and generating the experimental runs should be entirely automatic. Test data will include only queries and relevant articles, but no "Y/N" label. There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs. One run from each group will be assessed. The submission format and evaluation methods are described below. Corpus Structure The structure of the test corpora is derived from a general XML representation developed for use in RITEVAL, one of the tasks of the NII Testbeds and Community for Information access Research (NTCIR) project, as described at the following URL: http://sites.google.com/site/ntcir11riteval/ The RITEVAL format was developed for the general sharing of information retreival on a variety of domains. The format of the JURISIN competition corpora derived from an NTCIR representation of confirmed relationships between questions and the articles and cases relevant to answering the questions, as in the following example: <pair label="Y" id="H18-1-2"> <t1> (Seller's Warranty in cases of Superficies or Other Rights)Article 566 (1)In cases where the subject matter of the sale is encumbered with for the purpose of a superficies, an emphyteusis, an easement, a right of retention or a pledge, if the buyer does not know the same and cannot achieve the purpose of the contract on account thereof, the buyer may cancel the contract. In such cases, if the contract cannot be cancelled, the buyer may only demand compensation for damages. (2)The provisions of the preceding paragraph shall apply mutatis mutandis in cases where an easement that was referred to as being in existence for the benefit of immovable property that is the subject matter of a sale, does not exist, and in cases where a leasehold is registered with respect to the immovable property.(3)In the cases set forth in the preceding two paragraphs, the cancellation of the contract or claim for damages must be made within one year from the time when the buyer comes to know the facts. (Seller's Warranty in cases of Mortgage or Other Rights)Article 567(1)If the buyer loses his/her ownership of immovable property that is the object of a sale because of the exercise of an existing statutory lien or mortgage, the buyer may cancel the contract.(2)If the buyer preserves his/her ownership by incurring expenditure for costs, he/she may claim reimbursement of those costs from the seller.(3)In the cases set forth in the preceding two paragraphs, the buyer may claim compensation if he/she suffered loss. </t1> <t2> There is a limitation period on pursuance of warranty if there is restriction due to superficies on the subject matter, but there is no restriction on pursuance of warranty if the seller's rights were revoked due to execution of the mortgage. </t2> </pair> The above is an example where query id "H18-1-2" is confirmed to be answerable from article numbers 566 and 567 (relevant to Phase One). The pair label "Y" in this example means the answer of query is "Yes", which is entailed from the relevant articles (relevant to Phase Two). For the Phases One and Two, the training data will be the same. The groups who participate in the only Phase One can disregard the pair label. For Phase One, the test corpora will include only the query field, but no articles. For Phase Two, the test corpora will include both the query and the article fields, but no pair label. Competition Results Submission Format For Phase One, a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows: H18-1-2 566 univABC H18-1-2 567 univABC H18-5-A 322 univABC H19-19-I 433 univABC H21-5-3 110 univABC . . . where: 1. The first column is the query id. 2. The second column is the official article number of the retrieved article. 3. The third column is called the "run tag" and should be a unique identifier for the submitting group, i.e., each run should have a different tag that identifies the group. Please restrict run tags to 12 or fewer letters and numbers, with no punctuation. In this example of a submission, you can see that H18-1-2 has multiple relevant articles (566 and 567). For Phase Two, again a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows: H18-1-2 Y univABC H18-5-A N univABC H19-19-I Y univABC H21-5-3 N univABC . . . where: 1. and 3 as for Phase One, 2. "Y" or "N" indicating whether the Y/N question was confirmed to be true ("Y") by the relevant articles, or confirmed to be false ("N"). Competition Evaluation measures For Phase One, evaluation measure will be precision, recall and F-measure: Precision = (the number of correctly retrieved articles for all queries) (the number of retrieved articles for all queries) , Recall = (the number of correctly retrieved articles for all queries) (the number of relevant articles for all queries) , F-measure = (2 x Precision x Recall) (Precision + Recall) For Phase Two, the evaluation measure will be accuracy, with respect to whether the yes/no question was correctly confirmed: Accuracy = (the number of queries which were correctly confirmed as true or false) (the number of all queries) Task coordinators: Mi-Young Kim, Randy Goebel, University of Alberta, Canada Ken Satoh, National Institute of Informatics, Japan Workshop Format The workshop itself will be used as a forum both for presentation of results (including failure analysis and system comparisons), and for more lengthy system presentations describing legal information retrieval techniques used, experiments run using the data, and other issues of interest to researchers in legal information retrieval. All groups will be invited to present their results in the workshop. Papers of exceptional quality will be included in LNAI post-proceedings after a second review process. Application Details Organizations wishing to participate in JURISIN 2014 should respond to this call for participation by submitting an application. To apply, submit the application and memorandums of the following URL to miyoung2@ualberta.ca: Application: http://webdocs.cs.ualberta.ca/~miyoung2/jurisin_task/application.pdf Memorandum for Japanese Data http://webdocs.cs.ualberta.ca/~miyoung2/jurisin_task/JA_memorandum.pdf Memorandum for English Data http://webdocs.cs.ualberta.ca/~miyoung2/jurisin_task/EN_memorandum.pdf We will send an acknowledgement to the email address supplied in the form once we have processed the form. Any questions about conference participation should be sent to the general JURISIN email address, tojo(at)jaist.ac.jp. Last updated: 18-July-2014

Call to JURISIN 2014

JURISIN home

CALL FOR PARTICIPATION

Workshop on Juris-informatics (JURISIN) 2014

Competition on Legal Information Extraction/Entailment (COLIEE-14)

November 23-24, 2014
Raiosha Building, Keio University Kanagawa, Japan

Submission deadline: September 17, 2014 (extended)
submission page: https://easychair.org/conferences/ conference_info.cgi?a=7296960

For the paper format and submission instruction, please refer to JURISIN 2014.

Sponsored by the
National Institute of Informatics (NII)

The Juris-informatics (JURISIN) workshop series has been created to discuss both the fundamental and practical issues among people from the various backgrounds such as law, social science, information and intelligent technology, logic and philosophy, including the conventional "AI and law" area.
The information extraction/reasoning from legal data is one of the important targets of JURISIN including legal information representation, relation extraction, textual entailment, summarization, and their applications. Participants in the JURISIN workshops have examined a wide variety of information extraction techniques and environments with a variety of purposes, including retrieval of relevant articles, entity/relation extraction from legal cases, reference extraction from legal cases, finding relevant precedents, summarization of legal cases, and question answering. Details about JURISIN can be found at the JURISIN web site:

http://www.jaist.ac.jp/jurisin2014

This year, JURISIN invites participation in a competition on legal information extraction/entailment. Previous conferences/workshops have not conducted such a shared task on a large legal data collection, so we hope that the 2014 workshop will help establish a major experimental effort in the legal information extraction/retrieval field. The motivation for the competition is to help create a research community of practice for the capture and use of legal information.

The Legal Question Answering Task

This first competition focuses on two aspects of legal information processing related to answering yes/no questions from Japanese legal bar exams (the relevant data sets have been translated from Japanese to English).

1) Phase one of the legal question answering task involves reading a legal bar exam question Q, and extracting a subset of Japanese Civil Code Articles S1, S2,..., Sn from the entire Civil Code are those appropriate for answering the question such that

Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q).

Given a question Q and the entire Civil Code Articles, we have to retrieve the set of "S1, S2, ..., Sn" as the answer of this track.

2) Phase two of the legal question answering task involves the identification of an entailment relationship such that

Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q).

Given a question Q and relevant articles S1, S2, ..., Sn, we have to determine if the relevant articles entail "Q" or "not Q". The answer of this track is binary: "YES"("Q") or "NO"("not Q").

The Legal Question Answering Data Corpus

The corpus of legal questions is drawn from Japanese Legal Bar exams, and the relevant Japanese Civil Law articles are also provided (file format and access described below).

1) The Phase One problem is to use an identified set of legal yes/no questions to retrieve relevant Civil Law articles. In this case, the correct answers have been determined by a collection of law students, and those answers are used to calibrate the performance of a program to solve Phase One.

2) The Phase Two task requires some method of information extraction from both the question and the relevant articles, and then to confirm a simple entail relationship as described above: either the articles confirms "yes" or "no" as an answer to the yes/no questions.

Measuring the Competition Results

The measures for ranking competition participants are intended only to calibrate the set of competition submissions, rather than provide any deep performance measure. The data sets for Phases One and Two are annotated, so simple information retrieval measures (precision, recall, F-measure, accuracy) can be used to rank each submission, which are described in detail below.

Wider dissemination of the JURISIN challenge results is welcome, but the conditions of participation specifically preclude any advertising claims based on JURISIN competition rankings.

As noted above, the intention is to start to build a community of practice regarding legal textual entailment, so that the adoption and adaptation of general methods from a variety of fields is considered, and that participants share their approaches, problems, and results.

We expect that all competition results submitted to JURISIN are published in the Proceedings and are archived on the JURISIN web site.

Schedules

Submit your application to participate in JURISIN 2014 as described below.

Submitting an application will add you to the active participants' mailing list.

	July 21, 2014 Training corpus Available.  

	Sep 17, 2014 (extended)  Paper submission deadline. 

	       (The submission page is here https://easychair.org/conferences/conference_info.cgi?a=7296960

		   	(We use the same submission page with JURISIN 2014.) 

			The deadline is also the same with the JURISIN 2014 paper deadline. 

		      Participants need to include their evaluation results using cross-validation of training data.)  

	Sep 22, 2014  Test queries available.

	Oct 6, 2014  Results submission deadline.  

	Nov 23-24,2014 JURISIN 2014 Workshop and assessments returned to participants

Phase One Details

Our goal is to explore and evaluate legal document retrieval technologies that are both effective and reliable.

The task investigates the performance of systems that search a static set of civil law articles using previously-unseen queries. The goal of the task is to return relevant articles in the collection to a query. We call an article is "Relevant" to a query iff the query sentence can be answered Yes/No, entailed from the meaning of the article. If combining the meanings of more than one article (e.g., "A", "B", and "C") can answer a query sentence, then all the articles ("A", "B", and "C") are considered "Relevant". If a query can be answered by an article "D", and it can be also answered by another article "E" independently, we also consider both of the articles "D" and "E" are "Relevant". This task requires the retrieval of all the articles that are relevant to answering a query.

Japanese civil law articles (English translation besides Japanese) will be provided, and training data consists of pairs of a query and relevant articles. The process of executing the queries over the articles and generating the experimental runs should be entirely automatic. Test data will include only queries but no relevant articles.

There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs.

One run from each group will be assessed. The submission format and evaluation methods are described below.

Phase Two Details

Our goal is to construct Yes/No question answering systems for legal queries, by entailment from the relevant articles.

The task investigates the performance of systems that answer "Y" or "N" to previously unseen queries by comparing the meanings between queries and relevant articles.

Training data consists of triples of a query, relevant articles and a correct answer "Y" or "N". The process of executing the queries over the relevant articles and generating the experimental runs should be entirely automatic. Test data will include only queries and relevant articles, but no "Y/N" label.

There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs.

One run from each group will be assessed. The submission format and evaluation methods are described below.

Corpus Structure

The structure of the test corpora is derived from a general XML representation developed for use in RITEVAL, one of the tasks of the NII Testbeds and Community for Information access Research (NTCIR) project, as described at the following URL:

http://sites.google.com/site/ntcir11riteval/

The RITEVAL format was developed for the general sharing of information retreival on a variety of domains.

The format of the JURISIN competition corpora derived from an NTCIR representation of confirmed relationships between questions and the articles and cases relevant to answering the questions, as in the following example:

<pair label="Y" id="H18-1-2">
<t1>
(Seller's Warranty in cases of Superficies or Other Rights)Article 566 (1)In cases where the subject matter of the sale is encumbered with for the purpose of a superficies, an emphyteusis, an easement, a right of retention or a pledge, if the buyer does not know the same and cannot achieve the purpose of the contract on account thereof, the buyer may cancel the contract. In such cases, if the contract cannot be cancelled, the buyer may only demand compensation for damages. (2)The provisions of the preceding paragraph shall apply mutatis mutandis in cases where an easement that was referred to as being in existence for the benefit of immovable property that is the subject matter of a sale, does not exist, and in cases where a leasehold is registered with respect to the immovable property.(3)In the cases set forth in the preceding two paragraphs, the cancellation of the contract or claim for damages must be made within one year from the time when the buyer comes to know the facts.
(Seller's Warranty in cases of Mortgage or Other Rights)Article 567(1)If the buyer loses his/her ownership of immovable property that is the object of a sale because of the exercise of an existing statutory lien or mortgage, the buyer may cancel the contract.(2)If the buyer preserves his/her ownership by incurring expenditure for costs, he/she may claim reimbursement of those costs from the seller.(3)In the cases set forth in the preceding two paragraphs, the buyer may claim compensation if he/she suffered loss.
</t1>
<t2>
There is a limitation period on pursuance of warranty if there is restriction due to superficies on the subject matter, but there is no restriction on pursuance of warranty if the seller's rights were revoked due to execution of the mortgage.
</t2>
</pair>

The above is an example where query id "H18-1-2" is confirmed to be answerable from article numbers 566 and 567 (relevant to Phase One). The pair label "Y" in this example means the answer of query is "Yes", which is entailed from the relevant articles (relevant to Phase Two).

For the Phases One and Two, the training data will be the same. The groups who participate in the only Phase One can disregard the pair label.

For Phase One, the test corpora will include only the query field, but no articles. For Phase Two, the test corpora will include both the query and the article fields, but no pair label.

Competition Results Submission Format

For Phase One, a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows:

H18-1-2 566 univABC

H18-1-2 567 univABC

H18-5-A 322 univABC

H19-19-I 433 univABC

H21-5-3 110 univABC

.

.

.

where:

1. The first column is the query id.
2. The second column is the official article number of the retrieved article.
3. The third column is called the "run tag" and should be a unique identifier for the submitting group, i.e., each run should have a different tag that identifies the group. Please restrict run tags to 12 or fewer letters and numbers, with no punctuation.
In this example of a submission, you can see that H18-1-2 has multiple relevant articles (566 and 567).

For Phase Two, again a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows:

H18-1-2 Y univABC

H18-5-A N univABC

H19-19-I Y univABC

H21-5-3 N univABC

.

.

.

where:

1. and 3 as for Phase One,
2. "Y" or "N" indicating whether the Y/N question was confirmed to be true ("Y") by the relevant articles, or confirmed to be false ("N").

Competition Evaluation measures

For Phase One, evaluation measure will be precision, recall and F-measure:

Precision =    (the number of correctly retrieved articles for all queries)
      (the number of retrieved articles for all queries) ,

Recall =       (the number of correctly retrieved articles for all queries)
         (the number of relevant articles for all queries) ,

F-measure =    (2 x Precision x Recall)
                  (Precision + Recall)

For Phase Two, the evaluation measure will be accuracy, with respect to whether the yes/no question was correctly confirmed:
Accuracy = (the number of queries which were correctly confirmed as true or false)
(the number of all queries)
Task coordinators:
Mi-Young Kim, Randy Goebel, University of Alberta, Canada
Ken Satoh, National Institute of Informatics, Japan
Workshop Format
The workshop itself will be used as a forum both for presentation of results (including failure analysis and system comparisons), and for more lengthy system presentations describing legal information retrieval techniques used, experiments run using the data, and other issues of interest to researchers in legal information retrieval. All groups will be invited to present their results in the workshop. Papers of exceptional quality will be included in LNAI post-proceedings after a second review process.

Application Details
Organizations wishing to participate in JURISIN 2014 should respond to this call for participation by submitting an application. To apply, submit the application and memorandums of the following URL to miyoung2@ualberta.ca:

Application:
http://webdocs.cs.ualberta.ca/~miyoung2/jurisin_task/application.pdf
Memorandum for Japanese Data
http://webdocs.cs.ualberta.ca/~miyoung2/jurisin_task/JA_memorandum.pdf
Memorandum for English Data
http://webdocs.cs.ualberta.ca/~miyoung2/jurisin_task/EN_memorandum.pdf

We will send an acknowledgement to the email address supplied in the form once we have processed the form.

Any questions about conference participation should be sent to the general JURISIN email address, tojo(at)jaist.ac.jp.

Last updated: 18-July-2014