Call to JURISIN 2016

JURISIN2016 HOME

        

COLIEE-2016 CALL FOR TASK PARTICIPATION

Competition on Legal Information Extraction/Entailment (COLIEE)

run in association with Workshop on Juris-informatics (JURISIN) 2016

November 14-16, 2016
Raiosha Building, Keio University Kanagawa, Japan

Registration Due: June 30, 2016 July 15, 2016 (extended)

Those who wish to use COLIEE-2015 data for a trial, please contact miyoung2(at)ualberta.ca .

Sponsored by the
National Institute of Informatics (NII)

The Juris-informatics (JURISIN) workshop series has been created to discuss both the fundamental and practical issues among people from the various backgrounds such as law, social science, information and intelligent technology, logic and philosophy, including the conventional "AI and law" area.
The information extraction/reasoning from legal data is one of the important targets of JURISIN including legal information representation, relation extraction, textual entailment, summarization, and their applications. Participants in the JURISIN workshops have examined a wide variety of information extraction techniques and environments with a variety of purposes, including retrieval of relevant articles, entity/relation extraction from legal cases, reference extraction from legal cases, finding relevant precedents, summarization of legal cases, and question answering.
In 2016, JURISIN invites participation in a competition on legal information extraction/entailment. We held the first competition on legal information extraction/entailment (COLIEE-2014) in 2014 on a legal data collection, and it helped establish a major experimental effort in the legal information extraction/retrieval field. We will hold the third competition (COLIEE-2016) in 2016, and the motivation for the competition is to help create a research community of practice for the capture and use of legal information.

The Legal Question Answering Task

This first competition focuses on two aspects of legal information processing related to answering yes/no questions from Japanese legal bar exams (the relevant data sets have been translated from Japanese to English).

1) Phase one of the legal question answering task involves reading a legal bar exam question Q, and extracting a subset of Japanese Civil Code Articles S1, S2,..., Sn from the entire Civil Code are those appropriate for answering the question such that

Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q).

Given a question Q and the entire Civil Code Articles, we have to retrieve the set of "S1, S2, ..., Sn" as the answer of this track.

2) Phase two of the legal question answering task involves the identification of an entailment relationship such that

Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q).

Given a question Q and relevant articles S1, S2, ..., Sn, we have to determine if the relevant articles entail "Q" or "not Q". The answer of this track is binary: "YES"("Q") or "NO"("not Q").

The Legal Question Answering Data Corpus

The corpus of legal questions is drawn from Japanese Legal Bar exams, and the relevant Japanese Civil Law articles are also provided (file format and access described below).

1) The Phase One problem is to use an identified set of legal yes/no questions to retrieve relevant Civil Law articles. In this case, the correct answers have been determined by a collection of law students, and those answers are used to calibrate the performance of a program to solve Phase One.

2) The Phase Two task requires some method of information extraction from both the question and the relevant articles, and then to confirm a simple entail relationship as described above: either the articles confirms "yes" or "no" as an answer to the yes/no questions.

3) The Phase Three task is combination of Phase One and Phase Two. It requires both of the legal information retrieval system and textual entailment system. You are given set of legal yes/no questions, and then, your legal information retrieval system will retrieve relevant Civil Law articles. You confirm a 'Yes/No' entailment relationship between input yes/no question and your retrieved articles.

Participants can choose which phase they will apply for, amongst the three sub-tasks as follows:

1. Sub-task 1: legal information retrieval Task. Input is a bar exam 'Yes/No' question and output should be relevant civil law articles. (Phase One)

2. Sub-task 2: Recognizing Entailment between law articles and queries. Input is a pair of a question and relevant article(s), and output should be 'Yes' or 'No'. (Phase Two)

3. Sub-task 3: Combination of sub-task 1 and sub-task 2. Input is a bar exam 'Yes/No' question and output should be 'Yes' or 'No'. (Phase Three)

Measuring the Competition Results

The measures for ranking competition participants are intended only to calibrate the set of competition submissions, rather than provide any deep performance measure. The data sets for Phases One and Two are annotated, so simple information retrieval measures (precision, recall, F-measure, accuracy) can be used to rank each submission, which are described in detail below.

Wider dissemination of the JURISIN challenge results is welcome, but the conditions of participation specifically preclude any advertising claims based on JURISIN competition rankings.

As noted above, the intention is to start to build a community of practice regarding legal textual entailment, so that the adoption and adaptation of general methods from a variety of fields is considered, and that participants share their approaches, problems, and results.

We expect that all competition results submitted to JURISIN are published in the Proceedings and are archived on the JURISIN web site.

Schedules

Submit your application to participate in COLIEE-2016 as described below.

Submitting an application will add you to the active participants' mailing list.

   June 30, 2016 July 2, 2016 Dry run data release.
June 30, 2016 July 15, 2016 (Extended) Task Registration Due.
August 1, 2016 August 5, 2016 Formal run data release.
August 10, 2016 August 20, 2016 (Extended) Formal run submission due.
August 26, 2016 Paper submission due.
(Please submit paper to the JURISIN 2016 Workshop)
November 14-16, 2016 JURISIN Workshop with COLIEE-2016 & assessment returned to participants

Phase One Details

Our goal is to explore and evaluate legal document retrieval technologies that are both effective and reliable.

The task investigates the performance of systems that search a static set of civil law articles using previously-unseen queries. The goal of the task is to return relevant articles in the collection to a query. We call an article is "Relevant" to a query iff the query sentence can be answered Yes/No, entailed from the meaning of the article. If combining the meanings of more than one article (e.g., "A", "B", and "C") can answer a query sentence, then all the articles ("A", "B", and "C") are considered "Relevant". If a query can be answered by an article "D", and it can be also answered by another article "E" independently, we also consider both of the articles "D" and "E" are "Relevant". This task requires the retrieval of all the articles that are relevant to answering a query.

Japanese civil law articles (English translation besides Japanese) will be provided, and training data consists of pairs of a query and relevant articles. The process of executing the queries over the articles and generating the experimental runs should be entirely automatic. Test data will include only queries but no relevant articles.

There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs.

One run from each group will be assessed. The submission format and evaluation methods are described below.

Phase Two Details

Our goal is to construct Yes/No question answering systems for legal queries, by entailment from the relevant articles.

The task investigates the performance of systems that answer "Y" or "N" to previously unseen queries by comparing the meanings between queries and relevant articles.

Training data consists of triples of a query, relevant articles and a correct answer "Y" or "N". The process of executing the queries over the relevant articles and generating the experimental runs should be entirely automatic. Test data will include only queries and relevant articles, but no "Y/N" label.

There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs.

Phase Three Details

Our goal is to construct both systems for Phase One and Phase Two. If a 'Yes/No' legal bar exam question is given, your legal information retrieval system retrieves relevant Civil Law articles. Then, the task investigates the performance of systems that answer 'Y' or 'No' to previously unseen queries by comparing the meanings between queries and your retrieved Civil Law articles. Training data consists of triples of a query, relevant article(s), a correct answer "Y" or "N". Test data will include only queries, but no 'Y/N' label, no relevant articles.

One run from each group will be assessed. The submission format and evaluation methods are described below.

Corpus Structure

The structure of the test corpora is derived from a general XML representation developed for use in RITEVAL, one of the tasks of the NII Testbeds and Community for Information access Research (NTCIR) project, as described at the following URL:

http://sites.google.com/site/ntcir11riteval/

The RITEVAL format was developed for the general sharing of information retreival on a variety of domains.

The format of the JURISIN competition corpora derived from an NTCIR representation of confirmed relationships between questions and the articles and cases relevant to answering the questions, as in the following example:

<pair label="Y" id="H18-1-2">
<t1>
(Seller's Warranty in cases of Superficies or Other Rights)Article 566 (1)In cases where the subject matter of the sale is encumbered with for the purpose of a superficies, an emphyteusis, an easement, a right of retention or a pledge, if the buyer does not know the same and cannot achieve the purpose of the contract on account thereof, the buyer may cancel the contract. In such cases, if the contract cannot be cancelled, the buyer may only demand compensation for damages. (2)The provisions of the preceding paragraph shall apply mutatis mutandis in cases where an easement that was referred to as being in existence for the benefit of immovable property that is the subject matter of a sale, does not exist, and in cases where a leasehold is registered with respect to the immovable property.(3)In the cases set forth in the preceding two paragraphs, the cancellation of the contract or claim for damages must be made within one year from the time when the buyer comes to know the facts.
(Seller's Warranty in cases of Mortgage or Other Rights)Article 567(1)If the buyer loses his/her ownership of immovable property that is the object of a sale because of the exercise of an existing statutory lien or mortgage, the buyer may cancel the contract.(2)If the buyer preserves his/her ownership by incurring expenditure for costs, he/she may claim reimbursement of those costs from the seller.(3)In the cases set forth in the preceding two paragraphs, the buyer may claim compensation if he/she suffered loss.
</t1>
<t2>
There is a limitation period on pursuance of warranty if there is restriction due to superficies on the subject matter, but there is no restriction on pursuance of warranty if the seller's rights were revoked due to execution of the mortgage.
</t2>
</pair>

The above is an example where query id "H18-1-2" is confirmed to be answerable from article numbers 566 and 567 (relevant to Phase One). The pair label "Y" in this example means the answer of query is "Yes", which is entailed from the relevant articles (relevant to Phase Two).

For the Phases One, Two, and Three, the training data will be the same. The groups who participate in the only Phase One can disregard the pair label.

For Phase One, the test corpora will include only the query field, but no articles and no pair label. For Phase Two, the test corpora will include both the query and the article fields, but no pair label.

For Phase Three, the format of the test corpora will be the same with that of the Phase One.

Competition Results Submission Format

For Phase One, a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows:

H18-1-2 566 univABC
H18-1-2 567 univABC
H18-5-A 322 univABC
H19-19-I 433 univABC
H21-5-3 110 univABC
.
.
.
where:

1. The first column is the query id.
2. The second column is the official article number of the retrieved article.
3. The third column is called the "run tag" and should be a unique identifier for the submitting group, i.e., each run should have a different tag that identifies the group. Please restrict run tags to 12 or fewer letters and numbers, with no punctuation.
In this example of a submission, you can see that H18-1-2 has multiple relevant articles (566 and 567).

For Phase Two, again a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows:

H18-1-2 Y univABC
H18-5-A N univABC
H19-19-I Y univABC
H21-5-3 N univABC
.
.
.
where:

1. and 3 as for Phase One,
2. "Y" or "N" indicating whether the Y/N question was confirmed to be true ("Y") by the relevant articles, or confirmed to be false ("N").

For Phase Three, the submission format will be the same with that of the Phase Two.

Competition Evaluation measures

For Phase One, evaluation measure will be precision, recall and F-measure:

Precision =    (the number of correctly retrieved articles for all queries)
      (the number of retrieved articles for all queries) ,

Recall =       (the number of correctly retrieved articles for all queries)
         (the number of relevant articles for all queries) ,

F-measure =    (2 x Precision x Recall)
                  (Precision + Recall)

For Phase Two, the evaluation measure will be accuracy, with respect to whether the yes/no question was correctly confirmed:
Accuracy = (the number of queries which were correctly confirmed as true or false)
(the number of all queries)

For Phase Three, the evaluation meausure will be the same with that of the Phase Two.

Task coordinators:

Mi-Young Kim, Randy Goebel, University of Alberta, Canada
Ken Satoh, National Institute of Informatics, Japan
Yoshinobu Kano, Shizuoka University, Japan

Workshop Format

The workshop itself will be used as a forum both for presentation of results (including failure analysis and system comparisons), and for more lengthy system presentations describing legal information retrieval techniques used, experiments run using the data, and other issues of interest to researchers in legal information retrieval. All groups will be invited to present their results in the workshop. Papers of exceptional quality will be included in LNAI post-proceedings after a second review process.

Application Details

Organizations wishing to participate in JURISIN 2016 should respond to this call for participation by submitting an application. To apply, submit the application and memorandums of the following URL to miyoung2@ualberta.ca:

Application:
http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2016/application.pdf
Memorandum for Japanese Data
http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2016/JA_memorandum_2016.pdf
Memorandum for English Data
http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2016/EN_memorandum_2016.pdf

We will send an acknowledgement to the email address supplied in the form once we have processed the form.

Any questions about conference participation should be sent to the general JURISIN 2016 email address, mnakamur(at)nagoya-u.jp


Last updated: May, 2016