Call to ICAIL 2017

ICAIL 2017 HOME         

COLIEE-2017 CALL FOR TASK PARTICIPATION

Competition on Legal Information Extraction/Entailment (COLIEE)

run in association with the International Conference on Artificial Intelligence and Law (ICAIL) 2017

COLIEE-2017 Workshop: June 12, 2017
COLIEE-2017 Live Competition: June 13, 2017
London, UK

Workshop Program is now available.

COLIEE competition Registration Due: December 15, 2016
For the workshop registration, please check the ICAIL 2017 registration information.

Those who wish to use previous COLIEE data for a trial, please contact miyoung2(at)ualberta.ca .

Sponsored by the
National Institute of Informatics (NII)

In 2017, Interenational Conference on Artificial Intelligence and Law (ICAIL) newly invites participation in a competition on legal information extraction/entailment. We held the previous three competitions on legal information extraction/entailment (COLIEE 2014-2016) on a legal data collection with the JURISIN workshop, and they helped establish a major experimental effort in the legal information extraction/retrieval field. We will hold the fourth competition (COLIEE-2017) in 2017 with the ICAIL conference, and the motivation for the competition is to help create a research community of practice for the capture and use of legal information.

The Legal Question Answering Task

This competition focuses on two aspects of legal information processing related to answering yes/no questions from Japanese legal bar exams (the relevant data sets have been translated from Japanese to English).

1) Phase one of the legal question answering task involves reading a legal bar exam question Q, and extracting a subset of Japanese Civil Code Articles S1, S2,..., Sn from the entire Civil Code are those appropriate for answering the question such that

Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q).

Given a question Q and the entire Civil Code Articles, we have to retrieve the set of "S1, S2, ..., Sn" as the answer of this track.

2) Phase two of the legal question answering task involves the identification of an entailment relationship such that

Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q).

Given a question Q, we have to retrieve relevant articles S1, S2, ..., Sn through phase one, and then we have to determine if the relevant articles entail "Q" or "not Q". The answer of this track is binary: "YES"("Q") or "NO"("not Q").

The Legal Question Answering Data Corpus

The corpus of legal questions is drawn from Japanese Legal Bar exams, and also all the Japanese Civil Law articles are also provided (file format and access described below).

1) The Phase One problem is to use an identified set of legal yes/no questions to retrieve relevant Civil Law articles. In this case, the correct answers have been determined by a collection of law students, and those answers are used to calibrate the performance of a program to solve Phase One.

2) The Phase Two task requires some method of information extraction from both the question and the relevant articles, and then to confirm a simple entail relationship as described above: either the articles confirms "yes" or "no" as an answer to the yes/no questions.

Participants can choose which phase they will apply for, amongst the two sub-tasks as follows:

1. Sub-task 1: legal information retrieval Task. Input is a bar exam 'Yes/No' question and output should be relevant civil law articles. (Phase One)

2. Sub-task 2: Recognizing Entailment between law articles and queries. Input is a bar exam 'Yes/No' question. After retrieving relevant articles using your method, you have to determine 'Yes' or 'No' as the output. (Phase Two)

Measuring the Competition Results

The measures for ranking competition participants are intended only to calibrate the set of competition submissions, rather than provide any deep performance measure. The data sets for Phases One and Two are annotated, so simple information retrieval measures (precision, recall, F-measure, accuracy) can be used to rank each submission, which are described in detail below.

As noted above, the intention is to build a community of practice regarding legal textual entailment, so that the adoption and adaptation of general methods from a variety of fields is considered, and that participants share their approaches, problems, and results.

Submission details

Participants are required to submit a paper on their method and experimental results. We plan to publish a selection of the papers in a volume of the CEUR-WS proceeding series. At least one of the authors of an accepted paper has to present the paper at the special COLIEE session of ICAIL 2017. The paper(s) by the winner(s) of the competition will be included in the main ICAIL 2017 proceedings, and a cash prize will be awarded to the winner(s).

Papers should not exceed 10 pages in the approved style. Please use the paper2 style (two column style without page number) in the CEUR-WS style templates (http://ceur-ws.org/Vol-XXX/samplestyles/). While papers can be prepared using LaTeX, all papers should be converted to PDF prior to submission. You can submit your paper to the COLIEE 2017 EasyChair submission webpage (https://easychair.org/conferences/?conf=coliee2017.)

Live competition

In addition to the publication of winning papers, the intention is to have a live demonstration competition during the ICAIL 2017 Conference. The details of the live demonstration competition are being developed, and will be made available well in advance of the conference. Participants should declare their intention to participate and prepare their software accordingly. Another cash prize will be awarded to the top live demonstration competitor(s).

Our intention for the demonstration competition is to showcase the methods and results obtained by COLIEE participants, in order to help attract a broad community of support. We welcome suggestions for how COLIEE competitors would like to deliver their results in a live demonstration (e.g., will you have the ability to display a live running of your system on a laptop, what are your data requirements to enable real time output, etc). To exchange details on possible requirements, please contact Yoshinobu Kano (kanoyoshinobu(at)gmail.com) .

Schedules


   November 30, 2016 December 3, 2016 Dry run data release.
Dec. 15, 2016 Task Registration Due.
Feb. 15, 2017 Formal run data release.
Feb. 28, 2017 Formal run submission due.
Mar. 7, 2017 Notification of Evaluation results and Release of gold standard answers
Mar. 26, 2017 Paper submission due.
(Please submit your paper to the COLIEE 2017 EasyChair submission webpage)
Apr. 23, 2017 Notification of acceptence.
Apr. 30, 2017 Camera ready papers due.
June 12, 2017 COLIEE-2017 Workshop in the ICAIL 2017 Conference.
June 13, 2017 COLIEE-2017 Live Competition in the ICAIL 2017 Conference.

Phase One Details

Our goal is to explore and evaluate legal document retrieval technologies that are both effective and reliable.

The task investigates the performance of systems that search a static set of civil law articles using previously-unseen queries. The goal of the task is to return relevant articles in the collection to a query. We call an article is "Relevant" to a query iff the query sentence can be answered Yes/No, entailed from the meaning of the article. If combining the meanings of more than one article (e.g., "A", "B", and "C") can answer a query sentence, then all the articles ("A", "B", and "C") are considered "Relevant". If a query can be answered by an article "D", and it can be also answered by another article "E" independently, we also consider both of the articles "D" and "E" are "Relevant". This task requires the retrieval of all the articles that are relevant to answering a query.

Japanese civil law articles (English translation besides Japanese) will be provided, and training data consists of pairs of a query and relevant articles. The process of executing the queries over the articles and generating the experimental runs should be entirely automatic. Test data will include only queries but no relevant articles.

There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs.

At most three runs from each group will be assessed. The submission format and evaluation methods are described below.

Phase Two Details

Our goal is to construct Yes/No question answering systems for legal queries, by entailment from the relevant articles.

If a 'Yes/No' legal bar exam question is given, your legal information retrieval system retrieves relevant Civil Law articles. Then, the task investigates the performance of systems that answer 'Y' or 'No' to previously unseen queries by comparing the meanings between queries and your retrieved Civil Law articles. Training data consists of triples of a query, relevant article(s), a correct answer "Y" or "N". Test data will include only queries, but no 'Y/N' label, no relevant articles.

There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs.

At most three runs for each group should be assessed. The submission format and evaluation methods are described below.

Corpus Structure

The structure of the test corpora is derived from a general XML representation developed for use in RITEVAL, one of the tasks of the NII Testbeds and Community for Information access Research (NTCIR) project, as described at the following URL:

http://sites.google.com/site/ntcir11riteval/

The RITEVAL format was developed for the general sharing of information retreival on a variety of domains.

The format of the COLIEE competition corpora derived from an NTCIR representation of confirmed relationships between questions and the articles and cases relevant to answering the questions, as in the following example:

<pair label="Y" id="H18-1-2">
<t1>
(Seller's Warranty in cases of Superficies or Other Rights)Article 566 (1)In cases where the subject matter of the sale is encumbered with for the purpose of a superficies, an emphyteusis, an easement, a right of retention or a pledge, if the buyer does not know the same and cannot achieve the purpose of the contract on account thereof, the buyer may cancel the contract. In such cases, if the contract cannot be cancelled, the buyer may only demand compensation for damages. (2)The provisions of the preceding paragraph shall apply mutatis mutandis in cases where an easement that was referred to as being in existence for the benefit of immovable property that is the subject matter of a sale, does not exist, and in cases where a leasehold is registered with respect to the immovable property.(3)In the cases set forth in the preceding two paragraphs, the cancellation of the contract or claim for damages must be made within one year from the time when the buyer comes to know the facts.
(Seller's Warranty in cases of Mortgage or Other Rights)Article 567(1)If the buyer loses his/her ownership of immovable property that is the object of a sale because of the exercise of an existing statutory lien or mortgage, the buyer may cancel the contract.(2)If the buyer preserves his/her ownership by incurring expenditure for costs, he/she may claim reimbursement of those costs from the seller.(3)In the cases set forth in the preceding two paragraphs, the buyer may claim compensation if he/she suffered loss.
</t1>
<t2>
There is a limitation period on pursuance of warranty if there is restriction due to superficies on the subject matter, but there is no restriction on pursuance of warranty if the seller's rights were revoked due to execution of the mortgage.
</t2>
</pair>

The above is an example where query id "H18-1-2" is confirmed to be answerable from article numbers 566 and 567 (relevant to Phase One). The pair label "Y" in this example means the answer of query is "Yes", which is entailed from the relevant articles (relevant to Phase Two).

For the Phases One, and Two, the training data will be the same. The groups who participate in the only Phase One can disregard the pair label.

For Phases One and Two, the test corpora will include only the query field, but no articles and no pair label.

Competition Results Submission Format

For Phase One, a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows:

H18-1-2 566 univABC
H18-1-2 567 univABC
H18-5-A 322 univABC
H19-19-I 433 univABC
H21-5-3 110 univABC
.
.
.
where:

1. The first column is the query id.
2. The second column is the official article number of the retrieved article.
3. The third column is called the "run tag" and should be a unique identifier for the submitting group, i.e., each run should have a different tag that identifies the group. Please restrict run tags to 12 or fewer letters and numbers, with no punctuation.
In this example of a submission, you can see that H18-1-2 has multiple relevant articles (566 and 567).

For Phase Two, again a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows:

H18-1-2 Y univABC
H18-5-A N univABC
H19-19-I Y univABC
H21-5-3 N univABC
.
.
.
where:

1. and 3 as for Phase One,
2. "Y" or "N" indicating whether the Y/N question was confirmed to be true ("Y") by the relevant articles, or confirmed to be false ("N").

Competition Evaluation measures

For Phase One, evaluation measure will be precision, recall and F-measure:

Precision =    (the number of correctly retrieved articles for all queries)
      (the number of retrieved articles for all queries) ,

Recall =       (the number of correctly retrieved articles for all queries)
         (the number of relevant articles for all queries) ,

F-measure =    (2 x Precision x Recall)
                  (Precision + Recall)

For Phase Two, the evaluation measure will be accuracy, with respect to whether the yes/no question was correctly confirmed:
Accuracy = (the number of queries which were correctly confirmed as true or false)
(the number of all queries)

COLIEE Program in ICAIL 2017 (June 12, 2017)

13:30-14:00 Overview of COLIEE 2017
Yoshinobu Kano, Mi-Young Kim, Randy Goebel and Ken Satoh

14:00-14:30 A Civil Code Article Information Retrieval System based on Phrase Alignment with Article Structure Analysis and Ensemble Approach
Masaharu Yoshioka and Daiki Onodera

14:30-15:00 Multiple Agent Based Entailment System(MABES) for RTE
Byungtaek Jung, Chiseung Soh, Kihyun Hong, Seungtak Lim and Young-Yik Rhim

15:00-15:30 Recognizing entailments in legal texts using sentence encoding-based and decomposable attention models
Truong-Son Nguyen, Viet-Anh Phan and Le-Minh Nguyen

15:30-16:00 Coffee Break

16:00-16:30 Improving Legal Information Retrieval by Distributional Composition with Term Order Probabilities
Danilo S. Carvalho, Vu Tran, Khanh Van Tran and Le Minh Nguyen

16:30-17:00 Analyzable Legal Yes/No Question Answering System using Linguistic Structures
Yoshinobu Kano, Reina Hoshino and Ryosuke Taniguchi

17:00-17:30 Legal Information Retrieval Using Topic Clustering and Neural Networks
Rohan Nanda, Adebayo Kolawole John, Luigi Di Caro, Guido Boella and Livio Robaldo

17:30-18:00 Legal Question Answering System using Neural Attention
Ayaka Morimoto, Daiki Kubo, Motoki Sato, Hiroyuki Shindo and Yuji Matsumoto

Live competition will be in the afternoon on June 13. Please see the ICAIL 2017 Conference program for details.

Live Competition Results

The live competition ended successfully. You can check the live competition results here.

Program Chairs

Ken Satoh, ksatoh(at)nii.ac.jp, National Institute of Informatics and Sokendai, Japan
Randy Goebel, rgoebel(at)ualberta.ca, University of Alberta, Canada
Mi-Young Kim, miyoung2(at)ualberta.ca, Department of Computing Science, University of Alberta, Canada
Yoshinobu Kano, kanoyoshinobu(at)gmail.com, Shizuoka University, Japan

Program Committee Members

Yusuke Miyao, yusuke(at)nii.ac.jp, National Institute of Informatics, Japan
Bernardo Magnini, magnini(at)fbk.eu, Fondazione Bruno Kessler, Italy
Nguyen Le Minh, nguyenml(at)jaist.ac.jp, JAIST, Japan
Satoshi Tojo, tojo(at)jaist.ac.jp, JAIST, Japan
Livio Robaldo, livio.robaldo(at)uni.lu, University of Luxembourg, Luxembourg
Adam Wyner, azwyner(at)abdn.ac.uk, University of Aberdeen, UK
Douglas Oard, oard(at)umd.edu, University of Maryland, USA
Akira Shimazu, shimazu(at)jaist.ac.jp, JAIST, Japan
Kentaro Inui, inui(at)ecei.tohoku.ac.jp, Tohoku University, Japan
Katsumasa Yoshikawa, KATSUY(at)jp.ibm.com, IBM Research, Tokyo, Japan
Marie-Francine Moens, sien.moens(at)cs.kuleuven.be, KU Leuven, Belgium
Jack G. Conrad, Jack.G.Conrad(at)ThomsonReuters.com, Thomson Reuters

Questions and further information

miyoung2(at)ualberta.ca

Application Details

Potential participats to COLIEE-2017 should respond to this call for participation by submitting an application. To apply, submit the application and memorandums of the following URL to miyoung2(at)ualberta.ca:

Application:
http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2017/application.pdf
Memorandum for Japanese Data
http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2017/JA_memorandum_2017.pdf
Memorandum for English Data
http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2017/EN_memorandum_2017.pdf

We will send an acknowledgement to the email address supplied in the form once we have processed the form.

Any questions about conference participation should be sent to the general ICAIL 2017 email address, guido.governatori(at)data61.csiro.au


Last updated: Nov., 2016