Call to JURISIN 2015 |
---|
JURISIN2015 HOME |
COLIEE-2015 CALL FOR TASK PARTICIPATIONCompetition on Legal Information Extraction/Entailment (COLIEE)run in association with Workshop on Juris-informatics (JURISIN) 2015November 16-18, 2015 |
The Juris-informatics (JURISIN) workshop series has been created to discuss both the fundamental and practical issues among people from the various backgrounds such as law, social science, information and intelligent technology, logic and philosophy, including the conventional "AI and law" area. The information extraction/reasoning from legal data is one of the important targets of JURISIN including legal information representation, relation extraction, textual entailment, summarization, and their applications. Participants in the JURISIN workshops have examined a wide variety of information extraction techniques and environments with a variety of purposes, including retrieval of relevant articles, entity/relation extraction from legal cases, reference extraction from legal cases, finding relevant precedents, summarization of legal cases, and question answering. In 2015, JURISIN invites participation in a competition on legal information extraction/entailment. We held the first competition on legal information extraction/entailment (COLIEE-2014) in 2014 on a legal data collection, and it helped establish a major experimental effort in the legal information extraction/retrieval field. We will hold the second competition (COLIEE-2015) in 2015, and the motivation for the competition is to help create a research community of practice for the capture and use of legal information. The Legal Question Answering TaskThis first competition focuses on two aspects of legal information processing related to answering yes/no questions from Japanese legal bar exams (the relevant data sets have been translated from Japanese to English).1) Phase one of the legal question answering task involves reading a legal bar exam question Q, and extracting a subset of Japanese Civil Code Articles S1, S2,..., Sn from the entire Civil Code are those appropriate for answering the question such that Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q). Given a question Q and the entire Civil Code Articles, we have to retrieve the set of "S1, S2, ..., Sn" as the answer of this track. 2) Phase two of the legal question answering task involves the identification of an entailment relationship such that Entails(S1, S2, ..., Sn , Q) or Entails(S1, S2, ..., Sn , not Q). Given a question Q and relevant articles S1, S2, ..., Sn, we have to determine if the relevant articles entail "Q" or "not Q". The answer of this track is binary: "YES"("Q") or "NO"("not Q"). The Legal Question Answering Data CorpusThe corpus of legal questions is drawn from Japanese Legal Bar exams, and the relevant Japanese Civil Law articles are also provided (file format and access described below).1) The Phase One problem is to use an identified set of legal yes/no questions to retrieve relevant Civil Law articles. In this case, the correct answers have been determined by a collection of law students, and those answers are used to calibrate the performance of a program to solve Phase One. 2) The Phase Two task requires some method of information extraction from both the question and the relevant articles, and then to confirm a simple entail relationship as described above: either the articles confirms "yes" or "no" as an answer to the yes/no questions. 3) The Phase Three task is combination of Phase One and Phase Two. It requires both of the legal information retrieval system and textual entailment system. You are given set of legal yes/no questions, and then, your legal information retrieval system will retrieve relevant Civil Law articles. You confirm a 'Yes/No' entailment relationship between input yes/no question and your retrieved articles. Participants can choose which phase they will apply for, amongst the three sub-tasks as follows: 1. Sub-task 1: legal information retrieval Task. Input is a bar exam 'Yes/No' question and output should be relevant civil law articles. (Phase One) 2. Sub-task 2: Recognizing Entailment between law articles and queries. Input is a pair of a question and relevant article(s), and output should be 'Yes' or 'No'. (Phase Two) 3. Sub-task 3: Combination of sub-task 1 and sub-task 2. Input is a bar exam 'Yes/No' question and output should be 'Yes' or 'No'. (Phase Three) Measuring the Competition ResultsThe measures for ranking competition participants are intended only to calibrate the set of competition submissions, rather than provide any deep performance measure. The data sets for Phases One and Two are annotated, so simple information retrieval measures (precision, recall, F-measure, accuracy) can be used to rank each submission, which are described in detail below.Wider dissemination of the JURISIN challenge results is welcome, but the conditions of participation specifically preclude any advertising claims based on JURISIN competition rankings. As noted above, the intention is to start to build a community of practice regarding legal textual entailment, so that the adoption and adaptation of general methods from a variety of fields is considered, and that participants share their approaches, problems, and results. We expect that all competition results submitted to JURISIN are published in the Proceedings and are archived on the JURISIN web site. SchedulesSubmit your application to participate in COLIEE-2015 as described below.Submitting an application will add you to the active participants' mailing list.
Phase One DetailsOur goal is to explore and evaluate legal document retrieval technologies that are both effective and reliable.The task investigates the performance of systems that search a static set of civil law articles using previously-unseen queries. The goal of the task is to return relevant articles in the collection to a query. We call an article is "Relevant" to a query iff the query sentence can be answered Yes/No, entailed from the meaning of the article. If combining the meanings of more than one article (e.g., "A", "B", and "C") can answer a query sentence, then all the articles ("A", "B", and "C") are considered "Relevant". If a query can be answered by an article "D", and it can be also answered by another article "E" independently, we also consider both of the articles "D" and "E" are "Relevant". This task requires the retrieval of all the articles that are relevant to answering a query. Japanese civil law articles (English translation besides Japanese) will be provided, and training data consists of pairs of a query and relevant articles. The process of executing the queries over the articles and generating the experimental runs should be entirely automatic. Test data will include only queries but no relevant articles. There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs. One run from each group will be assessed. The submission format and evaluation methods are described below. Phase Two DetailsOur goal is to construct Yes/No question answering systems for legal queries, by entailment from the relevant articles.The task investigates the performance of systems that answer "Y" or "N" to previously unseen queries by comparing the meanings between queries and relevant articles. Training data consists of triples of a query, relevant articles and a correct answer "Y" or "N". The process of executing the queries over the relevant articles and generating the experimental runs should be entirely automatic. Test data will include only queries and relevant articles, but no "Y/N" label. There should be no human intervention at any stage, including modifications to your retrieval system motivated by an inspection of the queries. You should not materially modify your retrieval system between the time you downloaded the queries and the time you submit your runs. Phase Three DetailsOur goal is to construct both systems for Phase One and Phase Two. If a 'Yes/No' legal bar exam question is given, your legal information retrieval system retrieves relevant Civil Law articles. Then, the task investigates the performance of systems that answer 'Y' or 'No' to previously unseen queries by comparing the meanings between queries and your retrieved Civil Law articles. Training data consists of triples of a query, relevant article(s), a correct answer "Y" or "N". Test data will include only queries, but no 'Y/N' label, no relevant articles.One run from each group will be assessed. The submission format and evaluation methods are described below. Corpus StructureThe structure of the test corpora is derived from a general XML representation developed for use in RITEVAL, one of the tasks of the NII Testbeds and Community for Information access Research (NTCIR) project, as described at the following URL:http://sites.google.com/site/ntcir11riteval/ The RITEVAL format was developed for the general sharing of information retreival on a variety of domains. The format of the JURISIN competition corpora derived from an NTCIR representation of confirmed relationships between questions and the articles and cases relevant to answering the questions, as in the following example: <pair label="Y" id="H18-1-2"> <t1> (Seller's Warranty in cases of Superficies or Other Rights)Article 566 (1)In cases where the subject matter of the sale is encumbered with for the purpose of a superficies, an emphyteusis, an easement, a right of retention or a pledge, if the buyer does not know the same and cannot achieve the purpose of the contract on account thereof, the buyer may cancel the contract. In such cases, if the contract cannot be cancelled, the buyer may only demand compensation for damages. (2)The provisions of the preceding paragraph shall apply mutatis mutandis in cases where an easement that was referred to as being in existence for the benefit of immovable property that is the subject matter of a sale, does not exist, and in cases where a leasehold is registered with respect to the immovable property.(3)In the cases set forth in the preceding two paragraphs, the cancellation of the contract or claim for damages must be made within one year from the time when the buyer comes to know the facts. (Seller's Warranty in cases of Mortgage or Other Rights)Article 567(1)If the buyer loses his/her ownership of immovable property that is the object of a sale because of the exercise of an existing statutory lien or mortgage, the buyer may cancel the contract.(2)If the buyer preserves his/her ownership by incurring expenditure for costs, he/she may claim reimbursement of those costs from the seller.(3)In the cases set forth in the preceding two paragraphs, the buyer may claim compensation if he/she suffered loss. </t1> <t2> There is a limitation period on pursuance of warranty if there is restriction due to superficies on the subject matter, but there is no restriction on pursuance of warranty if the seller's rights were revoked due to execution of the mortgage. </t2> </pair> The above is an example where query id "H18-1-2" is confirmed to be answerable from article numbers 566 and 567 (relevant to Phase One). The pair label "Y" in this example means the answer of query is "Yes", which is entailed from the relevant articles (relevant to Phase Two). For the Phases One, Two, and Three, the training data will be the same. The groups who participate in the only Phase One can disregard the pair label. For Phase One, the test corpora will include only the query field, but no articles and no pair label. For Phase Two, the test corpora will include both the query and the article fields, but no pair label. For Phase Three, the format of the test corpora will be the same with that of the Phase One. Competition Results Submission FormatFor Phase One, a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows:H18-1-2 566 univABCwhere: 1. The first column is the query id. 2. The second column is the official article number of the retrieved article. 3. The third column is called the "run tag" and should be a unique identifier for the submitting group, i.e., each run should have a different tag that identifies the group. Please restrict run tags to 12 or fewer letters and numbers, with no punctuation. In this example of a submission, you can see that H18-1-2 has multiple relevant articles (566 and 567). For Phase Two, again a submission should consist of a single ASCII text file. Use as single space to separate columns as follows, with three columns per line as follows: H18-1-2 Y univABCwhere: 1. and 3 as for Phase One, 2. "Y" or "N" indicating whether the Y/N question was confirmed to be true ("Y") by the relevant articles, or confirmed to be false ("N"). For Phase Three, the submission format will be the same with that of the Phase Two. Competition Evaluation measuresFor Phase One, evaluation measure will be precision, recall and F-measure:(the number of retrieved articles for all queries) , Recall = (the number of correctly retrieved articles for all queries) (the number of relevant articles for all queries) , F-measure = (2 x Precision x Recall) (Precision + Recall) (the number of all queries) For Phase Three, the evaluation meausure will be the same with that of the Phase Two. Task coordinators:Mi-Young Kim, Randy Goebel, University of Alberta, CanadaKen Satoh, National Institute of Informatics, Japan Yoshinobu Kano, Shizuoka University, Japan Workshop FormatThe workshop itself will be used as a forum both for presentation of results (including failure analysis and system comparisons), and for more lengthy system presentations describing legal information retrieval techniques used, experiments run using the data, and other issues of interest to researchers in legal information retrieval. All groups will be invited to present their results in the workshop. Papers of exceptional quality will be included in LNAI post-proceedings after a second review process.Application DetailsOrganizations wishing to participate in JURISIN 2015 should respond to this call for participation by submitting an application. To apply, submit the application and memorandums of the following URL to miyoung2@ualberta.ca:Application: http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2015/application.pdf Memorandum for Japanese Data http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2015/JA_memorandum_2015.pdf Memorandum for English Data http://webdocs.cs.ualberta.ca/~miyoung2/COLIEE2015/EN_memorandum_2015.pdf We will send an acknowledgement to the email address supplied in the form once we have processed the form. Any questions about conference participation should be sent to the general JURISIN 2015 email address, ksatoh(at)nii.ac.jp. |