The objective of this research project is to develop efficient top-k query evaluation algorithms for big data with uncertainty. (By big data with uncertainty we mean a collection of uncertain data sets so large and complex that it becomes difficult to process using traditional query evaluation algorithms for uncertain databases.) It has been argued that roughly 80 percent of data sitting in enterprise databases is of uncertain value, According to IBM research manager Gabi Zodik at a IBM Innovate 2012 conference. and consequently, data veracity, uncertain or imprecise data, represents a critical challenge to the big data revolution. We have developed an efficient pruning algofirhtm for the top-K ranking in Pruning for Top-K Ranking in Uncertain Databases, VLDB 2011. This research is to extend (or modify) the existing top-k query algorithms in uncertain databases to suit for big data with uncertainty.
C Wang, LY Yuan, JH You, OR Zaiane and J Pein. Pruning for Top-K Ranking in Uncertain Databases. THE PROCEEDINGS OF THE VLDB ENDOWMENT 4 (July 2011): 598-609.
Chonghai Wang, Li-Yan Yuan, Jia-Huai You, Top-k ranking for uncertain data. Proc. of FSKD, 2010, 363 - 368, Seventh International Conference on Fuzzy Systems and Knowledge Discovery, China, Yantai, Aug. 10, 2010.
C. Wang, L.Y. Yuan and J. You, A ranking theory for uncertain data with constraints, Proc. the 2nd IEEE International Con- ference on Computer Science and Information Technology, 104-108 the 2nd IEEE International Con- ference on Computer Science and Information Technology, China, Beijing, Aug 8, 2009.