International Journal of Advanced and Applied Sciences
Int. j. adv. appl. sci.
EISSN: 2313-3724
Print ISSN:2313-626X
Volume 3, Issue 8 (August 2016), Pages: 78-84
Title: Data fusion in data federation using modified discriminative Markov logic networks
Authors: M. S. Hema 1, *, M. Nageswara Guptha 2
Affiliation(s):
1Department of CSE, Sri Venkateshwara College of Engineering, Bengaluru, India
2Department of ISE, Sri Venkateshwara College of Engineering, Bengaluru, India
http://dx.doi.org/10.21833/ijaas.2016.08.013
Abstract:
The quality integrated data is crucial for data mining process. The existing approaches are used trust your friends and cry with wolves principle to resolve the data conflicts. These principles are taking the value of a preferred source and taking the most frequent value. However, it is a challenge for data integration to choose the most trustworthy data source and it is arbitrary to trust only certain source. To mitigate above issues, Data Fusion in Data Federation using Modified Discriminative Markov Logic Networks (DF-MDMLN) approach is proposed. Data fusion is to resolve the data conflicts among the data from different heterogeneous databases by utilizing multi-angle features and knowledge of discriminative Markov Logic Network (MLN). The data fusion is used to improve the precision and recall of the end users’ data set. E-shopping for computer peripherals application is considered for experimentation to analyze the performance of DF-MDMLN approach. Experiments on E-shopping data sets show the effectiveness of DF-MDMLN approach. It is observed that the precision and recall of data fusion has been improved by 40% and 27% respectively.
© 2016 The Authors. Published by IASE.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Data federation, Data conflicts, Data fusion, Markov logic networks, Weight learning
Article History: Received 2 July 2016, Received in revised form 10 September 2016, Accepted 12 September 2016
Digital Object Identifier: http://dx.doi.org/10.21833/ijaas.2016.08.013
Citation:
Hema MS and Guptha MN (2016). Data fusion in data federation using modified discriminative Markov logic networks. International Journal of Advanced and Applied Sciences, 3(8): 78-84
http://www.science-gate.com/IJAAS/V3I8/Hema.html
References:
Bhattacharya I and Getoor L (2004). Iterative record linkage for cleaning and integration. In the 9th ACM SIGMOD workshop on Research issues in data mining and knowledge discovery, Paris, France: 11-18 http://dx.doi.org/10.1145/1008694.1008697 |
||||
Bilenko M and Mooney RJ (2003). Adaptive duplicate detection using learnable string similarity measures. In the 9th ACM SIGKDD international conference on Knowledge discovery and data mining, Washington, DC, USA: 39-48 http://dx.doi.org/10.1145/956750.956759 |
||||
Bleiholder J and Naumann F (2009). Data fusion. ACM Computing Surveys (CSUR), 41(1): 1-41. http://dx.doi.org/10.1145/1456650.1456651 |
||||
Dong XL and Naumann F (2009). Data fusion: resolving data conflicts for integration. Proceedings of the VLDB Endowment, 2(2): 1654-1655. http://dx.doi.org/10.14778/1687553.1687620 |
||||
Dong XL, Berti-Equille L and Srivastava D (2009). Integrating conflicting data: The role of source dependence. Proceedings of the VLDB Endowment, 2(1): 550-561. http://dx.doi.org/10.14778/1687627.1687690 |
||||
Hema MS and Chandramathi S (2011). Federated query processing service in service oriented business intelligence. In International Conference on Advances in Communication, Network, and Computing. Springer Berlin Heidelberg: 337-340 http://dx.doi.org/10.1007/978-3-642-19542-6_62 |
||||
Hema MS and Chandramathi S (2012). Review on ontology based data federation. International Journal of Research and Reviews in Computer Science (IJRRCS). Science Academy Publisher, United Kingdom, 3(2): 1508-1513. | ||||
Hema MS and Chandramathi S (2013). Quality aware service oriented ontology based data integration. WSEAS transactions on computers, 12(12): 463-473. | ||||
Huang S, Zhang Y, Zhou J and Chen J (2009). Coreference resolution using markov logic networks. Advances in computational linguistics, 41: 157-168. | ||||
Hull R and King R (1987). Semantic database modeling: survey, applications, and research issues. ACM Computing Surveys (CSUR), 19(3): 201-260. http://dx.doi.org/10.1145/45072.45073 |
||||
Jarke M, Jeusfeld MA, Quix C and Vassiliadis P (1999). Architecture and quality in data warehouses: An extended repository approach. Information Systems, 24(3): 229-253. http://dx.doi.org/10.1016/S0306-4379(99)00017-4 |
||||
Lenzerini M (2002). Data integration: A theoretical perspective. In the 21st ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, Madison, WI, USA :233-246. http://dx.doi.org/10.1145/543613.543644 |
||||
Liu X, Dong XL, Ooi BC and Srivastava D (2011). Online data fusion. Proceedings of the VLDB Endowment, 4(11): 932-943. http://dx.doi.org/10.1080/19479832.2010.523440 http://dx.doi.org/10.1080/19479832.2011.577458 http://dx.doi.org/10.1080/19479832.2010.546372 |
||||
Lowd D and Domingos P (2007, September). Efficient weight learning for Markov logic networks. In European Conference on Principles of Data Mining and Knowledge Discovery. Springer Berlin Heidelberg: 200-211. http://dx.doi.org/10.1007/978-3-540-74976-9_21 |
||||
Motro A and Anokhin P (2006). Fusionplex: resolution of data inconsistencies in the integration of heterogeneous information sources. Information Fusion, 7(2): 176-196. http://dx.doi.org/10.1016/j.inffus.2004.10.001 |
||||
Poon H and Domingos P (2006). Sound and efficient inference with probabilistic and deterministic dependencies. In the 21st national conference on Artificial intelligence (AAAI-06), Boston, Massachusetts, USA, 6: 458-463. | ||||
Scannapieco M, Virgillito A, Marchetti C, Mecella M and Baldoni R (2004). The DaQuinCIS architecture: a platform for exchanging and improving data quality in cooperative information systems. Information Systems, 29(7): 551-582. http://dx.doi.org/10.1016/j.is.2003.12.004 |
||||
Sheth AP and Larson JA (1990). Federated database systems for managing distributed, heterogeneous, and autonomous databases. ACM Computing Surveys (CSUR), 22(3): 183-236. http://dx.doi.org/10.1145/96602.96604 |
||||
Singla P and Domingos P (2005). Discriminative training of Markov logic networks. In the 20th National Conference on Artificial Intelligence (AAAI-05), Pittsburgh, Pennsylvania,USA: 868-873. | ||||
Singla P and Domingos P (2006). Entity resolution with markov logic. In the 6th IEEE International Conference on Data Mining (ICDM '06), Honk Kong: 572-582. http://dx.doi.org/10.1109/icdm.2006.65 |
||||
Song F, Zacharewicz G and Chen D (2013) An ontology-driven framework towards building enterprise semantic information layer. Advanced Engineering Informatics, 27(1): 38-50. http://dx.doi.org/10.1016/j.aei.2012.11.003 |
||||
Yin X, Han J and Philip SY (2008). Truth discovery with multiple conflicting information providers on the web. IEEE Transactions on Knowledge and Data Engineering, 20(6): 796-808. http://dx.doi.org/10.1109/TKDE.2007.190745 |