Abstract
In the context of knowledge management, ontology construction can be considered as a part of capturing of the body of knowledge of a particular problem domain. Traditionally, ontology construction assumes a tedious codification of the domain experts knowledge. In this paper, we describe a new approach to ontology engineering that has the potential of bridging the dichotomy between codification and collaboration turning to Web 2.0 technology. We propose to shift the primary source of ontology knowledge from the expert to socially emergent bodies of knowledge such as Wikipedia. Using Wikipedia as an example, we demonstrate how core terms and relationships of a domain ontology can be distilled from this socially constructed source. As an illustration, we describe how our approach achieved over 90% conceptual coverage compared with Gold standard hand-crafted ontologies, such as Cyc. What emerges is not a folksonomy, but rather a formal ontology that has nonetheless found its roots in social knowledge.
Similar content being viewed by others
Notes
References
Adamides E and Karacapilidis N (2006) Information technology support for the knowledge and social processes of innovation management. Technovation 26 (1), 50–59.
Balconi M (2002) Tacitness, codification of technological knowledge and the organization of industry. Research Policy 31 (3), 357–379.
Balconi M, Pozzali A and Viale R (2007) The ‘codification debate’ revisited: a conceptual framework to analyze the role of tacit knowledge in economics. Industrial and Corporate Change 16 (5), 823–849.
Bertino E, Catania B and Zarri GP (2001) Intelligent Database Systems. Addison-Wesley Longman Publishing Co., Inc. Boston, MA.
Bowker G and Star L (1999) Sorting Things Out: Classification and Its Consequences. MIT Press, Cambridge, MA.
Buchholz W (2006) Ontology. In Encyclopaedia of Knowledge Management (Schwartz DG Ed), pp 694–702, IGI Reference, Idea Group Inc., Hershey, PA.
Buitelaar P, Cimiano P and Magnini B (2005) Ontology Learning from Text: Methods, Evaluation and Applications. IOS Press, Amsterdam.
Burstein F, Mckemmish SM, Fisher JL, Manaszewicz R and Malhotra P (2006) A role for information portals as intelligent decision support systems: Breast Cancer Knowledge Online experience. In Intelligent Decision-making Support Systems: Foundations, Applications and Challenges (GUPTA JND, FORGIONNE GA and MORA M, Eds), pp 359–383, Springer-Verlag, London, UK.
Cimiano P (2006) Ontology Learning and Population from Text: Algorithms, Evaluation and Applications. Springer-Verlag New York Inc., Secaucus, NJ.
Cimiano P, Handschuh S and Staab S (2004) Towards the self-annotating web. In Proceedings of the 13th International Conference on World Wide Web, May 17–20, pp 462–471, ACM, New York, NY.
Cross R, Parker A, Prusak L and Borgatti S (2001) Knowing what we know: supporting knowledge creation and sharing in social networks. Organ Dynamics 3 (2), 100–120.
De Bo J, Spyns P and Meersman R (2003) Creating a ‘dogmatic’ multilingual ontology infrastructure to support a semantic portal, in on the move to meaningful Internet systems 2003: OTM 2003 workshops. Lecture Notes in Computer Science 2889, 253–266.
Etzioni O, Cafarella M, Downey D, Kok S, Popescu AM, Shaked T, Soderland S, Weld DS and Yates A (2004) Web-scale information extraction in know it all: (preliminary results). In Proceedings of the 13th international conference on World Wide Web, May 17–20, pp 100–110, ACM, New York, NY.
Farquhar A, Fikes R and Rice J (1997) Ontolingua server: A tool for collaborative ontology construction. International Journal of Human–Computers Studies 46 (6), 707–727.
Farquhar A, Fikes R, Pratt W and Rice J (1995) Collaborative ontology constructions for information integration. Technical Report, KSL-95–63, Stanford University Knowledge Systems Laboratory, Stanford University, Palo Alto, CA.
Fellbaum C (1998) WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA.
Ferneley E, Berney B and Rezgui Y (2002) Information retrieval algorithms for knowledge management – the challenge continues. In: Proceedings of the European Conference on Information and Communciation Technology Advances and Innovation in the Knowledge Society, eSMART 2002 in collaboration with CISEMIC 2002 Conference, Salford, Vol. 1, pp. 168-177.
Giles J (2005) Special report: Internet encyclopedias go head to head. Nature 438 (15), 900–901.
Gillmor D (2004) We the Media. Sebastopol, CA: O’Reilly Media http://www.authorama.com/book/we-the-media.html.
Glaser M (2006) Your guide to citizen journalism. Public Broadcasting Service http://www.pbs.org/mediashift/2006/09/your-guide-to-citizen-journalism270.html.
Gómez-Pérez A, Fernández-López M and Corcho O (2004) Ontological Engineering: With Examples from the Areas of Knowledge Management, E-Commerce and the Semantic Web. Springer, London, UK.
Guarino N and Welty C (2000) A formal ontology of properties. In Proceedings of EKAW-2000: The 12th International Conference on Knowledge Engineering and Knowledge Management, Vol. 1937, pp 97–112, Springer-Verlag, London, UK.
Hearst MA (1992) Automatic acquisition of hyponyms from large text corpora. In Proceedings of the 14th Conference on Computational Linguistics Vol. 2, Nantes, France, pp 539–545, Association for Computational Linguistics, Morriston NJ.
Holsapple CW and Joshi KD (2002) A collaborative approach to ontology design. Communications of the ACM 45 (2), 42–47.
Jarrar M and Meersman R (2008) Ontology Engineering – The DOGMA approach Lecture Notes In Computer Science archive. Advances in Web Semantics I: Ontologies, Web Services and Applied Semantic Web Section: Part I Ontologies and Knowledge Sharing, pp 7–34, Springer-Verlag; Berlin, Heidelberg.
Jarrar M, Verlinden R and Meersman R (2003) Ontology-based customer complaint management. In Proceedings of the Workshop on Regulatory Ontologies and the Modeling of Complaint Regulations, LNCS, 2889, pp 594–606.
Johnson B, Edward LE and Lundvall B-Å (2002) Why all this fuss about codified and tacit knowledge? Industrial and Corporate Change 11 (2), 245–262.
Latour B (1987) Science in Action: How to Follow Scientists and Engineers through Society. Open University Press, Milton Keynes, UK.
Lauser B, Wildermann T, Poulos A, Fisseha F, Keizer J and Katz S (2002) A comprehensive framework for building multilingual domain ontologies: Creating a prototype biosecurity ontology. International Conference on Dublin Core and Metadata Application Archive. In Proceedings of the International Conference on Dublin Core and Metadata for e-Communities: Supporting diversity and convergence table of contents, pp 113–123, Dublin Core Metadata Initiative, Florence, Italy.
Lave J and Wenger E (1991) Situated Learning: Legitimate Peripheral Participation. Cambridge University Press, Cambridge.
Lenat DB and Guha RV (1990) Building Large Knowledge-Based Systems: Representation and Inference in the Cyc Project. Addison-Wesley, Longman Publishing Co., Inc. Boston, MA.
Markert K, Nissim MK and Modjeska NN (2003) Using the web for nominal anaphora resolution. Proceedings of the European Chapter of the ACL (EACL) Workshop on the Computational Treatment of Anaphora (DALE R, VAN DEEMTER K and MITKOV R, Eds), April 12–17 Budapest, Hungary pp 39–46.
Mathes A (2004) Folksonomies – cooperative classification and communication through shared metadata. Computer Mediated Communication (LIS590CMC). University of Illinois, Urbana-Champaign, Illinois.
Mika P (2005) Ontologies are us: a unified model of social networks and semantics. In Proceedings of the International Semantic Web Conference 2005 (ISWC 2005) Lecture Notes in Computer Science (LNCS) 3729, pp 522–536, Springer-Verlag, Galway, Ireland.
Niles I and Pease A (2001) Origins of the IEEE standard upper ontology. Working Notes of the IJCAI-2001 Workshop on the IEEE Standard Upper Ontology, pp 37–42, Seattle, WA.
Pinto HS and Martins JP (2004) Ontologies: how can they be built? Knowledge and Information Systems 6 (4), 441–464.
Pinto HS, Staab S and Tempich C (2004) DILIGENT: towards a fine-grained methodology for distributed, loosely-controlled and evolving engineering of ontologies. Proceedings of the 16th European Conference on Artificial Intelligence (ECAI) In (DE MANTRAS RL and SAITTA L, Eds), pp 393–397, IOS Press, Valencia, Spain.
Ponzetto SP and Strube M (2007) Deriving a large scale taxonomy from Wikipedia. In Proceedings of the 22nd National Conference on Artificial Intelligence pp 1440–1445, Vancouver, Canada.
Ratsch E, Schultz J, Saric J, Lavin PC, Wittig U, Reyle U and Rojas I (2003) Developing a protein interactions ontology. Comparative and Functional Genomics 4 (1), 85–89.
Reinberger ML and Spyns P (2005) Unsupervised text mining for the learning of DOGMA-inspired ontologies. Ontology Learning from Text: Methods, Evaluation and Applications and Evaluation. In (BUITELAAR P, CIMIANO P and MAGNINI B, Eds), pp. 29–43, IOS Press, Amsterdam.
Sabou M (2005) Learning Web service ontologies: An automatic extraction method and its evaluation. In Ontology Learning from Text: Methods, Evaluation and Applications Frontiers in Artificial Intelligence and Application Series (BUITELAAR P, CIMIANO P and MAGNINI B, Eds), pp 125–139, Vol. 123, IOS Press, Amsterdam.
Sabou M (2006) Building Web service ontologies. p 187, PhD thesis, SIKS Dissertation Series, UK.
Singh P, Lin T, Mueller E, Lim G, Perkins T and Zhu W (2002) Open mind common sense: knowledge acquisition from the general public. In Proceedings of the First International Conference on Ontologies, Databases, and Applications of Semantics for Large Scale Information Systems, LNCS 2519, pp 1223–1237, Springer-Verlag, London, UK.
Snow R, Jurafsky D and Ng AY (2006) Semantic taxonomy induction from heterogenous evidence. In ACL ’06: Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the ACL, pp 801–808, Association for Computational Linguistics, Morristown, NJ.
Star L (Ed) (1995) Ecologies of Knowledge: Work and Politics in Science and Technology. SUNY Press, Albany, NY.
Star L and Griesemer J (1989) Institutional ecology, ‘translations’ and boundary objects: amateurs and professionals in Berkeley's museum of vertebrate Zoology, 1907–39. Social Studies of Science 19 (3), 387–420.
Suchanek FM, Ifrim G and Weikum G (2006) LEILA: learning to extract information by linguistic analysis. In Proceedings of the 2nd Workshop on Ontology Learning and Population: Bridging the Gap between Text and Knowledge – OLP 2006, Sydney, Australia, July 2006, Association for Computational Linguistics, pp 18–25.
Suchanek FM, Kasneci G and Weikum G (2007) Yago: A core of Semantic Knowledge. In WWW’07: Proceedings of the 16th International Conference on World Wide Web, pp. 697–706, New York, NY, ACM Press.
Sure Y (2003) Methodology, tools and case studies for ontology-based knowledge management. Unpublished Doctoral Dissertation, Karlsruhe University, Germany.
Sure Y, Erdmann M, Angele J, Staab S, Studer R and Wenke D (2002) Ontoedit: Collaborative ontology development for the semantic Web. In Proceedings of the 1st International SemanticWeb Conference (ISWC2002), June 9–12, 2002, LNCS 2342, pp 221–235 Springer, Sardinia, Italia.
Udell J (2004) Collaborative knowledge gardening. InfoWorld. http://www.infoworld.com/article/04/08/20/34OPstrategic_1.html (accessed 24 June 2009).
Uschold M (1996) Building ontologies: towards a unified methodology. 16th Annual Technical Conference of the British Computer Society Specialist Group on Expert Systems, pp 75–90, SGES Publications, Cambridge, UK.
Vander Wal T (2004) Folksonomy, http://vanderwal.net/folksonomy.html (accessed 24 June 2009).
Von Ahn L (2006) Games with a purpose. Computer 39 (6), 92–94.
Zirn C, Nastase V and Strube M (2008) Distinguishing between instances and classes in the Wikipedia taxonomy. In Proceedings of the 5th European Semantic Web Conference (HAUSWIRTH, M KOUBARAKIS M and BECHHOFER S, Eds), LNCS, berlin, Heidelberg, June 2008 Springer Verlag.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Guo, T., Schwartz, D., Burstein, F. et al. Codifying collaborative knowledge: using Wikipedia as a basis for automated ontology learning. Knowl Manage Res Pract 7, 206–217 (2009). https://doi.org/10.1057/kmrp.2009.14
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1057/kmrp.2009.14