Evaluation of Effectiveness of Large Language Models in Ontology and Knowledge Graph Creation

Authors

  • DRTC, Indian Statistical Institute, Bangalore
  • DRTC, Indian Statistical Institute, Bangalore
  • DRTC, Indian Statistical Institute, Bangalore

DOI:

https://doi.org/10.17821/srels/2025/v62i2/171792

Keywords:

ChatGPT, Evaluation, Knowledge Graph, Large Language Models, Ontology Development, Perplexity AI, Prompt Engineering, Semantic Web Senior

Abstract

This study investigates the effectiveness of selected Large Language Models (LLMs), namely ChatGPT 3.5, Semantic Web Senior (GPT-4), and Perplexity AI, in automating the creation of ontologies and knowledge graphs from unstructured, paragraph-style text using prompt engineering techniques. The research follows a five-phase methodology, including identifying suitable LLMs, developing tailored prompts for interaction, and assessing their performance based on specific parameters. The findings reveal that Perplexity AI outperforms other LLMs regarding comprehensiveness, query handling, and defining data properties. While ChatGPT models demonstrate the ability to generate ontologies, they exhibit limitations in defining subclass relationships and managing domain and range specifications of properties. The study emphasises the critical role of prompt engineering in optimising the capabilities of LLMs for ontology and knowledge graph creation. It offers a structured evaluation methodology, shedding light on the strengths and weaknesses of each LLM. By leveraging prompt engineering, this research uniquely illustrates how systematically chosen LLMs can facilitate the development of ontologies and knowledge graphs from unstructured text and identifies the most effective models through detailed performance analysis.

Downloads

Download data is not yet available.

References

Barilla, G. 2024. ChatGPT advanced data analysis: Guide and use cases. 2024. https://www.akkio.com/post/chatgptadvanceddata-analysis.

Bertaud-Gounot, V., Duvauferrier, R., & Burgun, A. (2011). Ontology and medical diagnosis. Informatics for Health and Social Care, 37(2), 51-61. https://doi.org/10.3109/17538157.2011.590258 PMid:22462194

Bischof, S., Filtz, E., Parreira, J. X., & Steyskal, S. (2025). LLMbased guided generation of ontology term definitions. Lecture Notes in Computer Science, 133-137. https://doi.org/10.1007/978-3-031-78952-6_13

Blumauer, A. (2014). From Taxonomies over ontologies to knowledge graphs. Semantic web company. Semantic Web Company. https://semantic-web.com/from-taxonomiesoverontologies-to-knowledge-graphs/

Borst, W. (1997). Construction of engineering ontologies, [Doctoral dissertation, Enschede, Institute for Telematica and Information Technology, University of Twente, The Netherlands].

Bsharat, S. M., Myrzakhan, A., & Shen, Z. (2024). Principled instructions are all you need for questioning LLaMA-1/2, GPT-3.5/4. ArXiv.org. http://arxiv.org/abs/2312.16171

Cheng, Y., Zhang, C., Zhang, Z., Meng, X., Hong, S., Li, W., …, He, X. (2024). Exploring large language model based intelligent agents: definitions, methods, and prospects. ArXiv.org. http://arxiv.org/abs/2401.03428

Ciatto, G., Agiollo, A., Magnini, M., & Omicini, A. (2024). Large language models as Oracles for instantiating ontologies with domain-specific knowledge. Knowledge-Based Systems, 310, 112940. https://doi.org/10.1016/j.knosys.2024.112940

da Silva, L. M. V., Kocher, A., Gehlhoff, F., & Fay, A. (2024). On the use of large language models to generate capability ontologies. 2024 IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA), 33, 1-8. http://arxiv.org/abs/2404.17524

De Bellis, A. (2023). Structuring the unstructured: An LLMguided transition. Proceedings of the Doctoral Consortium at ISWC 2023, Athens, Greece. https://ceur-ws.org/Vol3678/#paper12

Deike, M. (2024). Evaluating the performance of ChatGPT and Perplexity AI in business reference. Journal of Business and Finance Librarianship, 29(2), 1-30. https://doi.org/10.1080/ 08963568.2024.2317534

Ding, Y., & Foo, S. (2002). Ontology research and development. Part 1 - A review of ontology generation. Journal of Information Science, 28(2), 123-136. https://doi.org/10.1177/016555150202800204

Dutta, B. (2017). Examining the interrelatedness between ontologies and linked data. Library Hi Tech, 35(2), 312-331. https://doi.org/10.1108/LHT-10-2016-0107

Dutta, B., & Bain, D. (2023). Finding closeness between EHRMDS and open-source electronic health record systems: An analytical approach. Communications in Computer and Information Science, 122-134. https://doi.org/10.1007/978-3-031-39141-5_11

Dutta, B., Chatterjee, U., & Madalli, D. P. (2015). YAMO: Yet Another Methodology for large-scale faceted Ontology construction. Journal of Knowledge Management, 19(1), 6-24. https://doi.org/10.1108/JKM-10-2014-0439

Dutta, B., Durgesh, N, & Shahi, G. K. (2015). MOD: Metadata for ontology description and publication. Conference: International Conference on Dublin Core and Metadata Applications (DC-2015), Sao Paulo, Brazil.

Fan, L., Li, L., Ma, Z., Lee, S., Yu, H., & Hemphill, L. (2017). A bibliometric review of large language models research from 2017 to 2023. ArXiv.org. https://doi.org/10.1145/3664930

Ferrag, M. A., Battah, A., Tihanyi, N., Jain, R., Maimuţ, D., Alwahedi, F., … Cordeiro, L. C. (2025). SecureFalcon: Are we there yet in automated software vulnerability detection with LLMs? IEEE Transactions on Software Engineering, 1-18. https://doi.org/10.1109/TSE.2025.3548168

Funk, M., Hosemann, S., Jung, J. C., & Lutz, C. (2023). Towards ontology construction with language models. ArXiv (Cornell University). https://doi.org/10.48550/arXiv.2309.09898

Giglou, H. B., D’Souza, J., & Auer, S. (2023). LLMs4OL: Large language models for ontology learning. Lecture Notes in Computer Science, 408-427. https://doi.org/10.1007/978-3031-47240-4_22

Giray, L. (2023). Prompt engineering with ChatGPT: A guide for academic writers. Annals of Biomedical Engineering, 51, 2629-2633. https://doi.org/10.1007/s10439-023-03272-4 PMid:37284994

Groza, T., Caufield, H., Gration, D., Baynam, G., Haendel, M. A., Robinson, P. N., … Reese, J. T. (2024). An evaluation of GPT models for phenotype concept recognition. BMC Medical Informatics and Decision Making, 24(1). https://doi.org/10.1186/s12911-024-02439-w PMid:38297371 PMCid:PMC10829255

Gruber, T. R. (1993). A translation approach to portable ontology specifications. Knowledge Acquisition, 5(2), 199220. https://doi.org/10.1006/knac.1993.1008

Häse, N. (2024). LLMs and ontologies: The future of AI. Semper-KI (blog). https://magazin.semper-ki.org/llmsandontologies-the-future-of-ai/

He, Y., Chen, J., Dong, H., & Horrocks, I. (2023). Exploring large language models for ontology alignment. ArXiv (Cornell University). https://doi.org/10.48550/arXiv.2309.07172

Hogan, A., Blomqvist, E., Cochez, M., D’amato, C., Melo, G. D., Gutierrez, C., … Zimmermann, A. (2022). Knowledge graphs. ACM Computing Surveys, 54(4), 1-37. https://doi.org/10.1145/3447772

Hoseini, S., Burgdorf, A., Paulus, A., Meisen, T., Quix, C., & Pomp, A. (2024). Towards LLM-augmented creation of semantic models for dataspaces. The Second International Workshop on Semantics in Dataspaces, co-located with the Extended Semantic Web Conference, Hersonissos, Greece.

Kommineni, V. K., König-Ries, B., & Samuel, S. (2024). From human experts to machines: An LLM supported approach to ontology and knowledge graph construction. ArXiv.org. https://doi.org/10.48550/arXiv.2403.08345

Kung, T. H., Cheatham, M., Medenilla, A., Sillos, C., De Leon, L., Elepaño, C., …, Tseng, V. (2023). Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health, 2(2), e0000198. https://doi.org/10.1371/journal.pdig.0000198 PMid:36812645 PMCid:PMC9931230

Lan, T., Zhang, W., Xu, C., Huang, H., Lin, D., Chen, K., & Mao, X. (2024). CriticEval: Evaluating Large Language Model as Critic. ArXiv.org. http://arxiv.org/abs/2402.13764

Lawton, G. (2023). What is prompt engineering? - TechTarget definition. Enterprise AI. https://www.techtarget.com/searchenterpriseai/definition/prompt-engineering

Li, Q., Chen, Z., Ji, C., Jiang, S., & Li, J. (2024). LLM-based multilevel knowledge generation for few-shot knowledge graph completion, Proceedings of the Thirty-Third International Joint Conference on Artificial Intelligence Main Track. https://doi.org/10.24963/ijcai.2024/236

Maity, S., Deroy, A., & Sarkar, S. (2024). Exploring the capabilities of prompted large language models in educational and assessment applications. ArXiv.org http://arxiv.org/ abs/2405.11579

Martino, A., Iannelli, M., & Truong, C. M. (2023). Knowledge injection to counter Large Language Model (LLM) hallucination. Lecture Notes in Computer Science, 182185. https://doi.org/10.1007/978-3-031-43458-7_34

Mateiu, P., & Groza, A. (2023). Ontology engineering with Large Language Models. ArXiv.org. http://arxiv.org/ abs/2307.16699

Meyer, L.-P., Stadler, C., Frey, J., Radtke, N., Junghanns, K., Meissner, R., …, Martin, M. (2024). LLM-assisted knowledge graph engineering: Experiments with ChatGPT. Informatik Aktuell, 103-115. https://doi.org/10.1007/978-3-65843705-3_8

Neuhaus, F. (2023). Ontologies in the era of large language models – A perspective. Applied Ontology, 18(4), 399-407. https://doi.org/10.3233/AO-230072.

Norouzi, S. S., Mahdavinejad, M. S., & Hitzler, P. (2023). Conversational ontology alignment with ChatGPT. ArXiv (Cornell University). https://doi.org/10.48550/arXiv.2308.09217

Noy, N. F., & McGuinness, D. L. (n.d.). What is an ontology and why we need it. https://perso.liris.cnrs.fr/alain.mille/enseignements/Ecole_Centrale/What%20is%20an%20ontology%20and%20why%20we%20need%20it.htm (Accessed 2024 Jun 12).

Noy, N. F, & McGuinness., D. L. (n.d.). Ontology development 101: A guide to creating your first ontology. https://protege.stanford.edu/publications/ontology_development/ontology101.pdf

Patil, R., & Venkat Gudivada. (2024). A review of current trends, techniques, and challenges in Large Language Models (LLMs). Applied Sciences (Basel), 14(5), 2074-2074. https://doi.org/10.3390/app14052074

Peng, R., Liu, K., Yang, P., Yuan, Z., & Li, S. (2023). Embeddingbased retrieval with LLM for effective agriculture information extracting from unstructured data. ArXiv.org. https://doi.org/10.48550/arXiv.2308.03107

Petukhova, A., Matos-Carvalho, J. P., & Fachada, N. (2024). Text clustering with LLM embeddings. ArXiv.org. https://doi.org/10.48550/arXiv.2403.15112

Qiu, Y., & Jin, Y. (2024). ChatGPT and finetuned BERT: A comparative study for developing intelligent design support systems. Intelligent Systems with Applications, 21, 200308. https://doi.org/10.1016/j.iswa.2023.200308

Ramprasad, S., Krishna, K., Lipton, Z. C., & Wallace, B. C. (2024). Evaluating the factuality of zero-shot summarizers across varied domains. ArXiv.org. http://arxiv.org/abs/2402.03509.

Saeedizade, M. J., & Blomqvist, E. (2024). Navigating ontology development with large language models. Lecture Notes in Computer Science, 143-161. https://doi.org/10.1007/978-3-031-60626-7_8

Trajanoska, M., Stojanov, R., & Trajanov, D. (2023). Enhancing knowledge graph construction using large language models. ArXiv.org. https://doi.org/10.48550/arXiv.2305.04676

Wang, C., Liu, X., & Song, D. (2020). Language models are open knowledge graphs. ArXiv (Cornell University). https://doi.org/10.48550/arXiv.2010.11967

Wu, C., Lin, W., Zhang, X., Zhang, Y., Wang, Y., & Xie, W. (2023). PMC-LLaMA: Towards building open-source language models for medicine. ArXiv.org. http://arxiv.org/abs/2304.14454

Yao, L., Mao, C., & Luo, Y. (2019). KG-BERT: BERT for knowledge graph completion. ArXiv (Cornell University).. https://doi.org/10.48550/arXiv.1909.03193

Zhao, W. X., Zhou, K., Li, J., Tang, T., Wang, X., Hou, Y., …, Wen, J-R. (2023). A survey of large language models. ArXiv.org. https://doi.org/10.48550/arXiv.2303.18223

Zhu, Y., Wang, X., Chen, J., Qiao, S., Ou, Y., Yao, Y., …, Zhang, N. (2023). LLMs for knowledge graph construction and reasoning: Recent capabilities and future opportunities. ArXiv.org. https://doi.org/10.1007/s11280-024-01297-w

Published

2025-04-29

How to Cite

Dutta, B., Bain, D., & Khamaru, N. (2025). Evaluation of Effectiveness of Large Language Models in Ontology and Knowledge Graph Creation. Journal of Information and Knowledge, 62(2), 103–118. https://doi.org/10.17821/srels/2025/v62i2/171792

Issue

Section

Articles