Cukurova NLP (CuNLP) Research Group
Cukurova NLP (CuNLP) is a Natural Language Processing research group operating within the Department of Computer Engineering at Cukurova University.
Our group focuses primarily on Turkish NLP, developing innovative solutions at the intersection of artificial intelligence and linguistics.
Current Focus
We are currently working on Large Language Models (LLMs), particularly on hallucination and reasoning problems.
We welcome hardworking and motivated students who are passionate about LLMs and eager to explore these topics.
Undergraduate and graduate students interested in joining our research are encouraged to apply.
Prior experience or coursework in Machine Learning, Natural Language Processing, or related areas will be considered an advantage.
Collaborations with researchers and groups sharing similar interests are also highly welcome.
Our Team
Current Members
Dr. Umut Orhan – Group Lead
Research Interests: Natural Language Processing, Large Language Models, and Semantic RepresentationDr. B. Tahir Tahiroğlu – Turkish Linguist
Focus: Turkish syntax, morphology, and linguistic resources for NLPFerhat Albayrak (Ph.D. Student) – Evaluating reasoning capabilities of Large Language Models (LLMs)
Melisa Biçer (Ph.D. Student) – Enhancing reasoning in LLMs through graph-based systems
Eren Demir (M.Eng. Student) – Measuring semantic similarity in LLMs
Arda Mülayim (M.Eng. Student) – Detecting hallucinations in LLMs
Çağrı Sayallar (B.Sc. Student) – BERT-based Turkish lemmatization
Former Members
Çağatay N. Tülü (Ph.D.) – Developed the SemSpace semantic space model using WordNet relations
Enis Arslan (Ph.D.) – Focused on Turkish morphology and morphological parsing
Erhan Turan (Ph.D.) – Designed a machine-readable dictionary for Turkish
Elif Gülfidan Dayıoğlu (Ph.D.) – Worked on open-ended exam evaluation using SemSpace vectors and deep learning
Resources and Tools
CU-CE – A Large Language Model (LLM) based chatbot prepared for Çukurova University Computer Engineering Department students.
🔗 https://t.me/CU_CengBOTTurkish Corpus for Morphological Disambiguation
If you use this corpus in your publication, please cite:
U. Orhan, E. Arslan. “Learning Word-Vector Quantization: A Case Study in Morphological Disambiguation,” Transactions on Asian and Low-Resource Language Information Processing, 19(5), 72, 2020.
🔗 Download DatasetLearning Word-Vector Quantization (in MATLAB)
If you use this code or dataset in your publication, please cite the same paper as above.
🔗 Download CodeCU-NLP Dataset for Automatic Short Answer Grading
If you use this dataset in your publication, please cite:
C.N. Tulu, O. Ozkaya, U. Orhan. “Short Answer Grading with SemSpace Sense Vectors and MaLSTM,” IEEE Access, 9, 19270–19280, 2021.
🔗 Download DatasetSynset Vectors Computed by Generalized SemSpace
If you use this dataset in your publication, please cite:
U. Orhan, E.G. Tosun, O. Ozkaya. “Intent Detection Using Contextualized Deep SemSpace,” Arabian Journal for Science and Engineering, Volume 48, pages 2009–2020, 2023.
🔗 Download Dataset
Join Us
We are looking for motivated master’s and Ph.D. students who are passionate about Artificial Intelligence and Natural Language Processing.
If you are interested in joining our research group or collaborating with us, please contact Dr. Umut Orhan directly via email at uorhan@cu.edu.tr.
Undergraduate students with strong interest in Large Language Models (LLMs), reasoning, or hallucination detection are also welcome to get involved in our ongoing projects.
