Cukurova NLP (CuNLP) Research Group

Cukurova NLP (CuNLP) is a Natural Language Processing research group operating within the Department of Computer Engineering at Cukurova University.

Our group focuses primarily on Turkish NLP, developing innovative solutions at the intersection of artificial intelligence and linguistics.

Current Focus

We are currently working on Large Language Models (LLMs), particularly on hallucination and reasoning problems.
We welcome hardworking and motivated students who are passionate about LLMs and eager to explore these topics.

Undergraduate and graduate students interested in joining our research are encouraged to apply.
Prior experience or coursework in Machine Learning, Natural Language Processing, or related areas will be considered an advantage.

Collaborations with researchers and groups sharing similar interests are also highly welcome.

Our Team

Current Members

  • Dr. Umut OrhanGroup Lead
    Research Interests: Natural Language Processing, Large Language Models, and Semantic Representation

  • Dr. B. Tahir TahiroğluTurkish Linguist
    Focus: Turkish syntax, morphology, and linguistic resources for NLP

  • Ferhat Albayrak (Ph.D. Student) – Evaluating reasoning capabilities of Large Language Models (LLMs)

  • Melisa Biçer (Ph.D. Student) – Enhancing reasoning in LLMs through graph-based systems

  • Eren Demir (M.Eng. Student) – Measuring semantic similarity in LLMs

  • Arda Mülayim (M.Eng. Student) – Detecting hallucinations in LLMs

  • Çağrı Sayallar (B.Sc. Student) – BERT-based Turkish lemmatization


Former Members

  • Çağatay N. Tülü (Ph.D.) – Developed the SemSpace semantic space model using WordNet relations

  • Enis Arslan (Ph.D.) – Focused on Turkish morphology and morphological parsing

  • Erhan Turan (Ph.D.) – Designed a machine-readable dictionary for Turkish

  • Elif Gülfidan Dayıoğlu (Ph.D.) – Worked on open-ended exam evaluation using SemSpace vectors and deep learning

Resources and Tools

  • CU-CE – A Large Language Model (LLM) based chatbot prepared for Çukurova University Computer Engineering Department students.
    🔗 https://t.me/CU_CengBOT

  • Turkish Corpus for Morphological Disambiguation
    If you use this corpus in your publication, please cite:
    U. Orhan, E. Arslan. “Learning Word-Vector Quantization: A Case Study in Morphological Disambiguation,” Transactions on Asian and Low-Resource Language Information Processing, 19(5), 72, 2020.
    🔗 Download Dataset

  • Learning Word-Vector Quantization (in MATLAB)
    If you use this code or dataset in your publication, please cite the same paper as above.
    🔗 Download Code

  • CU-NLP Dataset for Automatic Short Answer Grading
    If you use this dataset in your publication, please cite:
    C.N. Tulu, O. Ozkaya, U. Orhan. “Short Answer Grading with SemSpace Sense Vectors and MaLSTM,” IEEE Access, 9, 19270–19280, 2021.
    🔗 Download Dataset

  • Synset Vectors Computed by Generalized SemSpace
    If you use this dataset in your publication, please cite:
    U. Orhan, E.G. Tosun, O. Ozkaya. “Intent Detection Using Contextualized Deep SemSpace,” Arabian Journal for Science and Engineering, Volume 48, pages 2009–2020, 2023.
    🔗 Download Dataset

Join Us

We are looking for motivated master’s and Ph.D. students who are passionate about Artificial Intelligence and Natural Language Processing.
If you are interested in joining our research group or collaborating with us, please contact Dr. Umut Orhan directly via email at uorhan@cu.edu.tr.

Undergraduate students with strong interest in Large Language Models (LLMs), reasoning, or hallucination detection are also welcome to get involved in our ongoing projects.