CV: NLP Focus

Professional Summary

Senior computer scientist and academic leader with over 20 years of experience in artificial intelligence. Currently serving as the Head of the Computer Engineering Department at Cukurova University. While my background includes extensive work in machine learning and pattern recognition, my recent research agenda focuses on Natural Language Processing (NLP), specifically Large Language Models (LLMs), hallucination mitigation, and knowledge graph-based reasoning. I am seeking to collaborate with the Insight Centre to investigate Cultural Analytics and Narrative Reasoning in Generative AI through a visiting scholar position funded by TUBITAK (The Scientific and Technological Research Council of Turkiye).

Research Interests

Natural Language Processing (NLP): Semantic Vector Spaces (SemSpace), Intent Detection, Semantic Similarity Measurement.
Large Language Models (LLMs):** Hallucination Detection, Reasoning Capabilities, Creativity by Inference.
Graph Theory:** Knowledge Graphs, Structured Knowledge Integration, Graph Representation Learning.

Current Position

Professor & Head of Department, Cukurova University, Turkiye (2022 – Present)

Leading the Computer Engineering Department and coordinating research strategies.
Key Courses Taught (PhD/MSc Level):
- Computational Linguistics (MSc)
- Advanced Machine Learning (MSc)
- Graph Theory (PhD)
- Text Vectorization (PhD)

Selected Project Experience: NLP Focus

Principal Investigator, Live Turkish Dictionary Network Design with Weighted Graphs (Funded by TUBITAK - Budget: ~€60K)
- Implemented morphological disambiguation algorithms to identify correct word stems, ensuring accurate semantic context determination for Turkish.
- Developed an automated framework to evaluate the machine readability of definition sentences and construct a weighted graph network representing semantic relationships.
Researcher, AI-Based Diagnostic Models for Liver Cancer using Cell-Free DNA Analysis (Funded by TUBITAK + Cukurova University Scientific Research Projects Fund - Budget: ~€50K)
- Investigating the detection of Hepatocellular Carcinoma by analyzing cell-free DNA (cfDNA) sequences obtained from blood samples (liquid biopsy).
- Developing diagnostic models by primarily employing BERT-based architectures to encode gene sequences as textual representations, while utilizing classical Machine Learning techniques as robust baselines.
Principal Investigator (Proposal Under Review), A Multi-Layered Framework for Hallucination Detection and Mitigation in Large Language Models, (Submitted to Cukurova University Scientific Research Projects Fund - Budget: ~€8K).
- Fine-tuning of a base model of a open-source LLM with a revised known-SFT dataset.
- Testing it with a new special dataset.

Selected Publications: NLP & AI Focus

Under Review / Current Research
- Albayrak, F., Orhan, U. (2025). “Evaluating Reasoning Skills, Not Memorized Answers: A New Experimental Design for LLMs”. Submitted to ACM Transactions on Intelligent Systems and Technology.
- Tahiroglu, B.T., Sayallar, C., Turan, E., Orhan, U. (2025). “Unmasking Accuracy Illusions in Turkish Lemmatization: A New Evaluation Framework for Morphologically Rich Languages”. Submitted to ACM Transactions on Asian and Low-Resource Language Information Processing.
Novel Embedding: The “SemSpace” Framework
- Orhan, U., & Tulu, C. N. (2021). “A novel embedding approach to learn word vectors by weighting semantic relations: SemSpace”. Expert Systems with Applications, 185, 115146.
- Orhan, U., Tosun, E. G., & Ozkaya, O. (2022). “Intent Detection Using Contextualized Deep SemSpace”. Arabian Journal for Science and Engineering, 48(2), 2009-2020.
- Tulu, C. N., Ozkaya, O., & Orhan, U. (2021). “Automatic Short Answer Grading With SemSpace Sense Vectors and MaLSTM”. IEEE Access, 9, 19270-19280.
Low-Resource Language Processing: Turkish
- Turan, E., & Orhan, U. (2022). “Confidence Indexing of Automated Detected Synsets: A Case Study on Contemporary Turkish Dictionary”. ACM Transactions on Asian and Low-Resource Language Information Processing, 21(2), 1-19.
- Arslan, E., & Orhan, U. (2020). “Learning Word-Vector Quantization: A Study in Morphological Disambiguation of Turkish”. ACM Transactions on Asian and Low-Resource Language Information Processing, 19(6), 1-22.
Biomedical AI: EEG & ECG
- Orhan, U. (2013). “Real-time CHF detection from ECG signals using a novel discretization method”, Computers in Biology and Medicine, 43(10), 1556-1562.
- Orhan, U., Hekim, M., Ozer, M. (2011). “EEG signals classification using the K-means clustering and a multilayer perceptron neural network model”, Expert Systems with Applications, 38(10), 13475-13481.

Education

Ph.D. in Electrical-Electronics Engineering, Bulent Ecevit University, 2011.
- Title: Novel approaches for diagnosis of epilepsy disease from EEG signals.
- Proposed three novel discretization frameworks specifically adapted for non-stationary EEG time-series analysis. By transforming continuous bio-signals into discrete probabilistic distributions, the study established a robust classification model for epilepsy diagnosis, demonstrating the efficacy of discrete modeling in handling complex physiological data.
M.Sc. in Mathematics, Gaziosmanpasa University, 2007.
- Title: Computer simulation of no-chance backgammon with fuzzy logic.
- Developed a human-vs-computer game simulation for a novel backgammon game variant, utilizing fuzzy inference algorithms to model adaptive decision-making and strategic reasoning against human opponents.
B.Eng. in Computer Engineering, Karadeniz Technical University, 2000.

Additional Information

Reviewer

Permanent reviewer for TUBITAK on machine learning and NLP-based commercial R&D projects.

Sci-Fi Author (Amazon KDP)

Author of the “Artificial World Colony” series, published globally on Amazon (AWC - The First Journey, AWC - The Blue Collapse)
Investigating the boundaries of algorithmic storytelling by bridging my technical research with my experience as a Sci-Fi novelist. I aim also to develop metrics to evaluate “machine creativity” and explore whether generative models can construct coherent, long-form narratives that rival human imagination.