Publications
You can also find my articles on my Google Scholar profile.
Conferences & Journals
- Gradient boosting decision trees on medical diagnosis over tabular data [Paper] [Code]
- Published in IEEE ICAD, 2025
- Medical diagnosis is a crucial task in the medical field, in terms of providing accurate classification and respective treatments. Having near-precise decisions based on correct diagnosis can affect a patient’s life itself, and may extremely result in a catastrophe if not classified correctly. Several traditional machine learning (ML), such as support vector machines (SVMs) and logistic regression, and state-of-the-art tabular deep learning (DL) methods, including TabNet and TabTransformer, have been proposed and used over tabular medical datasets. Additionally, due to the superior performances, lower computational costs, and easier optimization over different tasks, ensemble methods have been used in the field more recently. They offer a powerful alternative in terms of providing successful medical decision-making processes in several diagnosis tasks. In this study, we investigated the benefits of ensemble methods, especially the Gradient Boosting Decision Tree (GBDT) algorithms in medical classification tasks over tabular data, focusing on XGBoost, CatBoost, and LightGBM. The experiments demonstrate that GBDT methods outperform traditional ML and deep neural network architectures and have the highest average rank over several benchmark tabular medical diagnosis datasets. Furthermore, they require much less computational power compared to DL models, creating the optimal methodology in terms of high performance and lower complexity.
- Recommended citation: A. Y. Yıldız and A. Kalayci, “Gradient boosting decision trees on medical diagnosis over tabular data.” arXiv preprint arXiv:2410.03705 (2024). Copy BibTeX from here.
- Detection and classification architecture for sdr based radar electronic support measure systems [Paper]
- Published in IEEE SIU, 2024
- Electronic Support Measures (ESM) devices are key to situational awareness of the electromagnetic environment in the field. However, the current ESM systems tend to be physically large and cumbersome. To mitigate this problem, a portable ESM device is proposed. In this work, a compact single-board computer (SBC) is coupled with a Software Defined Radio (SDR) to create such a device. Signals received by the SDR are sampled within the SDR and sent to the SBC. Those signals are then processed with various signal processing and machine learning algorithms to perform detection, measurement, and classification tasks. Later, these results are reported to the user.
- Recommended citation: G. S. Yavuz, B. Sayğılı, Y Aydınlı, R. Dalkıran, İ. Eşin, M. Uluçay, B. Uykulu, S. S. Kıyma, O. Arikan, and A. Y. Yıldız, “Detection and classification architecture for sdr based radar electronic support measure systems.” 2024 32nd Signal Processing and Communications Applications Conference (SIU). IEEE, 2024. Copy BibTeX from here.
- Multivariate time series imputation with transformers [Paper] [Code]
- Published in IEEE SPL, 2022
- Processing time series with missing segments is a fundamental challenge that puts obstacles to advanced analysis in various disciplines such as engineering, medicine, and economics. One of the remedies is imputation to fill the missing values based on observed values properly without undermining performance. We propose the Multivariate Time-Series Imputation with Transformers (MTSIT), a novel method that uses transformer architecture in an unsupervised manner for missing value imputation. Unlike the existing transformer architectures, this model only uses the encoder part of the transformer due to computational benefits. Crucially, MTSIT trains the autoencoder by jointly reconstructing and imputing stochastically-masked inputs via an objective designed for multivariate time-series data. The trained autoencoder is then evaluated for imputing both simulated and real missing values. Experiments show that MTSIT outperforms state-of-the-art imputation methods over benchmark datasets.
- Recommended citation: A. Y. Yıldız, E. Koç, and A. Koç. “Multivariate time series imputation with transformers.” IEEE Signal Processing Letters 29 (2022): 2517-2521. Copy BibTeX from here.
Preprints
- T-PRIME: Transformer-based Protocol Identification for Machine-learning at the Edge [Paper] [Code]
- Published in arxiv, 2024 (submitted to IEEE/ACM Transactions on Networking)
- Spectrum sharing allows different protocols of the same standard (e.g., 802.11 family) or different standards (e.g., LTE and DVB) to coexist in overlapping frequency bands. As this paradigm continues to spread, wireless systems must also evolve to identify active transmitters and unauthorized waveforms in real time under intentional distortion of preambles, extremely low signal-to-noise ratios and challenging channel conditions. We overcome limitations of correlation-based preamble matching methods in such conditions through the design of T-PRIME: a Transformer-based machine learning approach. T-PRIME learns the structural design of transmitted frames through its attention mechanism, looking at sequence patterns that go beyond the preamble alone. The paper makes three contributions: First, it compares Transformer models and demonstrates their superiority over traditional methods and state-of-the-art neural networks. Second, it rigorously analyzes T-PRIME’s real-time feasibility on DeepWave’s AIR-T platform. Third, it utilizes an extensive 66 GB dataset of over-the-air (OTA) WiFi transmissions for training, which is released along with the code for community use. Results reveal nearly perfect (i.e. >98%) classification accuracy under simulated scenarios, showing 100% detection improvement over legacy methods in low SNR ranges, 97% classification accuracy for OTA single-protocol transmissions and up to 75% double-protocol classification accuracy in interference scenarios.
- Recommended citation: M. Belgiovine, J. Groen, M. Sirera, C. Tassie, \textbf{A. Y. Yıldız}, S. Trudeau, S. Ioannidis, and K. Chowdhury, “T-PRIME: Transformer-based Protocol Identification for Machine-learning at the Edge.” arXiv preprint arXiv:2401.04837 (2024). Copy BibTeX from here.
- sEMG motion classification via few-shot learning with applications to sports science [Paper]
- Published in Authorea Preprints, 2024
- Motion classification with surface electromyog- raphy (sEMG) has been studied for practical applications in prosthesis limb control and human-machine interaction. Recent studies have shown that feature learning with deep neural networks (DNN) reaches considerable accuracy in motion classification tasks. However, DNNs require large datasets for acceptable performance and fail for tasks with few data samples available for training. Professional athlete training includes hundreds of exercises, and coupled with privacy and confidentiality issues acquiring a large dataset for all the exercises is not feasible. As a result, state-of-the-art DNN architectures are unsuitable for real-life sports applications. We utilise few-shot learning (FSL) techniques to overcome the small dataset problem of sports-related motion classification tasks. The employed methodology uses the knowledge gathered from a large set of tasks to classify unseen tasks with a few data samples. The FSL approach with a siamese network and triplet loss reached the best performance with a median F1-score of 72.01%, 76%, and 79% for 1, 5 and 10 shot datasets that include an unseen set of tasks, respectively. In contrast, DNN with transfer learning (TF) reached 49.27%, 51.58%, and 67.66% for the same set of tasks, respectively.
- Recommended citation: M. Ergeneci, E. Bayram, \textbf{A. Y. Yıldız}, D. Carter, and P. Kosmas, “sEMG motion classification via few-shot learning with applications to sports science.” Authorea Preprints (2023). Copy BibTeX from here.