This volume constitutes the proceedings of the 23rd International Symposium on Intelligent Data Analysis, IDA 2025, which was held in Konstanz, Germany, during May 7–9, 2025.
The 35 full papers included in the proceedings were carefully reviewed and selected from 91 submissions. They were organized in topical sections as follows: Applications of data science, foundations of data science; natural language processing; temporal and streaming data; and explainable and interpretable data science.
This volume constitutes the proceedings of the 23rd International Symposium on Intelligent Data Analysis, IDA 2025, which was held in Konstanz, Germany, during May 7–9, 2025.
The 35 full papers included in the proceedings were carefully reviewed and selected from 91 submissions.
Applications of Data Science.- Credal Knowledge Tracing for Imprecise and Uncertain MCQ.- Development of Models to Quantify Training Load in Outdoor Running using Inertial Sensors.- Estimating the Learning Capacity of Bacterial Metabolic Networks.- Semi-supervised learning with pairwise instance comparisons for medical instance classification.- Local-global Data Augmentation for Contrastive Learning in Static Sign Language Recognition.- SiamCircle: Trajectory Representation Learning in Free Settings.- Synthetic Tabular Data Detection In the Wild.- Assessing the Impact of Graph Structure Learning in Graph Deviation Networks.- Foundations of Data Science.- The When and How of Target Variable Transformations.- Balancing performance and scalability of demand forecasting ML models.- Balancing global importance and source proximity for personalized recommendations using random walk length.- Counterintuitive Behavior of Clustering Quality: Findings for K-Means
on Synthetic and Real Data.- BOWSA: a contribution of sensitivity analysis to improve Bayesian optimization for parameter tuning.- Overfitting in Combined Algorithm Selection and Hyperparameter
Optimization.- Local Subgroup Discovery on Attributed Network Graphs.- Imposing Constraints in Probabilistic Circuits via Gradient Optimization.- Natural Language Processing.- Improving Next Tokens via Second-Last Predictions with ’Generate and Refine’.- Detection of Large Language Model Contamination with Tabular Data.- Imbalanced Data Clustering via Targeted Data Augmentation Using GMM and LLM.- Make Literature-Based Discovery Great Again through Reproducible Pipelines.- Extracting information in a low-resource setting: case study on bioinformatics workflows.- Vocabulary Quality in NLP Datasets: An Autoencoder-Based Framework Across Domains and Languages.- Temporal and Streaming Data Expertise Prediction of Tetris Players Using Eye Tracking Information.- Integrating Inverse and Forward Modeling for Sparse Temporal Data from Sensor Networks.- Bridging Spatial and Temporal Contexts: Sparse Transfer Learning.- Meta-learning and Data Augmentation for Stress Testing Forecasting Models.- Pragmatic Paradigm for Multi-stream Regression.- Two-in-one Models for Event Prediction and Time Series Forecasting. Comparison of Four Deep Learning Approaches to Simulate a Digital Patient under Anesthesia.- An Analysis of Temporal Dropout in Earth Observation Time Series for Regression Tasks.- Performative Drift Resistant Classification using Generative Domain Adversarial Networks.- Explainable and Interpretable Data Science.- Extracting Moore Machines from Transformers using Queries and Counterexamples.- Obtaining Example-Based Explanations from Deep Neural Networks.- Relevance-aware Algorithmic Recourse.- Expanding Polynomial Kernels for Global and Local Explanations of Support Vector Machines.- A Constrained Declarative Based Approach for Explainable Clustering.