TITLE: Identifying underutilized land by eXplainable artificial intelligence and geographic similarity ensemble model with limited samples
ABSTRACT: As cities globally confront the dual challenges of spatial resource scarcity and aging urban fabrics, the precise identification of underutilized land emerges as a critical pathway toward sustainable urban regeneration. However, persistent methodological gaps hinder precise identification due to three unresolved scientific problems: (1) multifactorial spatial complexity obscuring determinant interactions, (2) limited sample availability constraining machine learning efficacy, and (3) opaque decision-making processes in conventional algorithms. This study resolves these through an eXplainable Artificial Intelligence-Geographic Similarity Reasoning (XAIGSR) model integrating three innovations: a multidimensional indicator system quantifying land-use efficiency across morphology, economic, social, and ecological dimensions; XGBoost-SHAP interpretation elucidating nonlinear factor contributions; and geospatial analogical reasoning overcoming sample scarcity. Applied to Shenzhen, the model achieved 82.9 % accuracy, identifying 9668 underutilized blocks (25.44 % total) with distinct typological distribution - Type 1 (6.99 %) reflecting central district efficiency versus Type 2 dominance (56.17 %) revealing suburban improvement potential, while Type 3 (27.18 %) and Mixed-type (9.67 %) clusters predominantly occupy eastern/northern low-density zones. Compared to existing methods, our framework advances underutilized land detection by simultaneously resolving sample limitations through geospatial similarity reasoning and enhancing reliability via uncertainty-quantified similarity metrics, providing urban planners with an empirically validated decision-support tool for targeted regeneration strategies.
Keywords: Urban renewal;Underutilized land;Machine learning;XAI;Geographic similarity;Shenzhen