REGIONAL TYPOLOGY OF E-COMMERCE BUSINESS CONSTRAINTS IN INDONESIA: A MACHINE LEARNING APPROACH

Fathur Rachman; Harun Al Azies

Authors

Fathur Rachman Universitas Airlangga Author
Harun Al Azies Universitas Dian Nuswantoro , Institut Teknologi Bandung Author

Keywords:

E-commerce, machine learning, clustering analysis, regional typology

Abstract

This study analyzes regional disparities in e-commerce business constraints in Indonesia using an unsupervised machine learning approach. Using province-level data from Statistik E-Commerce 2024 published by Statistics Indonesia, the analysis covers 38 provinces. It examines seven major constraints, including funding limitations, skilled labor shortages, limited internet access, fraud, marketing challenges, delivery constraints, and other operational barriers. K-Means clustering with z-score standardization is applied to identify regional typologies of e-commerce business constraints. The optimal number of clusters is determined using the elbow method, the silhouette score, the Davies-Bouldin index, and the Calinski-Harabasz index. The results reveal five distinct regional clusters with different combinations of constraints. The findings show that provinces in Java and Bali are mainly constrained by capital and marketing pressures despite relatively advanced digital infrastructure. Several regions outside Java face balanced structural constraints involving multiple interrelated obstacles, while capital-heavy constraints dominate others. In contrast, Papua Pegunungan and Papua Tengah exhibit severe digital infrastructure constraints, indicating persistent digital divides. This study contributes by providing a province-level typology of e-commerce business constraints using official statistics and machine learning, offering a data-driven basis for designing region-specific strategies to support inclusive e-commerce development in Indonesia.

Downloads

Download data is not yet available.

References

Abbas, S. A., Aslam, A., Rehman, A. U., Abbasi, W. A., Arif, S., & Kazmi, S. Z. H. (2020). K-Means and K-Medoids: Cluster Analysis on Birth Data Collected in City Muzaffarabad, Kashmir. IEEE Access, 8, 151847–151855. https://doi.org/10.1109/ACCESS.2020.3014021

Ahi, A. A., Sinkovics, N., & Sinkovics, R. R. (2022). E-commerce Policy and the Global Economy: A Path to More Inclusive Development? Management International Review 2022 63:1, 63(1), 27–56. https://doi.org/10.1007/s11575-022-00490-1

Ahmed, S. R. A., Al-Barazanchi, I., Jaaz, Z. A., & Abdulshaheed, H. R. (2019). Clustering algorithms subjected to K-mean and gaussian mixture model on multidimensional data set. Periodicals of Engineering and Natural Sciences, 7(2), 448–457. https://doi.org/10.21533/PEN.V7I2.484

Aik, L. E., Choon, T. W., & Abu, M. S. (2023). K-means Algorithm Based on Flower Pollination Algorithm and Calinski-Harabasz Index. Journal of Physics: Conference Series, 2643(1), 012019. https://doi.org/10.1088/1742-6596/2643/1/012019

Al Azies, H., & Herowati, W. (2023). Unravelling Income Inequality in Indonesia. Jurnal Riset Ilmu Ekonomi, 3(2), 89–100. https://doi.org/10.23969/JRIE.V3I2.63

Asikin, Z. (2024). Diverse E-Commerce Business Models In Indonesia: A Cluster Analysis From The National E-Commerce Survey. Business Review and Case Studies, 5(2), 319–319. https://doi.org/10.17358/brcs.5.2.319

Azies, H. Al, & Rositawati, A. F. D. (2021). Mapping of the Reading Literacy Activity Index in East Java Province, Indonesia: an Unsupervised Learning Approach. Proceedings of The International Conference on Data Science and Official Statistics, 2021(1), 211–223. https://doi.org/10.34123/ICDSOS.V2021I1.128

Bakri, Rochmah, A. A. N., Safitri, E. A., Indriani, K., & Erlina, S. R. A. D. (2024). E-commerce and Market Penetration Strategies in Overcoming Geographical Challenges in Indonesia’s Retail Industry. Journal of Contemporary Administration and Management (ADMAN), 2(2), 539–546. https://doi.org/10.61100/adman.v2i2.197

Baligodugula, V. (2023). Unsupervised-based Distributed Machine Learning for Efficient Data Clustering and Prediction. Browse All Theses and Dissertations. https://corescholar.libraries.wright.edu/etd_all/2791

Cooksey, R. W. (2020). Descriptive Statistics for Summarising Data. Illustrating Statistical Procedures: Finding Meaning in Quantitative Data, 61–139. https://doi.org/10.1007/978-981-15-2537-7_5

Criveanu, M. M. (2023). Investigating Digital Intensity and E-Commerce as Drivers for Sustainability and Economic Growth in the EU Countries. Electronics 2023, Vol. 12, 12(10). https://doi.org/10.3390/electronics12102318

Dadashpoor, H., Malekzadeh, N., & Saeidishirvan, S. (2022). A typology of metropolitan spatial structure: a systematic review. Environment, Development and Sustainability 2022 25:12, 25(12), 13667–13693. https://doi.org/10.1007/s10668-022-02641-8

Edelmann, D., Móri, T. F., & Székely, G. J. (2021). On relationships between the Pearson and the distance correlation coefficients. Statistics & Probability Letters, 169, 108960. https://doi.org/10.1016/J.SPL.2020.108960

Gratsos, K., Ougiaroglou, S., & Margaris, D. (2023). A Web Tool for K-means Clustering. Lecture Notes in Networks and Systems, 783 LNNS, 91–101. https://doi.org/10.1007/978-3-031-44097-7_9

Inkongngarm, A., Bootkrajang, J., Somhom, S., Trongratsameethong, A., & Luekhong, P. (2024). Enhancing Educational Strategy Through K-Means Clustering: A Study on Academic Departments. Proceedings - 21st International Joint Conference on Computer Science and Software Engineering, JCSSE 2024, 310–315. https://doi.org/10.1109/JCSSE61278.2024.10613649

Li, Kaiming, Wang, L., Yue, L., & Li, Kaishun. (2026). Spatial Heterogeneity and Gradient Governance of Idle Rural Homesteads in Megacities: Evidence from Shanghai. Land 2026, Vol. 15, 15(2), 246. https://doi.org/10.3390/land15020246

Lima, S. P., & Cruz, M. D. (2020). A genetic algorithm using Calinski-Harabasz index for automatic clustering problem. Revista Brasileira de Computação Aplicada, 12(3), 97–106. https://doi.org/10.5335/RBCA.V12I3.11117

Liu, Y., Mu, Y., Chen, K., Li, Y., & Guo, J. (2020). Daily Activity Feature Selection in Smart Homes Based on Pearson Correlation Coefficient. Neural Processing Letters, 51(2), 1771–1787. https://doi.org/10.1007/S11063-019-10185-8/FIGURES/13

Luengo, J., García-Gil, D., Ramírez-Gallego, S., García, S., & Herrera, F. (2020). Big Data Preprocessing: Enabling Smart Data. Big Data Preprocessing: Enabling Smart Data, 1–186. https://doi.org/10.1007/978-3-030-39105-8/COVER

Mishra, P., Pandey, C. M., Singh, U., Gupta, A., Sahu, C., & Keshri, A. (2019). Descriptive Statistics and Normality Tests for Statistical Data. Annals of Cardiac Anaesthesia, 22(1), 67. https://doi.org/10.4103/ACA.ACA_157_18

Monica, M., Ayuningtiyas, N. U., Al Azies, H., Riefky, M., Khusna, H., & Rahayu, S. P. (2021). Unsupervised Learning Approach for Evaluating the Impact of COVID-19 on Economic Growth in Indonesia. Communications in Computer and Information Science, 1489 CCIS, 54–70. https://doi.org/10.1007/978-981-16-7334-4_5/COVER

Nirmal, S. (2008). Comparative Study between K-Means and K-Medoids Clustering Algorithms. International Research Journal of Engineering and Technology, 839. www.irjet.net

Onumanyi, A. J., Molokomme, D. N., Isaac, S. J., & Abu-Mahfouz, A. M. (2022). AutoElbow: An Automatic Elbow Detection Method for Estimating the Number of Clusters in a Dataset. Applied Sciences 2022, Vol. 12, Page 7515, 12(15), 7515. https://doi.org/10.3390/APP12157515

Ros, F., Riad, R., & Guillaume, S. (2023). PDBI: A partitioning Davies-Bouldin index for clustering evaluation. Neurocomputing, 528, 178–199. https://doi.org/10.1016/J.NEUCOM.2023.01.043

SchubertErich. (2023). Stop using the elbow criterion for k-means and how to choose the number of clusters instead. ACM SIGKDD Explorations Newsletter, 25(1), 36–42. https://doi.org/10.1145/3606274.3606278

Sinaga, K. P., & Yang, M. S. (2020). Unsupervised K-means clustering algorithm. IEEE Access, 8, 80716–80727. https://doi.org/10.1109/ACCESS.2020.2988796

Sosyal Araştırmalar, A. (2025). Sustainable e-Commerce: Transformation in Environmental, Economic, and Social Dimensions. Akademic Social Studies, 9(31), 261–290. https://doi.org/10.31455/asya.1622024

Sowan, B., Hong, T. P., Al-Qerem, A., Alauthman, M., & Matar, N. (2023). Ensembling validation indices to estimate the optimal number of clusters. Applied Intelligence, 53(9), 9933–9957. https://doi.org/10.1007/S10489-022-03939-W/FIGURES/11

Wijaya, D. R., Paramita, N. L. P. S. P., Uluwiyah, A., Rheza, M., Zahara, A., & Puspita, D. R. (2020). Estimating city-level poverty rate based on e-commerce data with machine learning. Electronic Commerce Research 2020 22:1, 22(1), 195–221. https://doi.org/10.1007/s10660-020-09424-1

Xu, G., Zhao, T., & Wang, R. (2022). Research on the Efficiency Measurement and Spatial Spillover Effect of China’s Regional E-Commerce Poverty Alleviation from the Perspective of Sustainable Development. Sustainability 2022, Vol. 14, 14(14). https://doi.org/10.3390/su14148456

REGIONAL TYPOLOGY OF E-COMMERCE BUSINESS CONSTRAINTS IN INDONESIA: A MACHINE LEARNING APPROACH

Authors

Keywords:

Abstract

Downloads

References

Downloads

Published

Conference Proceedings Volume

Section

How to Cite

Similar Articles

template

sidebar

Make a Submission

Latest publications

Information

STATISTIK