The Implementation of a Logistic Regression Algorithm and Gradient Boosting Classifier for Predicting Telco Customer Churn
DOI:
https://doi.org/10.51903/pixel.v17i1.2006Keywords:
Customer Churn, Logistic Regression, Gradient Boosting ClassifierAbstract
This research aims to predict customer churn in a telecommunications company using Logistic Regression (LR) and Gradient Boosting Classifier (GBC) algorithms. Customer churn poses a significant challenge as acquiring new customers is costlier than retaining existing ones. The dataset from Kaggle comprises 7043 records and 21 attributes. The process includes data pre-processing, cleaning, transformation, and normalization using a Min-Max Scaler. The data is split into features (X) and target (y), then divided into training and testing sets with an 80:20 ratio. Both models were trained and evaluated using a confusion matrix. Results show that the GBC model outperforms the LR model, with an accuracy of 83% compared to LR's 81%. This study demonstrates the effectiveness of GBC in predicting customer churn.
References
[2] N. Sjarif, N. Azmi, H. Sarkan, S. Sam, and M. Osman, "Predicting churn: how multilayer perceptron method can help with customer retention in telecom industry," in IOP Conference Series: Materials Science and Engineering, 2020, vol. 864, no. 1: IOP Publishing, p. 012076.
[3] A. Amin, F. Al-Obeidat, B. Shah, A. Adnan, J. Loo, and S. Anwar, "Customer churn prediction in telecommunication industry using data certainty," Journal of Business Research, vol. 94, pp. 290-301, 2019.
[4] W. H. Khoh, Y. H. Pang, S. Y. Ooi, L.-Y.-K. Wang, and Q. W. Poh, "Predictive churn modeling for sustainable business in the telecommunication industry: optimized weighted ensemble machine learning," Sustainability, vol. 15, no. 11, p. 8631, 2023.
[5] A. K. Ahmad, A. Jafar, and K. Aljoumaa, "Customer churn prediction in telecom using machine learning in big data platform," Journal of Big Data, vol. 6, no. 1, pp. 1-24, 2019.
[6] I. Ullah, B. Raza, A. K. Malik, M. Imran, S. U. Islam, and S. W. Kim, "A churn prediction model using random forest: analysis of machine learning techniques for churn prediction and factor identification in telecom sector," IEEE access, vol. 7, pp. 60134-60149, 2019.
[7] N. I. Mohammad, S. A. Ismail, M. N. Kama, O. M. Yusop, and A. Azmi, "Customer churn prediction in telecommunication industry using machine learning classifiers," in Proceedings of the 3rd international conference on vision, image and signal processing, 2019, pp. 1-7.
[8] A. Manzoor, M. A. Qureshi, E. Kidney, and L. Longo, "A review on machine learning methods for customer churn prediction and recommendations for business practitioners," IEEE Access, 2024.
[9] T. Zhang, S. Moro, and R. F. Ramos, "A data-driven approach to improve customer churn prediction based on telecom customer segmentation," Future Internet, vol. 14, no. 3, p. 94, 2022.
[10] M. Günay and T. Ensarı, "Predictive churn analysis with machine learning methods," in 2018 26th Signal Processing and Communications Applications Conference (SIU), 2018: IEEE, pp. 1-4.
[11] K. Ebrah and S. Elnasir, "Churn prediction using machine learning and recommendations plans for telecoms," Journal of Computer and Communications, vol. 7, no. 11, pp. 33-53, 2019.
[12] M. R. Ismail, M. K. Awang, M. N. A. Rahman, and M. Makhtar, "A multi-layer perceptron approach for customer churn prediction," International Journal of Multimedia and Ubiquitous Engineering, vol. 10, no. 7, pp. 213-222, 2015.
[13] A. Bhattarai, E. Shrestha, and R. P. Sapkota, "Customer churn prediction for imbalanced class distribution of data in business sector," Journal of Advanced College of Engineering and Management, vol. 5, pp. 101-110, 2019.
[14] B. Huang, M. T. Kechadi, and B. Buckley, "Customer churn prediction in telecommunications," Expert Systems with Applications, vol. 39, no. 1, pp. 1414-1425, 2012.
[15] V. Geetha, A. Punitha, A. Nandhini, T. Nandhini, S. Shakila, and R. Sushmitha, "Customer churn prediction in telecommunication industry using random forest classifier," in 2020 International Conference on System, Computation, Automation and Networking (ICSCAN), 2020: IEEE, pp. 1-5.
[16] W. Li and C. Zhou, "Customer churn prediction in telecom using big data analytics," in IOP Conference Series: Materials Science and Engineering, 2020, vol. 768, no. 5: IOP Publishing, p. 052070.
[17] A. F. Dewi and R. Pratiwi, "Analisis regresi logistik biner pada pengaruh harga, kualitas pelayanan dan promosi terhadap kepuasan pelanggan dalam menggunakan jasa layanan grab di kabupaten lamongan," Inferensi, vol. 4, no. 2, pp. 77-84, 2021.
[18] M. R. Khan, J. Manoj, A. Singh, and J. Blumenstock, "Behavioral modeling for churn prediction: Early indicators and accurate predictors of custom defection and loyalty," in 2015 IEEE International Congress on Big Data, 2015: IEEE, pp. 677-680.
[19] A. Chouiekh, "Deep convolutional neural networks for customer churn prediction analysis," International Journal of Cognitive Informatics and Natural Intelligence (IJCINI), vol. 14, no. 1, pp. 1-16, 2020.
[20] M. Y. Matdoan, "Pemodelan regresi robust least trimmed square (LTS)(studi kasus: faktor-faktor yang mempengaruhi penyebaran penyakit malaria di indonesia)," Euclid, vol. 7, no. 2, pp. 77-85, 2020.
[21] D. W. Hosmer Jr, S. Lemeshow, and R. X. Sturdivant, Applied logistic regression. John Wiley & Sons, 2013.
[22] J. H. Friedman, "Greedy function approximation: a gradient boosting machine," Annals of statistics, pp. 1189-1232, 2001.
[23] L.-W. Wei, C.-M. Huang, H. Chen, C.-T. Lee, C.-C. Chi, and C.-L. Chiu, "Adopting the I 3–R 24 rainfall index and landslide susceptibility for the establishment of an early warning model for rainfall-induced shallow landslides," Natural Hazards and Earth System Sciences, vol. 18, no. 6, pp. 1717-1733, 2018.
[24] A. Tharwat, "Classification assessment methods," Applied computing and informatics, vol. 17, no. 1, pp. 168-192, 2021.