Mengoptimalkan Proses Pembersihan Data dalam Analisis Big Data Menggunakan Pipeline Berbasis AI

Authors

  • Lukman Santoso Universitas Sains dan Teknologi Komputer
  • Priyadi Priyadi Universitas Sains dan Teknologi Komputer

DOI:

https://doi.org/10.51903/elkom.v17i2.2311

Keywords:

Machine Learning, Deep Learning, Data Preprocessing

Abstract

This study aims to develop an automated pipeline for data cleaning using Pandas and Scikit-learn. The data cleaning process is often performed manually, requiring a long time and prone to errors. This study uses a quantitative experimental method with a dataset of 100,000 rows of e-commerce transaction data. The results show that the automated pipeline reduces missing values by 95.7% and outliers by 91.7%, and accelerates processing time by 35% compared to manual methods. The distribution of data after cleaning becomes more stable, allowing for more accurate analysis. This study contributes to the development of a more efficient and accurate automated data cleaning approach.Keywords: Systematic Literature Review, Artificial Intelligence and Marketing Strategy.

Downloads

Published

2024-12-12

How to Cite

[1]
L. Santoso and Priyadi Priyadi, “Mengoptimalkan Proses Pembersihan Data dalam Analisis Big Data Menggunakan Pipeline Berbasis AI”, ELKOM, vol. 17, no. 2, pp. 657–666, Dec. 2024.