Sonia Sheryr

Dr. Bradley

CODES 123

25 February 2025

Article Annotation

This article is in regard to data cleaning and how it is such a crucial step in research. Another way we describe data cleaning is data transformation because we are just taking the data and preparing it the best we can so it can get analyzed correctly. The article’s authors talked about how data cleaning is vital, but it can also take away important information. At times, we can make hidden choices during data cleaning, which is when we can unknowingly make small changes or “clean-ups” that can have an excessive effect on the dataset itself. Taking out any data, even if it may seem irrelevant or extra, can cause a fluctuation in data results and evaluations.

The authors suggested a method called indexing as a substitute for data cleaning. Indexing would consist of structuring information in a way of a system that separates two topics of information, but they both end up relating to each other. This results in capturing the complete picture and diversity of the information without mistakenly taking crucial information out.

This article is rich in detailed information and elaborates on several viewpoints from different perspectives. This article consists of vital information as it initiates the idea of a substitute for data cleaning, which is indexing. There is a slight bias when the authors state that it is essential that every little bit of data must be conserved and used, but this is not always true for all situations. This article consists of several main ideas and provides even more supporting details. I did not really know much about data cleaning and what positive and negative effects came along with it until this article. This article was a heavily informative piece of writing, and it made me think about the impacts of data cleaning.