Why Are Raw Data Not Readily Usable By Analytics 762069

1why Are The Originalraw Data Not Readily Usable By Analytics Tasks

Why are the original/raw data not readily usable by analytics tasks? What are the main data preprocessing steps? List and explain their importance in analytics . Instructions: WORDS and two pee reply post 100 words each 2)NO PLAGARISM 3)APA FORMAT 4)TWO REFERNCE MUST AND CITE THE REFERENCE PROPERLY 2) What are the privacy issues with data mining? Do you think they are substantiated? Instructions: WORDS and two reply post 100 words each 2)NO PLAGARISM 3)APA FORMAT 4)TWO REFERNCE MUST AND CITE THE REFERENCE PROPERLY

Paper For Above instruction

Original/raw data are often not immediately suitable for analytics tasks because they contain inconsistencies, noise, missing values, and are often unstructured. Raw data typically require preprocessing to enhance its quality and usability. The main data preprocessing steps include data cleaning, integration, transformation, reduction, and discretization. Data cleaning involves removing errors, duplicates, and handling missing values, which is crucial because dirty data can lead to inaccurate insights and flawed decision-making (Kotu & Deshpande, 2019). Data integration combines data from various sources, ensuring consistency and coherence. Transformation converts data into appropriate formats or scales, facilitating analysis. Reduction simplifies data by eliminating redundancies, thus decreasing computational load. Discretization transforms continuous variables into categorical bins, boosting the efficiency of analytical models (Han et al., 2011). These steps improve data quality, accuracy, and efficiency of analytics processes, ultimately leading to more reliable insights. Privacy issues in data mining primarily concern the potential for misuse and unauthorized access to personal data. Such issues include de-anonymization, data breaches, and informed consent violations. As data mining utilizes extensive datasets, often containing sensitive information, there is a risk of infringing on individual privacy rights. These concerns are substantiated because technological advances have made re-identification possible even after anonymization (Sweeney, 2002). Therefore, safeguarding privacy requires rigorous security protocols, anonymization techniques, and transparent data policies. Without proper safeguards, data mining could lead to significant ethical and legal consequences, emphasizing the importance of addressing these privacy issues proactively.

References

  • Han, J., Kamber, M., & Pei, J. (2011). Data mining: Concepts and techniques (3rd ed.). Morgan Kaufmann.
  • Kotu, V., & Deshpande, B. (2019). Data science and big data: A managerial perspective. Morgan Kaufmann.
  • Sweeney, L. (2002). Achieving k-anonymity privacy protection using generalization and suppression. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 10(5), 571–588.