COMP47670 Assignment 1: Data Collection And Preparation

Question

COMP47670 Assignment 1: Data Collection & Preparation Overview: The objective of this assignment is to collect a dataset from one or more open web APIs of your choice, and use Python to preprocess and analyse the collected data. The assignment should be implemented as a single Jupyter Notebook (not a script). Your notebook should be clearly documented, using comments and Markdown cells to explain the code and results. Tasks: For this assignment you should complete the following tasks: Data identification: Choose at least one open web API as your data source. If more than one API is used, these APIs should be related in some way. Data collection: Collect data from your API(s) using Python. You may need to repeat the collection process multiple times to gather sufficient data. Store the collected data in an appropriate file format for subsequent analysis, such as JSON, XML, or CSV. Data preparation and analysis: Load and represent the data with an appropriate data structure (e.g., rows as records, features as columns). Apply necessary preprocessing steps to clean or filter the data before analysis. When multiple APIs are used, apply suitable data integration methods. Analyze, characterize, and summarize the cleaned dataset using tables and plots where appropriate. Clearly explain and interpret the analysis results. Summarize insights gained and suggest ideas for further analysis in the future. Guidelines: The assignment should be completed individually. Plagiarism will result in a zero grade. Submit your assignment via the COMP47670 Brightspace page. Your submission must be a ZIP file containing the Jupyter Notebook (IPYNB) file and your data. If the data size is large, include a smaller sample. In your notebook, clearly state your full name and student number. Include links to the home pages of the APIs used. Submission deadline is Monday 23rd March 2020. Late submissions will incur deductions as specified, with no submissions accepted after 10 days without appro

Dr. Jack HW Helper · Accepted Answer

The following paper presents a comprehensive approach to collecting, preprocessing, and analyzing data obtained from open web APIs, illustrating the process through the example of weather data from a public API. It emphasizes careful selection of suitable APIs, efficient data collection, meticulous data cleaning, and insightful analysis, demonstrating the value of integrating and visualizing datasets for meaningful insights. Introduction In the era of big data, open web APIs provide an accessible gateway to vast repositories of real-time and historical data, enabling researchers and developers to build data-driven applications. Selecting an appropriate API depends on the research question and the nature of data needed. For example, weather data APIs offer crucial insights for climate studies, urban planning, or health analytics. This paper discusses an example involving the collection and analysis of weather data from a publicly available API, illustrating key steps including data identification, collection, preprocessing, analysis, and visualization within a Jupyter Notebook environment. Data Identification The initial step involves choosing relevant APIs. Open APIs such as OpenWeatherMap, Weather API, or similar services provide historical weather data, often through free tiers with limitations. Criteria for selection include data availability, API reliability, documentation quality, and usage constraints. In this case, a weather data API was selected due to its comprehensive historical datasets, albeit with restrictions like rate limits and trial periods. The API chosen provided historical weather data for Dublin from July 2008 onwards, via CSV format, accessible through an API key after registration. Data Collection The core of data collection involves programmatically querying the API to retrieve data for desired timeframes and locations. Python offers libraries like urllib or requests to handle HTTP requests seamlessly. Due to API call limitations, multiple re

COMP47670 Assignment 1: Data Collection And Preparation

COMP47670 Assignment 1: Data Collection & Preparation

Paper For Above instruction

Introduction

Data Identification

Data Collection

Data Preprocessing and Cleaning

Data Analysis and Visualization

Discussion and Insights

Conclusions

References