Make Sure You've Read And Understand These Basic Pages
Make Sure Youve Read And Understand These Pagesbasic Project Requir
Make sure you've read and understand these pages: · Basic Project Requirements (Read This) · Week 2 - Keeping Data Separate from Analysis Project Instructions: This project focuses on data wrangling and analysis using pivot tables and complex formulas. The dataset is scraped from IMDB websites and was downloaded from early 2017. Download and load the dataset "movie_metadata.csv" into Excel. The submission must be in .xlsx format.
Part A Wrangle:
1. Extract a specific character sequence ("tt") from the "Movie_IMDB_Link" URL to create a unique primary key for each row. Place this in a new column called "Primary Key" as the first column, leave the original column untouched, and create a named range "PrimaryKey" for this column.
2. Format the dataset as an Excel table and create named ranges for "Director," "Country," "Gross," "Budget," and "Title Year."
3. Copy the "Primary Key" and "genres" columns to a new worksheet called "Genre worksheet," then use an Excel feature to assign each genre its own column labeled "Genre1," "Genre2," etc. Count how many movies are described with at least three genres.
4. Remove any special characters from the "movie_title" column.
5. Format the "Budget" column as U.S. Dollars. Apply conditional formatting to highlight differences between amounts. Reflect on whether dollar formatting is appropriate and justify your answer.
Part B Analysis:
6. Create a "Countries" worksheet listing distinct countries. Use formulas to count movies by country, rank them, and identify the top five countries by number of movies. Similarly, analyze total gross revenue and budgets by country, rank them, and compare top countries based on these measures.
7. Discuss potential issues in using totals for gross revenue and budgets across countries and suggest solutions.
8. Create an "Actors" worksheet listing all unique actors from three actor columns, count their total appearances, and flag actors with more than 25 films. Count how many actors appear in 30 or more films.
9. In a "Directors" worksheet, use a pivot table to analyze the number of movies directed in the US from 2010 onwards, counting the primary key in each year.
10. Develop a complex, original analysis related to the data, describing the problem, methodology, and conclusions.
Ensure data organization allows for easy updates and refreshes without breaking formulas. Leave formulas visible in cells so the process can be traced. Use the learning module techniques to complete formulas and analyses. Filename format: projectOne_LastName_FirstName.xlsx. All work must be your own, with proper citations if external sources are used.