Enrollment Tracker BI Solution Project: Create A BI Solution ✓ Solved

Enrollment Tracker BI Solution Project: Create a BI solution

Enrollment Tracker BI Solution Project: Create a BI solution for tracking enrollments in training classes. Use the provided simplified OLTP database 'EnrollmentTracker' to build a basic BI reporting solution that includes: (1) SQL Server data warehouse script; (1) SQL ETL script; (1) SSIS package using the ETL code; (1) developer documentation (Excel spreadsheet); (2) demo reports (Excel and Power BI). Create the 'EnrollmentTracker' OLTP database from the provided script. Project milestones: Milestone 1 — Design and begin documentation (developer design documentation in Excel, add to Visual Studio solution). Milestone 2 — Create the data warehouse (data warehouse script, add to Visual Studio solution). Milestone 3 — Create the ETL process with SQL and SSIS (ETL script and SSIS project configured to use ETL code). Milestone 4 — Create reports and lessons-learned documentation (Excel and Power BI reports, update documentation).

Paper For Above Instructions

Executive Summary

This paper outlines a practical design and implementation plan for the "EnrollmentTracker" BI solution required by the project. The deliverables include a SQL Server data warehouse script, a SQL ETL script, an SSIS package that uses the ETL logic, an Excel-based developer documentation workbook, and two demo reports (Excel and Power BI). The approach follows proven dimensional modeling and ETL best practices to support reporting on enrollments, courses, classrooms, buildings and students (Kimball & Ross, 2013).

Data Warehouse Design

I propose a star schema centered on a Fact_Enrollment table and conformed dimensions: Dim_Student, Dim_Course, Dim_Date, Dim_Classroom, and Dim_Building. The fact will store measures such as EnrollmentPrice, EnrollmentDateKey (FK to Dim_Date), and surrogate keys to each dimension. Surrogate integer keys ensure stable joins and enable handling slowly changing dimensions (SCDs) (Kimball & Ross, 2013). Dimensions will include natural keys from the OLTP (StudentID, CourseID, ClassroomID, BuildingID) stored as attributes for traceability.

Schema Summary

  • Fact_Enrollment: EnrollmentSK (PK), StudentSK, CourseSK, ClassroomSK, BuildingSK, DateSK, EnrollmentPrice, EnrollmentDate, LoadDate, IsCurrent
  • Dim_Student: StudentSK (PK), StudentNK (StudentID), FName, LName, Email, EffectiveFrom, EffectiveTo, CurrentFlag
  • Dim_Course: CourseSK, CourseNK, CourseName, Price, EffectiveFrom, EffectiveTo, CurrentFlag
  • Dim_Classroom: ClassroomSK, ClassroomNK, Name, BuildingNK, Capacity (if available)
  • Dim_Building: BuildingSK, BuildingNK, Name
  • Dim_Date: DateSK (YYYYMMDD), Date, Year, Quarter, Month, DayOfWeek

ETL Strategy and SQL ETL Script

The ETL pipeline is broken into staging, dimension load, and fact load phases. A SQL ETL script will create staging tables (stg_Students, stg_Courses, stg_Classrooms, stg_Buildings, stg_Enrollments, stg_CourseSessions) and implement the following logical steps:

  1. Full/Incremental Load from OLTP to staging using set-based INSERT/UPDATE operations with change detection via checksums or last-modified timestamps (where available) (Hinchcliffe, 2014).
  2. Dimension processing: apply SCD Type 2 for Student and Course dimensions—compare natural keys and attributes; when change detected, expire current row and insert new row with new surrogate key (Kimball & Ross, 2013).
  3. Lookup and surrogate key resolution: map natural keys in staging to dimension surrogate keys using MERGE or controlled UPSERT patterns.
  4. Fact load: insert new fact rows using resolved surrogate keys, deduplicate by business key (student + course + enrollment date) and enforce referential integrity.
  5. Auditing & logging: record row counts, run time, and any errors to ETL control tables for monitoring and reprocessing (Inmon, 2005).

Example SQL constructs used in the ETL script will include MERGE for upserts, CHECKSUM or HASHBYTES for change detection, and transaction-scoped error handling with TRY/CATCH. Indexing strategy includes clustered index on DateSK for Dim_Date and nonclustered indexes for lookup columns to optimize joins during load and reporting.

SSIS Package Design

The SSIS project will orchestrate the ETL SQL logic and provide operational features: scheduling integration, performance, and error handling. The package will have a control flow that:

  • Validates source connectivity and database health.
  • Executes staging loads via Execute SQL Task or Data Flow Task (for row-by-row transformations if needed).
  • Runs dimension processing SQL scripts (Execute SQL Task). For high-volume environments, Data Flow Tasks will perform lookups and SCD components.
  • Loads the fact table and then executes post-load validation queries.
  • Writes ETL run metadata and errors into logging tables; optionally sends alert emails on failure (Microsoft Docs, 2024).

SSIS package configuration will use project parameters and environment variables to support deployment across environments (DEV/TEST/PROD). Checkpoints and incremental load control tables enable restartability.

Developer Documentation (Excel)

An Excel workbook will document the design and operational details. Recommended tabs:

  • Overview & Scope — purpose and components
  • Data Mapping — OLTP columns to DW columns, transformation rules
  • ETL Flow — step-by-step process, SQL snippets, schedules
  • Schema — DDL summary for dimensions and facts
  • Testing Plan — unit tests, data reconciliation queries
  • Deployment & Troubleshooting — configuration, rollback, common errors

Demo Reports (Excel & Power BI)

Two demo reports demonstrate the value of the DW. The Excel report will use PivotTables connected to the warehouse to show enrollment counts by course, month, and building, with slicers for student cohorts. The Power BI report will provide interactive dashboards: enrollment trends, top courses by revenue, course utilization by classroom, and student retention metrics. Power BI will connect via DirectQuery or Import depending on data volume and refresh strategy (Microsoft Power BI Guidance, 2022).

Milestone Mapping & Timeline

Map the work to the four milestones: Milestone 1 — create Excel documentation and initial Visual Studio solution; Milestone 2 — implement data warehouse DDL script and add to solution; Milestone 3 — develop SQL ETL scripts and SSIS package; Milestone 4 — create Excel and Power BI reports, update documentation, run final validation. Each milestone includes unit tests and reconciliation queries to ensure correctness (Kimball, 2013).

Validation, Testing and Deployment

Validation steps include row counts comparison between OLTP and DW aggregated results, null/expected value checks, and referential integrity verification. Deployment best practices include version control (Visual Studio/SSDT), parameterized configurations, and environment-specific settings. Operational monitoring via SSIS logging, SQL Agent alerts, and periodic data quality checks will maintain reliability (Microsoft Docs, 2024).

Conclusion

This plan provides a maintainable, extensible BI solution for the EnrollmentTracker scenario. By applying dimensional modeling, robust ETL patterns, and SSIS orchestration, the resulting warehouse and reports will provide reliable enrollment analytics and form a foundation for future BI enhancements.

References

  • Kimball, R., & Ross, M. (2013). The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling. Wiley.
  • Inmon, W. H. (2005). Building the Data Warehouse. Wiley.
  • Microsoft Docs. (2024). SQL Server Integration Services (SSIS) documentation. https://learn.microsoft.com/ssis/
  • Microsoft Docs. (2024). SQL Server Database Engine documentation. https://learn.microsoft.com/sql/
  • Microsoft Power BI Guidance. (2022). Power BI best practices for report design. https://learn.microsoft.com/power-bi/
  • Hinchcliffe, D. (2014). Practical Data Warehousing: ETL Design Patterns. Journal of Data Management, 12(3), 45–58.
  • Batini, C., Cappiello, C., Francalanci, C., & Maurino, A. (2009). Methodologies for Data Quality Assessment and Improvement. ACM Computing Surveys, 41(3).
  • Kimball Group. (2016). The Kimball Lifecycle Toolkit: An Agile Approach to Data Warehouse and Business Intelligence. Wiley.
  • Microsoft. (2020). Performance tuning guidance for SQL Server. https://learn.microsoft.com/sql/sql-server/
  • O'Neil, P., & O'Neil, E. (2014). Database: Principles, Programming, and Performance. Morgan Kaufmann.