Certified Specialist Business Intelligence CSBI Refle
Certified Specialist Businessintelligence Csbi Refle
Certified Specialist Business Intelligence (CSBI) Reflection Part 5 of 6 CSBI Course 5: Business Intelligence and Analytical and Quantitative Skills â— Thinking about the Basics â— The Basic Elements of Experimental Design â— Sampling â— Common Mistakes in Analysis â— Opportunities and Problems to Solve â— The Low Severity Level ED (SL5P) Case Setup as an Example of BI Work â— Meaningful Analytic Structures Analysis and Statistics A key aspect of the work of the BI/Analytics consultant is analysis. Analysis can be defined as how the data is turned into information. Information is the outcome when the data is analyzed correctly. Rigorous analysis is having the best chance of creating the sharpest picture of what the data might reveal and is the product of proper application of statistics and experimental design.
Statistics encompasses a complex and detailed series of disciplines. Statistical concepts are foundational to all descriptive, predictive and prescriptive analytic applications. However, the application of simple descriptive statistical calculations yields a great deal of usable information for transformational decision-making. The value of the information is amplified when using these same simple statistics within the context of a well-designed experiment. This module is not designed to teach one statistic.
It is designed to place statistical work within the appropriate context so that it can be leveraged most effectively in driving organizational performance. An important review of the basic knowledge for work with descriptive and inferential statistics. The Basic Elements of Experimental Design Analytic tools also can provide an enhanced ability to conduct experiments. More than just allowing analysis of output of activities or processes, experiments can be performed on processes and the output of processes. Experimenting on processes is a movement beyond the traditional realm of report-writing analysis(collection and analysis of data without applying changes to factors to find differences that are not random variations) and observational studies.
This leads to performance improvements, as this enables decision management recommendations and guidance on future actions to undertake. In experiments the focus is on carrying out specific orderly procedures to verify/refute/establish the validity of one or more thoughts(hypotheses) on what might happen in each situation. For example, what happens to collections when a method or process like accelerating denial review is manipulated. We think we know, of course, as we have observed changes implemented in the past. However, an experiment is needed to ensure the change is indeed significant, the improvement is not just random positive variation and the new procedure is not a waste of resources.
The experiment provides insight into true cause-and-effect by demonstrating what outcome occurs when a factor is manipulated. This is greatly enhanced through the power of predictive analytics. Prediction allows the performance of what if and off-line scenarios. Subsequently the parameters in each positive what-if can be tried in real-time to look for proof of the effect. Experimental Design Then: â— Better targets, metrics or KPIs can be established, as what is possible from a process is now more fully understood within agreed levels of confidence. â— Parameters, rules or recommendations can be implemented to guide decisions in real-time dynamics to achieve desired, favorable or anticipated results in a situation or as a result of a process toward attainment of targets or goals. â— Decisions/decision-making attains greater precision and speed.
Because there is natural variation that must be considered and dealt with, bias must be eliminated, and working one time is not enough. The goal is to implement things that work, remember that things that work are real and real things are replicable. Here is a 3 step process: 1. Consider the question that should be answered and possible ideas about what these answers might be-the hypothesis. For example, a new methodology to speed collections might be needed because an organization is not up to the benchmark they track. 2. Consider the sample to be tested and data collection. 3. Design a proper experiment, taking into consideration the variation, bias and replications needed in collecting the data. Example There are four new process ideas by which to engage collections activity for accounts. From the general population of accounts, accounts that fit the desired sample size have been randomly selected. Credit score groups are created, and parties responsible for accounts are assigned to them-the lowest four scores in group one, and so on through the highest score. Then each member of each credit score group is randomly assigned to use one of the new processes in collecting their account. A control group in which current collection processes are continued should also be maintained. Sampling Correcting Targeting The objective of the experience is to make inferences about the population. However, the population may be too large to test in its entirety. If so, a sample is needed. A sample is the data set used to make inferences about the entire population. Again, it is important to have a firm grasp on what one is trying to answer and how this might be emphasized, as without this understanding, an incorrect population might be targeted-for example, is the effort focused on speeding collections overall, then some sort of continuous sampling of all accounts is in order. If it’s only slow-pay accounts, then a very different sample is needed and maybe a different approach and experiment time frame.
Representative Samples Be sure that samples are representative of the population. If not representative of the population, conclusions cannot be drawn, since the results would be different than for the entire population. This leads to the idea of sampling risk. There are two types of sampling risks: 1. The risk of incorrect acceptance of the research hypothesis. The sample can yield a conclusion that supports a theory about the population when it is not existent in the population. 2. The risk for incorrect rejection. The sample can yield a conclusion that rejects a theory about the population when the theory holds true in the population. *Please note: The risk of incorrect rejection(ii) is more concerning than the risk of incorrect acceptance(i). Consider this example: An experimental drug was tested for its debilitating side effects( hypothesis= the drug has debilitating side effects). With incorrect rejection, the researcher will conclude that the drug has no negative side effects. The entire population will take the drug believing that it has no side effects, yet members of the population will suffer consequences of the mistake of the researcher. While the risk of incorrect acceptance, the researcher will conclude that the drug has no negative side effects (yet the truth is that it doesn’t). The entire population will then abstain from taking the drug and no one is harmed. Practicability Practicability of the sampling must be considered. Statistical sampling techniques allow one to estimate the number of samples needed, which speaks to availability of the subjects/samples, the duration of the study, the workforce that the study demands, the materials, tools/equipment, ethical concerns and maybe the need for the study if these costs are too high. Modeling Methods When determining sample size, remember first that each situation is different and calls for differing statistical modeling methods. Every statistics modeling method has its own sampling rule. These rules intersect in designing your experiment, as one must match the modeling methods that address the question being asked and develop an appropriate sample for it. Example A t test is used when you cannot know the result of a total population yet want a level of confidence that what your sample is indicating is true or applicable (scales) to the entire population. A two-sample t test (a test regularly used in comparative testing experiments in healthcare) allows for a sample size as small as six and calls for 920 depending on the confidence level desired in the result -25% confidence level with six and 99% confidence with 920. Common Mistakes in Analysis Sophisticated Compensates Sophistication compensates for lack of data and/or business understanding. The convenience of available applications can lead to a temptation to supplement lack of data or business understanding with sophisticated statistics. This will result in incorrect approaches to analytics and problem-solving. It is necessary to understand the business, the problem being addressed, the process, the data underneath the process and then apply analytics tools. Tool selection is important as well. Do not use tools that are inappropriate to the task for example, using linear programming to fix resource use in relation to a “fixed average volume demand,†when the clinical unit, for which this information is to improve decision-making, is one where volume demand is naturally quite variable. This model will continually require adjustment (likely daily) to be accurate. Meanwhile pick a department for which demand is quite stable- say outpatient therapy. In these departments, healthcare users are waiting or booked to start many weeks into the future. In these departments, healthcare users are waiting or booked to start many weeks into the future. In these departments, healthcare users are waiting or booked to start many weeks into the future. In these departments, healthcare users are waiting or booked to start many weeks into the future. This Is stable demand and linear programming might work in relation to some resources and situations, but not others. One needs to determine whether a decision is about planning a time frame to vary it (increase or decrease). If this is the case, one needs to ascertain how well linear programming will aid making long-term adjustments. In advance, it is important to run tests (experiments not observational studies) comparing one’s models with actual results, and to use actual numbers in the models. If off by 5% to 10%, then rethink the model application. It is crucial to focus on developing business and data understanding before getting started and to experiment along the way to develop and ensure appropriate use of tools and techniques. Isolating and Explaining Meaningful Patterns It is often difficult to isolate and explain the meaningful patterns shown by the data. When attempting to explain everything detected, there is a tendency to consider randomness or “noise†in the system. Suppose there is descriptive information showing volume increasing over the last eight months. Does this mean expansion is required? In the quest for growth, some might look quickly in the direction of expansion. Is this a random situation? Is it part of a long-term long-wave cycle? Is it a temporary shift due to external economic factors, as an area bounces back from recession? None of these would call for expansion. Here knowledge of the data, the environment, statistics and use of appropriate tools are crucial, along with willingness to move beyond the intuitive guess. Correlation vs. Causation When correlation is shown, it does not mean the independent variable is the driver or cause. The data may show that healthcare user inpatient admissions with chronic heart failure(CHF) diagnosis rise a few days after every holiday. Are holidays the cause? They may be. However, one must isolate all the reasons the CHF admission increases and test to prove if these factors are more prevalent during holidays. Has one considered all things occurring around holidays? Even if a full moon correlated with certain activity, would it really be the cause? How could it be proven? These examples bringup the topic of hypothesis testing. When considering causation, one must develop a list of possible reasonable business predictions (hypothesis) for the results being studied. 1. Being unaware of the need for experiments 2. Engaging use of wrong tool: â— Making do with one-size fits-all â— Using visualization tools and thinking this will address analytic needs â— Utilizing tools that require known and/or stable demand â— Using non-predictive tools for predictions 3. Improper consideration of systems dynamics â— Volume demand fluctuations â— Dependencies â— Resource demand and supply â— Long-term and short-term frames â— Seasonality â— Time of day and day of week â— Environmental trends 4. Not understanding the business Overfitting/Underfitting In statistics overfitting is “the production of an analysis that corresponds to closely or exactly to a particular set of data, and may therefore fail to fit additional data or predict future observations reliably.†An overfitted model is a statistical model that contains more parameters than can be justified by the data. The essence of overfitting is to have unknowingly extracted some of the residual variation (i.e., the noise) as if that variation represented underlying model structure. Underfitting occurs when a statistical model cannot adequately capture the underlying structure of the data. An underfitting model is a model where some parameters or terms that would appear in a correctly specified model are missing. Underfitting would occur, for example, when fitting a linear model to non-linear data. Such a model will tend to have poor predictive performance. Resource application/allocation The organization supplies fewer resources than are really needed for the task that is expected to be performed. Opportunities and Problem to Solve Opportunity Identification & Selection Stephen R. Covey, the author of “The 7 Habits of Highly Effective Peopleâ„¢ pointed out that people often lose sight of what is important in the daily rush of taking care of urgent matters. Organizations are no different, and in complex ones such as hospitals and other large-scale healthcare providers, difficulties may be amplified. There are high volumes of activity, change and so much data to look at, what matters? As discussed earlier in the course, incoming requests to the BI/A consultant/team can come in all at once, overwhelming the team’s capacity and lowering productivity and results. And of greater importance, the BI/A may be faced with many requests of debatable value, meaning time is taken up not surfacing those of real appreciable value. Here the BI/A/ analytics team must undertake two tasks: First Task 1. Develop and engage a process for surfacing meaningful high value analytic activity to engage. In particular, the analytics team: â— Should focus on how to support the organization’s strategic initiatives and direction (direction is a two-edged thought—organizations can have a strategic direction that is embedded in the process and between the specific written lines of the formal plan, yet documentation and actions point out a direction to be probed). â— Should ensure that a significant level of specific action is focused on things that are going well, so it can catapult forward and find new opportunities to move ahead. â— Should not always focus on poor performance areas (although they should not be ignored either) because doing so makes forward movement difficult.
Paper For Above instruction
Analyzing business intelligence (BI) through the lens of experimental design, statistical analysis, and opportunity prioritization is crucial for organizations aiming to enhance decision-making processes and organizational performance. This comprehensive approach hinges on understanding how data transforms into actionable insights, the importance of correctly designing experiments, and the strategic selection of analytic opportunities that align with organizational goals.
The Role of Analysis in Business Intelligence
Analysis in BI involves converting raw data into meaningful information. When executed correctly, analysis provides a clear view of the underlying patterns, trends, and relationships within data sets. Rigorous analysis is rooted in statistical methods, both descriptive and inferential, which serve as the foundation for deriving insights that inform strategic decisions. Simple statistical calculations, such as means, medians, and standard deviations, can uncover significant insights, especially when embedded within well-designed experiments that enable causal inference.
Understanding Experimental Design in BI
Experimental design extends beyond basic analysis by enabling organizations to test hypotheses about processes and outcomes. Moving from observational studies to controlled experiments permits organizations to manipulate specific process variables, such as speeding up collections procedures, to ascertain their true impact. Properly structured experiments adhere to principles that control for variability, bias, and confounding factors, ultimately providing more reliable evidence for decision-making. For instance, randomly assigning different collections methods to various credit score groups allows organizations to determine whether a new process genuinely improves collection rates or if observed improvements are due to chance.
Executing experiments involves detailed planning: specifying the research question, selecting representative samples, and designing procedures to minimize bias. The goal is to produce data that is both valid and reliable, enabling the organization to establish cause-and-effect relationships with confidence. As predictive analytics becomes more prevalent, organizations can simulate “what-if” scenarios, optimizing process adjustments before implementing them in real time.
Sampling and Its Significance
Sampling is critical for inference, especially when testing the entire population is impractical or impossible. A representative sample must accurately reflect the population to avoid sampling bias, which threatens the validity of conclusions. Two types of sampling risk—incorrect acceptance and rejection—must be managed, with the former being less concerning than the latter in decision-critical contexts such as drug safety testing. Ensuring the sample’s representativeness involves careful consideration of sampling frame, size, and methodology, including techniques such as simple random sampling and stratified sampling.
Sample size determination depends on the statistical method used—e.g., t-tests for comparison— and must balance practicability with statistical power. For example, small samples might suffice for certain tests but can increase the likelihood of errors if improperly selected. Thus, organizations need to weigh factors such as cost, time, and ethical considerations when designing sampling processes.
Common Mistakes and How to Avoid Them
One common pitfall in BI analysis is the inappropriate application of sophisticated tools without sufficient understanding of the underlying business problems. For example, applying complex models like linear programming to demand-variable processes without proper validation can produce misleading results. Analytic efforts should begin with a thorough grasp of the business context, data quality, and process dynamics. Additionally, selecting tools aligned with the specific questions—predictive tools for forecasting, descriptive tools for understanding historical data—is essential to avoid misinterpretation.
Another challenge arises in isolating and explaining meaningful patterns. Data trends must be scrutinized to distinguish genuine signals from noise, requiring both domain knowledge and statistical expertise. For instance, a rise in hospital admissions post-holidays warrants testing to determine causality rather than assuming direct causation based solely on correlation.
Distinguishing Correlation from Causation
It is vital to recognize that correlation does not imply causation. Just because two variables move together does not mean one causes the other. For example, an observed increase in CHF hospitalizations after holidays might be related to seasonal factors, staffing schedules, or other external influences. Testing hypotheses that specify potential causal relationships—such as increased stress levels during holidays—can clarify underlying mechanisms, guiding effective interventions.
Overfitting, Underfitting, and Resource Allocation
Overfitting occurs when models are too complex, capturing random noise as if it were genuine patterns, thereby failing to predict future data accurately. Conversely, underfitting results from overly simplistic models that miss important data patterns. Both conditions impair decision-making and resource planning, which is critical in healthcare settings where resource constraints are common. Accurate model specification, validation, and ongoing testing are necessary to ensure models are both parsimonious and effective.
Identifying and Prioritizing Opportunities
Effective BI implementation hinges on selecting high-value opportunities aligned with organizational strategies. This process involves creating a structured methodology to surface areas with the greatest potential impact. Leaders should focus on initiatives supporting strategic objectives such as improving patient safety, enhancing population health, or optimizing operational efficiency. Prioritization tools like matrices facilitate tradeoff analysis, ensuring that limited resources target projects offering the highest return on investment.
In healthcare settings, opportunities span across financial performance, clinical outcomes, and operational processes