Course Taxonomy: Data & AI

Introduction to Data Analysis

Part 1: Data and Information

  1. Data in the Real World
  2. Data vs. Information
  3. The Many “Vs” of Data
  4. Structured Data and Unstructured Data
  5. Types of Data

Part 2: Data Analysis Defined

  1. Why do we analyze data?
  2. Data Analysis Mindset
  3. Data Analysis Steps
  4. Data Analysis Defined
  5. Descriptive Statistics vs Inferential Statistics

Part 3: Types of Variables

  1. Categorical vs Numerical
  2. Nominal Variables
  3. Ordinal Variables
  4. Interval Variables
  5. Ratio Variables

Part 4: Central Tendency of Data

  1. (Arithmetic) Mean
  2. Median
  3. Mode

Part 5: Basic Probability

  1. Probability Uses In Business
  2. Ways We Can Calculate Probability
  3. Probability Terms
  4. Calculating Probability
  5. Calculating Probability from a Contingency Table
  6. Conditional Probability
  7. Frequency Distribution

Part 6: Distributions, Variance, and Standard Deviation

  1. Discrete Distributions
  2. Continuous Distributions
  3. Range
  4. Quartiles
  5. Variance
  6. Standard Deviation
  7. Population vs. Sample
  8. Application of the Standard Deviation
    • Standard Deviation and the Normal Distribution
    • Sigma (σ) Values (Standard Deviations)
  9. Bimodal distribution
  10. Skew and Summary
  11. Other Distributions
    • Poisson Distribution
    • Exponential Distribution
    • Pareto Distribution (“80/20”)
    • Log Normal Distribution
  12. Distributions in Excel
     

Part 7: Fitting Data

  1. Bivariate Data (Two Variables)
  2. Covariance and Correlation
  3. Simple Linear Regression
  4. Linear Regression
  5. Fitting Functions
    • Linear Fit
    • Polynomial Fit
    • Power-Law Fit

Part 8: Predictive Analytics Overview

  1. Monte Carlo Method

Data Analysis Boot Camp

Part 1: The Value and Challenges of Data-Driven Disruption

  1. Objectives and expectations
  2. Hurdles to becoming a data-driven organization
  3. Data empowerment
  4. Instilling data practices in the organization
  5. The CRISP-DM model of data projects

Part 2: Tying Data to Business Value

  1. What constitutes data-driven value
  2. Requirements gathering: How to approach it
  3. Kanban for data analysis
  4. Know your customers
  5. Stakeholder cheat sheets
  • EXERCISE: Data-driven project checklist
  • LAB: Data analysis techniques: Aggregations

Part 3: Understanding Your Data

  1. Data defined
  2. Data versus information
  3. Types of data
    1. Unstructured vs. Structured
    2. Time scope of data
    3. Sources of data
  4. Data in the real world
  5. The 3 V’s of data
  6. Data Quality
    1. Cleansing
    2. Duplicates
    3. SSOT
    4. Field standardization
    5. Identify sparsely populated fields
    6. How to fix common issues
  • LAB: Prioritizing data quality

Part 4: Analyzing Data

  1. Analysis foundations
    1. Comparing programs and tools
    2. Words in English vs. data
    3. Concepts specific to data analysis
    4. Domains of data analysis
    5. Descriptive statistics
    6. Inferential statistics
    7. Analytical mindset
    8. Describing and solving problems
  2. Averages in data
    1. Mean
    2. Median
    3. Mode
    4. Range
  3. Central tendency
    1. Variance
    2. Standard deviation
    3. Sigma values
    4. Percentiles
  4. Demystifying statistical models
  5. Data analysis techniques
  • LAB: Central tendency
  • LAB: Variability
  • LAB: Distributions
  • LAB: Sampling
  • LAB: Feature engineering
  • LAB: Univariate linear regression
  • LAB: Prediction
  • LAB: Multivariate linear regression
  • LAB: Monte Carlo simulation

Part 5: Thinking Critically About Your Analysis

  1. Descriptive analysis
  2. Diagnostic analysis
  3. Predictive analysis
  4. Prescriptive analysis

Part 6: Data Analysis in the Real World

  1. Deployment of analyses
  2. Best practices for BI
  3. Technology ecosystems
    1. Relational databases
    2. NoSQL databases
    3. Big data tools
    4. Statistical tools
    5. Machine learning
    6. Visualization and reporting tools
  4. Making data useable

Part 7: Data Visualization & Reporting

  1. Best practices for data visualizations
    1. Visualization essentials
    2. Users and stakeholders
    3. Stakeholder cheat sheet
  2. Common presentation mistakes
  3. Goals of visualization
    1. Communication and narrative
    2. Decision enablement
    3. Critical characteristics
  4. Communicating data-driven knowledge
    1. Formats and presentation tools
    2. Design considerations

Part 8: Hands-On Introduction to R and R Studio

  1. What is R?
  • LAB: Intro to R Studio
  • LAB: Univariate linear regression in R
  • LAB: Multivariate linear regression in R