Course Taxonomy: Data & AI

Artificial Intelligence Implementation Boot Camp

Part 1: Introduction

  1. Working definitions: AI, Machine Learning, Deep Learning,  Data Science & Big Data 
  2. State of AI: summarizing major analysts’ statistics & predictions
  3. Summarizing AI misinformation
  4. Effects on the job market
  5. Today’s AI use  cases
    • Where it works well
    • Where it doesn’t work well
  6. What do high profile uses have in common?
  7. Addressing legitimate concerns & risks

Case study break: We will introduce the class to three real-world use cases – one in finance, one in health science, and one in general operations. In small groups, you will discuss implications of the cases and see if you and your peers can spot any parallel opportunities in your own business.

Part 2: The Big Data Prerequisite

  1. Evaluating your big data practice
  2. State of tools – understanding intelligent big data stacks
    • Visualization and Analytics
    • Computing
    • Storage
    • Distribution and Data Warehousing
  3. Strategically restructuring enterprise data architecture for AI
  4. Unifying data engineering practices
  5. Datasets as learning data
  6. Defeating Bias in your Datasets
  7. Optimizing Information Analysis
  8. Utilizing the IoT to amass a large amount of data

Part 3: Implementing Machine Learning

  1. Examine pillars of a practicing AI team
    • Business case
    • Domain expertise
    • Data science
    • Algorithms
    • Application integration
  2. Bettering Machine Learning Model Management
  3. State of tools – understanding intelligent machine learning stacks
  4. Machine Learning Methods and Algorithms
    • Decision Trees
    • Support Vector Machines
    • Regression
    • Naïve Bayes Classification
    • Hidden Markov Models
    • Random Forest
    • Recurrent Neural Networks
    • Convolutional Neural Networks
  5. Developing Validation Sets
  6. Developing Training Sets
  7. Accelerating Training
  8. Encoding Domain Expertise in Machine Learning
  9. Automating Data Science
  10. Deep Learning

Example: TensorFlow – We will take a look at Google’s TensorFlow as a tool for integrating machine learning features.  We’ll come away from the exercise with an understanding of the programming skills needed to leverage TensorFlow and the impacts of normal application workflow.

Part 4: Creating Concrete Value

  1. Opportunities for automation
  2. Understanding automation vs. job displacement vs. job creation
  3. Finding hidden opportunities through improved forecasting
  4. Production and operations
  5. Adding AI to the Supply Chain
  6. Marketing and Sales Applications
    • Predict Customer Behavior
    • Target Customers Efficiently
    • Manage Leads
    • AI-powered content creation
  7. Enhancing UX and UI
  8. Next-Generation Workforce Management
  9. Explaining Results

Use case breakout: Scoring the criteria for three potential applications. In groups, we’ll evaluate application use cases for machine learning: Medical imaging, electronic medical records, and genomics. We’ll grade each use case based on a scorecard for the following:

  • Quantity of data
  • Quality of data
  • ML techniques

Part 5: Machine intelligence as part of the customer experience

  1. IoT and the role of machine learning
  2. Projects based on customer & user needs
  3. Handling customer inquiries with AI
  4. Creating empathy-driven customer facing actions
  5. Narrowing down intent
  6. AI as part of your channel strategy

Part 6: Machine Intelligence & Cybersecurity

  1. How can ML help with security?
    • Advance cyber security analytics
    • Developing defensive strategies
    • Automating repetitive security tasks
    • Close zero-day vulnerabilities
  2. How are attackers leveraging ML and AI?
  3. Building up trust towards automated security decisions and actions
  4. Automated application monitoring as a security layer
  5. Identifying Vulnerabilities
  6. Automating Red Team/Blue Team Testing Scenarios
  7. Modeling AI after previous security breaches
  8. Automating and streamlining Incident Responses
  9. How use deep learning AI to detect and prevent malware and APTs
  10. Using natural language processing
  11. Fraud detection
  12. Reducing compliance testing & cost

Part 7: Filling the Internal Capability Gap

  1. Assessing your technological and business processes
  2. Building your AI and machine learning toolchain
  3. Hiring the right talent
  4. Developing talent
  5. How to make AI more accessible to people who are not data scientists
  6. Launching pilot projects

Part 8: Conclusion and Charting Your Course

  1. Review
  2. Charting Your Course
    • Establishing a timeline
  3. Open Discussion

Introduction to Data Analysis

Part 1: Data and Information

  1. Data in the Real World
  2. Data vs. Information
  3. The Many “Vs” of Data
  4. Structured Data and Unstructured Data
  5. Types of Data

Part 2: Data Analysis Defined

  1. Why do we analyze data?
  2. Data Analysis Mindset
  3. Data Analysis Steps
  4. Data Analysis Defined
  5. Descriptive Statistics vs Inferential Statistics

Part 3: Types of Variables

  1. Categorical vs Numerical
  2. Nominal Variables
  3. Ordinal Variables
  4. Interval Variables
  5. Ratio Variables

Part 4: Central Tendency of Data

  1. (Arithmetic) Mean
  2. Median
  3. Mode

Part 5: Basic Probability

  1. Probability Uses In Business
  2. Ways We Can Calculate Probability
  3. Probability Terms
  4. Calculating Probability
  5. Calculating Probability from a Contingency Table
  6. Conditional Probability
  7. Frequency Distribution

Part 6: Distributions, Variance, and Standard Deviation

  1. Discrete Distributions
  2. Continuous Distributions
  3. Range
  4. Quartiles
  5. Variance
  6. Standard Deviation
  7. Population vs. Sample
  8. Application of the Standard Deviation
    • Standard Deviation and the Normal Distribution
    • Sigma (σ) Values (Standard Deviations)
  9. Bimodal distribution
  10. Skew and Summary
  11. Other Distributions
    • Poisson Distribution
    • Exponential Distribution
    • Pareto Distribution (“80/20”)
    • Log Normal Distribution
  12. Distributions in Excel
     

Part 7: Fitting Data

  1. Bivariate Data (Two Variables)
  2. Covariance and Correlation
  3. Simple Linear Regression
  4. Linear Regression
  5. Fitting Functions
    • Linear Fit
    • Polynomial Fit
    • Power-Law Fit

Part 8: Predictive Analytics Overview

  1. Monte Carlo Method

Data Analysis Boot Camp

Part 1: The Value and Challenges of Data-Driven Disruption

  1. Objectives and expectations
  2. Hurdles to becoming a data-driven organization
  3. Data empowerment
  4. Instilling data practices in the organization
  5. The CRISP-DM model of data projects

Part 2: Tying Data to Business Value

  1. What constitutes data-driven value
  2. Requirements gathering: How to approach it
  3. Kanban for data analysis
  4. Know your customers
  5. Stakeholder cheat sheets
  • EXERCISE: Data-driven project checklist
  • LAB: Data analysis techniques: Aggregations

Part 3: Understanding Your Data

  1. Data defined
  2. Data versus information
  3. Types of data
    1. Unstructured vs. Structured
    2. Time scope of data
    3. Sources of data
  4. Data in the real world
  5. The 3 V’s of data
  6. Data Quality
    1. Cleansing
    2. Duplicates
    3. SSOT
    4. Field standardization
    5. Identify sparsely populated fields
    6. How to fix common issues
  • LAB: Prioritizing data quality

Part 4: Analyzing Data

  1. Analysis foundations
    1. Comparing programs and tools
    2. Words in English vs. data
    3. Concepts specific to data analysis
    4. Domains of data analysis
    5. Descriptive statistics
    6. Inferential statistics
    7. Analytical mindset
    8. Describing and solving problems
  2. Averages in data
    1. Mean
    2. Median
    3. Mode
    4. Range
  3. Central tendency
    1. Variance
    2. Standard deviation
    3. Sigma values
    4. Percentiles
  4. Demystifying statistical models
  5. Data analysis techniques
  • LAB: Central tendency
  • LAB: Variability
  • LAB: Distributions
  • LAB: Sampling
  • LAB: Feature engineering
  • LAB: Univariate linear regression
  • LAB: Prediction
  • LAB: Multivariate linear regression
  • LAB: Monte Carlo simulation

Part 5: Thinking Critically About Your Analysis

  1. Descriptive analysis
  2. Diagnostic analysis
  3. Predictive analysis
  4. Prescriptive analysis

Part 6: Data Analysis in the Real World

  1. Deployment of analyses
  2. Best practices for BI
  3. Technology ecosystems
    1. Relational databases
    2. NoSQL databases
    3. Big data tools
    4. Statistical tools
    5. Machine learning
    6. Visualization and reporting tools
  4. Making data useable

Part 7: Data Visualization & Reporting

  1. Best practices for data visualizations
    1. Visualization essentials
    2. Users and stakeholders
    3. Stakeholder cheat sheet
  2. Common presentation mistakes
  3. Goals of visualization
    1. Communication and narrative
    2. Decision enablement
    3. Critical characteristics
  4. Communicating data-driven knowledge
    1. Formats and presentation tools
    2. Design considerations

Part 8: Hands-On Introduction to R and R Studio

  1. What is R?
  • LAB: Intro to R Studio
  • LAB: Univariate linear regression in R
  • LAB: Multivariate linear regression in R