Splunk for Analytics and Data Science

This 13.5-hour module is for users who want to attain operational intelligence level 4, (business insights) and covers implementing analytics and data science projects using Splunk‘s statistics, machine learning, built-in and custom visualization capabilities.

Days : 3
Price :

Ce produit est actuellement en rupture et indisponible.

Description

Course Content

  • Analytics Framework
  • Exploratory Data Analysis
  • Machine Learning
  • Using Algorithms to Build Models
  • Market Segmentation
  • Transactional Analysis
  • Anomaly Detection
  • Estimation and Prediction
  • Classification

Prerequisites

To be successful, students should have a solid understanding of the following modules:

  • Fundamentals 1, 2, & 3 (Retired)
  • Advanced Searching & Reporting

Or the following single-subject modules:

What is Splunk? (WIS)
Intro to Splunk (ITS)
Using Fields (SUF)
Scheduling Reports & Alerts (SRA)
Visualizations (SVZ)
Working with Time (WWT)
Statistical Processing (SSP)
Comparing Values (SCV)
Result Modification (SRM)
Leveraging Lookups and Subsearches (LLS)
Correlation Analysis (SCLAS)
Search Under the Hood (SUH)
Multivalue Fields (SMV)
Intro to Knowledge Objects (IKO)
Creating Knowledge Objects (CKO)
Creating Field Extractions (CFE)
Enriching Data with Lookups (EDL)
Data Models (SDM)
Introduction to Dashboards (ITD)
Dynamic Dashboards (SDD)
Using Choropleth (SUC)
Search Optimization (SSO)

Course Objectives

This 13.5-hour module is for users who want to attain operational intelligence level 4, (business insights) and covers implementing analytics and data science projects using Splunk‘s statistics, machine learning, built-in and custom visualization capabilities.

Please note that this course may run over three days, with 4.5 hour sessions each day.

Outline: Splunk for Analytics and Data Science (SADS)

Topic 1 – Analytics Workflow

  • Define terms related to analytics and data science
  • Describe the analytics workflow
  • Describe common usage scenarios
  • Navigate Splunk Machine Learning Toolkit

Topic 2 – Exploratory Data Analysis

  • Describe the purpose of data exploration
  • Identify SPL commands for data exploration
  • Split data for testing and training using the sample command

Topic 3 – Predict Numeric Fields with Regression

  • Differentiate predictions from estimates
  • Identify prediction algorithms and assumptions
  • Describe the fit and apply commands
  • Model numeric predictions in the MLTK and Splunk Enterprise
  • Use the score command to evaluate models

Topic 4 – Clean and Preprocess the Data

  • Define preprocessing and describe its purpose
  • Describe algorithms that preprocess data for use in models
  • Use FieldSector to choose relevant fields
  • Use PCA and ICA to reduce dimensionality
  • Normalize data with StandardScaler and RobustScaler
  • Preprocess text using Imputer, and NPR, TF-IDF, HashingVectorizer and the cluster command

Topic 5 – Cluster Data

  • Define Clustering
  • Identify clustering methods, algorithms, and use cases
  • Use Smart Clustering Assistant to cluster data
  • Evaluate clusters using silhouette score
  • Validate cluster coherence
  • Describe clustering best practices

Topic 6 – Anomaly Detection

  • Define anomaly detection and outliers
  • Identify anomaly detection use cases
  • Use Splunk Machine Learning ToolKit Smart Outlier Assistant
  • Detect anomalies using the Density Function algorithm
  • Optimize anomaly detection with Local Outlier Factor
  • View results with the Distribution Plot visualization

Topic 7 – Estimation and Prediction

  • Differentiate predictions from forecasts
  • Use the Smart Forecasting Assistant
  • Use the StateSpaceForecast algorithm
  • Forecast multivariate data
  • Account for periodicity in each time series

Topic 8 – Classification

  • Define key classification terms
  • Use classification algorithms
  • AutoPrediction
  • LogisticRegression
  • SVM (Support Vector Machines)
  • RandomForestClassifier
  • Evaluate classifier tradeoffs
  • Evaluate results of multiple algorithms