Predictive Analytics - การวิเคราะห์เชิงทำนาย - GaugeSnap - Computer Vision สำหรับอ่านค่าเครื่องมือวัดและจอแสดงผลทุกประเภท -- DEV

Predictive Analytics คืออะไร?

Predictive Analytics เป็นการใช้ข้อมูลในอดีตและปัจจุบัน ร่วมกับเทคนิค Statistical Modeling, Machine Learning และ Artificial Intelligence เพื่อวิเคราะห์รูปแบบและทำนายเหตุการณ์หรือแนวโน้มในอนาคต ด้วยความแม่นยำสูง

ในระบบ GaugeSnap เราใช้ Predictive Analytics ในการวิเคราะห์ข้อมูลจากมิเตอร์และเซนเซอร์ต่างๆ เพื่อคาดการณ์ความผิดปกติของอุปกรณ์ การบำรุงรักษาที่จำเป็น และแนวโน้มการใช้งานในอนาคต ช่วยให้สามารถวางแผนและป้องกันปัญหาได้ก่อนเกิดขึ้นจริง

องค์ประกอบสำคัญ

Historical Data Analysis: วิเคราะห์ข้อมูลในอดีต เพื่อหารูปแบบและแนวโน้ม
Machine Learning Models: ใช้อัลกอริธึม ML ในการสร้างโมเดลทำนาย
Real-time Monitoring: ตรวจสอบและคาดการณ์ แบบ Real-time อย่างต่อเนื่อง

Historical Data: ข้อมูลในอดีตสำหรับการเรียนรู้
Predictive Models: โมเดลทำนายที่ฝึกมาแล้ว
Real-time Processing: การประมวลผลแบบเรียลไทม์

การใช้งานใน Gauge Reading

ทำนาย equipment failure ล่วงหน้า
วางแผนการซ่อมบำรุงเชิงป้องกัน
ปรับแต่งการทำงานเพื่อประสิทธิภาพสูงสุด

Prediction Process

Data Collection

Historical Gauge Data

↓

Feature Engineering

Pattern & Trend Extraction

↓

Model Training

ML Algorithm Training

↓

Validation

Accuracy Assessment

↓

Prediction

Future Value Forecasting

ประเภทการทำนาย

รูปแบบการทำนายที่แตกต่างกันตามวัตถุประสงค์

Equipment Health

ทำนายสุขภาพและอายุการใช้งานของอุปกรณ์

• Remaining Useful Life (RUL)
• Degradation Patterns
• Performance Decline
• Maintenance Scheduling

Failure Prediction

ประเมินความเสี่ยงของการเกิดความเสียหาย

• Failure Probability
• Critical Component Analysis
• Risk Assessment
• Emergency Planning

Performance Optimization

หาจุดที่เหมาะสมสำหรับประสิทธิภาพสูงสุด

• Optimal Operating Points
• Efficiency Prediction
• Resource Allocation
• Cost Optimization

Demand Forecasting

ทำนายความต้องการและการใช้งานในอนาคต

• Load Prediction
• Seasonal Patterns
• Capacity Planning
• Supply Chain Optimization

โมเดล Machine Learning

อัลกอริธึมที่ใช้ในการทำนาย

Time Series Models

ARIMA

AutoRegressive Integrated Moving Average สำหรับการทำนายแบบคลาสสิค

Best for: Linear trends

Accuracy: 75-85%

Speed: Fast

Interpretability: High

LSTM Networks

Long Short-Term Memory networks สำหรับรูปแบบที่ซับซ้อน

Best for: Complex patterns

Accuracy: 85-95%

Speed: Medium

Interpretability: Low

Prophet

Facebook's time series forecasting tool ที่จัดการ seasonality ได้ดี

Best for: Seasonal data

Accuracy: 80-90%

Speed: Fast

Interpretability: High

Regression Models

Random Forest

Ensemble method ที่ใช้หลาย decision trees

Advantages:

• High accuracy
• Feature importance ranking
• Handles missing values
• Robust to outliers

Gradient Boosting

เทคนิค boosting ที่ปรับปรุงประสิทธิภาพแบบต่อเนื่อง

Variants:

• XGBoost
• LightGBM
• CatBoost
• AdaBoost

Support Vector Regression

SVR สำหรับการทำนายที่มีความแม่นยำสูง

Kernels: Linear, RBF, Polynomial

Use case: Non-linear relationships

Performance: High accuracy, slower training

การใช้งานในอุตสาหกรรม

ตัวอย่างการประยุกต์ใช้ในสถานการณ์จริง

Predictive Maintenance

ทำนายเวลาที่เหมาะสมสำหรับการซ่อมบำรุงอุปกรณ์

Data Sources:

• Vibration sensors
• Temperature readings
• Pressure measurements
• Current/voltage monitoring
• Operating hours

Prediction Output:

• Remaining useful life (RUL)
• Failure probability
• Optimal maintenance window
• Component replacement schedule

Business Impact:

Downtime: -40%

Maintenance Cost: -25%

Equipment Life: +20%

ROI: 300%

Energy Optimization

ทำนายการใช้พลังงานและหาจุดที่เหมาะสมสำหรับประหยัดพลังงาน

Monitoring Points:

• Power consumption meters
• Load demand patterns
• Environmental conditions
• Production schedules
• Equipment efficiency

Optimization Strategies:

• Peak demand prediction
• Load shifting recommendations
• Equipment scheduling
• Energy storage optimization

Results:

Energy Savings: 15-30%

Peak Reduction: 20%

Cost Savings: $100K/year

CO2 Reduction: 25%

ตัวอย่างการ Implement

โค้ดสำหรับสร้างระบบ Predictive Analytics

# Predictive Analytics System for Industrial Gauges
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split, TimeSeriesSplit
from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score
from sklearn.preprocessing import StandardScaler
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

class PredictiveAnalytics:
    def __init__(self, prediction_horizon=30):
        """
        Initialize Predictive Analytics system
        
        Args:
            prediction_horizon: Number of time steps to predict ahead
        """
        self.prediction_horizon = prediction_horizon
        self.models = {}
        self.scalers = {}
        self.feature_importance = {}
        self.is_trained = False
        
    def create_features(self, data, window_size=24):
        """Create predictive features from time series data"""
        features = []
        targets = []
        
        for i in range(window_size, len(data) - self.prediction_horizon + 1):
            # Historical window
            window = data[i-window_size:i]
            
            # Statistical features
            feature_vector = [
                window.mean(),
                window.std(),
                window.min(),
                window.max(),
                np.percentile(window, 25),
                np.percentile(window, 75),
                window[-1],  # Current value
                window[-1] - window[-2] if len(window) > 1 else 0,  # Recent change
            ]
            
            # Trend features
            if len(window) > 2:
                # Linear trend
                x = np.arange(len(window))
                slope = np.polyfit(x, window, 1)[0]
                feature_vector.append(slope)
                
                # Moving averages
                ma_short = window[-5:].mean() if len(window) >= 5 else window.mean()
                ma_long = window[-12:].mean() if len(window) >= 12 else window.mean()
                feature_vector.extend([ma_short, ma_long])
                
                # Volatility
                feature_vector.append(window.rolling(5).std().iloc[-1] if len(window) >= 5 else 0)
            else:
                feature_vector.extend([0, 0, 0, 0])
            
            # Seasonal features (hour of day, day of week if timestamp available)
            # This is simplified - in real implementation, use proper datetime features
            hour_feature = i % 24 / 24.0  # Simulated hour cycle
            day_feature = (i // 24) % 7 / 7.0  # Simulated day cycle
            feature_vector.extend([hour_feature, day_feature])
            
            features.append(feature_vector)
            
            # Target: value at prediction horizon
            target = data[i + self.prediction_horizon - 1]
            targets.append(target)
        
        return np.array(features), np.array(targets)
    
    def create_lstm_sequences(self, data, sequence_length=60):
        """Create sequences for LSTM model"""
        sequences = []
        targets = []
        
        for i in range(sequence_length, len(data) - self.prediction_horizon + 1):
            # Input sequence
            seq = data[i-sequence_length:i].values.reshape(-1, 1)
            sequences.append(seq)
            
            # Target: value at prediction horizon
            target = data[i + self.prediction_horizon - 1]
            targets.append(target)
        
        return np.array(sequences), np.array(targets)
    
    def train_random_forest(self, X, y, gauge_id):
        """Train Random Forest model"""
        print(f"Training Random Forest for gauge {gauge_id}")
        
        # Split data chronologically
        split_idx = int(0.8 * len(X))
        X_train, X_test = X[:split_idx], X[split_idx:]
        y_train, y_test = y[:split_idx], y[split_idx:]
        
        # Scale features
        scaler = StandardScaler()
        X_train_scaled = scaler.fit_transform(X_train)
        X_test_scaled = scaler.transform(X_test)
        
        # Train model
        rf_model = RandomForestRegressor(
            n_estimators=100,
            max_depth=15,
            min_samples_split=5,
            min_samples_leaf=2,
            random_state=42,
            n_jobs=-1
        )
        
        rf_model.fit(X_train_scaled, y_train)
        
        # Evaluate
        train_pred = rf_model.predict(X_train_scaled)
        test_pred = rf_model.predict(X_test_scaled)
        
        train_score = r2_score(y_train, train_pred)
        test_score = r2_score(y_test, test_pred)
        
        print(f"Random Forest - Train R2: {train_score:.3f}, Test R2: {test_score:.3f}")
        
        # Store model and scaler
        self.models[f"{gauge_id}_rf"] = rf_model
        self.scalers[f"{gauge_id}_rf"] = scaler
        
        # Feature importance
        feature_names = [
            'mean', 'std', 'min', 'max', 'q25', 'q75', 'current',
            'recent_change', 'trend', 'ma_short', 'ma_long', 'volatility',
            'hour_cycle', 'day_cycle'
        ]
        
        importance = dict(zip(feature_names, rf_model.feature_importances_))
        self.feature_importance[f"{gauge_id}_rf"] = importance
        
        return {
            'train_score': train_score,
            'test_score': test_score,
            'train_mae': mean_absolute_error(y_train, train_pred),
            'test_mae': mean_absolute_error(y_test, test_pred)
        }
    
    def train_lstm(self, sequences, targets, gauge_id):
        """Train LSTM model"""
        print(f"Training LSTM for gauge {gauge_id}")
        
        # Split data chronologically
        split_idx = int(0.8 * len(sequences))
        X_train = sequences[:split_idx]
        X_test = sequences[split_idx:]
        y_train = targets[:split_idx]
        y_test = targets[split_idx:]
        
        # Scale data
        scaler = StandardScaler()
        X_train_scaled = scaler.fit_transform(X_train.reshape(-1, 1)).reshape(X_train.shape)
        X_test_scaled = scaler.transform(X_test.reshape(-1, 1)).reshape(X_test.shape)
        
        y_scaler = StandardScaler()
        y_train_scaled = y_scaler.fit_transform(y_train.reshape(-1, 1)).flatten()
        y_test_scaled = y_scaler.transform(y_test.reshape(-1, 1)).flatten()
        
        # Build LSTM model
        model = Sequential([
            LSTM(50, return_sequences=True, input_shape=(X_train_scaled.shape[1], 1)),
            Dropout(0.2),
            LSTM(50, return_sequences=False),
            Dropout(0.2),
            Dense(25),
            Dense(1)
        ])
        
        model.compile(optimizer='adam', loss='mse', metrics=['mae'])
        
        # Train model
        history = model.fit(
            X_train_scaled, y_train_scaled,
            epochs=50,
            batch_size=32,
            validation_data=(X_test_scaled, y_test_scaled),
            verbose=0
        )
        
        # Evaluate
        train_pred_scaled = model.predict(X_train_scaled)
        test_pred_scaled = model.predict(X_test_scaled)
        
        train_pred = y_scaler.inverse_transform(train_pred_scaled).flatten()
        test_pred = y_scaler.inverse_transform(test_pred_scaled).flatten()
        
        train_score = r2_score(y_train, train_pred)
        test_score = r2_score(y_test, test_pred)
        
        print(f"LSTM - Train R2: {train_score:.3f}, Test R2: {test_score:.3f}")
        
        # Store model and scalers
        self.models[f"{gauge_id}_lstm"] = model
        self.scalers[f"{gauge_id}_lstm_X"] = scaler
        self.scalers[f"{gauge_id}_lstm_y"] = y_scaler
        
        return {
            'train_score': train_score,
            'test_score': test_score,
            'train_mae': mean_absolute_error(y_train, train_pred),
            'test_mae': mean_absolute_error(y_test, test_pred),
            'history': history.history
        }
    
    def train(self, data, gauge_id):
        """Train all predictive models"""
        print(f"Starting training for gauge {gauge_id}")
        print(f"Data shape: {data.shape}")
        print(f"Prediction horizon: {self.prediction_horizon} steps")
        
        if len(data) < 100:
            raise ValueError("Need at least 100 data points for training")
        
        results = {}
        
        # 1. Random Forest with engineered features
        try:
            X_rf, y_rf = self.create_features(data)
            rf_results = self.train_random_forest(X_rf, y_rf, gauge_id)
            results['random_forest'] = rf_results
        except Exception as e:
            print(f"Random Forest training failed: {e}")
            results['random_forest'] = {'error': str(e)}
        
        # 2. LSTM for time series
        try:
            X_lstm, y_lstm = self.create_lstm_sequences(data)
            lstm_results = self.train_lstm(X_lstm, y_lstm, gauge_id)
            results['lstm'] = lstm_results
        except Exception as e:
            print(f"LSTM training failed: {e}")
            results['lstm'] = {'error': str(e)}
        
        self.is_trained = True
        return results
    
    def predict(self, data, gauge_id, method='ensemble'):
        """Make predictions using trained models"""
        if not self.is_trained:
            raise ValueError("Models not trained yet")
        
        predictions = {}
        
        # Random Forest prediction
        rf_key = f"{gauge_id}_rf"
        if rf_key in self.models:
            try:
                # Create features for latest window
                X_rf, _ = self.create_features(data)
                if len(X_rf) > 0:
                    X_latest = X_rf[-1:] # Latest feature vector
                    X_scaled = self.scalers[rf_key].transform(X_latest)
                    rf_pred = self.models[rf_key].predict(X_scaled)[0]
                    predictions['random_forest'] = rf_pred
            except Exception as e:
                predictions['random_forest'] = {'error': str(e)}
        
        # LSTM prediction
        lstm_key = f"{gauge_id}_lstm"
        if lstm_key in self.models:
            try:
                X_lstm, _ = self.create_lstm_sequences(data)
                if len(X_lstm) > 0:
                    X_latest = X_lstm[-1:] # Latest sequence
                    X_scaled = self.scalers[f"{gauge_id}_lstm_X"].transform(
                        X_latest.reshape(-1, 1)
                    ).reshape(X_latest.shape)
                    
                    lstm_pred_scaled = self.models[lstm_key].predict(X_scaled)
                    lstm_pred = self.scalers[f"{gauge_id}_lstm_y"].inverse_transform(
                        lstm_pred_scaled
                    )[0][0]
                    predictions['lstm'] = lstm_pred
            except Exception as e:
                predictions['lstm'] = {'error': str(e)}
        
        # Ensemble prediction
        valid_preds = [pred for pred in predictions.values() 
                      if isinstance(pred, (int, float))]
        
        if valid_preds:
            if method == 'ensemble':
                ensemble_pred = np.mean(valid_preds)
            elif method == 'weighted':
                # Weight by historical accuracy (simplified)
                weights = [0.6, 0.4] # RF, LSTM
                if len(valid_preds) == 2:
                    ensemble_pred = np.average(valid_preds, weights=weights)
                else:
                    ensemble_pred = valid_preds[0]
            else:
                ensemble_pred = valid_preds[0]
            
            predictions['ensemble'] = ensemble_pred
        
        # Add confidence estimation
        if len(valid_preds) > 1:
            std_pred = np.std(valid_preds)
            confidence = max(0, 1 - (std_pred / np.mean(valid_preds)))
        else:
            confidence = 0.7  # Default confidence for single model
        
        return {
            'predictions': predictions,
            'confidence': confidence,
            'prediction_horizon': self.prediction_horizon,
            'timestamp': pd.Timestamp.now().isoformat()
        }
    
    def generate_maintenance_recommendations(self, current_value, predicted_value, 
                                           gauge_id, threshold_ranges):
        """Generate maintenance recommendations based on predictions"""
        
        # Calculate prediction change
        change_percent = ((predicted_value - current_value) / current_value) * 100
        
        recommendations = []
        priority = "LOW"
        
        # Check against threshold ranges
        if gauge_id in threshold_ranges:
            ranges = threshold_ranges[gauge_id]
            
            # Current status
            if predicted_value > ranges.get('critical_high', float('inf')):
                recommendations.append("CRITICAL: Predicted value exceeds critical threshold")
                priority = "CRITICAL"
            elif predicted_value < ranges.get('critical_low', float('-inf')):
                recommendations.append("CRITICAL: Predicted value below critical threshold")
                priority = "CRITICAL"
            elif predicted_value > ranges.get('warning_high', float('inf')):
                recommendations.append("WARNING: Predicted value approaching high limit")
                priority = "HIGH"
            elif predicted_value < ranges.get('warning_low', float('-inf')):
                recommendations.append("WARNING: Predicted value approaching low limit")
                priority = "HIGH"
        
        # Trend analysis
        if abs(change_percent) > 20:
            recommendations.append(f"Large change predicted: {change_percent:+.1f}%")
            priority = max(priority, "HIGH")
        elif abs(change_percent) > 10:
            recommendations.append(f"Moderate change predicted: {change_percent:+.1f}%")
            priority = max(priority, "MEDIUM")
        
        # Default recommendations
        if not recommendations:
            recommendations.append("No immediate action required")
        
        return {
            'priority': priority,
            'recommendations': recommendations,
            'predicted_change': change_percent,
            'action_required': priority in ['CRITICAL', 'HIGH']
        }

# Usage example
def main():
    # Generate sample data (in real application, load from database)
    np.random.seed(42)
    
    # Simulate gauge data with trend and seasonality
    n_points = 1000
    time_index = pd.date_range('2023-01-01', periods=n_points, freq='H')
    
    # Base signal with trend and seasonality
    trend = np.linspace(100, 110, n_points)
    seasonal = 5 * np.sin(2 * np.pi * np.arange(n_points) / 24)  # Daily cycle
    noise = np.random.normal(0, 2, n_points)
    
    gauge_data = pd.Series(trend + seasonal + noise, index=time_index)
    
    # Initialize and train predictor
    predictor = PredictiveAnalytics(prediction_horizon=24)  # 24 hours ahead
    
    # Train models
    training_results = predictor.train(gauge_data, gauge_id="TEMP_SENSOR_01")
    
    print("Training Results:")
    for model, results in training_results.items():
        if 'error' not in results:
            print(f"{model}: Test R2 = {results['test_score']:.3f}, "
                  f"Test MAE = {results['test_mae']:.2f}")
    
    # Make prediction
    prediction_result = predictor.predict(gauge_data, "TEMP_SENSOR_01")
    
    print(f"\nPrediction Results:")
    print(f"Current Value: {gauge_data.iloc[-1]:.2f}")
    for method, pred in prediction_result['predictions'].items():
        if isinstance(pred, (int, float)):
            print(f"{method}: {pred:.2f}")
    print(f"Confidence: {prediction_result['confidence']:.2%}")
    
    # Generate recommendations
    threshold_ranges = {
        "TEMP_SENSOR_01": {
            'critical_high': 120,
            'warning_high': 115,
            'warning_low': 95,
            'critical_low': 90
        }
    }
    
    recommendations = predictor.generate_maintenance_recommendations(
        gauge_data.iloc[-1],
        prediction_result['predictions'].get('ensemble', 0),
        "TEMP_SENSOR_01",
        threshold_ranges
    )
    
    print(f"\nMaintenance Recommendations:")
    print(f"Priority: {recommendations['priority']}")
    print(f"Predicted Change: {recommendations['predicted_change']:+.1f}%")
    for rec in recommendations['recommendations']:
        print(f"- {rec}")

if __name__ == "__main__":
    main()

เทคโนโลยีที่เกี่ยวข้อง

เทคโนโลยีอื่นๆ ที่ทำงานร่วมกับ Predictive Analytics

Predictive Analytics Platform