 Container ai-scientist-python-workspace  Created
 Container ai-scientist-semantic-scholar-mcp-1  Running
 Container ai-scientist-latex-compiler-1  Running
Attaching to latex-compiler-1, ai-scientist-python-workspace, semantic-scholar-mcp-1
ai-scientist-python-workspace  | 2025-09-16 04:58:07,162 - WARNING - sklearn not available. Some baseline methods will be disabled.
ai-scientist-python-workspace  | 2025-09-16 04:58:07,163 - WARNING - sklearn not available. Using custom metric implementations.
ai-scientist-python-workspace  | 2025-09-16 04:58:07,666 - INFO - ================================================================================
ai-scientist-python-workspace  | 2025-09-16 04:58:07,666 - INFO - PHISHING DETECTION - ACADEMIC METHODS COMPARISON
ai-scientist-python-workspace  | 2025-09-16 04:58:07,666 - INFO - ================================================================================
ai-scientist-python-workspace  | 2025-09-16 04:58:07,666 - INFO - 
ai-scientist-python-workspace  | [Step 1/4] Loading and preparing datasets...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,667 - INFO - Downloading and preparing real phishing datasets...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,667 - INFO - Loading cached combined dataset...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,668 - INFO - Loaded 1002 emails from cache
ai-scientist-python-workspace  | 2025-09-16 04:58:07,668 - INFO - Dataset statistics:
ai-scientist-python-workspace  | 2025-09-16 04:58:07,668 - INFO -   Training samples: 701
ai-scientist-python-workspace  | 2025-09-16 04:58:07,668 - INFO -   Validation samples: 150
ai-scientist-python-workspace  | 2025-09-16 04:58:07,668 - INFO -   Test samples: 151
ai-scientist-python-workspace  | 2025-09-16 04:58:07,668 - INFO - 
ai-scientist-python-workspace  | [Step 2/4] Initializing detection methods...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,668 - WARNING - TF-IDF + SVM not available due to missing sklearn
ai-scientist-python-workspace  | 2025-09-16 04:58:07,669 - INFO -   - PhishIntention adapter (USENIX 2022)...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,669 - INFO -   - CNN-BiGRU detector (Sensors 2024)...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,676 - INFO - CNN-BiGRU model initialized on cpu
ai-scientist-python-workspace  | 2025-09-16 04:58:07,676 - INFO -   - Feature Ensemble detector (uOttawa 2023)...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,676 - INFO -   - Hybrid LLM-Regex detector (Ours)...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,677 - INFO - 
ai-scientist-python-workspace  | [Step 3/4] Running evaluations...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,677 - INFO - 
ai-scientist-python-workspace  | Evaluating all detection methods:
ai-scientist-python-workspace  | 2025-09-16 04:58:07,677 - INFO - 
ai-scientist-python-workspace  |   Testing Rule-based Baseline...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Accuracy: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Precision: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Recall: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     F1-Score: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Time: 0.00s
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO - 
ai-scientist-python-workspace  |   Testing Regex Pattern Baseline...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Accuracy: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Precision: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Recall: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     F1-Score: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO -     Time: 0.00s
ai-scientist-python-workspace  | 2025-09-16 04:58:07,678 - INFO - 
ai-scientist-python-workspace  |   Testing PhishIntention (USENIX'22)...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,679 - INFO - PhishIntention adapter initialized (no training required)
ai-scientist-python-workspace  | 2025-09-16 04:58:07,680 - INFO - Validation accuracy on sample: 0.46
ai-scientist-python-workspace  | 2025-09-16 04:58:07,680 - ERROR -     Error evaluating PhishIntention (USENIX'22): 'list' object has no attribute 'get'
ai-scientist-python-workspace  | 2025-09-16 04:58:07,680 - INFO - 
ai-scientist-python-workspace  |   Testing CNN-BiGRU (Sensors'24)...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,680 - INFO - Training CNN-BiGRU phishing detector...
ai-scientist-python-workspace  | 2025-09-16 04:58:07,681 - INFO - Vocabulary built with 327 words
ai-scientist-python-workspace  | 2025-09-16 04:58:08,496 - INFO - Epoch 1/5: Loss=0.6938, Accuracy=0.545
ai-scientist-python-workspace  | 2025-09-16 04:58:09,127 - INFO - Epoch 2/5: Loss=0.6937, Accuracy=0.540
ai-scientist-python-workspace  | 2025-09-16 04:58:09,715 - INFO - Epoch 3/5: Loss=0.6918, Accuracy=0.530
ai-scientist-python-workspace  | 2025-09-16 04:58:10,289 - INFO - Epoch 4/5: Loss=0.6876, Accuracy=0.535
ai-scientist-python-workspace  | 2025-09-16 04:58:10,915 - INFO - Epoch 5/5: Loss=0.6965, Accuracy=0.530
ai-scientist-python-workspace  | 2025-09-16 04:58:10,951 - INFO - Validation accuracy: 0.460
ai-scientist-python-workspace  | 2025-09-16 04:58:10,952 - ERROR -     Error evaluating CNN-BiGRU (Sensors'24): 'list' object has no attribute 'get'
ai-scientist-python-workspace  | 2025-09-16 04:58:10,952 - INFO - 
ai-scientist-python-workspace  |   Testing Feature Ensemble (uOttawa'23)...
ai-scientist-python-workspace  | 2025-09-16 04:58:10,952 - INFO - Training Feature-based Ensemble Detector...
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - ERROR -     Error evaluating Feature Ensemble (uOttawa'23): No module named 'sklearn'
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO - 
ai-scientist-python-workspace  |   Testing Hybrid LLM-Regex (Ours)...
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO - Training Hybrid LLM-Regex Detector...
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO - Optimized regex threshold: 2.00
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO -     Accuracy: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO -     Precision: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO -     Recall: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO -     F1-Score: 0.000
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO -     Time: 0.00s
ai-scientist-python-workspace  | 2025-09-16 04:58:10,963 - INFO - 
ai-scientist-python-workspace  | [Step 4/4] Generating results report and visualizations...
ai-scientist-python-workspace  | 2025-09-16 04:58:10,964 - INFO - Generating visualizations...
ai-scientist-python-workspace  | 2025-09-16 04:58:11,619 - INFO - Visualizations saved to results_20250916_045807
ai-scientist-python-workspace  | /app/workspace/main.py:221: RuntimeWarning: invalid value encountered in scalar divide
ai-scientist-python-workspace  |   improvement = ((academic_avg - baseline_avg) / baseline_avg) * 100
ai-scientist-python-workspace  | 2025-09-16 04:58:11,620 - INFO - 
ai-scientist-python-workspace  | ================================================================================
ai-scientist-python-workspace  | 2025-09-16 04:58:11,620 - INFO - EXPERIMENT COMPLETED SUCCESSFULLY
ai-scientist-python-workspace  | 2025-09-16 04:58:11,620 - INFO - Results saved to: results_20250916_045807
ai-scientist-python-workspace  | 2025-09-16 04:58:11,620 - INFO - ================================================================================
ai-scientist-python-workspace  | 
ai-scientist-python-workspace  | ====================================================================================================
ai-scientist-python-workspace  | FINAL RESULTS COMPARISON - ACADEMIC METHODS
ai-scientist-python-workspace  | ====================================================================================================
ai-scientist-python-workspace  | 
ai-scientist-python-workspace  | Method                              Accuracy     Precision    Recall       F1-Score     Time(s)   
ai-scientist-python-workspace  | ----------------------------------------------------------------------------------------------------
ai-scientist-python-workspace  | Rule-based Baseline                 0.000        0.000        0.000        0.000        0.00      
ai-scientist-python-workspace  | Regex Pattern Baseline              0.000        0.000        0.000        0.000        0.00      
ai-scientist-python-workspace  | PhishIntention (USENIX'22)          0.000        0.000        0.000        0.000        0.00      
ai-scientist-python-workspace  | CNN-BiGRU (Sensors'24)              0.000        0.000        0.000        0.000        0.00      
ai-scientist-python-workspace  | Feature Ensemble (uOttawa'23)       0.000        0.000        0.000        0.000        0.00      
ai-scientist-python-workspace  | Hybrid LLM-Regex (Ours)             0.000        0.000        0.000        0.000        0.00      
ai-scientist-python-workspace  | 
ai-scientist-python-workspace  | ====================================================================================================
[Kai-scientist-python-workspace exited with code 0
