# Bitcoin Price Prediction with PowerSigJax

This example demonstrates how to use PowerSigJax to predict Bitcoin prices using signature kernels, replicating the approach from Chris Salvi's work.

## Overview

The example implements:
- **36-day sliding window** approach for time series prediction
- **PowerSigJax** for computing signature kernel gram matrices
- **Support Vector Regression (SVR)** with custom kernel entries
- **Time augmentation** to include temporal information
- **Real Bitcoin data** downloaded via yfinance

## Key Features

### 1. Sliding Window Approach
- Uses 36-day windows to predict the average of the following two days' Bitcoin prices
- Each window is normalized to [0,1] range for better kernel performance
- Creates overlapping windows for dense prediction
- Training period: June 2, 2017 to May 2018 (346 days)

### 2. Signature Kernel Computation
- Uses PowerSigJax to compute signature kernels between time series
- Implements static kernel approach (as opposed to dynamic)
- Computes both training and test gram matrices efficiently

### 3. SVR with Custom Kernel
- Uses scikit-learn's SVR with `kernel='precomputed'`
- Reasonable default parameters (C=1.0, epsilon=0.1)
- Ready for hyperparameter optimization
- Evaluates performance using MAPE (Mean Absolute Percentage Error)

### 4. Time Augmentation
- Adds time feature as additional dimension
- Helps the kernel capture temporal patterns
- Standard approach in signature kernel literature

## Files

- `bitcoin.py` - Main implementation with real Bitcoin data
- `test_bitcoin.py` - Test version with synthetic data for faster testing
- `download_bitcoin_data.py` - Standalone Bitcoin data downloader with caching
- `README_bitcoin.md` - This documentation

## Usage

### Prerequisites

Install the required dependencies:
```bash
pip install -r requirements.txt
```

### Running the Example

1. **Download Bitcoin data** (optional, runs automatically):
```bash
cd examples
python download_bitcoin_data.py
```

2. **Full Bitcoin prediction** (with real data):
```bash
cd examples
python bitcoin.py
```

3. **Test version** (with synthetic data, faster):
```bash
cd examples
python test_bitcoin.py
```

### Data Download Options

The Bitcoin data downloader supports several options:

```bash
# Download data with custom date range
python download_bitcoin_data.py --start-date 2018-01-01 --end-date 2024-01-01

# Force re-download (ignore cache)
python download_bitcoin_data.py --force-download

# Set cache age to 30 days
python download_bitcoin_data.py --cache-days 30

# Validate existing cached data
python download_bitcoin_data.py --validate-only

# Custom output filename
python download_bitcoin_data.py --output my_bitcoin_data.pkl
```

## Parameters

### PowerSigJax Parameters
- `order=16` - Polynomial order for signature computation (higher = more accurate but slower)
- `device` - Automatically selects CUDA if available, otherwise CPU

### SVR Parameters
- `C=1.0` - Regularization parameter (controls model complexity)
- `epsilon=0.1` - Epsilon parameter for SVR (controls margin)

### Data Parameters
- `window_size=36` - Size of sliding window (days)
- `train_start_date=2017-06-02` - Start date for training period
- `train_days=346` - Training period length (June 2, 2017 to May 2018)
- `target_horizon=2` - Predicts average of next 2 days
- Date range: 2017-06-02 to present

## Expected Output

The script will output:
1. Data loading and preprocessing information
2. Gram matrix computation progress and statistics
3. SVR training progress
4. Model performance metrics (MSE, MAE, MAPE, R²)
5. Sample predictions
6. Model summary
7. Visualization plots

## Performance Considerations

### Data Management
- Bitcoin data is automatically cached to avoid re-downloading
- Cache expires after 7 days by default (configurable)
- Data validation ensures quality and completeness
- Multiple fallback sources for reliable downloads

### Computation Time
- Gram matrix computation is the most time-consuming part
- Scales quadratically with number of samples
- Use smaller `order` for faster computation during development

### Memory Usage
- Gram matrices can be large for many samples
- Consider using smaller datasets for initial testing
- Test script uses synthetic data and fewer samples

### GPU Acceleration
- Automatically uses CUDA if available
- Falls back to CPU if GPU not available
- JAX compilation provides significant speedup

## Hyperparameter Optimization

The current implementation uses reasonable defaults, but you can optimize:

1. **PowerSigJax order**: Try values 8, 16, 32 (higher = more accurate)
2. **SVR C parameter**: Try values 0.1, 1.0, 10.0, 100.0
3. **SVR epsilon**: Try values 0.01, 0.1, 0.5
4. **Window size**: Try values 20, 36, 50 days
5. **Target horizon**: Currently 2 days, could be extended to 3-7 days

## Extensions

### Multi-feature Input
Currently uses only closing prices. Can be extended to include:
- Open, High, Low prices
- Volume data
- Technical indicators

### Dynamic Kernel
The current implementation uses static kernels. Can be extended to:
- Implement dynamic signature kernels
- Use different kernel types
- Add kernel hyperparameters

### Ensemble Methods
Can be combined with:
- Multiple window sizes
- Different feature sets
- Other kernel methods

## References

- Chris Salvi's work on signature kernels for time series prediction
- PowerSigJax implementation details
- SVR with custom kernels in scikit-learn

## Troubleshooting

### Common Issues

1. **Import Error**: Make sure PowerSigJax is properly installed
2. **CUDA Error**: Falls back to CPU automatically
3. **Memory Error**: Use smaller dataset or reduce order
4. **Slow Computation**: Use test script first, reduce order

### Performance Tips

1. Start with test script to verify setup
2. Use smaller order (8) for development
3. Use synthetic data for initial testing
4. Monitor memory usage with large datasets

## License

This example is part of the PowerSigJax project and follows the same license terms. 