Starting Step 3: Heterogeneity Metrics Calculation
============================================================
Loading data from: outputs/step2_percentile_rankings_20250707_114353.csv
Data loaded successfully. Shape: (1083, 32)
Columns: ['user_id', 'age', 'gender', 'education_level', 'country', 'test_run_id', 'battery_id', 'time_of_day', 'grand_index', 'subtest_36_score', 'subtest_39_score', 'subtest_40_score', 'subtest_29_score', 'subtest_28_score', 'subtest_33_score', 'subtest_30_score', 'subtest_27_score', 'subtest_32_score', 'subtest_38_score', 'subtest_37_score', 'age_bin', 'percentile_36', 'percentile_39', 'percentile_40', 'percentile_29', 'percentile_28', 'percentile_33', 'percentile_30', 'percentile_27', 'percentile_32', 'percentile_38', 'percentile_37']
First 3 rows:
   user_id  age gender  ...  percentile_32 percentile_38  percentile_37
0    68983   50      m  ...      84.976526     40.610329      67.840376
1   106315   23      m  ...      52.839117     33.911672      60.410095
2   334338   60      m  ...      48.913043     60.144928      27.898551

[3 rows x 32 columns]
All required columns present: 32 columns validated

Unique values for key experimental design parameters:
gender: ['m' 'f']
education_level: [6 8 4 1 2 3 7 5]
country: ['US' 'NZ' 'AU' 'CA']
battery_id: [26]
time_of_day: [13 19  5 20  6 22  7  8 10  9 11 15 12 23 16 21 18 17  4  0 14  2  1  3]
age_bin: ['50-59' '18-29' '60-69' '40-49' '30-39' '70-99']

Data types for measured variables:
age: int64
grand_index: float64
subtest_36_score: float64
subtest_39_score: float64
subtest_40_score: float64
subtest_29_score: float64
subtest_28_score: float64
subtest_33_score: float64
subtest_30_score: float64
subtest_27_score: float64
subtest_32_score: float64
subtest_38_score: float64
subtest_37_score: float64
percentile_36: float64
percentile_39: float64
percentile_40: float64
percentile_29: float64
percentile_28: float64
percentile_33: float64
percentile_30: float64
percentile_27: float64
percentile_32: float64
percentile_38: float64
percentile_37: float64

Cleaning data...
Initial number of rows: 1083
Final number of rows: 1083
Excluded rows due to missing values: 0

Calculating heterogeneity metrics...
Using percentile columns: ['percentile_36', 'percentile_39', 'percentile_40', 'percentile_29', 'percentile_28', 'percentile_33', 'percentile_30', 'percentile_27', 'percentile_32', 'percentile_38', 'percentile_37']
Calculated percentile_range for 1083 participants
Calculated percentile_iqr for 1083 participants

Summary statistics for heterogeneity metrics:
Percentile Range:
  Mean: 74.643
  Std: 13.665
  Min: 20.000
  Max: 98.913
Percentile IQR:
  Mean: 34.059
  Std: 11.603
  Min: 5.836
  Max: 73.438

Validating discriminant validity...
Correlation between percentile_range and grand_index: r = 0.0121, p = 0.6905
Correlation between percentile_iqr and grand_index: r = 0.0633, p = 0.0374
Discriminant validity for percentile_range: Met (|r| = 0.0121)
Discriminant validity for percentile_iqr: Met (|r| = 0.0633)

Validation results:
        metric_name  ...  discriminant_validity_met
0  percentile_range  ...                       True
1    percentile_iqr  ...                       True

[2 rows x 4 columns]

Saving output files...
Main output saved to: outputs/step3_heterogeneity_metrics_20250707_115126.csv
Main output shape: (1083, 34)
Validation results saved to: outputs/step3_validation_results_20250707_115126.csv
Validation results shape: (2, 4)

First 2 rows of main output:
Columns: ['user_id', 'age', 'gender', 'education_level', 'country', 'test_run_id', 'battery_id', 'time_of_day', 'grand_index', 'subtest_36_score', 'subtest_39_score', 'subtest_40_score', 'subtest_29_score', 'subtest_28_score', 'subtest_33_score', 'subtest_30_score', 'subtest_27_score', 'subtest_32_score', 'subtest_38_score', 'subtest_37_score', 'age_bin', 'percentile_36', 'percentile_39', 'percentile_40', 'percentile_29', 'percentile_28', 'percentile_33', 'percentile_30', 'percentile_27', 'percentile_32', 'percentile_38', 'percentile_37', 'percentile_range', 'percentile_iqr']
   user_id  age gender  ...  percentile_37 percentile_range  percentile_iqr
0    68983   50      m  ...      67.840376        44.366197       18.779343
1   106315   23      m  ...      60.410095        76.025237       32.255521

[2 rows x 34 columns]

Validation results:
Columns: ['metric_name', 'correlation_with_grand_index', 'correlation_p_value', 'discriminant_validity_met']
        metric_name  ...  discriminant_validity_met
0  percentile_range  ...                       True
1    percentile_iqr  ...                       True

[2 rows x 4 columns]

============================================================
Step 3 completed successfully!
Finished execution

