# LLM-Ellipse-Attack

## Scripts

- `scripts/compare_pythia_ellipses.py`: Compare ellipse parameters across Pythia snapshots; writes Overleaf data files for bias and stretch.
- `scripts/cost_est.py`: Estimate sampling/image costs for various models and emit a LaTeX table (`models.tex`).
- `scripts/decompress_logits.py`: Decompress logits from a compressed NPZ (matrix multiply + softmax) and save logprobs.
- `scripts/disguise_logprobs.py`: Recover surrogate hidden vectors that reproduce target probabilities using LBFGS; saves `disguised_logprobs.npy`.
- `scripts/experiment/generate_outputs.py`: Generate last-token logprobs for a model on a dataset and save to `data/outputs/`.
- `scripts/experiment/run_experiment_fast.py`: Compute ellipses directly from saved model params and tabulate distances (experimental).
- `scripts/experiment/save_models.py`: Extract final-layer params (unembed, norm weight/bias) for several HF models into `data/model/*.npz`.
- `scripts/generate_random_proj.py`: Save a random column-subset projection matrix of shape `(in_size, out_size)` to `data/random_proj.npy`.
- `scripts/hidden_recovery_error.py`: Evaluate hidden-state recovery from logprobs and a predicted ellipse; includes Procrustes alignment baseline.
- `scripts/linear_compress.py`: Low‑rank compress a matrix from stdin, learn a decompressor, verify, and save `compressed` + `decompressor` NPZ to stdout.
- `scripts/logprobs_of_logits.py`: Convert logits from stdin NPZ to logprobs and write a `.npy` array to stdout.
- `scripts/lstsq_fitting/*`: Deprecated least‑squares ellipse fitting utilities (multi‑step pipeline: save logits/params, fit coeffs1/2/3, analyze).
- `scripts/mode_stats.sh`: Quick token statistics for two token lists using system wordlist.
- `scripts/model_identification.py`: Test whether an estimated ellipse discriminates a target model’s outputs from alternatives; saves error histograms/data.
- `scripts/nanoGPT/nanoGPT.py`: Example loader for a NanoGPT checkpoint and text generation (experimental/incomplete).
- `scripts/nl_pred_multi_seed.sh`: Run `time_ellipse_solving.py` for multiple seeds and sample sizes; saves ellipse predictions under `data/.../ellipse_pred/`.
- `scripts/openai_batch_inference/cancel.py`: Cancel submitted OpenAI Batch jobs listed in `data/batch_log.jsonl`.
- `scripts/openai_batch_inference/make_batch_file.py`: Build JSONL Batch request files to query many single‑token completions with `logprobs=True`.
- `scripts/openai_batch_inference/send_requests.py`: Submit JSONL Batch files to OpenAI and append job metadata to `data/batch_log.jsonl`.
- `scripts/openai_batch_inference/status.py`: Summarize Batch request progress by reading `data/batch_log.jsonl` and polling statuses.
- `scripts/sample_from_model.py`: Run batched forward passes to save final hidden and prenorm activations for a model/dataset to `data/.../outputs.npz`.
- `scripts/save_hidden_size_centered_norms.py`: Compute normalized hidden‑state norms, export Overleaf data files; utilities for mixture modeling (commented out).
- `scripts/save_tinystories_model.py`: Extract final‑layer params from a TinyStories model and save to `data/model/*.npz` (experimental; may require fixes).
- `scripts/tab/ellipse_estimation_errors.py`: Transform `data/error_data.pkl` into a compact table of ellipse estimation errors.
- `scripts/time_ellipse_solving.py`: Solve for ellipse params from hidden→logprobs, optionally with down‑projection/random sampling; saves NPZ + timing.
- `scripts/utils.py`: Small helpers (device selection, argument parsing, second‑order term expansion).
- `scripts/validate_ellipse_pred.py`: Compare predicted ellipse params against the true model‑derived ellipse and save error summaries.
- `scripts/viz/sample_size.py`: Build Overleaf table/data for error vs. sample size from `data/error_data.pkl`.
- `scripts/ying.py`: Synthetic validation of Ying’s method via CVXPY; fits ellipse on generated data and checks factorization properties.

Notes:
- Bytecode caches under `scripts/__pycache__/` are tracked but are not scripts and are excluded from the list above.

## To do

- [ ] Update cost estimates

## Extracting the ellipse of a real LM.

We can isolate a set of next-token distributions with low-variance post-center norms.
Do these give smaller error when solving for bias and singular values?

```sh
# Generate the samples
python scripts/sample_from_model.py

# Filter for low-variance norm outputs.
python scripts/save_hidden_size_centered_norms.py 

# Solve for the ellipse with various numbers of outputs, 
# save the solve times.
# The `--filter` flag uses the low-variance outputs
python scripts/time_ellipse_solving.py [--filter]

# Evaluate the goodness of fit
# The `--filter` flag uses the low-variance outputs
python scripts/validate_ellipse_pred.py [--filter]
```

## Deprecated files

- `scripts/lstsq_fitting`: loads samples from model and computes ellipse using least‑squares steps. Deprecated.
