# Cache Eviction Strategies

A cache eviction policy is a crucial component of `vCache` that determines which items to remove when the cache reaches its capacity. The choice of policy significantly impacts cache performance, as it dictates how the system prioritizes the stored information. `vCache` supports both traditional heuristics (FIFO, LRU, MRU) and a novel, statistically-driven strategy (SCU).


## First-In, First-Out (FIFO)

The FIFO policy is the most straightforward eviction method. It evicts items in the strict order they were added, without any regard for how often or how recently they have been accessed. This is achieved by sorting all cached items by their `created_at` timestamp and removing the oldest entries. While simple and fast, FIFO can be inefficient if older items are frequently used.



## Least Recently Used (LRU)

The LRU policy operates on the assumption that items that have not been used recently are less likely to be used in the near future. When eviction is necessary, LRU removes the items that have the oldest `last_accessed` timestamp. This strategy can be effective for access patterns where recency is a good predictor of future use.



## Most Recently Used (MRU)

Conversely, the MRU policy evicts the items that have been accessed most recently. This seemingly counterintuitive approach is highly effective for workloads involving iterative scans over large datasets that do not fit in the cache. In such cases, once an item is accessed, it is often not needed again for a long time, making the most recently used item the best candidate for eviction.



## Sky Confident Utility (SCU)

Traditional cache eviction policies, such as LRU and FIFO, rely on temporal heuristics that are poorly suited for intelligent caching systems. These methods fail to consider the learned, per-item performance characteristics available in `vCache`.

The `SkyEvictionPolicy` is a novel eviction strategy designed to leverage the statistical metadata generated by the `VerifiedDecisionPolicy`. It recasts the eviction problem as a multi-objective optimization task, selecting victims based on their globally competitive utility rather than simple heuristics.

### Multi-Objective Optimization Framework

The policy models each cached item as a point in a two-dimensional performance space, with the goal of optimizing two competing objectives:

1.  **Generality ($1 - t'_{prime}$):** The capacity of an item to correctly serve a wide range of queries. This is inversely related to the learned similarity threshold, $t'_{prime}$.
2.  **Confidence ($n_{obs}$):** The statistical confidence in the item's learned parameters, represented by the number of observations, $n_{obs}$.

This framework treats eviction as a global, competitive process where an item's utility is relative to the entire cache population.

### Pareto-Optimal Selection

The `SkyEvictionPolicy` identifies the most valuable items by determining the **Pareto Frontier** of the cache population. An item is Pareto-optimal if no other item in the cache dominates it on both generality and confidence.

These non-dominated items represent the optimal trade-off between the two objectives and are preserved during eviction. Conversely, items that are "most dominated"—i.e., furthest from this frontier—are prioritized for removal.

### Utility Calculation and Victim Selection

Victim selection is based on a utility score derived from an item's normalized distance to a theoretical **Ideal Point**. This point, $p_{ideal} = (0, 1)$, represents maximum generality ($t'_{prime}=0$) and maximum confidence ($n'_{obs}=1$).

#### 1. Normalization

To ensure equal weighting of objectives, the observation count for each item $i$ is normalized:

$n'_{obs,i} = \frac{n_{obs,i}}{\max(n_{obs})}$

The generality metric, based on $t'_{prime}$, is already within the $[0, 1]$ range.

#### 2. Utility Score

The utility of an item is inversely proportional to its Euclidean distance from the Ideal Point. We define this distance, $D_i$, as:

$D_i = \sqrt{(t'_{prime,i})^2 + (n'_{obs,i} - 1)^2}$

#### 3. Victim Selection

Items with the largest distance $D_i$ have the lowest utility and are selected as victims for eviction.

### Justification

This distance-based utility model provides a robust, data-driven foundation for eviction. Its advantages include:

-   **Global Ranking:** It evaluates items based on their performance relative to the entire cache population, not on isolated metrics.
-   **Pareto-Implicit:** The distance calculation implicitly prioritizes items on or near the Pareto frontier.
-   **Balanced Penalization:** The model correctly distinguishes between:
    -   **Proven Losers** (high $t'_{prime}$, high $n_{obs}$), which are heavily penalized.
    -   **Suspected Losers** (high $t'_{prime}$, low $n_{obs}$), which are treated with less prejudice, allowing them an opportunity to accumulate more data.

By aligning the eviction criteria with the statistical outputs of the caching policy, the `SkyEvictionPolicy` ensures that the most valuable and reliable information is preserved.
