[
    {
        "problem_id": 2432,
        "domain": [
            "Mathematics -> Discrete Mathematics -> Combinatorics"
        ],
        "difficulty": 6.0,
        "problem_text": "Let $S$ be a set of intervals defined recursively as follows: Initially, $[1,1000]$ is the only interval in $S$. If $l \\neq r$ and $[l, r] \\in S$, then both $\\left[l,\\left\\lfloor\\frac{l+r}{2}\\right\\rfloor\\right],\\left[\\left\\lfloor\\frac{l+r}{2}\\right\\rfloor+1, r\\right] \\in S$. An integer $i$ is chosen uniformly at random from the range $[1,1000]$. What is the expected number of intervals in $S$ which contain $i$?",
        "sample_id": 1,
        "final_answer": "The final answer is \\boxed{10.976}",
        "steps": [
            {
                "step_id": 1,
                "edge": "We apply the linearity of expectation, a fundamental principle in probability theory that allows decomposing the expected value of a sum into the sum of expected values. Here, the random variable counting intervals containing i is the sum over all intervals I in S of the indicator variable for i∈I. The expectation of each indicator is P(i∈I), so the total expectation equals the sum of P(i∈I) over all I in S. This approach simplifies the problem by shifting focus from the random integer i to the fixed intervals in S.",
                "direct_dependent_steps": null,
                "node": "The expected number of intervals containing a uniformly random integer i equals the sum over all I in S of P(i∈I)."
            },
            {
                "step_id": 2,
                "edge": "Given that i is chosen uniformly at random from [1,1000], the probability that i falls within any interval I is proportional to I's length. Specifically, for an interval I of length |I| (where length is defined as r−l+1 for [l,r]), there are exactly |I| integers in I, so P(i∈I) = |I|/1000. This follows directly from the definition of uniform probability over a discrete set of integers.",
                "direct_dependent_steps": null,
                "node": "For an interval I of length |I|, P(i∈I)=|I|/1000."
            },
            {
                "step_id": 3,
                "edge": "Combining Step 1 and Step 2, the expected number of intervals containing i is the sum over all I in S of |I|/1000. Factoring out the constant denominator, this equals (sum of |I| for all I in S) divided by 1000. This rephrasing is crucial because it transforms the probabilistic problem into a combinatorial one: computing the total length of all intervals in S.",
                "direct_dependent_steps": [
                    1,
                    2
                ],
                "node": "Therefore the expected number equals the sum of the lengths of all intervals in S divided by 1000."
            },
            {
                "step_id": 4,
                "edge": "The problem statement explicitly defines the initial state of S: it begins with only the interval [1,1000]. This serves as the base case for the recursive construction of S and establishes the starting point for analyzing the interval set.",
                "direct_dependent_steps": null,
                "node": "The set S initially contains only the interval [1,1000]."
            },
            {
                "step_id": 5,
                "edge": "The recursive rule for expanding S is given directly in the problem: any interval [l,r] in S with l≠r (i.e., length greater than 1) is replaced by two subintervals formed by splitting at the midpoint ⌊(l+r)/2⌋. This defines the entire structure of S through repeated application and is essential for understanding how intervals evolve.",
                "direct_dependent_steps": null,
                "node": "If an interval [l,r] with l≠r is in S then [l,⌊(l+r)/2⌋] and [⌊(l+r)/2⌋+1,r] are also in S."
            },
            {
                "step_id": 6,
                "edge": "Using Step 4, the initial interval [1,1000] has length calculated as 1000−1+1=1000. This arithmetic follows the standard definition of interval length for integer endpoints and quantifies the starting size of the single interval in S.",
                "direct_dependent_steps": [
                    4
                ],
                "node": "The initial interval [1,1000] has length 1000."
            },
            {
                "step_id": 7,
                "edge": "From Step 5, when splitting [l,r] with l≠r, the left child is [l, ⌊(l+r)/2⌋] and the right child is [⌊(l+r)/2⌋+1, r]. The length of the left child is ⌊(l+r)/2⌋ − l + 1 = ⌊(r−l+1)/2⌋ = ⌊n/2⌋ where n = r−l+1 is the parent length. Similarly, the right child length is r − (⌊(l+r)/2⌋+1) + 1 = ⌈n/2⌉. This shows how splitting preserves the integer nature of lengths through floor and ceiling operations.",
                "direct_dependent_steps": [
                    5
                ],
                "node": "Whenever an interval of length n>1 is split, its two children have lengths ⌊n/2⌋ and ⌈n/2⌉ respectively."
            },
            {
                "step_id": 8,
                "edge": "Building on Step 7, for a parent interval of length n>1, the sum of the children's lengths is ⌊n/2⌋ + ⌈n/2⌉. Since ⌊n/2⌋ + ⌈n/2⌉ = n for any integer n (a basic property of floor and ceiling functions), this sum equals the parent length n. This conservation of total length is critical for tracking the aggregate size of S during splits.",
                "direct_dependent_steps": [
                    7
                ],
                "node": "The sum of the lengths of those two children equals the parent length n."
            },
            {
                "step_id": 9,
                "edge": "We introduce a systematic way to categorize intervals in S by the number of splits applied to reach them. The k-splits are defined as all intervals generated exactly at the k-th recursive split step. This partitioning allows us to analyze S in layers, where k=0 corresponds to the initial interval, k=1 to the first split's results, and so on.",
                "direct_dependent_steps": null,
                "node": "Define the k-splits to be the set of all intervals obtained after exactly k recursive splits."
            },
            {
                "step_id": 10,
                "edge": "Step 9 defines the 0-splits as the initial interval set. Step 6 gives the length of this single interval as 1000. Therefore, the total length of the 0-splits is 1000. This establishes the baseline total length for k=0.",
                "direct_dependent_steps": [
                    6,
                    9
                ],
                "node": "The sum of lengths of the 0-splits is 1000."
            },
            {
                "step_id": 11,
                "edge": "From Step 6, the initial interval length is 1000, which is clearly greater than 1. This confirms that the initial interval satisfies the splitting condition (l≠r) in Step 5, ensuring the recursion can begin.",
                "direct_dependent_steps": [
                    6
                ],
                "node": "The initial interval has length greater than 1."
            },
            {
                "step_id": 12,
                "edge": "Step 11 confirms the initial interval (0-splits) has length >1, so it splits per Step 5. Step 8 guarantees that splitting preserves total length, so the sum for the 1-splits (the children) equals the parent's length. Step 10 states the 0-splits sum to 1000, and Step 9 identifies the 1-splits as the result of the first split. Thus, the 1-splits also sum to 1000.",
                "direct_dependent_steps": [
                    8,
                    9,
                    10,
                    11
                ],
                "node": "Therefore the sum of lengths of the 1-splits equals 1000."
            },
            {
                "step_id": 13,
                "edge": "Step 9 defines k-splits for any k, so we consider k in the specific range 1≤k≤8. This range is chosen because intervals remain splittable (length >1) up to k=8, as verified later. Restricting k here sets up the induction for Step 15.",
                "direct_dependent_steps": [
                    9
                ],
                "node": "Let k be any integer with 1≤k≤8."
            },
            {
                "step_id": 14,
                "edge": "Given k in [1,8] (Step 13), the k-splits consist of 2^k intervals (by recursive doubling from the initial interval). The total length of these intervals is 1000 (as established for k=0,1 in Steps 10,12 and extended by Step 15). Since 2^k ≤ 256 for k≤8 and 256 < 1000, the average interval length exceeds 1. Moreover, because lengths are integers, the minimal possible length is at least 2 (e.g., for k=8, 1000/256 ≈ 3.9, so minimal length is 3). Thus, every interval in the k-splits has length >1, satisfying the splitting condition in Step 5.",
                "direct_dependent_steps": [
                    13
                ],
                "node": "Every interval in the k-splits has length greater than 1."
            },
            {
                "step_id": 15,
                "edge": "Step 14 ensures every interval in the k-splits (for k in [1,8]) has length >1, so all split per Step 5. Step 8 states that splitting any such interval preserves its length in the children. Step 9 defines the (k+1)-splits as the set of children from splitting the k-splits. Therefore, the total length of the (k+1)-splits equals the total length of the k-splits, as each parent's length is fully distributed to its children.",
                "direct_dependent_steps": [
                    8,
                    9,
                    14
                ],
                "node": "Therefore the sum of lengths of the (k+1)-splits equals the sum of lengths of the k-splits."
            },
            {
                "step_id": 16,
                "edge": "Step 12 shows the sum is 1000 for k=0 and k=1. Step 15 provides the inductive step: if the sum is 1000 for k, it is 1000 for k+1, for k from 1 to 8. Combining these, the sum remains 1000 for all k from 0 to 9 (10 layers: k=0 to k=9). This conservation holds because intervals only stop splitting when length=1, which first occurs at k=9 for some intervals.",
                "direct_dependent_steps": [
                    12,
                    15
                ],
                "node": "Hence the sum of lengths of the k-splits equals 1000 for all k from 0 to 9."
            },
            {
                "step_id": 17,
                "edge": "Step 7 describes the splitting rule: for an interval of length n>1, children have lengths ⌊n/2⌋ and ⌈n/2⌉. When n=2, ⌊2/2⌋=1 and ⌈2/2⌉=1, so both children have length 1. This specific case is vital for handling the final splits where intervals reduce to unit length.",
                "direct_dependent_steps": [
                    7
                ],
                "node": "At the 10th split, intervals of length 2 split into two intervals of length 1."
            },
            {
                "step_id": 18,
                "edge": "Step 5 states splitting occurs only when l≠r. For an interval of length 1, l=r, so it does not split further. This halting condition explains why intervals of length 1 persist in S and are not subdivided, anchoring the recursion.",
                "direct_dependent_steps": [
                    5
                ],
                "node": "Intervals of length 1 do not split further."
            },
            {
                "step_id": 19,
                "edge": "Step 9 defines k-splits, and Step 14 confirms all intervals in splits up to k=8 are splittable (length>1). Starting from one interval at k=0, each split doubles the count. After 9 splits, the number of intervals is 2^9=512. This exponential growth follows directly from the binary splitting process applied uniformly to all intervals at each step.",
                "direct_dependent_steps": [
                    9,
                    14
                ],
                "node": "The number of intervals in the 9-splits is 2^9."
            },
            {
                "step_id": 20,
                "edge": "Step 16 explicitly states that for k=9, the sum of lengths of the 9-splits is 1000. This is consistent with the conservation principle established for k≤9, as Step 16 covers all k from 0 to 9.",
                "direct_dependent_steps": [
                    16
                ],
                "node": "The total sum of lengths of the 9-splits is 1000."
            },
            {
                "step_id": 21,
                "edge": "To analyze the composition of the 9-splits, we define x as the count of intervals with length exactly 2. This variable helps decompose the total length and interval count, as intervals in the 9-splits can only be length 1 or 2 (since 512 intervals summing to 1000 implies no longer intervals are possible).",
                "direct_dependent_steps": null,
                "node": "Let x be the number of intervals of length 2 in the 9-splits."
            },
            {
                "step_id": 22,
                "edge": "Similarly, we define y as the count of intervals with length exactly 1 in the 9-splits. This complements Step 21, as together x and y account for all intervals (since lengths must be 1 or 2 given the total length and count constraints).",
                "direct_dependent_steps": null,
                "node": "Let y be the number of intervals of length 1 in the 9-splits."
            },
            {
                "step_id": 23,
                "edge": "Step 19 gives the total number of intervals in the 9-splits as 2^9=512. Steps 21 and 22 define x and y as the counts of length-2 and length-1 intervals, respectively. Since every interval in the 9-splits has length 1 or 2, the total count is x+y, so x+y=512.",
                "direct_dependent_steps": [
                    19,
                    21,
                    22
                ],
                "node": "We have x+y=2^9."
            },
            {
                "step_id": 24,
                "edge": "Step 20 states the total length of the 9-splits is 1000. Steps 21 and 22 specify that x intervals contribute 2 units each and y intervals contribute 1 unit each to the total length. Therefore, 2x + y = 1000, representing the aggregate length constraint.",
                "direct_dependent_steps": [
                    20,
                    21,
                    22
                ],
                "node": "We also have 2x+y=1000."
            },
            {
                "step_id": 25,
                "edge": "From Step 23, y = 512 − x. Substituting this into Step 24's equation 2x + y = 1000 yields 2x + (512 − x) = 1000. Simplifying gives x + 512 = 1000, which isolates the variable x for solving. This algebraic substitution is standard for solving systems of linear equations.",
                "direct_dependent_steps": [
                    23,
                    24
                ],
                "node": "Substituting y=2^9−x into 2x+y=1000 gives 2x+2^9−x=1000."
            },
            {
                "step_id": 26,
                "edge": "Solving Step 25's equation x + 512 = 1000 by subtracting 512 from both sides gives x = 1000 − 512. This straightforward arithmetic resolves the count of length-2 intervals in the 9-splits.",
                "direct_dependent_steps": [
                    25
                ],
                "node": "Solving yields x=1000−2^9."
            },
            {
                "step_id": 27,
                "edge": "Step 26 gives x = 1000 − 2^9. Computing 2^9 = 512, so x = 1000 − 512 = 488. Sanity check: 488 × 2 = 976, and 512 − 488 = 24 intervals of length 1 would give 976 + 24 = 1000, matching Step 20.",
                "direct_dependent_steps": [
                    26
                ],
                "node": "Since 2^9=512 we have x=488."
            },
            {
                "step_id": 28,
                "edge": "From Step 23, y = 2^9 − x. Step 27 gives x = 488 and 2^9 = 512, so y = 512 − 488 = 24. This counts the intervals of length 1 in the 9-splits, verified by 488 × 2 + 24 × 1 = 976 + 24 = 1000.",
                "direct_dependent_steps": [
                    23,
                    27
                ],
                "node": "Then y=2^9−488=24."
            },
            {
                "step_id": 29,
                "edge": "Step 17 states that intervals of length 2 split into two length-1 intervals. Step 27 identifies 488 such intervals in the 9-splits, and Step 28 confirms the remaining 24 intervals have length 1 (which do not split per Step 18). Thus, only the 488 length-2 intervals split to produce the 10-splits, each yielding two intervals of length 1.",
                "direct_dependent_steps": [
                    17,
                    27,
                    28
                ],
                "node": "Each of the 488 intervals of length 2 in the 9-splits produces two intervals of length 1 in the 10-splits."
            },
            {
                "step_id": 30,
                "edge": "Step 18 confirms length-1 intervals do not split, so they do not contribute to the 10-splits. Step 29 states each of the 488 length-2 intervals produces two length-1 intervals in the 10-splits. Each new interval has length 1, so the total length is 488 × 2 × 1 = 976. This is consistent: 488 splits generate 976 unit-length intervals.",
                "direct_dependent_steps": [
                    18,
                    29
                ],
                "node": "Therefore the sum of lengths of the 10-splits is 2×488=976."
            },
            {
                "step_id": 31,
                "edge": "Step 16 shows the sum of lengths is 1000 for each of the 10 splits (k=0 to k=9). Step 30 gives the sum for the 10-splits (k=10) as 976. Therefore, the total length across all intervals in S is the sum over k=0 to 10: 10 × 1000 + 976 = 10,976. This aggregates the lengths from every layer of splits.",
                "direct_dependent_steps": [
                    16,
                    30
                ],
                "node": "The total sum of lengths across all splits is 10×1000+976=10976."
            },
            {
                "step_id": 32,
                "edge": "Step 3 states the expected number equals the total length of all intervals in S divided by 1000. Step 31 computes this total length as 10,976. Thus, 10,976 / 1000 = 10.976. This final division converts the combinatorial sum into the desired probabilistic expectation, yielding the answer.",
                "direct_dependent_steps": [
                    3,
                    31
                ],
                "node": "Dividing by 1000 yields the expected number 10.976."
            }
        ]
    }
]
