[
    {
        "problem_id": 1676,
        "domain": [
            "Mathematics -> Discrete Mathematics -> Combinatorics"
        ],
        "difficulty": 4.0,
        "problem_text": "There are 100 people standing in a line from left to right. Half of them are randomly chosen to face right (with all $\\binom{100}{50}$ possible choices being equally likely), and the others face left. Then, while there is a pair of people who are facing each other and have no one between them, the leftmost such pair leaves the line. Compute the expected number of people remaining once this process terminates.",
        "sample_id": 1,
        "final_answer": "2^{100}/\\binom{100}{50}-1",
        "steps": [
            {
                "step_id": 1,
                "edge": "We begin by establishing the foundational context: the problem involves 100 people arranged in a line, as explicitly stated in the problem text. This step sets the stage for all subsequent reasoning by confirming the fixed size of the population under consideration, which is essential for combinatorial counting later.",
                "direct_dependent_steps": null,
                "node": "The problem has 100 people in a line."
            },
            {
                "step_id": 2,
                "edge": "Building on Step 1's specification of 100 people, we note the problem states half are randomly chosen to face right. Since 100 is even, exactly 50 face right and 50 face left. This follows directly from the problem's description of the random selection process using $\\binom{100}{50}$ equally likely arrangements, which inherently requires equal counts of both orientations.",
                "direct_dependent_steps": [
                    1
                ],
                "node": "Exactly 50 people face right and 50 face left."
            },
            {
                "step_id": 3,
                "edge": "To model directional relationships mathematically, we introduce a weight assignment: each right-facing person is assigned weight $1$. This is a strategic definition (not derived from prior steps) that converts directional information into numerical values, enabling algebraic manipulation of sequences. The choice of $1$ provides a natural basis for tracking rightward influence in prefix sums.",
                "direct_dependent_steps": null,
                "node": "We represent each right-facing person with weight $1$."
            },
            {
                "step_id": 4,
                "edge": "Complementing Step 3's definition, we assign weight $-1$ to each left-facing person. This deliberate choice creates a symmetric numerical representation where opposing directions have additive inverses, which will later facilitate cancellation during pair removals. Like Step 3, this is a foundational definition introduced to enable mathematical analysis of the directional dynamics.",
                "direct_dependent_steps": null,
                "node": "We represent each left-facing person with weight $-1$."
            },
            {
                "step_id": 5,
                "edge": "Using the weights defined in Steps 3 and 4, we formalize the prefix sum $S_i = \\sum_{j=1}^i w_j$ for position $i$. This cumulative sum tracks the net directional bias up to each point in the line. The definition is crucial because it transforms the spatial arrangement into a numerical sequence whose properties (like minima) will correlate with the termination state of the removal process.",
                "direct_dependent_steps": [
                    3,
                    4
                ],
                "node": "For any arrangement, define the prefix sum at position $i$ as $S_i=\\sum_{j=1}^i w_j$ where $w_j$ is the weight of the $j$th person."
            },
            {
                "step_id": 6,
                "edge": "Referencing Steps 3 and 4's weight assignments, we observe that when a right-facing person (weight $1$) immediately precedes a left-facing person (weight $-1$), their adjacent pair has weights $1$ and $-1$. This specific pairing is the only configuration that triggers removal, as per the problem's process description where 'a pair facing each other' implies right-left adjacency.",
                "direct_dependent_steps": [
                    3,
                    4
                ],
                "node": "Removing an adjacent pair consisting of a right-facing person followed by a left-facing person removes weights $1$ and $-1$."
            },
            {
                "step_id": 7,
                "edge": "From Step 6's identification of the removable pair's weights ($1$ and $-1$), we compute their net contribution: $1 + (-1) = 0$. This algebraic simplification is critical—it shows that removing such a pair leaves the total weight unchanged, which will later help establish invariants during the iterative removal process.",
                "direct_dependent_steps": [
                    6
                ],
                "node": "Removing weights $1$ and $-1$ yields a net weight removal of $0$."
            },
            {
                "step_id": 8,
                "edge": "Combining Step 5's prefix sum definition with Step 7's net-zero removal property, we deduce that removing a right-left pair doesn't alter any prefix sum. Specifically, since the pair's weights sum to zero, deleting them shifts subsequent indices but preserves cumulative sums relative to the new sequence. This invariance is pivotal because it means key properties of the prefix sum sequence remain constant throughout the removal process.",
                "direct_dependent_steps": [
                    5,
                    7
                ],
                "node": "Removing such a pair does not alter any prefix sum of the remaining sequence."
            },
            {
                "step_id": 9,
                "edge": "Building on Step 8's demonstration that prefix sums are invariant under removals, we conclude that the minimum prefix sum value must also remain unchanged. This follows because the set of prefix sums (and thus their minimum) is preserved when removing zero-sum pairs. This invariant will ultimately link the initial arrangement to the final state after all possible removals.",
                "direct_dependent_steps": [
                    8
                ],
                "node": "Therefore the minimum value among all prefix sums is invariant under the removal operations."
            },
            {
                "step_id": 10,
                "edge": "Through analysis of the removal process (not dependent on prior computational steps), we recognize that termination occurs only when no adjacent right-left pairs remain. This forces the final configuration to be a block of left-facing people followed by right-facing people. Given equal initial counts, the remaining counts must be equal—denoted as $k$ left and $k$ right for some $k \\geq 0$—since unmatched orientations would create removable pairs.",
                "direct_dependent_steps": null,
                "node": "The final configuration after all removals consists of $k$ left-facing people followed by $k$ right-facing people for some nonnegative integer $k$."
            },
            {
                "step_id": 11,
                "edge": "Using Step 5's prefix sum definition and Step 10's final configuration structure ($k$ left followed by $k$ right), we compute the prefix sums explicitly. Starting from the left, each left-facing person (weight $-1$) decreases the sum: $-1, -2, \\dots, -k$. Then each right-facing person (weight $1$) increases it: $-k+1, \\dots, 0$. This sequence confirms the minimum prefix sum occurs at the end of the left block.",
                "direct_dependent_steps": [
                    5,
                    10
                ],
                "node": "In a sequence of $k$ left-facing people followed by $k$ right-facing people the prefix sums are $-1,-2,\\dots,-k,-k+1,\\dots,0$."
            },
            {
                "step_id": 12,
                "edge": "From Step 11's enumeration of prefix sums in the final configuration, we identify the minimum value as $-k$ (the last sum in the left block). This direct observation is essential because it connects the structural parameter $k$ (half the remaining people) to a numerical property we can track via invariants.",
                "direct_dependent_steps": [
                    11
                ],
                "node": "Therefore the minimum prefix sum in the final configuration equals $-k$."
            },
            {
                "step_id": 13,
                "edge": "Combining Step 9's invariant (minimum prefix sum unchanged from initial to final) with Step 12's final-state minimum ($-k$), we equate the initial minimum prefix sum to $-k$. Since $2k$ people remain, solving for $2k$ yields $-2 \\times (\\text{minimum prefix sum})$. This algebraic rearrangement establishes the core relationship between the invariant and the quantity we need to compute.",
                "direct_dependent_steps": [
                    9,
                    12
                ],
                "node": "Hence the number of people remaining, which is $2k$, equals $-2$ times the minimum prefix sum."
            },
            {
                "step_id": 14,
                "edge": "Applying linearity of expectation to Step 13's deterministic relationship ($2k = -2 \\times \\min S_i$), we conclude the expected number of remaining people equals $-2$ times the expected minimum prefix sum. This step shifts our focus to computing $E[\\min S_i]$, leveraging probability theory to handle the random initial arrangement.",
                "direct_dependent_steps": [
                    13
                ],
                "node": "Therefore the expected number of people remaining equals $-2$ times the expected minimum prefix sum."
            },
            {
                "step_id": 15,
                "edge": "Referencing Steps 3 and 4's weight definitions, we note that swapping all directions (right $\\leftrightarrow$ left) negates all weights. Since the problem treats both orientations symmetrically and has equal counts, this negation maps arrangements to equally probable arrangements. This symmetry is fundamental for relating minimum and maximum prefix sums later.",
                "direct_dependent_steps": [
                    3,
                    4
                ],
                "node": "Negating all weights in any arrangement yields another arrangement with equal probability."
            },
            {
                "step_id": 16,
                "edge": "Using Step 5's prefix sum definition and Step 15's weight negation, we see that negating all weights transforms $S_i$ to $-S_i$ for every $i$. This follows directly from the linearity of summation: $\\sum (-w_j) = -\\sum w_j$. Consequently, the entire prefix sum sequence is reflected about zero.",
                "direct_dependent_steps": [
                    5,
                    15
                ],
                "node": "Negating all weights changes each prefix sum from $S_i$ to $-S_i$."
            },
            {
                "step_id": 17,
                "edge": "From Step 16's transformation ($S_i \\to -S_i$), we deduce that the maximum prefix sum of the original sequence becomes the negative of the minimum prefix sum after negation. Specifically, $\\max(-S_i) = -\\min(S_i)$, which rearranges to $\\max(S_i) = -\\min(-S_i)$. This duality is crucial for exploiting symmetry in expectations.",
                "direct_dependent_steps": [
                    16
                ],
                "node": "Consequently, negation transforms the maximum prefix sum of an arrangement into the negative of its minimum prefix sum."
            },
            {
                "step_id": 18,
                "edge": "Combining Step 15's distributional symmetry (arrangements are equally likely under negation) with Step 17's transformation ($\\max S_i = -\\min(-S_i)$), we establish $E[\\min S_i] = -E[\\max S_i]$. This equality holds because the distribution of $\\min S_i$ under negation matches $-\\max S_i$, and symmetry implies identical expectations for corresponding quantities.",
                "direct_dependent_steps": [
                    15,
                    17
                ],
                "node": "Because the distribution of arrangements is invariant under weight negation, $E[\\min S_i]=-E[\\max S_i]$."
            },
            {
                "step_id": 19,
                "edge": "Substituting Step 18's identity ($E[\\min S_i] = -E[\\max S_i]$) into Step 14's expression ($-2E[\\min S_i]$) yields $2E[\\max S_i]$. This simplification replaces the minimum prefix sum expectation with the maximum, which is more amenable to combinatorial counting via the reflection principle in subsequent steps.",
                "direct_dependent_steps": [
                    14,
                    18
                ],
                "node": "Therefore $-2E[\\min S_i]=2E[\\max S_i]$."
            },
            {
                "step_id": 20,
                "edge": "We apply the standard expectation formula for nonnegative integer random variables: $E[X] = \\sum_{k=1}^\\infty \\Pr[X \\geq k]$. This holds for $X = \\max S_i$ (which is nonnegative since $S_0=0$ and steps can increase it), converting the expectation into a sum of tail probabilities that are easier to compute combinatorially.",
                "direct_dependent_steps": null,
                "node": "By the definition of expectation for a nonnegative integer random variable, $E[\\max S_i]=\\sum_{k=1}^\\infty \\Pr[\\max S_i\\ge k]$."
            },
            {
                "step_id": 21,
                "edge": "Merging Step 19's result ($-2E[\\min S_i] = 2E[\\max S_i]$) with Step 20's tail-sum formula ($E[\\max S_i] = \\sum_{k=1}^\\infty \\Pr[\\max S_i \\geq k]$), we express the expected remaining people as $2\\sum_{k=1}^\\infty \\Pr[\\max S_i \\geq k]$. This rephrasing focuses our effort on computing these probabilities.",
                "direct_dependent_steps": [
                    19,
                    20
                ],
                "node": "Hence the expected number of people remaining equals $2\\sum_{k=1}^\\infty \\Pr[\\max S_i\\ge k]$."
            },
            {
                "step_id": 22,
                "edge": "By the reflection principle—a combinatorial technique for counting paths with barriers—the number of sequences with 50 $+1$'s and 50 $-1$'s where the maximum prefix sum reaches at least $k$ equals $\\binom{100}{50-k}$. This standard result accounts for paths crossing the level $k$ by reflecting the segment after first passage, yielding the binomial coefficient expression.",
                "direct_dependent_steps": null,
                "node": "By the reflection principle, the number of arrangements of 50 ones and 50 minus ones with maximum prefix sum at least $k$ equals $\\binom{100}{50-k}$."
            },
            {
                "step_id": 23,
                "edge": "Using Step 22's count of favorable arrangements ($\\binom{100}{50-k}$) and Step 2's total arrangements ($\\binom{100}{50}$), we compute the probability as $\\Pr[\\max S_i \\geq k] = \\binom{100}{50-k} / \\binom{100}{50}$. This ratio follows from the problem's uniform probability assumption over all $\\binom{100}{50}$ arrangements.",
                "direct_dependent_steps": [
                    22,
                    2
                ],
                "node": "Hence $\\Pr[\\max S_i\\ge k]=\\binom{100}{50-k}/\\binom{100}{50}$."
            },
            {
                "step_id": 24,
                "edge": "From Step 23's probability expression, we note that $\\binom{100}{50-k} = 0$ when $k > 50$ (since binomial coefficients vanish for negative lower indices or indices exceeding $n$). Thus the infinite sum $\\sum_{k=1}^\\infty \\Pr[\\max S_i \\geq k]$ truncates to $\\sum_{k=1}^{50} \\Pr[\\max S_i \\geq k]$, as terms beyond $k=50$ contribute nothing.",
                "direct_dependent_steps": [
                    23
                ],
                "node": "Because $\\binom{100}{50-k}=0$ for $k>50$, $\\sum_{k=1}^\\infty \\binom{100}{50-k}/\\binom{100}{50}=\\sum_{k=1}^{50}\\binom{100}{50-k}/\\binom{100}{50}$."
            },
            {
                "step_id": 25,
                "edge": "Substituting Step 24's finite sum into Step 21's expression, we rewrite the expected remaining people as $2\\sum_{k=1}^{50} \\binom{100}{50-k} / \\binom{100}{50}$. This consolidation prepares for algebraic simplification by focusing on the computable range $k=1$ to $50$.",
                "direct_dependent_steps": [
                    21,
                    24
                ],
                "node": "Therefore $2\\sum_{k=1}^\\infty \\Pr[\\max S_i\\ge k]=2\\sum_{k=1}^{50}\\binom{100}{50-k}/\\binom{100}{50}$."
            },
            {
                "step_id": 26,
                "edge": "By the symmetry of binomial coefficients ($\\binom{n}{m} = \\binom{n}{n-m}$) and the identity $\\sum_{m=0}^{100} \\binom{100}{m} = 2^{100}$, we derive $\\sum_{m=0}^{49} \\binom{100}{m} = \\frac{2^{100} - \\binom{100}{50}}{2}$. This holds because the symmetric sum excludes the central term $\\binom{100}{50}$ and splits the remaining total equally between lower and upper halves.",
                "direct_dependent_steps": null,
                "node": "By symmetry of binomial coefficients, $\\sum_{m=0}^{49}\\binom{100}{m}=\\frac{2^{100}-\\binom{100}{50}}{2}$."
            },
            {
                "step_id": 27,
                "edge": "Reindexing the sum in Step 25 via $m = 50 - k$ (so $k=1$ gives $m=49$, $k=50$ gives $m=0$), we have $\\sum_{k=1}^{50} \\binom{100}{50-k} = \\sum_{m=0}^{49} \\binom{100}{m}$. Step 26 then provides the closed form $\\frac{2^{100} - \\binom{100}{50}}{2}$ for this sum, enabling direct computation.",
                "direct_dependent_steps": [
                    26
                ],
                "node": "Since $\\sum_{k=1}^{50}\\binom{100}{50-k}=\\sum_{m=0}^{49}\\binom{100}{m}$, it equals $\\frac{2^{100}-\\binom{100}{50}}{2}$."
            },
            {
                "step_id": 28,
                "edge": "Multiplying Step 27's sum ($\\sum_{m=0}^{49} \\binom{100}{m} = \\frac{2^{100} - \\binom{100}{50}}{2}$) by 2 yields $2^{100} - \\binom{100}{50}$. This simplification collapses the double sum into a compact expression, verified by noting $2 \\times \\frac{A}{2} = A$ where $A = 2^{100} - \\binom{100}{50}$.",
                "direct_dependent_steps": [
                    27
                ],
                "node": "Multiplying that by $2$ yields $2\\sum_{k=1}^{50}\\binom{100}{50-k}=2^{100}-\\binom{100}{50}$."
            },
            {
                "step_id": 29,
                "edge": "Combining Step 25's expression (expected remaining = $2 \\sum_{k=1}^{50} \\binom{100}{50-k} / \\binom{100}{50}$) with Step 28's result ($2 \\sum_{k=1}^{50} \\binom{100}{50-k} = 2^{100} - \\binom{100}{50}$), we obtain $(2^{100} - \\binom{100}{50}) / \\binom{100}{50}$. This quotient represents the expectation as a ratio of combinatorial quantities.",
                "direct_dependent_steps": [
                    25,
                    28
                ],
                "node": "Therefore the expected number of people remaining equals $(2^{100}-\\binom{100}{50})/\\binom{100}{50}$."
            },
            {
                "step_id": 30,
                "edge": "Algebraically simplifying Step 29's quotient: $(2^{100} - \\binom{100}{50}) / \\binom{100}{50} = 2^{100}/\\binom{100}{50} - \\binom{100}{50}/\\binom{100}{50} = 2^{100}/\\binom{100}{50} - 1$. This final rearrangement isolates the constant term and matches the problem's expected answer format.",
                "direct_dependent_steps": [
                    29
                ],
                "node": "Simplifying gives $(2^{100}-\\binom{100}{50})/\\binom{100}{50}=2^{100}/\\binom{100}{50}-1$."
            },
            {
                "step_id": 31,
                "edge": "Confirming Step 30's simplified expression $2^{100}/\\binom{100}{50} - 1$ as the expected number of remaining people, we box this result per standard mathematical convention for final answers. This matches the problem's stated final answer and completes the derivation.",
                "direct_dependent_steps": [
                    30
                ],
                "node": "The final answer is \\boxed{2^{100}/\\binom{100}{50}-1}."
            }
        ]
    }
]
