{
  "episode_id": "0ff53813-3367-4f43-bcbd-3fd725c1bf4b",
  "task": "What two-word type of model did Manash Pratim Kashyap's and PS Fader's studies in customer retention studies published during 2018-2019 have in common (no punctuation)?",
  "file": "",
  "success": true,
  "num_turns": 25,
  "total_tokens": 70166,
  "trajectory": [
    {
      "turn_idx": 0,
      "action": "file_saver({\"content\": \"Step 0: Parsed Question, Constraints, and Output Rules\\n\\n1) Task understanding\\n- Goal: Identify the two-word type of model that is common to customer retention studies published during 2018–2019 by Manash Pratim Kashyap and PS Fader (Peter S. Fader).\\n- Entities: \\\"Manash Pratim Kashyap\\\" and \\\"PS Fader\\\" (aka Peter S. Fader).\\n- Domain: Customer retention studies.\\n- Time window: Publications in 2018 or 2019.\\n- Output target: A shared model type described by exactly two words.\\n\\n2) Output formatting rules (must comply strictly)\\n- Final answer must be two words only, no punctuation.\\n- No articles (a, an, the) and no abbreviations.\\n- If digits are required, they must be written in plain text (not numerals). Likely not needed here.\\n- End with: \\\"FINAL ANSWER: [your two-word phrase]\\\".\\n- Answer should be as few words as possible; here exactly two words.\\n\\n3) Research scope and inclusion criteria\\n- Locate a customer retention study by Manash Pratim Kashyap published in 2018 or 2019.\\n- Locate a customer retention study by PS Fader (Peter S. Fader) published in 2018 or 2019.\\n- From each, extract the explicit model type(s) used (e.g., phrases like \\\"hazard model\\\", \\\"survival model\\\", \\\"Pareto NBD\\\", \\\"Markov model\\\", etc.).\\n- Determine the two-word model type common to both.\\n\\n4) Disambiguation and pitfalls to avoid\\n- Name ambiguity: confirm Manash Pratim Kashyap’s identity and that the study is indeed on customer retention.\\n- Time filter: ensure publication year is 2018 or 2019.\\n- Topic filter: ensure the focus is customer retention (not acquisition only, not general marketing unless retention modeling is present).\\n- Model phrase extraction: capture exact two-word phrases from the methodology or abstract; avoid paraphrase.\\n\\n5) Planned search queries (for next step)\\n- \\\"Manash Pratim Kashyap customer retention 2018\\\"\\n- \\\"Manash Pratim Kashyap 2019 retention model\\\"\\n- \\\"PS Fader 2018 customer retention model\\\"\\n- \\\"Peter S. Fader 2019 retention model\\\"\\n- \\\"Peter S. Fader churn model 2018\\\"\\n- \\\"Kashyap customer retention hazard model\\\"\\n- \\\"Fader Pareto NBD 2018 2019\\\"\\n\\n6) Evidence handling\\n- Use multiple credible sources (journals, working papers, institutional repositories).\\n- Extract verbatim phrases with proper source attribution.\\n- Cross-verify across sources if possible.\\n\\n7) Deliverable for later steps\\n- Determine and output only the shared two-word model type, formatted per rules, with minimal wording and no punctuation.\\n\", \"file_path\": \"work_space/gaia_0ff53813_20251222_031212_092979/step0_parsed_requirements.md\", \"mode\": \"w\"})",
      "explanation": "Parsed the question and constraints. Created a structured note detailing entities, time window, output rules, research plan, and potential queries. Saved as work_spacegaia_0ff53813_20251222_031212_092979/step0_parsed_requirements.md. Ready to proceed to searching for the two studies (Step 1).",
      "observation": "Content successfully saved to work_space/gaia_0ff53813_20251222_031212_092979/step0_parsed_requirements.md"
    },
    {
      "turn_idx": 1,
      "action": "mark_step({\"step_index\": 0, \"step_status\": \"completed\", \"step_notes\": \"Parsed the question and constraints. Created a structured note detailing entities, time window, output rules, research plan, and potential queries. Saved as work_space/gaia_0ff53813_20251222_031212_092979/step0_parsed_requirements.md. Ready to proceed to searching for the two studies (Step 1).\"})",
      "explanation": "",
      "observation": "Step 0: step_status is completed, step_notes is Parsed the question and constraints. Created a structured note detailing entities, time window, output rules, research plan, and potential queries. Saved as work_space/gaia_0ff53813_20251222_031212_092979/step0_parsed_requirements.md. Ready to proceed to searching for the two studies (Step 1). "
    },
    {
      "turn_idx": 2,
      "action": "tavily_search({\"query\": \"PS Fader 2019 customer retention model study\"})",
      "explanation": "Executed Step 1: Conducted targeted searches for Manash Pratim Kashyap’s 2019 customer retention paper and Peter S. Fader’s 2018 study. Retrieved and extracted content from IJAR 2019 PDF (Kashyap), ResearchBib, Academia.edu listing, LBS Research Online 2018 PDF (Fader et al.), and Wharton faculty PDF for background. The consolidated findings were saved as a single file per instruction: work_spacegaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md. Key extracted phrases:\n- Kashyap (2019): “discrete time beta geometric model”; keywords include “Beta geometric”.\n- Fader et al. (2018): “beta-geometric (BG) distribution” cited as robust model for retention. \nThese directly indicate the common two-word model type: “beta geometric”. No issues encountered beyond filtering to 2018–2019; extracted information is 100% faithful and properly attributed.",
      "observation": "{'query': 'PS Fader 2019 customer retention model study', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://journals.sagepub.com/doi/10.1002/dir.20074', 'title': 'How to project customer retention - Peter S. Fader, Bruce G.S. ...', 'content': 'An important managerial task is to take a series of past retention numbers for a given group of customers and project them into the future to make more accurate', 'score': 0.43208793, 'raw_content': None}, {'url': 'https://faculty.wharton.upenn.edu/wp-content/uploads/2012/04/Fader_hardie_jim_07.pdf', 'title': '[PDF] How to project customer retention - Wharton Faculty Platform', 'content': '\\x02 \\x04 t \\x01 1 \\x08 \\x04 \\x04 t \\x01 1 \\x02 ≠( \\x04 t) ≠( \\x04 t \\x01 1) n ≠(\\x08 \\x04 \\x04 t) (\\x08 \\x04 \\x04 t \\x01 1) \\x02 B(\\x08, \\x04 t) B(\\x08, \\x04 t \\x01 1) rt \\x02 B(\\x08, \\x04 t) B(\\x08, ) nB(\\x08, \\x04 t \\x01 1) B(\\x08, ) APPENDIX B IMPLEMENTING THE MODEL IN EXCEL Journal of Interactive Marketing DOI: 10.1002/dir HOW TO PROJECT CUSTOMER RETENTION 89 estimates of the model parameters.4 As the number computed using (B2) will be very small, we usually work with the natural logarithm of the likelihood function, the so-called log-likelihood function: (B3) For a sample of 1,000 High End customers, Table 1 implies the number of customers active at the end of Years 1–7 as reported in Table B1.', 'score': 0.3863703, 'raw_content': None}, {'url': 'https://scholar.google.com/citations?user=NYOYpwQAAAAJ&hl=en', 'title': '\\u202aPeter Fader\\u202c - \\u202aGoogle Scholar\\u202c', 'content': 'This \"Cited by\" count includes citations to the following articles in Scholar. | Citations \"This is the number of citations to all publications. | i10-index \"i10-index is the number of publications with at least 10 citations. * Eric BradlowThe Wharton School of the University of PennsylvaniaVerified email at wharton.upenn.edu. * Kartik HosanagarWharton School, University of PennsylvaniaVerified email at wharton.upenn.edu. * Daniel McCarthyDepartment of Marketing, Goizueta Business School at Emory UniversityVerified email at emory.edu. * Jeffrey LarsonAssociate Professor of Marketing, Brigham Young UniversityVerified email at byu.edu. * Necati TereyagogluAssociate Professor of Operations Management, University of South CarolinaVerified email at moore.sc.edu. * Senthil VeeraraghavanProfessor of Operations, Information and Decisions, Wharton School, University of PennsylvaniaVerified email at wharton.upenn.edu. * Max BazermanHarvard Business School, Harvard UniversityVerified email at hbs.edu. HauserKirin Professor of Marketing, MITVerified email at mit.edu. * Shane JensenProfessor of Statistics, The Wharton School, University of PennsylvaniaVerified email at wharton.upenn.edu. Verified email at wharton.upenn.edu - Homepage. Articles)Cited by)Co-authors).', 'score': 0.23837323, 'raw_content': None}, {'url': 'https://www.researchgate.net/publication/314904887_How_to_Project_Customer_Retention', 'title': '(PDF) How to Project Customer Retention - ResearchGate', 'content': 'The sBG model allows continuous interpretation of the conditions in which the customer retention is determined to be discrete-time contractual in a contractual', 'score': 0.20615827, 'raw_content': None}, {'url': 'https://www.tandfonline.com/doi/full/10.1080/2573234X.2025.2551950?src=', 'title': 'Customer retention model using machine learning for improved user ...', 'content': \"+ Browse all journals A-Z. + Find guidance on Author Services. + Medicine, Dentistry, Nursing & Allied Health. We're here to help. Find guidance on Author Services. * Medicine, Dentistry, Nursing & Allied Health. + Browse all journals A-Z. + Find guidance on **Author Services**.\", 'score': 0.1591785, 'raw_content': None}], 'response_time': 0.86, 'request_id': '0112f2ee-0031-45b2-a80e-af80ad77ac9f'}"
    },
    {
      "turn_idx": 3,
      "action": "tavily_search({\"query\": \"Manash Pratim Kashyap 2019 customer retention study model\"})",
      "explanation": "",
      "observation": "{'query': 'Manash Pratim Kashyap 2019 customer retention study model', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://www.researchgate.net/profile/Manash-Kashyap-2', 'title': 'Manash Pratim Kashyap Professor (Associate) at Assam ...', 'content': 'A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE. Article. Full-text available. Mar 2019. Manash ...Read more', 'score': 0.8047902, 'raw_content': None}, {'url': 'https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf', 'title': 'ISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59 ...', 'content': 'Objective of the Study The main objective of the study is to compare the customer retention of the grocery store for the two population i.e. rural and urban using stochastic model. Hypotheses of the Study Based on the objective our the major hypothesis is as follows H0: There is no significant difference between the „Customer retention‟ of the grocery store of rural and urban.Probabilistic model is used to compare the customer retention of rural and urban grocery store. Table 2:-Parameter Estimation Places \\uf061 ˆ \\uf062 ˆ \\uf071 Rural 0.559988 1.679963 0.25 Urban 0.901502 1.118529 0.446281 Table 2 gives the value of the model parameters and the churn probabilities of rural and urban customer. After estimating the sBG model parameter the retention rate of each of rural and urban consumer is computed the following table no 3 gives the customer retention rate of rural and urban for both the discrete time point second and third purchase occasion. Similarly, in the third purchase occasion rural customer has more retention rate than the urban customer.', 'score': 0.5792344, 'raw_content': None}, {'url': 'https://huggingface.co/datasets/Intelligent-Internet/GAIA-Subset-Benchmark', 'title': 'Intelligent-Internet/GAIA-Subset-Benchmark · Datasets at ...', 'content': '#### Table: Summary of Non-Native Clownfish Occurrences Before 2020 | \\\\*\\\\*State\\\\*\\\\* | \\\\*\\\\*County\\\\*\\\\* | \\\\*\\\\*Location\\\\*\\\\* | \\\\*\\\\*Zip Code\\\\*\\\\* | \\\\*\\\\*Year Observed\\\\*\\\\* | \\\\*\\\\*Status\\\\*\\\\* | |-----------|------------------|-------------------------|--------------|--------------------|--------------| | Florida | Pinellas County | Fred Howard Park | 34689 | 2018 | Eradicated | \\\\*Note: No other occurrences in the United States were reported before 2020.\\\\* #### Limitations and Research Gaps While the USGS database provides valuable information on non-native species occurrences, there are limitations: 1. --- ### Summary Table | \\\\*\\\\*Species\\\\*\\\\* | \\\\*\\\\*Location\\\\*\\\\* | \\\\*\\\\*Zip Code\\\\*\\\\* | \\\\*\\\\*Year Observed\\\\*\\\\* | \\\\*\\\\*Status\\\\*\\\\* | \\\\*\\\\*Source\\\\*\\\\* | |--------------------------|------------------------------------|--------------|-------------------|--------------|-----------------------------------------------------------------------------| | \\\\*Amphiprion ocellaris\\\\* | Fred Howard Park, Tarpon Springs, FL | \\\\*\\\\*34689\\\\*\\\\* | 2018 | Eradicated | USGS Species Profile | --- ### Conclusion Based on the USGS Nonindigenous Aquatic Species database, the only documented nonnative occurrence of the clownfish (\\\\*Amphiprion ocellaris\\\\*) in the United States before 2020 was in \\\\*\\\\*Fred Howard Park\\\\*\\\\*, located in \\\\*\\\\*Tarpon Springs, Florida\\\\*\\\\*. While Kashyap adapted the model to a discrete time framework for analyzing rural and urban customer behavior, Fader’s research reinforced the broader applicability and robustness of the Beta Geometric model in customer retention studies.', 'score': 0.53968745, 'raw_content': None}, {'url': 'https://www.researchgate.net/publication/332297417_A_SIMPLE_MODEL_FOR_ANALYZING_THE_CUSTOMER_RETENTION_COMPARING_RURAL_AND_URBAN_STORE', 'title': 'a simple model for analyzing the customer retention ...', 'content': 'The main objective of the study is to compare the customer retention of the grocery store for the two population i.e. rural and urban using stochastic model. urban.Probabilistic model is used to compare the customer retention of rural and urban grocery store. customers who are still active  at time t in grocery store for  urban and rural population respectively. Now  our approach  is to  fit  the  sBG  model  for  the three  purchase  occasion  of rural  and  urban  customer. Table 2 gives the value of the model parameters and the churn probabilities of rural and urban customer. gives  the  customer  retention rate  of rural  and urban  for  both the  discrete time  point  second  and third  purchase. Table 3:-Customer retention of Rural and urban. Table 3 shows that retention rate of rural customer is more than the urban customer in the second purchase occasion. Similarly,  in the  third  purchase occasion  rural  customer has  more  retention rate  than  the urban  customer. Figure 3:-Retention rate of Rural and Urban Customer.', 'score': 0.444493, 'raw_content': None}, {'url': 'https://www.academia.edu/89361498/A_Simple_Model_for_Analyzing_the_Customer_Retention_Comparing_Rural_and_Urban_Store', 'title': 'A Simple Model for Analyzing the Customer Retention ...', 'content': \"# A Simple Model for Analyzing the Customer Retention Comparing Rural and Urban Store. 2. The study analyzes customer retention using a discrete time beta geometric model on 390 households. A Bivariate Timing Model of Customer Acquisition and Retention. A Study Of Two Customer Retention Measures: The American Customer Satisfaction Index And The Conversion Model. One of management's major challenges is to utilise a model suitable to explain and predict customer retention for a particular company or brand. Objective: Identify a new model of predicting customer behavior based on new variables that can be used by marketing management and adapted to their business planning. Quantitative Modeling of Customer Retention in Context of Indian Retail Market. Modelling of customer retention in multi-format retailing. We compare the predictive performance of Pareto/NBD and BG/NBD models from the customer base analysis literaturein terms of repeat purchase levels and active statususing grocery retail transaction data. The recency, frequency and monetary (RFM) based classification techniques are proposed to model the customer purchase behaviour.\", 'score': 0.42978892, 'raw_content': None}], 'response_time': 0.67, 'request_id': 'c62eebfc-b4d7-405c-9e52-b36afcbe19cd'}"
    },
    {
      "turn_idx": 4,
      "action": "tavily_search({\"query\": \"Peter S. Fader 2018 customer retention study model\"})",
      "explanation": "",
      "observation": "{'query': 'Peter S. Fader 2018 customer retention study model', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://journals.sagepub.com/doi/abs/10.1016/j.intmar.2018.01.002', 'title': '“How to Project Customer Retention” Revisited: The Role ...', 'content': \"According to this model, the phenomenon of increasing cohort-level retention rates is purely due to cross-sectional heterogeneity; an individual customer's propensity to churn does not change over time. Fader Peter S., Hardie Bruce G.S. How to Project Customer Retention *Journal of Interactive Marketing* 21 Winter 2007 76–90. Fader Peter S., Hardie Bruce G.S. Customer-Base Valuation in a Contractual Setting: The Perils of Ignoring Heterogeneity *Marketing Science* 29 January–February 2010 85–93. Fox Richard J., Reddy Srinivas K., Rao Bharat Modeling Response to Repetitive Promotional Stimuli *Journal of the Academy of Marketing Science* 25 3 1997 242–255. Improving the Diagnosis and Prediction of Customer Churn: A Heterogeneous Hazard Modeling Approach *Journal of Interactive Marketing* 20 Summer/Autumn 2006 16–29. Schweidel David A., Fader Peter S., Bradlow Eric Understanding Service Retention Within and Across Cohorts Using Limited Information *Journal of Marketing* 72 January 2008 82–94. For more information view the Sage Journals article sharing.\", 'score': 0.7144891, 'raw_content': None}, {'url': 'https://gregoryfaletto.com/2018/11/20/the-mccarthy-fader-hardie-model-for-customer-retention/', 'title': 'The McCarthy/Fader/Hardie Model for Customer Retention', 'content': 'To predict how many customers will be retained and how many will churn each month, McCarthy, Fader, and Hardie advocate using a technique from ...Read more', 'score': 0.6821721, 'raw_content': None}, {'url': 'https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf', 'title': '“How to Project Customer Retention” Revisited', 'content': 'Regular High End θ 0.374 0.138 c 0.636 0.910 LL −1404.0 −1226.5 Table 3: dW model estimation results As would be expected, our estimate of c is less than 1 (i.e., negative duration dependence) for both datasets, which implies that individual-level retention probabilities increase over time. Fitting this model to the High End dataset using a ﬁve-year model calibration period yields the following parameter estimates: ˆ r = 0.483, ˆ α = 0.562, and ˆ s = 2.721; the associated value of the log-likelihood function is −1222.8.8 The model-based estimates of survival and retention are compared against the actual numbers in Figure 10. This model is suﬃciently ﬂexible to capture the phenomenon of non-monotonically increasing cohort-level retention rates.11 If the aggregate retention curve is U-shaped then individual-level churn probabilities must increase over time (i.e., c > 1).', 'score': 0.52597815, 'raw_content': None}, {'url': 'https://www.researchgate.net/publication/326744305_How_to_Project_Customer_Retention_Revisited_The_Role_of_Duration_Dependence', 'title': '“How to Project Customer Retention” Revisited: The Role ...', 'content': 'To gain a better understanding of customer retention, studies have focused on developing models of relationship duration to predict retention rates (Fader et al ...Read more', 'score': 0.48879638, 'raw_content': None}, {'url': 'https://faculty.wharton.upenn.edu/wp-content/uploads/2012/04/Fader_hardie_jim_07.pdf', 'title': 'How to project customer retention - Wharton Faculty Platform', 'content': '\\x02 \\x04 t \\x01 1 \\x08 \\x04 \\x04 t \\x01 1 \\x02 ≠( \\x04 t) ≠( \\x04 t \\x01 1) n ≠(\\x08 \\x04 \\x04 t) (\\x08 \\x04 \\x04 t \\x01 1) \\x02 B(\\x08, \\x04 t) B(\\x08, \\x04 t \\x01 1) rt \\x02 B(\\x08, \\x04 t) B(\\x08, ) nB(\\x08, \\x04 t \\x01 1) B(\\x08, ) APPENDIX B IMPLEMENTING THE MODEL IN EXCEL Journal of Interactive Marketing DOI: 10.1002/dir HOW TO PROJECT CUSTOMER RETENTION 89 estimates of the model parameters.4 As the number computed using (B2) will be very small, we usually work with the natural logarithm of the likelihood function, the so-called log-likelihood function: (B3) For a sample of 1,000 High End customers, Table 1 implies the number of customers active at the end of Years 1–7 as reported in Table B1.', 'score': 0.42892754, 'raw_content': None}], 'response_time': 0.83, 'request_id': 'e2a66b8a-85a6-4b80-a1e8-402077b89912'}"
    },
    {
      "turn_idx": 5,
      "action": "tavily_search({\"query\": \"Manash Pratim Kashyap customer retention 2018 study\"})",
      "explanation": "",
      "observation": "{'query': 'Manash Pratim Kashyap customer retention 2018 study', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://iarjset.com/issues/volume-5-issue-10-october-2018/', 'title': 'VOLUME 5, ISSUE 10, OCTOBER 2018 Archives', 'content': 'How Service Quality Influence of Satisfaction and Trust Towards Consumer Loyalty in Starbucks Coffee Indonesia ... Shonahar Ali, Manash Pratim Kashyap. Abstract | ...Read more', 'score': 0.6783488, 'raw_content': None}, {'url': 'https://huggingface.co/datasets/Intelligent-Internet/GAIA-Subset-Benchmark', 'title': 'Intelligent-Internet/GAIA-Subset-Benchmark · Datasets at ...', 'content': '#### Table: Summary of Non-Native Clownfish Occurrences Before 2020 | \\\\*\\\\*State\\\\*\\\\* | \\\\*\\\\*County\\\\*\\\\* | \\\\*\\\\*Location\\\\*\\\\* | \\\\*\\\\*Zip Code\\\\*\\\\* | \\\\*\\\\*Year Observed\\\\*\\\\* | \\\\*\\\\*Status\\\\*\\\\* | |-----------|------------------|-------------------------|--------------|--------------------|--------------| | Florida | Pinellas County | Fred Howard Park | 34689 | 2018 | Eradicated | \\\\*Note: No other occurrences in the United States were reported before 2020.\\\\* #### Limitations and Research Gaps While the USGS database provides valuable information on non-native species occurrences, there are limitations: 1. --- ### Summary Table | \\\\*\\\\*Species\\\\*\\\\* | \\\\*\\\\*Location\\\\*\\\\* | \\\\*\\\\*Zip Code\\\\*\\\\* | \\\\*\\\\*Year Observed\\\\*\\\\* | \\\\*\\\\*Status\\\\*\\\\* | \\\\*\\\\*Source\\\\*\\\\* | |--------------------------|------------------------------------|--------------|-------------------|--------------|-----------------------------------------------------------------------------| | \\\\*Amphiprion ocellaris\\\\* | Fred Howard Park, Tarpon Springs, FL | \\\\*\\\\*34689\\\\*\\\\* | 2018 | Eradicated | USGS Species Profile | --- ### Conclusion Based on the USGS Nonindigenous Aquatic Species database, the only documented nonnative occurrence of the clownfish (\\\\*Amphiprion ocellaris\\\\*) in the United States before 2020 was in \\\\*\\\\*Fred Howard Park\\\\*\\\\*, located in \\\\*\\\\*Tarpon Springs, Florida\\\\*\\\\*. While Kashyap adapted the model to a discrete time framework for analyzing rural and urban customer behavior, Fader’s research reinforced the broader applicability and robustness of the Beta Geometric model in customer retention studies.', 'score': 0.63098115, 'raw_content': None}, {'url': 'https://www.researchgate.net/profile/Manash-Kashyap-2', 'title': 'Manash Pratim Kashyap Professor (Associate) at Assam ...', 'content': 'A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE. ... Manas Pra · Manash Pratim Kashyap.Read more', 'score': 0.55420566, 'raw_content': None}, {'url': 'https://independent.academia.edu/ManashPratimKashyap2', 'title': 'Manash Pratim Kashyap - Independent Researcher', 'content': 'Comparative Phytochemical study of total compound of Glycyrhiza glabra (Yasthimadhu) and Tinospora cordifolia (Giloy) and their antimicrobial activity. Aim: To find out the incidence of CA occurring among institutional live births and to study the associated maternal and perinatal risk factors in Northeast region of India. We analysed their expression profile to evaluate as molecular biomarkers of esophageal squamous cell carcinoma (ESCC) and their association with different parameters.MethodsExpression analysis of IL8, IL12 and IL13 were performed by Real time qPCR in blood and tumor tissue of 120 ESCC patients. Factor analysis on issues of teaching and learning mathematics for UG level students of Kamrup(M) of Assam. The study mainly revolves around the factors posing as issues in learning and teaching of mathematics in the Kamrup (M) areas of Assam. In this study, sputum samples were collected from the patients who reported cough in outdoor patient departments. Age, gender, and location-wise data were prepared for study.', 'score': 0.35226917, 'raw_content': None}, {'url': 'https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf', 'title': 'ISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59 ...', 'content': 'Objective of the Study The main objective of the study is to compare the customer retention of the grocery store for the two population i.e. rural and urban using stochastic model. Hypotheses of the Study Based on the objective our the major hypothesis is as follows H0: There is no significant difference between the „Customer retention‟ of the grocery store of rural and urban.Probabilistic model is used to compare the customer retention of rural and urban grocery store. Table 2:-Parameter Estimation Places \\uf061 ˆ \\uf062 ˆ \\uf071 Rural 0.559988 1.679963 0.25 Urban 0.901502 1.118529 0.446281 Table 2 gives the value of the model parameters and the churn probabilities of rural and urban customer. After estimating the sBG model parameter the retention rate of each of rural and urban consumer is computed the following table no 3 gives the customer retention rate of rural and urban for both the discrete time point second and third purchase occasion. Similarly, in the third purchase occasion rural customer has more retention rate than the urban customer.', 'score': 0.2806661, 'raw_content': None}], 'response_time': 0.84, 'request_id': '76946555-34cc-4503-91d3-1ccf1a715792'}"
    },
    {
      "turn_idx": 6,
      "action": "tavily_search({\"query\": \"Fader hazard model customer retention 2018 2019\"})",
      "explanation": "",
      "observation": "{'query': 'Fader hazard model customer retention 2018 2019', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://gregoryfaletto.com/2018/11/20/the-mccarthy-fader-hardie-model-for-customer-retention/', 'title': 'The McCarthy/Fader/Hardie Model for Customer Retention', 'content': 'This model starts from the assumption that each customer \\\\(j\\\\) has a baseline probability of churning \\\\(\\\\lambda\\\\_R^{(j)}(t)\\\\)2 at any given time \\\\(t\\\\). With the baseline probability of churning along with these characteristics, the model estimates a probability of churning at a time \\\\(t\\\\) for individual \\\\(j\\\\),\\xa0\\\\(\\\\lambda\\\\_R^{(j)}(t \\\\mid X\\\\_R)\\\\). One way to interpret this function is that we have a baseline survival function \\\\(\\\\exp \\\\big[-\\xa0 \\\\lambda\\\\_R (m’ – m) \\\\big]\\\\) which predicts the probability that a customer acquired at time \\\\(m\\\\) will be retained until at least time \\\\(m’\\\\) when the covariates all equal zero, and this probability is multiplied by a factor \\\\( \\\\exp \\\\big[\\\\exp(X\\\\_{R}\\\\beta\\\\_R) \\\\big] \\\\) which can either increase or decrease the probability the customer will be retained for that long in a proportional way (that is, it extends or contracts the expected length of the customer relationship).', 'score': 0.86136365, 'raw_content': None}, {'url': 'https://www.ama.org/2019/04/24/escarzas-retention-futility-targeting-high-risk-customers-might-be-ineffective-wins-2018-award/', 'title': \"Ascarza's “Retention Futility: Targeting High-Risk Customers Might ...\", 'content': '# Ascarza’s “Retention Futility: Targeting High-Risk Customers Might be Ineffective,” Wins 2018 Award. Green Award for her article “Retention Futility: Targeting High-Risk Customers Might be Ineffective,” which appeared in the February 2018 issue of the *Journal of Marketing Research*. The committee provided the following statement about their choice of Ascarza’s paper for the Green Award:. *In her paper, “Retention Futility: Targeting High-Risk Customers Might be Ineffective,” Ascarza questions the assumption made by researchers and practitioners that targeting current customers with the highest risk of defection with retention interventions provides the most effective way to reduce customer churn. Ailawadi, Yu Ma, and Dhruv Grewal for their paper, “The Club Store Effect: Impact of Shopping in Warehouse Club Stores on Consumers’ Packaged Food Purchases,” published in the April 2018 issue of the *Journal of Marketing Research*. Fader for their paper,\\xa0“Customer-Based Corporate Valuation for Publicly Traded Noncontractual Firms,” published in the October 2018 issue of the *Journal of Marketing Research*.', 'score': 0.5786631, 'raw_content': None}, {'url': 'https://jasemjournal.com/wp-content/uploads/2019/10/JASEMV31_Lee-et-al-2019.pdf', 'title': '[PDF] THE MEDIATION EFFECT OF INERTIA ON SERVICE DURATION', 'content': 'To gain a better understanding of customer retention, studies have focused on developing models of relationship duration to predict retention rates (Fader et al', 'score': 0.5164741, 'raw_content': None}, {'url': 'https://journals.sagepub.com/doi/10.1002/dir.20074', 'title': 'How to project customer retention - Peter S. Fader, Bruce G.S. ...', 'content': 'An important managerial task is to take a series of past retention numbers for a given group of customers and project them into the future to make more accurate', 'score': 0.32530668, 'raw_content': None}, {'url': 'https://www.researchgate.net/publication/326744305_How_to_Project_Customer_Retention_Revisited_The_Role_of_Duration_Dependence', 'title': '“How to Project Customer Retention” Revisited: The Role of ...', 'content': 'To gain a better understanding of customer retention, studies have focused on developing models of relationship duration to predict retention rates (Fader et al', 'score': 0.31564152, 'raw_content': None}], 'response_time': 1.48, 'request_id': 'e7a97bad-746d-448b-802d-de4f907b4457'}"
    },
    {
      "turn_idx": 7,
      "action": "tavily_search({\"query\": \"Manash Pratim Kashyap hazard model customer retention\"})",
      "explanation": "",
      "observation": "{'query': 'Manash Pratim Kashyap hazard model customer retention', 'follow_up_questions': None, 'answer': None, 'images': [], 'results': [{'url': 'https://www.researchgate.net/profile/Manash-Kashyap-2', 'title': 'Manash Pratim Kashyap Professor (Associate) at Assam ...', 'content': 'A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE. ... Manash Pratim Kashyap. The main purpose of ...Read more', 'score': 0.999977, 'raw_content': None}, {'url': 'https://www.academia.edu/89361498/A_Simple_Model_for_Analyzing_the_Customer_Retention_Comparing_Rural_and_Urban_Store', 'title': 'A Simple Model for Analyzing the Customer Retention ...', 'content': \"# A Simple Model for Analyzing the Customer Retention Comparing Rural and Urban Store. 2. The study analyzes customer retention using a discrete time beta geometric model on 390 households. A Bivariate Timing Model of Customer Acquisition and Retention. A Study Of Two Customer Retention Measures: The American Customer Satisfaction Index And The Conversion Model. One of management's major challenges is to utilise a model suitable to explain and predict customer retention for a particular company or brand. Objective: Identify a new model of predicting customer behavior based on new variables that can be used by marketing management and adapted to their business planning. Quantitative Modeling of Customer Retention in Context of Indian Retail Market. Modelling of customer retention in multi-format retailing. We compare the predictive performance of Pareto/NBD and BG/NBD models from the customer base analysis literaturein terms of repeat purchase levels and active statususing grocery retail transaction data. The recency, frequency and monetary (RFM) based classification techniques are proposed to model the customer purchase behaviour.\", 'score': 0.99371004, 'raw_content': None}, {'url': 'https://www.researchgate.net/publication/332297417_A_SIMPLE_MODEL_FOR_ANALYZING_THE_CUSTOMER_RETENTION_COMPARING_RURAL_AND_URBAN_STORE', 'title': 'a simple model for analyzing the customer retention ...', 'content': 'The main objective of the study is to compare the customer retention of the grocery store for the two population i.e. rural and urban using stochastic model. urban.Probabilistic model is used to compare the customer retention of rural and urban grocery store. customers who are still active  at time t in grocery store for  urban and rural population respectively. Now  our approach  is to  fit  the  sBG  model  for  the three  purchase  occasion  of rural  and  urban  customer. Table 2 gives the value of the model parameters and the churn probabilities of rural and urban customer. gives  the  customer  retention rate  of rural  and urban  for  both the  discrete time  point  second  and third  purchase. Table 3:-Customer retention of Rural and urban. Table 3 shows that retention rate of rural customer is more than the urban customer in the second purchase occasion. Similarly,  in the  third  purchase occasion  rural  customer has  more  retention rate  than  the urban  customer. Figure 3:-Retention rate of Rural and Urban Customer.', 'score': 0.9914225, 'raw_content': None}, {'url': 'https://www.journalijar.com/article/26843/a-simple-model-for-analyzing-the-customer-retention-comparing-rural-and-urban-store/', 'title': 'A Simple Model For Analyzing The Customer Retention ...', 'content': 'The main goal of this paper is to predict future purchase patterns for rural and urban customer that can be described by the structural characteristics.Read more', 'score': 0.9914225, 'raw_content': None}, {'url': 'http://paper.researchbib.com/view/paper/207357', 'title': 'A SIMPLE MODEL FOR ANALYZING THE CUSTOMER ...', 'content': '# A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE. Many businesses follow the repeat operation of a particular customer with a same store in different discrete time point. Customer retention comprises the probability of customer that alive in particular store in different purchase occasion. Different purchase occasion with a particular store occur at regular time intervals. The main goal of this paper is to predict future purchase patterns for rural and urban customer that can be described by the structural characteristics. The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process. The model is applied in 194 household from urban and 196 household from rural. The study reveals that customer retention of rural customer is more than the urban customer. 1. A Novel Nutrient Mixture Induces Apoptosis in Human Mesothelioma Cells (MSTO-211H) via Activation of Caspases. 2. PROMOTION OF HEALTH EDUCATION WITH EMPHASIS ON THE IMPORTANCE OF FOOD AND NUTRITIONAL EDUCATION.', 'score': 0.97262347, 'raw_content': None}], 'response_time': 0.98, 'request_id': '16d11a51-3488-45d5-9b29-44807c16e4e3'}"
    },
    {
      "turn_idx": 8,
      "action": "fetch_website_content({\"website_url\": \"https://journals.sagepub.com/doi/abs/10.1016/j.intmar.2018.01.002\"})",
      "explanation": "",
      "observation": "Just a moment... Enable JavaScript and cookies to continue \n"
    },
    {
      "turn_idx": 9,
      "action": "fetch_website_content({\"website_url\": \"http://paper.researchbib.com/view/paper/207357\"})",
      "explanation": "",
      "observation": "A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE\nShare Your Research, Maximize Your Social Impacts\nSign for Notice Everyday Sign up \n>> Login English 中文\nHome Paper Archives Journal Indexing Research Conference\nMain Menu\nMy Profile\nAdd Article\nSearching By\nSearch more\nPARTNERS\nTweet\nA SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE Journal : International Journal of Advanced Research (Vol.7, No. 3) Publication Date : 2019-03-01 Authors : Manash Pratim Kashyap. ; Page : 52 - 59 Keywords : sBG Model Beta distribution Beta geometric Beta binomial. ; Source : Download Find it from : Google Scholar\nAbstract Many businesses follow the repeat operation of a particular customer with a same store in different discrete time point. Customer retention comprises the probability of customer that alive in particular store in different purchase occasion. Different purchase occasion with a particular store occur at regular time intervals. The main goal of this paper is to predict future purchase patterns for rural and urban customer that can be described by the structural characteristics. The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process. The model is applied in 194 household from urban and 196 household from rural. Each household visited three times once in every three months. The study reveals that customer retention of rural customer is more than the urban customer. Other Latest Articles A Novel Nutrient Mixture Induces Apoptosis in Human Mesothelioma Cells (MSTO-211H) via Activation of Caspases PROMOTION OF HEALTH EDUCATION WITH EMPHASIS ON THE IMPORTANCE OF FOOD AND NUTRITIONAL EDUCATION ANALYSIS OF FOOD HABITS AND PRACTICE OF PHYSICAL EXERCISES OF STUDENTS OF THE CITY OF RECIFE - PE IMPACT OF A WASTE DISPOSAL SITE ON GROUNDWATER QUALITY (A CASE STUDY OF OKEODO REFUSE DUMPSITE, IWO OSUN STATE SOUTHWESTERN, NIGERIA) SEASONAL VARIATIONS IN NUTRIENT CONCENTRATION OF JUICE AND SKIN OF TEN GRAPE (VITIS VINIFERA L.) GENOTYPES Last modified: 2019-04-22 18:50:12\nAdvertisement\nLegal\nTerms of Service\nPrivacy Policy\nAd Choices\nCookies\nAbout\nResearchBib\nCareers\nContact us\n© 2025 ResearchBib. All Rights Reserved.\n"
    },
    {
      "turn_idx": 10,
      "action": "fetch_website_content({\"website_url\": \"https://www.academia.edu/89361498/A_Simple_Model_for_Analyzing_the_Customer_Retention_Comparing_Rural_and_Urban_Store\"})",
      "explanation": "",
      "observation": "\n(PDF) A Simple Model for Analyzing the Customer Retention Comparing Rural and Urban Store\nAcademia.edu no longer supports Internet Explorer. To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .\nLog In Sign Up Log In Sign Up more About Press Papers Terms Privacy Copyright We're Hiring! Help Center less\nOutline keyboard_arrow_down Title Abstract Key Takeaways Figures Objective of the Study Hypotheses of the Study Sampling and Data Consideration Methodology:- Findings and Analysis:- Conclusion:- References FAQs All Topics Mathematics download Download Free PDF Download Free PDF A Simple Model for Analyzing the Customer Retention Comparing Rural and Urban Store Manash Pratim Kashyap 2019, International Journal of Advanced Research https://doi.org/10.21474/IJAR01/8598 visibility … description 8 pages description See full PDF download Download PDF\nbookmark Save to Library share Share close Sign up for access to the world's latest research Sign up for free arrow_forward check Get notified about relevant papers check Save papers to use in your research check Join the discussion with peers check Track your impact Abstract Many businesses follow the repeat operation of a particular customer with a same store in different discrete time point. Customer retention comprises the probability of customer that alive in particular store in different purchase occasion. Different purchase occasion with a particular store occur at regular time intervals. The main goal of this paper is to predict future purchase patterns for rural and urban customer that can be described by the structural characteristics. The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process. The model is applied in 194 household from urban and 196 household from rural. Each household visited three times once in every three months. The study reveals that customer retention of rural customer is more than the urban customer. ... Read more Key takeaways AI Rural customers exhibit higher retention rates than urban customers across multiple purchase occasions. The study analyzes customer retention using a discrete time beta geometric model on 390 households. Retention rates show 75% for rural customers and 55.37% for urban customers in the second purchase occasion. Churn probabilities for urban customers exceed those of rural customers, indicating greater switching tendencies. Differences in customer behavior stem from cultural and economic disparities between rural and urban populations.\nFigures (6) arrow_back_ios S(t) = Probability that a customer survival up to time t But practically it is observed that time horizon is limited and it is not possible to observe a cohort of customers infinitely. The investigator observed it is for some fixed time. Since @ is unobserved, i.e. for a given individual the value of @ shall remain constant over the time period of the study, This forward recursion formula used to compute the probabilities is derived by the following manner If we put T = | in (3) then Based on the methodology the different parameters are calculated using Microsoft Excel 2007. The survival data presented in table | are for segments of customer. One rural customer and another urban customer for grocery store type of business. Findings and Analysis:- After estimating the parameters the customer retention is calculated for the different discrete time period. Similarly for the rural, it as follows, Figure 2:-Distribution of Churn Probabilities for the Rural and urban Customer Table 2 gives the value of the model parameters and the churn probabilities of rural and urban customer. The churn probability of urban consumer is more than the rural consumer. This indicates that more rural customer is active with the same store than the urban customer. The urban customer is more chance to switch the other grocery store than the rural customer. It is clearer from the figure 1. Figure 3:-Retention rate of Rural and Urban Customer Table 3 shows that retention rate of rural customer is more than the urban customer in the second purchase occasion. Similarly, in the third purchase occasion rural customer has more retention rate than the urban customer. This indicates that rural customer has more chance to stay alive in the particular store than the urban customer. It is more comprehensible from the following graphical depiction Conclusion:- arrow_forward_ios Related papers Customer Retention Based on the Number of Purchase: A Data Mining Approach sahar q Purpose: this paper aimed at finding the relationship between the numbers of purchase and the customer's income. The data mining tools were applied in the study to find those customers who bought more than one life insurance policy and represented the signs of good payments at the same time. Design/ methodology/ approach: in the present research the data mining tools were employed based on CRISP-DM methodology. The K-means algorithm was used for classification and the prediction was based on a proposed formula in Excel worksheet. Findings: the researcher extracted some simple rules to predict customers' clusters through selecting the customers who bought more than one policy and filtering the income-bringer customers as the companies would be able to use this prediction to change their strategies in relation to different customers. Originality/value: Utilizing data mining tools to classify different customers in life insurance and prediction based on the classification were new approaches of the study. There was not enough research and implementation in relation to the CRM and data mining in the insurance industry in Iran. Especially CRISP-DM methodology was not used extensively enough in a life insurance investigation. download Download free PDF View PDF chevron_right An Empirical Analysis of Customer Retention in the Context of Local Pakistani Food Restaurants IAEME Publication IAEME PUBLICATION, 2020 The core objective of this study is to analyze the factors impacting the consumer retention towards local Pakistani food restaurants. This study examines the influence of Food Quality, Word of Mouth, Perceived Value/Price, Service Quality, and Ambience on Customer Satisfaction. Likewise, it also examines the impact of Customer Satisfaction on Consumer Retention. The data was collected in form of an on-line google questionnaire from 377 participants comprises of housewives, university students, and professionals. The data has been analyzed by using statistical software SPSS version 17 and SmartPLS version 3. Confirmatory Factor Analysis (CFA) model has been applied and Structural Equation Modeling (SEM) model has been further used to analyze the direct relationship between the data and to test the An Empirical Analysis of Customer Retention in the Context of Local Pakistani Food Restaurants http://www.iaeme.com/IJM/index.asp 455 editor@iaeme.com proposed model. The findings of the analysis show that all the values of Cronbach's Alpha, inner VIF, outer loading, and path coefficient are satisfactory and the model seems fit to the context. The finding reveals that, Word of Mouth, Perceived Value/Price, Service Quality, and Ambience have a positive impact on Customer Satisfaction which consequently influences the Customer Retention. Unlike the rest of the variables, H1 is rejected which shows that Food quality does not play a significant role in customer satisfaction and retention. The study further shows that consumer understanding of service quality, price paying, word of mouth and environment are key determinants of customer satisfaction and hence, it tends to balance customer satisfaction between satisfaction variables and consumer retention. The results further suggest that consumer satisfaction partly mediates the interaction between revisit expectations. download Download free PDF View PDF chevron_right Customer base analysis: partial defection of behaviourally loyal clients in a non-contractual FMCG retail setting Abhishek Singh European Journal of Operational Research, 2003 Customer relationship management (CRM) enjoys increasing attention as a countermeasure to switching behaviour of customers. Because foregone profits of (partially) defected customers can be significant, an increase of the retention rate can be very profitable. In this paper we focus on the treatment of a companyÕs most behaviourally loyal customers in a non-contractual setting. We build a model in order to predict partial defection by behaviourally loyal clients using three classification techniques: Logistic regression, automatic relevance determination (ARD) Neural Networks and Random Forests. Focusing on partial attrition of high-frequency shoppers who exhibit a regular visit pattern may overcome the problem of unidentifiability of total defection in non-contractual settings. Classification accuracy (PCC) and area under the receiver operating characteristic curve (AUC) are used to evaluate classifier performance on a test/hold-out sample. Using real-life data from an FMCG retailer, we show that future partial defection can be successfully predicted, i.e. exceeding the benchmark hurdle of the null model. There are no significant differences in terms of performance among alternative classification techniques. Similar to direct-marketing applications we find that past behavioural variables, more specifically RFM variables (recency, frequency, and monetary value) are the best predictors of partial customer defection. This set of variables complements demographic variables confirming findings by other authors about its importance in predicting churn behaviour. Moreover, additional variables (listed in decreasing order of importance) such as the length of customer relationship, mode of payment, buying behaviour across categories, usage of promotions and brand purchase behaviour are shown to be moderately useful to incorporate in attrition models. download Download free PDF View PDF chevron_right The Study of Customer Segmentation Examined by Catastrophe Model Yu-Kai Huang IFIP – The International Federation for Information Processing Convenience stores in Taiwan have made remarkable successes with retail delivery services by integrating E-commerce and logistics systems to form a new retail delivery model: \"On-line shopping with pickups at convenience stores.\" Although choice behavior has been discussed in marketing, few studies describe the non-linear characteristic of choice behavior. The catastrophe model was used to analyze the linkages between customer satisfaction and switching cost on pickup point service loyalty. The results indicated that the switching cost plays the splitting factor in the catastrophe model, and a high switching cost makes the discontinuous choice behavior. In the cusp catastrophe mode, \"Reselection electronic map\" is the main index of the splitting factor. Nevertheless, \"The quality of relative service concerning the electronic map\" is the main index of the normal factor. It has been expected that a catastrophe approach to discontinuous behavior has made clearly abundant implications. Based on the findings of loyalty in the application of cusp catastrophe theory, the cusp catastrophe model is an appropriate model to know the process of loyalty. It suggests that other researchers could consider the cusp catastrophe theory and other nonlinear techniques, especially for standard approaches not adequately to capture the underlying dynamic. download Download free PDF View PDF chevron_right Customer Segmentation and Churn Prediction in Online Retail Varun Kumar Advances in Artificial Intelligence The online retail industry has changed the way customers shop as everything is available online. In order to build a loyal customer base, a company needs to deploy various marketing strategies focused on the diverse nature of its customers. We propose a model, abbreviated as RFMOC, based on extension of recency frequency, monetary (RFM) analysis with two new variables to segment customers. The model also studies the segmentation performance for the k-means clustering algorithm. Moreover, customer lifetime value (CLV) is calculated for the weighted RFMOC with weights for variables calculated by the analytic hierarchy process (AHP) and customer segments are then ranked accordingly which helps to create targeted marketing strategies. At last, the customer churn prediction is performed using logistic regression by further extending the RFMOC with one more variable, abbreviated as RFMOCD, in order to predict the churning behaviour of the customers. The proposed approach is helpful to assess customer loyalty and to manage customer relationships in an effective manner. download Download free PDF View PDF chevron_right Drivers of Customer Retention: An Introspection Into Indian Retail Customers Jogi Mathew Vision: The Journal of Business Perspective There is a wide variety of choices for the modern retail customer including multiple retail formats. The success of the retail establishments has a great reliance of customer retention, which is an essential attribute to achieve profitability. This study takes in to consideration to extract the factors responsible for customer retention which in turn assists in increasing the customer base. The prime objective of the study is to ascertain the influence of customer satisfaction, switching costs and customer loyalty on customer retention. Whereas, the second one is to explore the effect of demographic factors on customer retention. The sample size of this study was 600 respondents who were chosen for the full-fledged study. The statistical techniques used for final analysis were structural equation modelling and regression. The findings subsequent to the statistical analysis and interpretation concluded that customer loyalty, customer satisfaction and switching cost have the strongest... download Download free PDF View PDF chevron_right A Bivariate Timing Model of Customer Acquisition and Retention Eric Bradlow Marketing Science, 2008 Two components central to the calculation of customer value are acquisition and retention propensities. However, while extant research has incorporated such components into different kinds of models, limited work has investigated the kinds of associations that may exist between them. In this research, we focus on the relationship between a prospective customer's time until acquisition of a particular service and the subsequent duration for which he retains it, and examine the implications of this relationship on the (lifetime) value of customers. download Download free PDF View PDF chevron_right A Study Of Two Customer Retention Measures: The American Customer Satisfaction Index And The Conversion Model Nic Terblanche Customer retention has become a major concern for many businesses. Various means to measure and predict loyalty and commitment have been developed to attend to this need. One of management&#39;s major challenges is to utilise a model suitable to explain and predict customer retention for a particular company or brand. This paper compares two prominent measures, namely the American Customer Satisfaction Index (ACSI) and the Conversion Model (CM). The respondents were 2000 consumers drawn randomly from the seven major metropolitan areas of South Africa. Respondents were required to respond to items from the ACSI and CM scales in relation to the fast food and motor car industries. The results produced by the ACSI model to predict customer loyalty indicate a weak relationship between customer loyalty and customer satisfaction. These findings are in line with the views of various authors who state that customer satisfaction cannot be a sole predictor of customer loyalty. The CM results d... download Download free PDF View PDF chevron_right How to project customer retention Peter Fader Journal of Interactive Marketing, 2007 At the heart of any contractual or subscription-oriented business model is the notion of the retention rate. An important managerial task is to take a series of past retention numbers for a given group of customers and project them into the future in order to make more accurate predictions about customer tenure, lifetime value, and so on. In this paper we reanalyze data from a leading book on data mining (Berry and Linoff 2004), who drew the dire conclusion that \"parametric approaches do not work\" for such a task. As an alternative to common \"curve-fitting\" regression models, we develop and demonstrate a probability model with a wellgrounded \"story\" for the churn process. We show that our basic model (known as a \"shiftedbeta-geometric\") can be implemented in a simple Microsoft Excel spreadsheet and provides remarkably accurate forecasts and other useful diagnostics about customer retention. We provide a detailed appendix covering the implementation details and offer additional pointers to other related models. download Download free PDF View PDF chevron_right Predicting customer behavior with Activation Loyalty per Period. From RFM to RFMAP josep alet Predicting customer behavior with Activation Loyalty per Period. From RFM to RFMAP, 2020 Objective: Identify a new model of predicting customer behavior based on new variables that can be used by marketing management and adapted to their business planning. Methodology: New model has been used, with the definition of new calculation systems of the traditional variables R, Recency, F, Frequency, and M, monetary value, (RFM), related to the business periods. Besides, activation in each period P becomes a key variable for constructing the purchase cohorts of customers and identifying their potential. A new variable, Activation Loyalty, is recognized as a good proxy of the likelihood of future customer purchases. The model builds a weighting through a multiple regression analysis obtaining β for each variable, including the periods of activation, presenting the relative effect of the variables, and the best global explanation of the model. Results: This new model, RFMAP, which includes Activation Periods and Activation Loyalty, presents a higher prediction accuracy and improvements over traditional models with a clear impact, useful and manageable lines of segmentation, and prioritization for marketing management in CRM systems. Limitations: The main limitation of this model consists that it is based on data of only one company, and it should show the value in other sectors and give a full insight through its transversal application. Practical implications: The involved advantages demonstrated better predictability and usefulness to decision-makers, not only to determine the best customers but also with lapsed ones. It gives a meaningful explanation of differences in customer behavior, which are present in the data and are being reflected in the model. Also, it provides a prescriptive pri-oritization of variables to be managed in the marketing plan and how to be implemented. download Download free PDF View PDF chevron_right See full PDF download Download PDF Loading Preview Sorry, preview is currently unavailable. You can download the paper by clicking the button above. References (24) Bitran, Gabriel and Susana Mondschein (1996). Mailing Decisions in the Catalog Sales Industry. Management Science, 42(9). Cohen, David; Gan, Christopher; Au Yong, Hua Hwa and Choong, Esther (2006). Customer Satisfaction: A Study of Bank Customer Retention in New Zealand. Commerce Division, Discussion Paper No. 109, (March). Dréze, Xavier and André Bonfrer.(2005). Moving From Customer Lifetime Value to Customer Equity. Working Paper, University of Pennsylvania. Drucker, F. Peter. (1973). Management: Tasks Responsibilities Practices. (Harper and Row). Dwyer, F Robert (1997).Customer lifetime valuation to support marketing decision making. Journal of Interactive Marketing, 11(4), Autumn: 6-13. Fader, Peter, Bruce Hardie, and Ka Lee (2005). RFM and CLV: Using Iso-value Curves for Customer Base Analysis. Journal of Marketing Research, 42(4): 415-430. Gonul, Fusun, Byung-Do Kim and Mengze, Shi. (2000). Mailing Smarter to Catalog Customers. Journal of Interactive Marketing, 14 (2): 2-16. Gupta, S., Lehmann, D. R., Stuart, J. A.(2004). \"Valuing customer\", Journal of Marketing Research, 41 (1): 7-18. 59 Gupta, Suni and Zeithaml, Valarie. (2006). Customer Metrics and Their Impact on Financial Performance. working paper, Columbia University, New York. Jain, Dipak and Singh, Siddhartha. (2002). Customer Lifetime Value Research in Marketing: A Review and Future Directions. Journal of Interactive Marketing, 16(2). Kalbfleisch, John and Prentice, Ross.(1980). Statistical Analysis of Failure Time Data. Wiley,New York. Kotler, Philip and Kevin L. Keller (2006). Marketing Management. Upper Saddle River,NJ: Prentice Hall. Kotler, P. (2003). Marketing Management. PHI, Tenth Edition, pp. 397. Oliver, Richard. (1997). Satisfaction: A Behavioral Perspective on the Consumer. New York: McGraw-Hill. Payne, Adrian (2006). Handbook of CRM: Achieving Excellence in Customer Management. Great Britain: Butterworth-Heinemann. Pfeifer Phillip and Robert Carraway (2000). Modeling customer relationships as Markov chains. Journal of Interactive Marketing, 14(2): 43-55. Reichheld, Frederick F. and W. Earl Sasser Jr. (1990). Zero Defections: Quality Comes to Services. Harvard Business Review, (September-October), 105-111. Reinartz, Werner and V. Kumar (2000). On the Profitability of Long-life Rust, Roland, Katherine Lemon and Zeithaml, Valarie (2004). Return on Marketing: Using Customer Equity to Focus Marketing Strategy. Journal of Marketing, 68(1): 109-126. Schimittlein, D.C, Morrison, D.G and Colombo, R (1987). Counting Your Customers: Who are They and What will They do Next?. Management Science, 33 (1): 1-24. Schmittlein, David and Peterson, Robert. (1994). Customer Base Analysis: An Industrial Purchase Process Application. Marketing Science, 13 (1): 41-68. Simester, Duncan, Peng Sun and John Tsitsiklis (2005). Dynamic Catalog Mailing Policies. Working Paper, MIT. Thomas, Jacquelyn, Blattberg, Robert and Fox, Edward. (2004), Recapturing Lost Customers. Journal of Marketing Research, 16 (February): 31-45. Vyas, S. C. (1997). Rural Marketing: Thrust and Challenges (First ed.). New Delhi: National Publishing House. View more arrow_downward FAQs AI What does the study reveal about urban versus rural customer retention rates? add The study finds that rural customers have a retention rate of 75% at the second purchase occasion, compared to 55.37% for urban customers. How does customer acquisition cost compare to retention expenses? add Research indicates that acquiring new customers is five times more costly than retaining existing ones. What methodology is used to analyze customer retention in this study? add The study employs a stochastic model and employs hazard models to assess customer retention probabilities. How does customer loyalty differ from customer retention according to the findings? add The paper discusses that customer loyalty implies active commitment, unlike mere customer retention, which can stem from inertia. What statistical model is applied to assess customer survival probabilities? add The study utilizes a shifted Beta Geometric model to compute customer survival probabilities and retention rates. Related papers Quantitative Modeling of Customer Retention in Context of Indian Retail Market Shubhanshu Naik Indian Journal of Commerce & Management Studies Indian retail market is witnessing a transformational period where every big retail player is confronting the problem of retaining the Customers. While low cost items and frequent sales help them to attract Customers for a short duration, once these strategies are imitated by the competitors, they are no longer able to retain the same Customers. Also, these short term strategies put a heavy burden on their operating costs resulting in shrinkage of profits. Thus, maintaining profitability and market share are issues of concern for every player in the market. But it is unlikely to increase brand loyalty without any marked distinction among different outlets. In order to increase the brand loyalty among Customers, the retailers are now focusing on the Customer lifetime value. Here, Customer retention emerges as the most important aspect for the retailers. In Indian market, there is very little understanding of the variables and factors which contribute to Customer download Download free PDF View PDF chevron_right Modelling of customer retention in multi-format retailing PILAR RIVERA , María del Carmen Berné Manero The International Review of Retail, Distribution and Consumer Research, 2002 The objective of this study is to offer a model of the formation process of loyalty towards the services provided by retail food companies. Speci cally, the case of a Spanish company, which is a leader in the retail food market is taken. The company operates using different retail formats, and it has customers with various levels of loyalty and behavioural patterns. The research tries to determine implications for managing the loyalty towards the company. Different customer groups are analysed. Four working hypotheses are proposed and contrasted through the formulation and estimation of two theoretical models or conceptual maps. After the empirical analysis, the proposed models show a good t, and the formulated hypotheses are con rmed. By using this kind of modelling, the companies that operate under the considered framework may reach signi cant conclusions prior to making decisions in the area of customer retention. download Download free PDF View PDF chevron_right Empirical validation and comparison of models for customer base analysis Meltem Denizel , Alpay Filiztekin , Emine Batislam International Journal of Research in Marketing, 2007 The benefits of retaining customers have led companies to search for means of profiling their customers individually and tracking their retention and defection behaviors. To this end, the main issues addressed in customer base analysis are identification of customer active/inactive status and prediction of future purchase levels. We compare the predictive performance of Pareto/NBD and BG/NBD models from the customer base analysis literaturein terms of repeat purchase levels and active statususing grocery retail transaction data. We also modify the BG/ NBD model to incorporate zero repeat purchasers. All models capture the main characteristics of the purchase and dropout process of individual customers and produce similar forecasts. There are some deviations in the cumulative purchase estimates of the models, which may be due to the characteristics of grocery purchasing. download Download free PDF View PDF chevron_right A note on an integrated model of customer buying behavior Peter Fader European Journal of Operational Research, 2002 We propose a simple benchmark model for the integrated stochastic model of buying behavior developed in the article ''Counting your customers: Compounding customer's in-store decisions, interpurchase time and repurchasing behavior'' [Eur. J. Oper. Res. 127 ]. Re-examining the previously analyzed data covering the purchasing of tea, we find that the new benchmark model -which involves merely three parameters and can be estimated entirely within a standard spreadsheet environment -outperforms the original integrated model and provides clearer, more complete answers to the managerial questions posed at the outset of the earlier paper. Ó download Download free PDF View PDF chevron_right Counting Your Customers\": When will they buy next? An empirical validation of probabilistic customer base analysis models based on purchase timing Dennis Fok 2013 This research provides a new way to validate and compare buy-tillyou-defect [BTYD] models. These models specify a customers transaction and defection processes in a non-contractual setting. They are typically used to identify active customers in a com-panys customer base and to predict the number of purchases. Surprisingly, the literature shows that models with quite different assumptions tend to have a similar predictive performance. We show that BTYD models can also be used to predict the timing of the next purchase. Such predictions are managerially relevant as they enable managers to choose appropriate promotion strategies to improve revenues. Moreover, the predictive performance on the purchase timing can be more informative on the relative quality of BTYD models. For each of the established models, we discuss the prediction of the purchase timing. Next, we compare these models across three datasets on the predictive performance on the purchase timing as well as purchase frequency. We show that while the Pareto/NBD and its Hierarchical Bayes extension [HB] models perform the best in predicting transaction frequency, the PDO and HB models predict transaction timing more accurately. Furthermore, we find that differences in a models predictive performance across datasets can be explained by the correlation between behavioral parameters and the proportion of customers without repeat purchases. download Download free PDF View PDF chevron_right CHALLENGES IN RURAL CUSTOMERS AND URBAN CUSTOMERS LIFESTYLE Publisher ijmra.us UGC Approved Dr G.S.JAYESH Marketing is the factor of survey and human psychology. Human mind is changed as per the time, place, situation and environment. Current paper is with an aim to study these various affecting factors in rural and urban and customers living style. Many studies have proved that there is change in the lifestyle of individual as per geographical separation. Business industries have to analysis the psychology of customers of these two background of rural and urban while launching the product in the market. Market analysis is very important while opening any new business or even to keep good status of established business with proper profit. Because customers are main target for industrial and business sectors. The current research work is an attempt to identify the challenges for organization in comparing the rural customers and urban customers and their living style living at Chidambaram district of Tamil Nadu State. download Download free PDF View PDF chevron_right The Effect of Product Assortment Changes on Customer Retention Sharad Borle Marketing Science, 2005 This research investigates the impact of a large-scale assortment reduction on customer retention, utilizing a model we develop to explore the effect on sales at both the store level and the category level simultaneously. We apply our model to a data set provided by an online grocer. The data contain detailed household purchase records for every category in the store. Our results indicate that the reduction in assortment reduces overall store sales, a result that contrasts with that of all of the recent studies on assortment reductions (Food Marketing Institute. 1993. Variety or duplication: A process to know where you stand. Prepared by Willard Bishop Consulting and Information resources, Inc., in cooperation with Frito Lay; Drèze, Xavier, Stephen J. Hoch, Mary E. Purk. 1994. Shelf management and space elasticity. J. Retailing 70(4) 301–326; Broniarczyk, Susan M., Wayne D. Hoyer, Leigh McAlister. 1998. Consumers&#39; perceptions of the assortment offered in a grocery category: The ... download Download free PDF View PDF chevron_right The Impact of Relationship Marketing, Store Atmosphere and, Service on Customer Retention Ignasius S S Refo 2020 Relationship Marketing has developing to be a brand new paradigm for the marketing strategy. Relationship Marketing which implemented by the New Cinderella, has not been effectively performed yet. It can be observed through the New Cinderalla employees’ attitude when serving the customers. Customer retention, as marketing objective, is useful to keep the customers from moving to the competitors. Customer retention is useful to maintain a continually sales relationship with the customers in the long term. The sample unit selection technique of this research uses the Purposive Sampling. To obtain an accurate PLS prediction, which is consistent towards the research model, it is recommended that the minimum size of the sample is 100 respondents. The Partial Least Square Structural Equation Modeling (PLS-SEM) from the WarpPLS 6.0 software statistic package is used to analyze the data. The result of the research displays that the relationship marketing has positive influence towards the s... download Download free PDF View PDF chevron_right Relationship between Product Based Loyalty and Clustering based on Supermarket Visit and Spending Patterns Greg Adams 2005 Loyalty of customers to a supermarket can be measured in a variety of ways. If a customer tends to buy from certain categories of products, it is likely that the customer is loyal to the supermarket. Another indication of loyalty is based on the tendency of customers to visit the supermarket over a number of weeks. Regular visitors and spenders are more likely to be loyal to the supermarket. Neither one of these two criteria can provide a complete picture of customers' l oyalty. The decision regarding the loyalty of a customer will have to take into account the visiting pattern as well as the categories of products purchased. This paper describes results of experiments that attempted to identify customer loyalty using thes e two sets of criteria separately. The experiments were based on transactional data obtained from a supermarket data collection program. Comparisons of results from these parallel sets of experiments were useful in fine tuning both the schemes of estimating the degree of loyalty of a customer. The project also provides useful insights for the development of more sophisticated measures for studying customer loyalty. It is hoped that the understanding of loyal customers will be helpful in identifying better marketing strategies. download Download free PDF View PDF chevron_right Preliminary Studies on Predicting Customer Purchase Behaviour in Online Retail Business Ken Tak Lim , Nur Shamsiah Abdul Rahman International Journal of Advanced Trends in Computer Science and Engineering , 2020 Online retail business has become a popular trend in our life because it is very easy and hassle free. The increasing amount of customer in online retail business motivates the use of data mining techniques to discover the customer purchase behaviour. Before the implementation of data mining technique, the problems encountered by the online retail business are time consuming, probable human error and space consumption. These problems have degraded the rate of customer purchase in online retail business. Therefore, the objective of this research is to identify the existing methods in predicting customer purchase behaviour. The recency, frequency and monetary (RFM) based classification techniques are proposed to model the customer purchase behaviour. Next, the results obtained from the proposed models are predicted using correlation and linear regression methods to predict the customer purchase behaviour. The result from the end of the research will be further discussed. Various technique can be applied to further improve the current result in future work. download Download free PDF View PDF chevron_right keyboard_arrow_down View more papers Related topics Mathematics add Follow Research add Follow\nExplore Papers Topics Features Mentions Analytics PDF Packages Advanced Search Search Alerts Journals Academia.edu Journals My submissions Reviewer Hub Why publish with us Testimonials Company About Careers Press Help Center Terms Privacy Copyright Content Policy 580 California St., Suite 400 San Francisco, CA, 94104 © 2025 Academia. All rights reserved\n"
    },
    {
      "turn_idx": 11,
      "action": "fetch_website_content({\"website_url\": \"https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf\"})",
      "explanation": "",
      "observation": "PDF内容提取成功（pdfplumber，共8页）：\n\n\n=== 第 1/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\nJournal Homepage: -www.journalijar.com\nArticle DOI:10.21474/IJAR01/8598\nDOI URL: http://dx.doi.org/10.21474/IJAR01/8598\nRESEARCH ARTICLE\nA SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND\nURBAN STORE.\nManash Pratim Kashyap.\nDepartment of Business Administration, NERIM Group of Institutions.\n……………………………………………………………………………………………………....\nManuscript Info Abstract\n……………………. ………………………………………………………………\nManuscript History Many businesses follow the repeat operation of a particular customer\nReceived: 02 January 2018 with a same store in different discrete time point. Customer retention\nFinal Accepted: 04 February 2019 comprises the probability of customer that alive in particular store in\nPublished: March 2019\ndifferent purchase occasion. Different purchase occasion with a\nparticular store occur at regular time intervals. The main goal of this\nKey words:-\npaper is to predict future purchase patterns for rural and urban customer\nsBG Model, Beta distribution, Beta\ngeometric, Beta binomial. that can be described by the structural characteristics. The discrete time\nbeta geometric model allows to heterogeneity of each of rural and\nurban customer behavioral process. The model is applied in 194\nhousehold from urban and 196 household from rural. Each household\nvisited three times once in every three months. The study reveals that\ncustomer retention of rural customer is more than the urban customer.\nCopy Right, IJAR, 2019,. All rights reserved.\n……………………………………………………………………………………………………....\nIntroduction:-\nThe increasing economic importance of staple goods in grocery store in rural and urban inspired many researchers.\nThe term customer retention is the probability of a customer being “alive” or “repeat” buying from a firm (Gupta\nand Zeithams, 2006). Oliver (1997) define customer retention is deeply held commitment to re-buy or repatronize\npreferred product or service consistently in the future, despite situational influences and marketing efforts having the\npotential to cause switching behavior. Every business needs to survive, so it undertakes every possible action to get\nthem. According to Peter Drucker (1973), the main purpose of a business is to create a customer. Marketing has\ntraditionally focused on market share and the customer acquisition rather than on retaining existing customers and\non building long lasting relationships with them (Kotler, 2003). Gupta et al. (2004) found that a 1 percent increase in\ncustomer retention had almost five times more impact on firm value than a 1 percent change in discount rate or cost\nof capital. Kotler and Keller (2006) give a reason for popularity of customer retention is the recognition that losing a\ncustomer means in fact more than a single sale i.e. losing the entire stream of purchases that this particular customer\nwould make over a lifetime of patronage ( Kotler and Keller, 2006).\nThe significance of customer retention was first introduced by Reichhel and Sasser (1990). They found that profits\nin service industries and credit card companies increased in direct proportion to the length of customer retention.\nIt is obvious that customer retention is more economical to keep existing customers than to acquire new one\ncustomer. Reichheld and Kenny (1990) observed that acquiring new customers is calculated as being five times\nmore costly than the expenses of retaining existing customers. They observed that, long term customers buy more.\nThe companies can clearly benefit from the life time spending of customers (Kotler, 2003 and Payne, 2006).\nCorresponding Author:-Manash Pratim Kashyap. 52\nAddress:-Department of Business Administration, NERIM Group of Institutions.\n\n=== 第 2/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\nIn literature the concept of customer retention and customer loyalty is treated as same. But Cohen et al. (2006) argue\nthat customer loyalty and customer retention are not same. According to Cohen et al. (2006) consumer inertia\nimplies that some customers are only being retained, rather than expressing loyalty. According to them customer\nloyal are described as being less price sensitive and more inclined to increase the number of purchase.\nDwyer (1997) and Jain and Singh (2002) observed that customer retention model exist for two classes “lost for\ngood” and “always a share”. The “lost for good” class define as customer defection as permanent while the “always\na share” class considers customer switching to competitors as transient. Kalbfleisch and Prentice (1980) used hazard\nmodels using which fall into broad groups- accelerated failure time (AFT) or proportional hazard (PH) models.\nSchmitten, Morrison and Colombo (1987) and Schmittlein and Peterson (1994) proposed a NBD/Pareto model for\nassessing the probability that a customer is still alive. This model is used by Reinartz and Kumar (2000). Fader,\nHardie and Lee (2005) modified this model. The “always share” retention models view customer switching to\ncompetitors and typically uses migration or Markov models to estimate transition probabilities of customers being in\na certain state. Bitran and Mondschein (1996), Gonul and Shi (1998) and Pfeifer and Carraway (2000) defined these\nstates based on RFM measures while Rust, Lemon and Zeithamal (2004) used brands as states and estimated\ntransition probabilities using a logit model. Simester, sun and Tsitsiklis (2005) used a binary tree approach to define\nthe state space and estimated the tranisition probabilities using a non parametric approach. As mentioned earlier,\nretention are categorized into two groups- lost for good and always share. Some researchers have argued that\ncustomers should be treated as renewable resource (Dreze and Bonfrer, 2005). Thomas, Blattberg and Fox (2004)\nbuild a model for recapturing the lost customers.\nObjective of the Study\nThe main objective of the study is to compare the customer retention of the grocery store for the two population i.e.\nrural and urban using stochastic model. The reason is that urban and rural customers are different in their culture,\nnature and purchasing capacity.\nHypotheses of the Study\nBased on the objective our the major hypothesis is as follows\nH : There is no significant difference between the „Customer retention‟ of the grocery store of rural and\n0\nurban.Probabilistic model is used to compare the customer retention of rural and urban grocery store.\nSampling and Data Consideration\nThe data is collected from Cachar district of south Assam, India. The district covers an area of 3786 square\nkilometres with 273694 households according to 2001 census. Out of which 21575 households are located in urban\narea and the rest are rurally located. Since the study involves both rural and urban customers, so a sample of\nhouseholds of size 194 from urban area and 196 from rural area are selected. The sample size is calculated with 95\npercent confidence level and a confidence interval of 5 percent. However, a larger sample size 250 household is\ntaken from each of the rural and urban population which is considered as final sample size. This was done to avoid\nthe loss of contact with the respondents in subsequent stages of the study as it is spread across more than two years\nperiod.\nThe sampling is performed separately for rural and urban area. In the said district there are 1020 villages. From\nwhich 52 villages were selected i.e. one out of every twenty village. From the selected 52 villages, a total of 196\nhouseholds are selected randomly, in such a way that the number of households selected from each village remains\nproportional to the total number of households in that village. There are only two towns in the district viz Silchar\nand Lakhipur which are divided into 28 and 10 municipal wards respectively. From each of the wards, households\nare selected randomly taking the total number of households selected from urban area to 194. In this case as well,\nthe number of households selected from each of the municipal wards is proportional to the total number of\nhouseholds in that ward. The available information was loaded to Microsoft Excel and accordingly the data was\nanalyzed.\nMethodology:-\nThe methodology adopted in the study can be used for comparing the customer retention of the grocery store for\nrural and urban population. For the purpose of comparison the investigator has to contact the respondents for three\npurchase occasion after a gap of three months. This is done to observe whether the respondent choice the grocery\nstore at subsequent purchase occasions. Similarly, the investigator observed the same for the next purchase occasion.\n53\n\n=== 第 3/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\nNow this makes sense to discuss the customer retention. The retention rate for the period t, denoted by r which is\nt\ndefined as follows\nr = Customer active at the end of the period t / Customer active at the end of the period t-1\nt\ni.e. proportion of customers active at the end of period t-1 who are still active at the end of the period t.\nMarketers particularly give importance to the concept for characterizing the distribution of customer life time. The\ndistribution of customer life time is studied by the survivor function which is denoted byS(t). The survivor\nfunction is defined as the probability that a customer has survived to time “t” i.e. the customer is still active at the\nperiod “t”. The survival function S(t) is defined as follows\nS(t) Probability that a customer survival up to time t\nr r ...r\n1 2 t\nt\nr\ni\ni1\n….. (1)\nt1\nS(t 1)  r\ni\nAgain, i1\nS(t)\nThus, r  …..(2)\nt S(t1)\nBut practically it is observed that time horizon is limited and it is not possible to observe a cohort of customers\ninfinitely. The investigator observed it is for some fixed time.\nIntroducing the Discrete Time Model\nAccording to Fader and Hardie (2005), the duration of customer life times is based on the following two\nassumptions:\nEvery customer has a constant retention probability “1”. The duration of the customer‟s relationship with the\ngrocery store denoted by the random variable T. So, T follows (shifted) geometric distribution with probability mass\nfunction\npT  t (1)t1 ; t=1,2,3,…..\n….. (3)\nAnd survivor function is given by\nS(T  t ) \n(1)t1\nt=1, 2, 3…\n….. (4)\nHowevercannot be considered as a constant but shall vary from customer to customer. Thus  is assumed to\nprobability density function\n1 11\n; 01; 0,0\nf ,\n(,)\n….. (5)\nThe reason to take heterogeneity coefficient  to follow the beta distribution is for the flexibility of the distribution\nand also as it is bounded between zero and one. If both the parameters of the beta distribution (and) are small\ni.e. (,1) the f (/,)is “U- Shaped” or highly polarized across the customers. If both the parameters\nare relatively large (,1); then f (/,)are fairly homogeneous. The different shapes of the beta curve\nfor different values of and can be seen in the figure 1\n54\n\n=== 第 4/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\nSince  is unobserved, i.e. for a given individual the value of  shall remain constant over the time period of the\nstudy,\n1\nP(T t/,)  P(T t/)f (/,)d\n0\n1\n1(1)1\n (1)t1 d\nB(,)\n0\n1 1\n (1)t2d\nB(,)\n0\nB(1,t1)\n …..(6)\n(,)\nThis forward recursion formula used to compute the probabilities is derived by the following manner\nIf we put T = 1 in (3) then\nB(1,)\nP(T 1/,)\nB(,)\n\n\n\nNow for T=2, 3 …\nP(T t)\nP(T t)  P(T t 1)\nP(T t 1)\nP(T t) B(1,t 1) B(,)\n  \nP(T t 1) B(,) B(1,t 2)\nB(1,t1) t 2\n \nB(1,t2) t 1\n55\n\n=== 第 5/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\nFinally, the customer retention rate is obtained using equation number (2). Substitute shifted Beta Geometric\nsurvivor function into (2) gives,\nS(t)\nr \nt S(t1)\nB(,t)\n\nB(,t1)\nt 1\n\nt 1\nEstimation of the Parameters\nThe aim of this section is to compute the sBG model parameters using maximum likelihood estimate. Let, there are\nn and n customers who are still active at time t in grocery store for urban and rural population respectively.\nut rt\nThis means number of customer who buys products from the same store for the consecutive purchase occasion under\nstudy. Now, n and n are the urban and rural customer who drop out in the first period. Similarly, n and n\nu1 r1 u2 r2\nare urban and rural customer who drop out in the second period. After the end of the second period, there are\nn (n n ) and n (n n )customer still being active at the end of the second period for urban and\nu u1 u2 r r1 r2\nrural population respectively. Assume that one customer is drop out is independent of the behavior of the other\ncustomer. It gives the specific values of the model parameter  and. The joint probability of n and n\nu1 u2\nurban customer and n (n n ) urban customers still being active at the end of the second period is given\nu u1 u2\nby\nP(Datau/,)P(T 1/,)n u 1P(T 2/,)nu2S(2/,)nu  nu1nu2 \nSimilarly for rural,\nP(Data /,)P(T1/,)n r1P(T2/,)n r2S(2/,)n r(n r1n r2)\nr\nTo estimate the model parameters, maximum likelihood estimation is used. So, the likelihood function is given by,\nL(,/Data )\nP(T1/,)n u1P(T2/,)n u2S(2/,)n u(n u1n u2)\nu\nTaking Log both side,\nLL(,/Data )n ln[p T 1/, ]n ln[p T 2/, ][n (n n )]ln[S(2/,)]\nu u1 u2 u u1 u2\nSimilarly for the rural, it as follows,\nLL(,/Data )n ln[pT 1/,]n ln[pT 2/,][n (n n )]ln[S(2/,)]\nr r1 r2 r r1 r2\nAfter estimating the parameters the customer retention is calculated for the different discrete time period.\nFindings and Analysis:-\nBased on the methodology the different parameters are calculated using Microsoft Excel 2007. The survival data\npresented in table 1 are for segments of customer. One rural customer and another urban customer for grocery store\ntype of business.\nTable 1:-Observed Percentage of Rural and Urban Customer\nPeriod Rural Customer Urban Customer\nAlive (%) Die (%) Alive (%) Die (%)\n0 (First Purchase Occasion) 100 - 100.00 -\n1 (Second Purchase Occasion) 75 25 55.37 44.63\n2 (Third Purchase Occasion) 62.04 12.96 38.84 16.53\n56\n\n=== 第 6/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\nFrom Table 1 it is revealed that, in case of rural, 75 percent customers are alive in a grocery store in the second\npurchase occasion. It is assume that the number of customer in first purchase occasion is considered as cohort of\ncustomer in particular store. In third purchase occasion, 62.04 percent customers are alive. It is also observed from\ntable 1, that 25 percent and 12.96 percent customers are die (the customer is switched to another shop) in the second\npurchase occasion and third purchase occasion respectively. In case of urban, 55.37 percent and 38.84 percent\ncustomer is alive in urban grocery store in second purchase occasion and third purchase occasion respectively.\nSimilarly, 44.63 percent and 16.53 percent customer is dying in the second purchase occasion and third purchase\noccasion respectively.\nNow our approach is to fit the sBG model for the three purchase occasion of rural and urban customer. The\nfollowing table gives the different value of the model parameters which is calculated using Microsoft Excel 2007.\nTable 2:-Parameter Estimation\nPlaces ˆ ˆ \nRural 0.559988 1.679963 0.25\nUrban 0.901502 1.118529 0.446281\nTable 2 gives the value of the model parameters and the churn probabilities of rural and urban customer. The churn\nprobability of urban consumer is more than the rural consumer. This indicates that more rural customer is active\nwith the same store than the urban customer. The urban customer is more chance to switch the other grocery store\nthan the rural customer. It is clearer from the figure 1.\nDistribution of Churn Probabilities for the Rural\nand Urban Customer\n1\n0.9\n0.8\n0.7\n)\nθ\n00 .. 56\n(\nf 0.4 F(Theta_Rural)\n0.3\nF(Theta_Urban)\n0.2\n0.1\n0\n1593715937159371593715937\n0001122233444556667788899\n.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0\nθ\nFigure 2:-Distribution of Churn Probabilities for the Rural and urban Customer\nFrom the figure 2 it is observed that the distribution of churn rate rural and urban customers is “reverse J-Shape”.\nThese gives that within rural and urban customer have fairly low churn probabilities and follow the beta distribution.\nCloser examination shows that urban consumer has more churn rate than the rural consumer. After estimating the\nsBG model parameter the retention rate of each of rural and urban consumer is computed the following table no 3\ngives the customer retention rate of rural and urban for both the discrete time point second and third purchase\noccasion. Since, first purchase occasion is considered as cohort of customer for both rural and urban. The following\ntable 3 gives the customer retention rate of rural and urban customer.\n57\n\n=== 第 7/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\nTable 3:-Customer retention of Rural and urban\nPlace Purchase Occasion\nSecond Third\nRural 0.75 0.827162\nUrban 0.553719 0.701492\nTable 3 shows that retention rate of rural customer is more than the urban customer in the second purchase occasion.\nSimilarly, in the third purchase occasion rural customer has more retention rate than the urban customer. This\nindicates that rural customer has more chance to stay alive in the particular store than the urban customer. It is more\ncomprehensible from the following graphical depiction\nRetention Rate of Rural and Urban\nCustomer\n1\n0.8271\n0.7499\ne\n0.8 0.7014\nt\na R\nn\n0.6 0.5537\no\nit\nn 0.4 Urban\ne\nt\ne Rural\nR 0.2\n0\nSecond Third\nPurchase Ocassion\nFigure 3:-Retention rate of Rural and Urban Customer\nConclusion:-\nThe main purpose of the study is to compare the customer retention of rural and urban population. This study shows\nthat customer retention of rural customers is greater than the urban customer. This means that rural customer is more\nattach with the same store than the urban customer at the different time horizon. This study also focuses the\nheterogeneity of customer at different time points. The urban consumer has more tendencies to leave the grocery\nstore last purchase than the rural customer. The reason is that, (Vyas, 1997) it is observed that rural consumers vary\nnot only in their behavior and practices but also in conviction and belief.\nReferences:-\n1. Bitran, Gabriel and Susana Mondschein (1996). Mailing Decisions in the Catalog Sales Industry.\nManagement Science, 42(9).\n2. Cohen, David; Gan, Christopher; Au Yong, Hua Hwa and Choong, Esther (2006). Customer Satisfaction:\nA Study of Bank Customer Retention in New Zealand. Commerce Division, Discussion Paper No. 109,\n(March).\n3. Dréze, Xavier and André Bonfrer.(2005). Moving From Customer Lifetime Value to Customer Equity.\nWorking Paper, University of Pennsylvania.\n4. Drucker, F. Peter. (1973). Management: Tasks Responsibilities Practices. (Harper and Row).\n5. Dwyer, F Robert (1997).Customer lifetime valuation to support marketing decision making. Journal\nof Interactive Marketing, 11(4), Autumn: 6-13.\n6. Fader, Peter, Bruce Hardie, and Ka Lee (2005). RFM and CLV: Using Iso-value Curves for Customer\nBase Analysis. Journal of Marketing Research, 42(4): 415-430.\n7. Gonul, Fusun, Byung-Do Kim and Mengze, Shi. (2000). Mailing Smarter to Catalog Customers. Journal\nof Interactive Marketing, 14 (2): 2-16.\n8. Gupta, S., Lehmann, D. R., Stuart, J. A.(2004). “Valuing customer”, Journal of Marketing Research, 41 (1):\n7–18.\n58\n\n=== 第 8/8 页 ===\n\nISSN: 2320-5407 Int. J. Adv. Res. 7(3), 52-59\n9. Gupta, Suni and Zeithaml, Valarie. (2006). Customer Metrics and Their Impact on Financial Performance.\nworking paper, Columbia University, New York.\n10. Jain, Dipak and Singh, Siddhartha. (2002). Customer Lifetime Value Research in Marketing: A Review and\nFuture Directions. Journal of Interactive Marketing, 16(2).\n11. Kalbfleisch, John and Prentice, Ross.(1980). Statistical Analysis of Failure Time Data. Wiley,New York.\n12. Kotler, Philip and Kevin L. Keller (2006). Marketing Management. Upper Saddle River,NJ: Prentice Hall.\n13. Kotler, P. (2003). Marketing Management. PHI, Tenth Edition, pp. 397.\n14. Oliver, Richard. (1997). Satisfaction: A Behavioral Perspective on the Consumer. New York: McGraw-Hill.\n15. Payne, Adrian (2006). Handbook of CRM: Achieving Excellence in Customer Management. Great Britain:\nButterworth-Heinemann.\n16. Pfeifer Phillip and Robert Carraway (2000). Modeling customer relationships as Markov chains. Journal of\nInteractive Marketing, 14(2): 43-55.\n17. Reichheld, Frederick F. and W. Earl Sasser Jr. (1990). Zero Defections: Quality Comes to Services.\nHarvard Business Review, (September-October), 105-111.\n18. Reinartz, Werner and V. Kumar (2000). On the Profitability of Long-life\n19. Rust, Roland, Katherine Lemon and Zeithaml, Valarie (2004). Return on Marketing: Using Customer\nEquity to Focus Marketing Strategy. Journal of Marketing, 68(1): 109-126.\n20. Schimittlein, D.C, Morrison, D.G and Colombo, R (1987). Counting Your Customers: Who are They\nand What will They do Next?. Management Science, 33 (1): 1-24.\n21. Schmittlein, David and Peterson, Robert. (1994). Customer Base Analysis: An Industrial Purchase Process\nApplication. Marketing Science, 13 (1): 41-68.\n22. Simester, Duncan, Peng Sun and John Tsitsiklis (2005). Dynamic Catalog Mailing Policies. Working\nPaper, MIT.\n23. Thomas, Jacquelyn, Blattberg, Robert and Fox, Edward. (2004), Recapturing Lost Customers. Journal of\nMarketing Research, 16 (February): 31-45.\n24. Vyas, S. C. (1997). Rural Marketing: Thrust and Challenges (First ed.). New Delhi: National Publishing\nHouse.\n59"
    },
    {
      "turn_idx": 12,
      "action": "fetch_website_content({\"website_url\": \"https://faculty.wharton.upenn.edu/wp-content/uploads/2012/04/Fader_hardie_jim_07.pdf\"})",
      "explanation": "",
      "observation": "PDF内容提取成功（pdfplumber，共15页）：\n\n\n=== 第 1/15 页 ===\n\nHOW TO PROJECT CUSTOMER\nRETENTION\nPETER S.FADER AND BRUCE G.S.HARDIE\nMARKETPLACE\nA\nPETER S.FADER\nt the heart of any contractual or subscription-oriented business\nis the Frances and Pei-Yuan Chia\nProfessor of Marketing,The\nmodel is the notion of the retention rate.An important managerial task is to\nWhartonSchool,The University of\nPennsylvania,Philadelphia,PA; take a series of past retention numbers for a given group of customers and\ne-mail:faderp@wharton.upenn.edu\nproject them into the future to make more accurate predictions about\ncustomer tenure, lifetime value, and so on. As an alternative to common\nBRUCE G.S.HARDIE\n“curve-fitting”regression models,we develop and demonstrate a probability\nis Associate Professor of Marketing,\nLondon Business School,London, model with a well-grounded “story”for the churn process.We show that our\nUnited Kingdom;\nbasic model (known as a “shifted-beta-geometric”) can be implemented in a\ne-mail:bhardie@london.edu\nsimple Microsoft Excel spreadsheet and provides remarkably accurate\nforecasts and other useful diagnostics about customer retention.We provide a\nThe authors thank Michael Berry\nand Gordon Linoff for providing the detailed appendix covering the implementation details and offer additional\ndata used in this article,and Naufel\npointers to other related models.\nVilcassim for his helpful comments.\nThe second author acknowledges\nthe support of the London Business\n© 2007 Wiley Periodicals, Inc. and Direct Marketing Educational Foundation, Inc.\nSchool Centre for Marketing and the\nhospitality of the Department of\nJOURNAL OF INTERACTIVE MARKETING VOLUME 21 / NUMBER 1 / WINTER 2007\nMarketing at the University of\nPublishedonlineinWileyInterScience(www.interscience.wiley.com).DOI:10.1002/dir.20074\nAuckland Business School.\n76\n\n=== 第 2/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nINTRODUCTION function. In a discrete-time setting, this is computed\nas\nAdefining characteristic of a contractual or subscrip-\n(cid:5)\ntion business setting is that the departure of a cus- expected tenure (cid:2) aS(t).\ntomer is observed. For example, the customer has to t(cid:2)0\ncontact the firm to cancel a mobile phone contract;\nIn light of (1), the standard textbook expression for\nsimilarly, a local theater company can observe that a\npatron has not renewed an annual subscription.1 (expected) customer lifetime value (CLV) in a contrac-\ntual setting that (correctly) reflects the phenomenon\nAs such, it makes sense to talk of metrics such as\nof nonconstant retention rates,\nretention and churn rates: The retention rate for\nPeriod t (r) is defined as the proportion of customers\nt\nactive at the end of Period t(cid:1)1 who are still active at (cid:5) t 1 t\nthe end of Period t, and the churn rate for a given\nE(CLV) (cid:2) ame qr ifa 1(cid:4)db ,\nt(cid:2)0 i(cid:2)1\nperiod is defined as the proportion of customers\nactive at the end of Period t(cid:1)1 who dropped out in\ncan be written as\nPeriod t.2\n(cid:5) S(t)\nAs we seek to understand the nature of customer E(CLV) (cid:2) am\n(1(cid:4)d)t\n.\nbehavior in a contractual setting, it is useful to draw t(cid:2)0\non the survival analysis literature. One particularly\nuseful concept for characterizing the distribution of In a contractual setting, the empirical survivor func-\ncustomer lifetimes is that of the survivor function, tion S‡(t) is simply the proportion of customers\ndenoted by S(t), which is the probability that a cus- acquired at Time 0 who are still active at Time t. A\ntomer has “survived” to Time t(i.e., is still active at t). major problem in using the empirical survivor func-\nRecalling the definition of a retention rate, it follows tion to compute expected tenure or lifetime value is\nthat that the observed time horizon is often quite limited.\nSuppose we observe a particular cohort of customers\nover their first 5 years with the firm, which implies\nS(t) (cid:2) r (cid:3)r (cid:3)...(cid:3)r\n1 2 t we can compute S‡(1),...S‡(5)(By definition, S‡(0)(cid:2) 1.)\nt The quantity S‡(0) (cid:4)... (cid:4) S‡(5) is the expected cus-\n(cid:2) qr i, (1)\ntomer lifetime for the members of the cohort over this\ni(cid:2)1\nperiod. Similarly, we can compute expected CLV dur-\nwhich implies ing the first 5 years of a customer’s relationship with\nthe firm; however, we would be underestimating the\nS(t) expected tenure and CLV of a new customer, as we\nr (cid:2) . (2)\nt S(t(cid:1)1) would be ignoring the remaining life of those cutomers\nwho are active at the end of Year 5. To compute the\nSeveral quantities of managerial interest can be true expected tenure and CLV, we need to be able to\neasily calculated directly from the survivor function. project the survivor function beyond the observed\nFor example, the expected (or average) tenure of a time horizon. That is, we need to create estimates of\ncustomer is simply the area under the survivor S(6), S(7),...given the data S‡(1),...S‡(5). This pro-\njected survivor function also is needed if we wish to\ncompute the expected residual tenure or lifetime\nvalue of an individual who has been a customer for,\nsay, 3 years.\n1This is in contrast to a noncontractual setting, a defining charac-\nteristic of which is that the departure of a customer is not observed\nAn obvious approach is to fit some flexible function of\nby the firm (see “Limits to Application” section for a discussion of\nthe implications of this characteristic). time to the observed data. Then resulting regression\n2Strictly speaking, we should talk of retention and churn probabil- equation can be used to project the survivor func-\nities, not rates. tionbeyond the range of observations, from which we\n77\nHOW TO PROJECT CUSTOMER RETENTION\n\n=== 第 3/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\ncan compute expected tenure, CLV, and so on. In a an unspecified subscription-type business. These data\npopular book on data mining, Berry and Linoff (2004) were presented in graphical form in Berry and Linoff\nexplored this idea (pp. 392–393); their conclusion (2004, chap. 12). The High End data were used by\nregarding the viability of such an exercise is evident Berry and Linoff in their examination of parametric\nin the title of their sidebar discussion “Parametric approaches to the projection of the survivor function.\napproaches do not work.”\nSuppose we only have the first 7 years of data and\nThe objective of this article is to present an alterna- wish to compute estimates of S(8), S(9),....If we\ntive approach to the problem of projecting the sur- were to give these data to a student who had just com-\nvivor function—one that does “work.” We formulate a pleted a typical data analysis course, the natural\nprobabilistic model of contract duration that is based starting point would be to fit a linear function of time\non a simple story of customer behavior. The resulting to the data and use the resulting regression equation\nmodel offers useful diagnostic insights and is very to project the survivor function over the future peri-\neasy to implement using Microsoft Excel. ods. Recognizing that the data are not linear, some\nstudents would add a quadratic term to try to capture\nIn the next section, we replicate and extend Berry and the curvature in the data. More sophisticated stu-\nLinoff’s (2004) analysis. We then present a simple dents would specify some nonlinear function of time,\nprobability model of customer lifetime and demon- such as an exponential function.\nstrate the value of using a formal model to predict\nfuture customer behavior. We conclude with a discus- In their “Parametric approaches do not work” sidebar,\nsion of several issues that arise from this work. Berry and Linoff (2004) estimated and compared this\nset of regression models with the following results:3\nPROJECTING SURVIVAL USING\nLinear y(cid:2)0.925(cid:1)0.071t R2(cid:2)0.922\nSIMPLE FUNCTIONS OF TIME\nQuadratic y(cid:2)0.997(cid:1)0.142t(cid:4)0.010t2 R2(cid:2)0.998\nThe survival data presented in Table 1 are for two Exponential ln(y)(cid:2)(cid:1)0.062(cid:1)0.102t R2(cid:2)0.963\nsegments of customers (“Regular” and “High End”) for\nwhere y is the proportion of customers surviving at\nleast t years. These equations then are used to extra-\npolate the survivor function to Year 12; Figure 1\nrecreates the plot presented in Berry and Linoff’s\nTABLE 1 Observed %Customers Surviving at\nLeast 0–12 Years sidebar (p. 393).\nThe fit of all three models up to and including Year 7\n%SURVIVING is reasonable, and the quadratic model provides a\nYEAR REGULAR HIGH END particularly good fit. But when we consider the pro-\njections beyond the model calibration period, all three\n0 100.0 100.0\n1 63.1 86.9 models break down dramatically. The linear and\n2 46.8 74.3 exponential models underestimate Year 12 survival\n3 38.2 65.3 by 81 and 30%, respectively, while the quadratic\n4 32.6 59.3 model overestimates Year 12 survival by 92%.\n5 28.9 55.1 Furthermore, the models lack logical consistency:\n6 26.2 51.7 The linear model would have S(t) (cid:6) 0 after year 14,\n7 24.1 49.1\nand according to the quadratic model the survivor\n8 22.3 46.8\n9 20.7 44.5\n10 19.4 42.7 3In the models run by Berry and Linoff (2004), time is indexed 1,\n11 18.3 40.9 2,...,8, but to maintain consistency with the definitions of S(t)\n12 17.3 39.4 discussed earlier [specifically S(0)(cid:2) 1], we reindex time to 0, 1,... ,7.\nThis has no impact at all on the fit or forecasting performance of\nany of the models.\n78\nJOURNAL OF INTERACTIVE MARKETING\n\n=== 第 4/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\n100\n80\n60\n40\n20\n0\n0 1 2 3 4 5 6 7 8 9 10 11 12\nTenure (years)\nfunction will start to increase over time, which is not Linear y(cid:2)0.773(cid:1)0.092t R2(cid:2) 0.776\npossible. It is therefore not surprising that Berry and Quadratic y(cid:2)0.930(cid:1)0.249t(cid:4)0.022t2 R2(cid:2) 0.960\nLinoff (2004) concluded that parametric curves do not Exponential ln(y)(cid:2)(cid:1)0.248(cid:1)0.190t R2(cid:2) 0.915.\n“work” for the task of projecting the survivor function\nover time. and the corresponding fits and projections are re-\nported in Figure 2. The projections associated with\nRepeating this analysis for the Regular segment the linear and quadratic models are terrible and illogical\nyields the following equations: once again. The exponential model does not appear to\n79\nHOW TO PROJECT CUSTOMER RETENTION\ngnivivruS\n%\nActual\nLinear\nQuadratic\nExponential\nFIGURE 1\nActual Versus Regression-Model-Based Estimates of the Percentage of High End Customers\nSurviving at Least 0–12 Years\n120\n80\n40\n0\n0 1 2 3 4 5 6 7 8 9 10 11 12\nTenure (years)\n–40\ngnivivruS\n%\nActual\nLinear\nQuadratic\nExponential\nFIGURE 2\nActual Versus Regression-Model-Based Estimates of the Percentage of Regular Customers\nSurviving at Least 0–12 Years\n\n=== 第 5/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nbe very bad in the figure, but in fact it underestimates additional assumptions will be required in this par-\nYear 12 survival by 54%. This is not an acceptable ticular case.\nrange of error.\nTo operationalize this verbal model, we need to trans-\nOf course, we could try out different arbitrary func- late the elements of this story into the language of\ntions of time, but this would be a pure curve-fitting mathematics. More formally, our proposed model for\nexercise at its worst. Furthermore, it is hard to imag- the duration of customer lifetimes is based on the\nine that there would be any underlying rationale for following two assumptions:\nthe equation(s) that we might settle upon. Faced with\n1. An individual remains a customer of the firm with\nthis situation, it is tempting to “throw up our hands”\nconstant retention probability 1(cid:1)(cid:7). This is equiv-\nin despair and say that we cannot project the survivor\nalent to assuming that the duration of the cus-\nfunction beyond the range of observations.\ntomer’s relationship with the firm, denoted by the\nrandom variable T, is characterized by the (shifted)\nHowever, we feel that such a conclusion is prema-\ngeometric distribution with probability mass\nture. After all, in other areas of marketing there\nfunction and survivor function\nare plenty of models that have been used to provide\naccurate forecasts of the behavior of a cohort of cus- P(T (cid:2) t|(cid:7)) (cid:2) (cid:7)(1(cid:1)(cid:7))t(cid:1)1, t (cid:2) 1, 2, 3,... (3)\ntomers beyond the range of observations (e.g.,\nS(t|(cid:7))(cid:2) (1 (cid:1) (cid:7))t, t (cid:2) 1, 2, 3, .... (4)\nHardie, Fader, & Wisniewski, 1998, for the case of\nnew-product-sales forecasting). Thus, in the next sec- 2. Heterogeneity in (cid:7) follows a beta distribution\ntion, we formulate a probabilistic model of contract\nwith pdf\nduration that is based on a simple “story” of customer\nbehavior. (cid:7)(cid:8)(cid:1)1(1(cid:1)(cid:7))(cid:9)(cid:1)1\nf((cid:7)0(cid:8),(cid:9)) (cid:2) , (cid:8),(cid:9) (cid:10)0,\nB((cid:8),(cid:9))\nA DISCRETE-TIME MODEL FOR\nwhere B(. , .) is the beta function.\nCONTRACT DURATION\nConsider the following story of customer behavior in a The assumption of geometrically distributed lifetimes\ncontractual setting: follows from the first two elements of our simple story\nof customer behavior; it is perfectly consistent with\n• At the end of each period, a customer flips a coin:\nthe sequential coin-flip description. The beta distrib-\n“heads”shecancelshercontract,“tails”sherenewsit.\nution will be less familiar to most readers, but it is a\n• For a given individual, the probability of a coin very reasonable way to characterize heterogeneity in\ncoming up “heads” does not change over time. the churn probabilities because it is a flexible distrib-\nution that is bounded between zero and one. If one\n• P(“heads”) varies across customers.\nthinks about how the “coin-flip” probabilities are like-\nOf course, people do not make their contract renewal ly to vary across individuals, there are four principal\ndecisions on the basis of coin flips; rather, this possibilities, as illustrated in Figure 3. If both para-\nstory is a paramorphic representation of customer meters of the beta distribution ((cid:8) and (cid:9)) are small\nbehavior. The third element of the story should not ((cid:6)1), then the mix of churn probabilities is “U-shaped,”\nbe controversial, as the notion of heterogeneity is or highly polarized across customers. If both parame-\ncentral to marketing; however, some readers might ters are relatively large ((cid:8), (cid:9) (cid:10) 1), then the probabil-\nfindthe second element contrary to their expectation ities are fairly homogeneous. Likewise, the dis-\nthat retention rates increase over time as the cus- tribution of probabilities can be “J-shaped” or\ntomer gains more experience with the product or “reverse-J-shaped” if the parameters fall within the\nservice. But rather than overcomplicate our story, we remaining ranges as shown in the figure. It is not\nstart with the simplest possible set of assumptions essential for the reader to remember all of these\nand only add supposedly richer “touches of reality” cases, but these parameters can offer useful diagnos-\nif the model does not “work.” As seen shortly, no tics to help the manager understand the degree (and\n80\nJOURNAL OF INTERACTIVE MARKETING\n\n=== 第 6/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nrequired to achieve pregnancy (Weinberg & Gladen,\nb\n1986) and the length of stays in a psychiatric hospital\n(Kaplan, 1982). Direct-marketing applications of\nrelated models are discussed later.\nWe note that while (5) and (6) are expressed in terms\nof beta functions, we can implement the model with-\n0.0 0.5 1.0 0.0 0.5 1.0 out ever having to deal with beta functions directly.\nq q\nAs formally derived in Appendix A, we can compute\n1 sBG probabilities by using the following forward-\nrecursion formula from P(T (cid:2) 1):\n(cid:8)\nt(cid:2)1\n(cid:8)(cid:4)(cid:9)\nP(T(cid:2)t) (cid:2) µ (7)\n(cid:9)(cid:4)t(cid:1)2\nP(T(cid:2)t(cid:1)1) t(cid:2)2, 3, . . .\n(cid:8)(cid:4)(cid:9)(cid:4)t(cid:1)1\n0.0 0.5 1.0 0.0 0.5 1.0\nq q\n0 1 a Recall from (2) that the retention rate is the ratio\nof sequential values of the survivor function.\nFIGURE 3\nSubstituting (6) into (2) and simplifying (see\nGeneral Shapes of the (cid:9)Distribution as a Function of (cid:8) and(cid:9)\nAppendix A) gives us the following expression for the\n(aggregate) retention rate associated with sBG model:\nnature) of heterogeneity in churn probabilities across\n(cid:9)(cid:4)t(cid:1)1\nr (cid:2) (8)\nthe customer base. t (cid:8)(cid:4)(cid:9)(cid:4)t(cid:1)1\nGiven these two model assumptions, how can we com-\npute the probability that a customer fails to renew his Given (8), we can go back to the expression given in\ncontract at the end of Period t or survives beyond (1) and compute S(t) without having to deal with any\nPeriod t [P(T (cid:2) t) and S(t), respectively]? Since this beta functions.\ncustomer’s value of (cid:7)is unobserved, we cannot use (3)\nand (4). We therefore take the expectation of (3) and We immediately see that under the sBG model, the\n(4) over the beta distribution that characterizes the retention rate is an increasing function of time, even\ncross-sectional heterogeneity in (cid:7) to arrive at the cor- though the underlying (unobserved) individual-level\nresponding expressions for a randomly chosen indi- retention probability is constant. According to this\nvidual: model, there are no underlying time dynamics at the\nlevel of the individual customer; the observed phe-\nB((cid:8) (cid:4) 1, (cid:9) (cid:4) t (cid:1) 1) nomenon of retention rates increasing over time is\nP(T (cid:2) t0(cid:8),(cid:9)) (cid:2) , t (cid:2) 1, 2, . . . (5)\nB((cid:8),(cid:9)) simply due to heterogeneity (i.e., the high-churn cus-\ntomers drop out early in the observation period, with\nB((cid:8),(cid:9)(cid:4)t)\nS(t0(cid:8),(cid:9)) (cid:2) , t(cid:2)1, 2, . . . (6) the remaining customers having lower churn proba-\nB((cid:8),(cid:9))\nbilities). This well-known “ruse of heterogeneity”\n(Vaupel & Yashin, 1985) is often overlooked by those\n(The mathematically inclined reader is referred to attempting to make sense of various aggregate pat-\nAppendix A for step-by-step details of the deriva- terns of customer behavior.\ntions.) We call this model the shifted-beta-geometric\n(sBG) distribution. Nonbusiness applications of this We fit the sBG model to the first 7 years of the data\nmodel include the number of menstrual cycles presented in Table 1. For the High End segment,\n81\nHOW TO PROJECT CUSTOMER RETENTION\n\n=== 第 7/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\n100\n80\n60\n40\n20\n0\n0 1 2 3 4 5 6 7 8 9 10 11 12\nTenure (years)\n(cid:8)ˆ (cid:2) 0.688, ˆ(cid:9) (cid:2) 3.806; for the Regular segment, it does not have to accumulate across periods as S(t)\n(cid:8)ˆ (cid:2)0.704, ˆ(cid:9)(cid:2)1.182. (See Appendix B for details of does, and therefore it is more sensitive to period-to-\nhow to estimate the model parameters in the familiar period variations. Despite the existence of certain\nMicrosoft Excel environment.) Using these parameter unexplained “blips” as in Year 2 for the High End seg-\nestimates, we extrapolate the survivor function for ment, the tracking/prediction plot for r is very\nt\neach segment to Year 12. These model-based numbers impressive through Year 12, and there is every reason\nare plotted in Figure 4, along with the corresponding to believe that the model would continue to perform\nempirical survivor functions. The resulting predic- well over an even longer future horizon.\ntions are almost too good to be true; the sBG model\noverestimates Year 12 survival by only 4% and 2% for For both segments, note that the retention rates are an\nthe High End and Regular segments, respectively. increasing function of the length of a customer’s rela-\nEven though this model is no more complicated than tionship with the firm. The important point to empha-\nthe regression models discussed earlier, its carefully size, once again, is that the sBG “story” assumes that\nconstructed “story” makes it possible to tease out, and these apparent dynamics are simply a result of hetero-\ntherefore accurately project, the critical behavioral geneity; any given individual has a constant (but\ncomponents. unknown) retention probability 1(cid:1)(cid:7). Unlike the con-\nventional wisdom about customer retention, it is not a\nAnother plot of interest shows the (aggregate) reten- story of individual customers becoming increasingly\ntion rate as a function of tenure. The model-based loyal as they develop a deeper relationship with the\nretention rate numbers [as computed using (8)] are firm, and so on. Thus, the observed phenomenon of\nplotted in Figure 5, along with the corresponding increasing retention rates is simply a sorting effect in a\nobserved retention rates as computed from the empir- heterogeneous population (i.e., the high-churn cus-\nical survivor functions. For both segments, the sBG tomers drop out early in the observation period, with the\nmodel accurately tracks the empirical retention rate remaining customers having lower churn probabilities).\ncurves. On one hand, this might not seem surprising\nsince r and S(t) are so closely related; on the other As a final demonstration of the usefulness of the sBG\nt\nhand, however, r is harder to predict accurately since model, we show and contrast the mixing distributions\nt\n82\nJOURNAL OF INTERACTIVE MARKETING\ngnivivruS\n%\nActual\nModel\nHigh End\nRegular\nFIGURE 4\nActual Versus sBG-Model-Based Estimates of the Percentage of Customers Surviving at\nLeast 0–12 Years for the High End and Regular Segments\n\n=== 第 8/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\n1.0\n0.9\n0.8\n0.7\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nTenure (years)\nthat characterize how the churn probabilities ((cid:7)) will tend to depart very quickly. These patterns sug-\ndiffer across the individuals in each segment. In gest that there is a fairly high degree of heterogeneity\nFigure 6, we see that both distributions are “reverse- within each segment; therefore, a model that does not\nJ-shaped.” This implies that within each group, most take these cross-customer differences into account\ncustomers have fairly low churn probabilities, but will not perform very well, particularly in terms of\nthere is a sizeable subsegment within each one that out-of-sample forecasting. Acloser examination shows\n83\nHOW TO PROJECT CUSTOMER RETENTION\netaR\nnoitneteR\nHigh End\nRegular\nActual\nModel\nFIGURE 5\nActual Versus sBG-Model-Based Estimates of Retention Rates by Tenure for the High End\nand Regular Segments\n4\nHigh End\n3\nRegular\nf (q) 2\n1\n0\n0.00 0.25 0.50 0.75 1.00\nq\nFIGURE 6\nEstimated Distributions of Churn Probabilities for the High End and Regular Segments\n\n=== 第 9/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nthat the overall “weight” of the distribution for the period to period (e.g., number of contacts with the\nRegular group is shifted slightly to the right com- customer-service department), and changes in\npared to that of the High End distribution. This these variables can be strong predictors of customer\nreflects the fact that the Regular group has a higher defection.\nmean churn probability [E((cid:7))(cid:2)(cid:8)(cid:11)((cid:8)(cid:4)(cid:9))(cid:2)0.37] com-\npared to that of the High End group [E((cid:7))(cid:2)0.15]. It However, these models cannot easily be used to\nshould be clear from Figures 4 and 5 that this kind of address the problem of projecting the survivor func-\ndifference in the means exists, but this plot provides tion into the future, as we do not have future values\na better idea about the nature of these differences at of the time-varying covariates. It is therefore impor-\na more fine-grained level. tant to use the right model for the task at hand, and\nto acknowledge the limitations to application of any\nDISCUSSION model we develop.\nWehave presented the sBG distribution as a model for Wehave referred to the sBG distribution as a model for\nthe duration of customer relationships in a discrete- the duration of customer relationships in a discrete-\ntime contractual setting, and demonstrated that it time contractual setting. Many readers will have\ncan provide accurate forecasts and other useful diag- glanced over the words “discrete-time” and “contrac-\nnostics about customer retention. Furthermore, we tual” without reflecting on their significance, howev-\nhave argued that it is preferable to use such a model er, they are very important as we seek to understand\ninstead of arbitrary functions of time. In closing, we when and where it is appropriate to use the model\ndiscuss limits to its application, related models in the presented in this article.\ndirect-marketing literature, possible extensions to the\n• By “discrete-time,” we mean that transactions can\nbasic model, and some practical implementation\noccur only at fixed points in time (e.g., the annual\nissues.\nrenewal cycles for most professional organizations).\nThis is in contrast to continuous-time, where the\nLimits to Application\ntransactions can occur at any point in time (e.g., the\ncancelation of basic utility contracts).\nThe practical problem that drove the development of\nthis model is a desire to project an empirical survivor • In a “contractual” setting, the time at which the\nfunction (and therefore retention rates) beyond the customer becomes inactive is observed (e.g., when\nobserved time horizon of our dataset. The ability to the customer fails to renew a subscription). This is\nperform this projection is central to any attempt to in contrast to a “noncontractual” setting, where the\ncompute CLV or other metrics such as expected absence of a contract or subscription means that\ntenure if we wish to avoid the “truncation” problem the point in time at which the customer becomes\nassociated with computing these quantities using just inactive is not observed by the firm (e.g., a catalog\nthe observed survival data. For this particular prob- retailer). The challenge is how to differentiate\nlem, this simple model should be the first tool the between a customer who has ended a “relation-\nresearcher pulls out of his toolkit. ship” with the firm versus one who is merely in the\nmidst of a long hiatus between transactions.\nThere are other churn-related problems where this\nshould not be the case. In particular, there is a broad This leads to a two-dimensional classification of cus-\nliterature on churn modeling in which logit models tomer bases: opportunities for transactions (continu-\n(and far more sophisticated statistical models and ous vs. discrete) and type of relationship with cus-\ndata-mining methodologies) are used to determine tomers(noncontractualvs.contractual).Themodelin\nthe correlates of churn (Berry & Linoff, 2004; Parr thisarticleisforjustoneofthefourpossiblebusiness\nRud, 2001). The resulting models then can be used to contexts.\nidentify which customers are at risk of churning in\nthe next period so that retention-oriented marketing In continuous-time contractual settings, we should\nresources can be targeted at them. Many of the not use the sBG model. Rather, we should use its\ncovariates included in these models will vary from continuous-time analog, the exponential-gamma (EG)\n84\nJOURNAL OF INTERACTIVE MARKETING\n\n=== 第 10/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\ndistribution (also known as the Lomax distribution or by making the model parameters functions of the\nthe “Pareto distribution of the second kind”). Such a descriptor variables. By a similar logic, the effects of\nmodel assumes that the duration of an individual time-invariant covariates could be incorporated in the\ncustomer’s relationship with the firm is characterized sBG model by making (cid:8) and (cid:9) functions of the\nby the exponential distribution, and that het- descriptor variables. Incorporating the effects of time-\nerogeneity in “departure rates” is captured by a varying covariates (e.g., marketing-mix effects, sea-\ngamma distribution (Hardie et al., 1998; Morrison & sonality) is more complicated. The key is to bring in\nSchmittlein, 1980). all of these factors at the right level; that is, at the\nlevel of the latent parameter of interest (in this case,\nModelsfornoncontractualsettingsaremorecomplicated (cid:7)) instead of just “jamming” different covariate effects\nbecausethetimeatwhichacustomerbecomesinactive, into a regression-like model (see Schweidel, Fader, &\nand the likelihood that it has occurred at all, must be Bradlow, 2006, for a discussion of how to do this in a\ninferred from the transaction history. For continuous- continuous-time contractual setting.) However, as\ntimenoncontractualsettings,wehavethePareto/NBD noted in the last section, we question the value of\n(Schmittlein, Morrison, & Colombo, 1987) and BG/ such an extension given our modeling objective (i.e.,\nNBD (Fader, Hardie, & Lee, 2005) models while for projecting the empirical survivor function beyond the\ndiscrete-timenoncontractualsettings,wehavetheBG/ observed time horizon of our dataset).\nBBmodel(Fader,Hardie,&Berger,2004).\nBoth the sBG model and its continuous-time analog\nRelated Probability Models (i.e., the EG model) are based on the assumption that\nand Extensions the commonly observed phenomenon of increasing\nretention rates is due entirely to heterogeneity;\n“List falloff” is an important phenomenon in direct individual-customer-level retention rates are assumed\nmarketing. The basic idea is that the response rate to be constant. If we wish to allow for the possibility of\nfrom the first mailing to a prospect list is usually time dynamics at the level of the individual customer,\nhigher than that of the second mailing, which in turn we can no longer characterize the duration of an indi-\nis higher than that for the third mailing, and so on. vidual’s relationship with the firm using either the\nBuchanan and Morrison (1988), hereafter BM, pre- shifted-geometric or exponential distributions, both of\nsented a simple probability model of list falloff and which have the “memoryless” property (i.e., the proba-\nshowed how the model can be used to determine how bility of survival to s(cid:4)t, given survival to t, is the\nmany more mailings should be sent to a prospect list same as the initial probability of survival to s). In a\ngiven the observed response rates for the first two continuous-time setting, we can accommodate this\nmailings. Their model is based on assumptions simi- effect by assuming that individual lifetimes can be\nlar to those behind the sBG model: (a) Each person characterized by the Weibull distribution, which allows\nresponds to a direct-mail solicitation with constant for an individual’s risk of canceling a contract to\nprobability p, and (b) p varies across the population increase or decrease as the length of the relationship\naccording to a beta distribution. While BM base their with the firm increases. In a discrete-time contractual\nframework on the beta-binomial model, it could have setting, this leads to the beta-discrete-Weibull (BdW)\nbeen derived as an sBG model (e.g., the mailing on model (Fader & Hardie, 2006), which is a generaliza-\nwhich the prospect responds to the offer is character- tion of the sBG model, while in a continuous-time con-\nized by the shifted-geometric distribution). As such, it tractual setting, this leads to a generalization of the EG\nis possible to identify clear relationships between model, the Weibull-gamma (WG) model (Hardie et al.,\nsome of the results in this article [e.g., r andS(t)] and 1998; Morrison & Schmittlein, 1980).\nt\nsome quantities of interest in a list-falloff setting.\nImplementation Issues\nThe BM framework was extended by Rao and Steckel\n(1995) to incorporate (time-invariant) descriptor Our treatment of how to estimate the sBG model\nvariables such as age, income, and sex. This is accom- parameters (Appendix B) assumes that we are fitting\nplished using the beta-logistic model (Heckman & the model to data for just one cohort of customers. But\nWillis, 1977), which extends the beta-binomial model in practice, we will frequently have data for more\n85\nHOW TO PROJECT CUSTOMER RETENTION\n\n=== 第 11/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nthan one cohort, where cohorts are defined by time of Fader, P. S., & Hardie, B. G. S. (2006). Customer Base\nacquisition (and possibly acquisition channel, product Valuation in a Contractual Setting: The Perils of\nIgnoring Heterogeneity. Retrieved September 23, 2006,\nclass, etc.) When faced with data for multiple cohorts,\nfrom http://brucehardie.com/papers/022/\nan important model implementation issue is to choose\nFader, P. S., Hardie, B. G. S., & Berger, P. D. (2004).\namong three possible approaches: (a) to pool the\nCustomer-Base Analysis With Discrete-Time Transaction\ncohorts and estimate a single set of model parameters\nData. Retrieved September 23, 2006, from http://\nacross them, (b) to estimate a separate set of model brucehardie.com/papers/020/\nparameters for each cohort, or (c) to use a “beta-logistic” Fader, P. S., Hardie, B. G. S., & Lee, K. L. (2005). “Counting\nversion of the sBG with cohort-specific dummy vari- Your Customers” the Easy Way: An Alternative to the\nables. Our decision of how to move ahead is influenced Pareto/NBD Model. Marketing Science, 24(Spring),\n275–284.\nby our beliefs of whether we can view each cohort as\nthe realization of a common underlying contract dura- Hardie, B. G. S., Fader, P. S., & Wisniewski, M. (1998). An\nEmpirical Comparison of New Product Trial Forecasting\ntion process. The two datasets examined earlier\nModels. Journal of Forecasting, 17(June–July), 209–229.\ndemonstrate that we can expect to see some cross-\nHeckman, J. J., & Willis, R. J. (1977). ABeta-Logistic Model\ncohort differences. Schweidel et al. (2006) examined\nfor the Analysis of Sequential Labor Force Participation\nthis issue more broadly in a continuous-time setting.\nby Married Women. Journal of Political Economy,\n85(February), 27–58.\nWhen we have multiple cohorts defined by time of Kaplan, E. H. (1982). Statistical Models and Mental\nacquisition, the problem with fitting separate models Health: An Analysis of Records From a Mental Health\nto each cohort is that every new cohort has one less Center. Unpublished master’s thesis, Massachusetts\nInstitute of Technology, Department of Mathematics.\nperiod of information than does its temporal prede-\nCambridge.\ncessor, which may result in less confidence in the\nMorrison, D. G., & Schmittlein, D. C. (1980). Jobs, Strikes,\nmodel parameter estimates for the cohorts with fewer\nand Wars: Probability Models for Duration.\ndata points. The natural starting point in such a situ-\nOrganizational Behavior and Human Performance,\nation is to pool the cohorts, assuming that each cohort 25(April), 224–251.\nis the realization of a common underlying contract- Parr Rud, O. (2001). Data Mining Cookbook. New York:\nduration process, and to estimate one set of parame- Wiley.\nters using all the data. Amore elegant solution would Rao, V. R., & Steckel, J. H. (1995). Selecting, Evaluating,\nbe to add another layer of heterogeneity to the model. and Updating Prospects in Direct Marketing. Journal of\nThat is, we would assume that (cid:8)and(cid:9)themselves are Direct Marketing, 9(Spring), 20–31.\ndistributed across cohorts according to some para- Schmittlein, D. C., Morrison, D. G., & Colombo, R. (1987).\nCounting Your Customers: Who They Are and What Will\nmetric distribution. Using a hierarchical Bayes for-\nThey Do Next? Management Science, 33(January), 1–24.\nmulation, this would enable the cohorts with fewer\nSchweidel, D. A., Fader, P. S., & Bradlow, E. T. (2006).\ndata points to “borrow” information about the possible\nModeling Retention in and Across Cohorts. Retrieved\nvalues of (cid:8)and(cid:9)from the earlier cohorts rather than September 23, 2006, from http://ssrn.com/abstract(cid:2)\nrelying on the cohort-specific data alone. 742884\nVaupel, J. W., & Yashin, A. I. (1985). Heterogeneity’s Ruses:\nSome Surprising Effects of Selection on Population\nREFERENCES\nDynamics. The American Statistician, 39(August),\nBerry, M. J. A., & Linoff, G. S. (2004). Data Mining 176–185.\nTechniques: For Marketing, Sales, and Customer Weinberg, C. R., & Gladen, B. C. (1986). The Beta-\nRelationship Management (2nd ed.). Indianapolis, IN: Geometric Distribution Applied to Comparative\nWiley. Fecundability Studies. Biometrics, 42(September),\nBuchanan, B., & Morrison, D. G. (1988). AStochastic Model 547–560.\nof List Falloff With Implications for Repeat Mailings.\nJournal of Direct Marketing, 2(Summer), 7–15.\n86\nJOURNAL OF INTERACTIVE MARKETING\n\n=== 第 12/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nAPPENDIX A STEPS IN MODEL DERIVATION\nIn Appendix A, we walk through the derivations ofthe key which, combining terms and moving all non-(cid:7) elements to\nmathematical results presented in this article. the left of the integral sign,\nNote the three definitions and results that are central to\nthe derivations that follow. 1 (cid:1)1\n• The beta function B((cid:8),(cid:9)) is defined by the integral\n(cid:2)\nB((cid:8),(cid:9))\n(cid:7)(cid:8) (1(cid:1)(cid:7))(cid:9)(cid:4)t(cid:1)2d(cid:7).\n0\nLooking closely at the integral, we see that it is simply the\n(cid:1)1\nB((cid:8),(cid:9)) (cid:2) (cid:7)(cid:8)(cid:1)1(1(cid:1)(cid:7))(cid:9)(cid:1)1d(cid:7),(cid:8),(cid:9)(cid:10)0. (A1) integral expression for the beta function (A1) with parame-\n0 ters(cid:8) (cid:4) 1 and (cid:9) (cid:4) t(cid:1)1. Therefore,\nNote that B((cid:8),(cid:9)) is simply notation for the definite\nB((cid:8)(cid:4)1,(cid:9)(cid:4)t(cid:1)1)\nintegral on the right-hand side of (A1). P(T(cid:2)t0(cid:8),(cid:9)) (cid:2) .\nB((cid:8),(cid:9))\n• The beta function can be expressed in terms of gamma\nfunctions:\n[The expression for the sBG survivor function (6) is derived\nin a similar manner.]\n≠((cid:8))≠((cid:9))\nB((cid:8),(cid:9)) (cid:2) . (A2) Derivation of (7)\n≠((cid:8)(cid:4)(cid:9))\nTo derive the forward-recursion formula used to compute\nsBG probabilities, first note that\n• For the purposes of this article, the only thing weneed to\nknow about the gamma function is its so-called recursive\nproperty: P(T(cid:2)10(cid:8),(cid:9)) (cid:2)\nB((cid:8)(cid:4)1,(cid:9))\nB((cid:8),(cid:9))\n≠(x(cid:4)1)\n(cid:2)x. (A3) which, expressing the beta functions in term of gamma\n≠(x)\nfunctions (A2),\n≠((cid:8)(cid:4)1)≠((cid:9)) ≠((cid:8))≠((cid:9))\nDerivation of (5) (cid:2) n\n≠((cid:8)(cid:4)(cid:9)(cid:4)1) ≠((cid:8)(cid:4)(cid:9))\nWe derive the sBG expression for P(T (cid:2) t) in the following\nmanner. If (cid:7)were known, the probability of dropping out in ≠((cid:8)(cid:4)1) ≠((cid:8)(cid:4)(cid:9)(cid:4)1)\n(cid:2) n .\nPeriod t would simply be the shifted-geometric probability ≠((cid:8)) ≠((cid:8)(cid:4)(cid:9))\n(cid:7)(1 (cid:1) (cid:7))t(cid:1)1. But since (cid:7) is unobserved (and assumed to be\ndistributed randomly across the population), P(T(cid:2)t) for a Recalling the recursive nature of the gamma function (A3),\nrandomly chosen individual is the expected value of the (cid:13)((cid:8)(cid:4)1)(cid:11)(cid:13)((cid:8))(cid:2)(cid:8)and(cid:13)((cid:8)(cid:4)(cid:9)(cid:4)1)/(cid:13)((cid:8)(cid:4)(cid:9))(cid:2)(cid:8)(cid:4)(cid:9).Therefore,\nshifted-geometric probability of dropping out in Period t\n(conditional on (cid:12)(cid:2)(cid:7)), where the expectation is with respect P(T(cid:2)10(cid:8),(cid:9)) (cid:2) (cid:8) .\nto the beta distribution for (cid:12), E[P(T(cid:2)t|(cid:12)(cid:2) (cid:7))]. (That is, (cid:8)(cid:4)(cid:9)\nwe weight each P(T(cid:2)t|(cid:12)(cid:2)(cid:7)) by the probability of that\nvalue of (cid:7) occurring, f((cid:7)).) Since (cid:12) is a continuous random\nButhowdoesthishelpuscomputeP(T(cid:2)t)fort(cid:2)2,3,...?\nReflectingontheidentity\nvariable, this is computed as\nP(T(cid:2)t)\nP(T(cid:2)t) (cid:2) (cid:3)P(T(cid:2)t(cid:1)1),\n(cid:1)1 P(T(cid:2)t(cid:1)1)\nP(T (cid:2) t0(cid:8),(cid:9)) (cid:2) P(T(cid:2)t0(cid:12)(cid:2)(cid:7))f((cid:7)0(cid:8),(cid:9))d(cid:7)\n0 shifted-geometric beta if we have a simple expression for the ratio P(T (cid:2) t)(cid:11)\n(cid:1)1 (cid:7)(cid:8)(cid:1)1(1(cid:1)(cid:7))(cid:9)(cid:1)1 P(T (cid:2) t (cid:1) 1), we can easily compute P(T (cid:2) 2) given the\n(cid:2) (cid:7)(1(cid:1)(cid:7))t(cid:1)1 d(cid:7) value of P(T(cid:2)1)(cid:2) (cid:8)(cid:11)((cid:8) (cid:4) (cid:9)). Given the value of P(T(cid:2)2),\nB((cid:8),(cid:9))\n0 we can then compute P(T(cid:2)3), and so on.\n87\nHOW TO PROJECT CUSTOMER RETENTION\nµ •\n\n=== 第 13/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nRecalling (5), we have Derivation of (8)\nWe derive the expression for the retention rate as implied\nP(T(cid:2)t) B((cid:8)(cid:4)1,(cid:9)(cid:4)t(cid:1)1) B((cid:8)(cid:4)1,(cid:9)(cid:4)t(cid:1)2) by the sBG model by substituting the expression for the\n(cid:2) n\nP(T(cid:2)t(cid:1)1) B((cid:8),(cid:9)) B((cid:8),(cid:9)) sBG survivor function (6) into (2) and simplifying:\n(cid:2)\nB((cid:8)(cid:4)1,(cid:9)(cid:4)t(cid:1)1)\nB((cid:8),(cid:9)(cid:4)t) B((cid:8),(cid:9)(cid:4)t(cid:1)1)\nB((cid:8)(cid:4)1,(cid:9)(cid:4)t(cid:1)2) r t(cid:2)\nB((cid:8),(cid:9))\nn\nB((cid:8),(cid:9))\nwhich, expressing the beta functions in term of gamma B((cid:8),(cid:9)(cid:4)t)\n(cid:2)\nfunctions (A2) and canceling terms, B((cid:8),(cid:9)(cid:4)t(cid:1)1)\nwhich, expressing the beta functions in terms of gamma\n≠((cid:9)(cid:4)t(cid:1)1) ≠((cid:8)(cid:4)(cid:9)(cid:4)t)\n(cid:2) n functions (A2) and canceling terms,\n≠((cid:9)(cid:4)t(cid:1)2) ≠((cid:8)(cid:4)(cid:9)(cid:4)t(cid:1)1)\n≠((cid:9)(cid:4)t) ≠((cid:8)(cid:4)(cid:9)(cid:4)t)\nwhich, recalling the recursive nature of the gamma function (cid:2) n\n≠((cid:9)(cid:4)t(cid:1)1) (cid:13)((cid:8)(cid:4)(cid:9)(cid:4)t(cid:1)1)\n(A3),\nwhich, recalling the recursive nature of the gamma function\n(cid:2) (cid:9)(cid:4)t(cid:1)2 . (A3),\n(cid:8)(cid:4)(cid:9)(cid:4)t(cid:1)1\n(cid:9)(cid:4)t(cid:1)1\n(cid:2) .\n(cid:8)(cid:4)(cid:9)(cid:4)t(cid:1)1\nThe complete forward-recursion formula naturally follows.\nAPPENDIX B IMPLEMENTING THE MODEL IN EXCEL\nIn Appendix B, we show how to compute the maximum like- specific values of the model parameters (cid:8) and (cid:9), the joint\nlihood estimates for the sBG model parameters for the High probabilityoflosingn customersinthefirstperiod,n inthe\n1 2\nEnd dataset using Microsoft Excel. Before providing step- secondperiod,...,n intheseventhperiod,andn(cid:1) g7 n\n7 t(cid:2)1 t\nby-step instructions for constructing the worksheet, we customers still being active at the end of the seventh\nbriefly review the notion of maximum likelihood estimation. periodis\nSuppose we observe a group of n customers for seven peri-\nods. Note that n 1customers are “lost” in the first period (i.e., P(data0(cid:8),(cid:9))(cid:2)P(T(cid:2)10(cid:8),(cid:9))n1P(T(cid:2)20(cid:8),(cid:9))n2P(T(cid:2)30(cid:8),(cid:9))n3\ndo not renew their contract at the end ofthat period), n 2in (cid:3)P(T(cid:2)40(cid:8),(cid:9))n4P(T(cid:2)50(cid:8),(cid:9))n5P(T(cid:2)60(cid:8),(cid:9))n6\nthe second period,..., with n 7 customers not renewing (cid:3)P(T(cid:2)70(cid:8),(cid:9))n7S(70(cid:8),(cid:9))n(cid:1)g7 t(cid:2)1nt. (B1)\ntheir contracts at the end of the seventh period. It follows\nthat n(cid:1) g7 t(cid:2)1n tcustomers are still active at the end of the However, we do not know the values of (cid:8)and(cid:9), even though\nseventh period. we believe that the data come from the sBG distribution.\nAssumethatthecustomerlifetimescanbecharacterizedby The idea of maximum likelihood estimation is to askwhat\nthesBGdistribution.Whatistheprobabilitythatarandomly values of the model parameters maximize the probability\nchosencustomerhasalifetimeofoneperiod? Theansweris (or, more formally, the likelihood) of the observed data. We\nthe sBG probability P(T (cid:2)1|(cid:8), (cid:9)). What is the probability definethelikelihoodfunctionas\nthatarandomly chosencustomerhasalifetimeoftwoperi-\nods?Theansweristhe sBGprobabilityP(T(cid:2)2|(cid:8),(cid:9)).What L((cid:8),(cid:9)0data)(cid:2)P(T(cid:2)10(cid:8),(cid:9))n1P(T(cid:2)20(cid:8),(cid:9))n2P(T(cid:2)30(cid:8),(cid:9))n3\nistheprobabilitythatonerandomlychosencustomerhasa (cid:3)P(T(cid:2)40(cid:8),(cid:9))n4P(T(cid:2)50(cid:8),(cid:9))n5P(T(cid:2)60(cid:8),(cid:9))n6\nlifetime of one period while another has a lifetime of two (cid:3)P(T(cid:2)70(cid:8),(cid:9))n7S(70(cid:8),(cid:9))n(cid:1)g7 t(cid:2)1nt. (B2)\nperiods? Assuming that the propensity of one customer to\ndrop out is independent of the behavior of the other cus- and use numerical optimization methods (e.g., the Solver\ntomer,itissimplytheproductoftherespectivesBGproba- add-in in Excel) to find the values of (cid:8)and(cid:9)that maximize\nbilities: P(T (cid:2)1|(cid:8), (cid:9))P(T (cid:2)2|(cid:8), (cid:9)). It follows that, given this function; these are called the maximum likelihood\n88\nJOURNAL OF INTERACTIVE MARKETING\n\n=== 第 14/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\nA B C D E F\n1 alpha 1.000\nTABLE B1 Sample Data 2 beta 1.000\n3 LL -2115.5\n4\n5 t P(T=t) S(t) #active #lost\n6 1 0.500 0.500 869 131 -90.8\nYEAR 1 2 3 4 5 6 7 7 2 0.167 0.333 743 126 -225.8\n8 3 0.083 0.250 653 90 -223.6\nNo.Active 869 743 653 593 551 517 491\n9 4 0.050 0.200 593 60 -179.7\n10 5 0.033 0.167 551 42 -142.9\n11 6 0.024 0.143 517 34 -127.1\n12 7 0.018 0.125 491 26 -104.7\n13 -1021.0\nestimatesofthemodelparameters.4Asthenumbercomputed\nFIGURE B1\nusing (B2) will be very small, we usually work with the\nnatural logarithm of the likelihood function, the so-called Screenshot of Excel Worksheet for Parameter Estimation\nlog-likelihood function:\n– We compute P(T(cid:2)2) by entering (cid:2)($B$2(cid:4)A7(cid:1)2)(cid:11)\nLL((cid:8),(cid:9)0data) (cid:2)ln(cid:2)L((cid:8),(cid:9)0data)(cid:3)\n($B$1(cid:4)$B$2(cid:4)A7(cid:1)1)*B6in Cell B7.\n7 7\n(cid:2) a n tln(cid:2)P (T(cid:2)t 0 a,b)(cid:3)(cid:4)an(cid:1) a n tbln(cid:2)S(70 a,b)(cid:3) . – We copy B7toB8:B12.\nt(cid:2)1 t(cid:2)1\n(B3) • Wecompute the values of S(t 0(cid:8),(cid:9)) for t(cid:2)1, 2, . . . ,7in\nFor a sample of 1,000 High End customers, Table 1 implies CellsC6:C12:\nthe number of customers active at the end of Years 1–7 as\n– S(1) is simply 1(cid:1)P(T(cid:2)1), so we enter (cid:2)1(cid:1)B6 in\nreported in Table B1.\nCellC6.\nGiventhesedata,ourtaskisto“codeup”theexpressionfor\nthe model log-likelihood function in an Excel worksheet – For t(cid:10)1, S(t) (cid:2)S(t(cid:1)1) (cid:1)P(T(cid:2)t), so we enter\nand find the maximum likelihood estimates of (cid:8) and (cid:9) by (cid:2)C6(cid:1)B7in Cell C7.\nusingSolvertofindthevaluesof(cid:8)and(cid:9)thatmaximizethe\n– We copy C7toC8:C12.\nvalueofthisfunction.Therelevantworksheetisshownin\nFigureB1 andisconstructedinthefollowingmanner. • The next step is to enter the observed data. The number\nof customers active at the end of Year 1 (n(cid:2)869) is\n• Toenterexpressionsfor P(T(cid:2)t 0a,b)withoutanerror\nentered in cell D6, the number for Year 2 (n(cid:2)743) is\nmessage appearing (e.g., #NUM! or #DIV/0!), we need\nentered in cell D7, and so on down to 491 customers in\nsome “starting values” for (cid:8) and (cid:9). The exact values do\ncellD12for Year 7.\nnotmatter—providedtheyarewithinthedefinedbounds\n(i.e.,(cid:8),(cid:9) (cid:10) 0)—sowestartwith 1.0 for (cid:8)and(cid:9), locating • The number of customers not renewing their contracts\nthese parameter values in Cells B1:B2, respectively. each year (n), as required for the log-likelihood function,\nt\nis computed in Cells E6:E12:\n• Weenter the values of t(cid:2)1, 2, . . . , 7in Cells A6:A12.\n– As the number of customers “lost” in Year 1 is simply\n• The corresponding values of P(T(cid:2)t|(cid:8),(cid:9)) are computed\nthe number of initial customers minus the number of\nin Cells B6:B12using the forward-recursion given in (7):\ncustomers who are still active at the end of the first\n– We compute P(T(cid:2)1) by entering (cid:2)B1/(B1(cid:4)B2)in year, we enter (cid:2)1000(cid:1)D6in Cell E6.\nCellB6.\n– For t(cid:10)1, the number of customers “lost” in Year t is\nthe number of customers who are still active at the\nend of Year t(cid:1) 1 minus the number of customers who\nare still active at the end of the Year t. We therefore\n4Note that (B1) and (B2) look almost identical, but there is a sub- enter(cid:2)D6(cid:1)D7in Cell E7and copy it to E8:E12.\ntle difference: in (B1): The probability we compute is a function of\n• The first seven elements of the log-likelihood function\nthe data pattern for fixed model parameters; in (B2), we already\nhave the data, and the probability we compute is a function of the\nare computed in Cells F6:F12: We enter (cid:2)E6*LN(B6)\nmodel parameters. in Cell F6and copy it to F7:F12.\n89\nHOW TO PROJECT CUSTOMER RETENTION\n\n=== 第 15/15 页 ===\n\nJournal of Interactive Marketing DOI:10.1002/dir\n• The final element of the log-likelihood function, that asso-\nciated with those customers who are still active at the end\nof Year 7, is entered as (cid:2)D12*LN(C12)in Cell F13.\n• The sum of Cells F6:F13is entered in Cell B3; this is the\nvalue of the log-likelihood function given the values for\nthe two model parameters in Cells B1:B2. (With starting\nvalues of 1.0 for both parameters, LL(cid:2)(cid:1)2,115.5.)\nWefind the maximum likelihood estimates of the two model\nparameters by maximizing the log-likelihood function. We\nFIGURE B2\ndo this using the Excel add-in Solver, available under the\nSolver Settings\n“Tools” menu. The target cell is the value of the log-\nlikelihood, Cell B3. We wish to maximize this by changing\nCells B1:B2. The constraints we place on the parameters\nare the maximum likelihood estimates of the model\nare that (cid:8)and(cid:9)are greater than 0. As Solver offers us only\nparameters. (To be sure that we actually have reached the\na “greater than or equal to” constraint, we add the con-\nmaximum of the log-likelihood function, it is good practice\nstraint that Cells B1:B2 are (cid:14) a small positive number\nto redo the optimization process using a completely\n(e.g., 0.0001) (see Figure B2).\ndifferent set of starting values. For example, using\nstarting values of 0.01 and 0.01 (for which LL(cid:2) (cid:1)2,741.7),\nClicking the Solve button, Solver converges to a solution use Solver to find the maximum of the log-likelihood\nwhere the maximum value of the log-likelihood function is function. Are the corresponding values of the two model\n(cid:1)1,611.2, associated with (cid:8) (cid:2) 0.668 and (cid:9) (cid:2) 3.806. These parameters equal to those given earlier? They should be!)\n90\nJOURNAL OF INTERACTIVE MARKETING"
    },
    {
      "turn_idx": 13,
      "action": "fetch_website_content({\"website_url\": \"https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf\"})",
      "explanation": "",
      "observation": "PDF内容提取成功（pdfplumber，共36页）：\n\n\n=== 第 1/36 页 ===\n\nLBS Research Online\nP S Fader, B G S Hardie, Y Liu, J Davin and T Stennburgh\n\"How to project customer retention\" revisited: the role of duration dependence\nArticle\nThisversionisavailableintheLBSResearchOnlinerepository: https://lbsresearch.london.edu/\nid/eprint/954/\nFader, P S, Hardie, B G S, Liu, Y, Davin, J and Stennburgh, T\n(2018)\n\"How to project customer retention\" revisited: the role of duration dependence.\nJournal of Interactive Marketing, 43 (August). pp. 1-16. ISSN 1094-9968\nDOI: https://doi.org/10.1016/j.intmar.2018.01.002\nReuse of this item is allowed under the Creative Commons licence:\nhttp://creativecommons.org/licenses/by-nc-nd/4.0/\nElsevier\nhttps://www.sciencedirect.com/science/article/pii/...\nUsers may download and/or print one copy of any article(s) in LBS Research Online for purposes of\nresearch and/or private study. Further distribution of the material, or use for any commercial gain, is\nnot permitted.\n\n=== 第 2/36 页 ===\n\n“How to Project Customer Retention” Revisited:\nThe Role of Duration Dependence\nPeter S. Fader\nBruce G.S. Hardie\nYuzhou Liu\nJoseph Davin\nThomas Steenburgh†\nJanuary 2018\n†Peter S. Fader is the Frances and Pei-Yuan Chia Professor of Marketing at The Wharton School of\nthe University of Pennsylvania (address: 771 Jon M. Huntsman Hall, 3730 Walnut Street, Philadelphia,\nPA19104-6340;phone: (215)898-1132;email: faderp@wharton.upenn.edu;web: www.petefader.com).\nBruceG.S.HardieisProfessorofMarketingattheLondonBusinessSchool(email: bhardie@london.edu;\nweb:www.brucehardie.com). YuzhouLiuisaQuantitativeResearcher atManNumeric. JosephDavinis\nHead of Data Science, Two Six Capital. Thomas Steenburgh is the Bank of America Research Professor\nof Business Administration at the Darden Graduate School of Business, University of Virginia.\n\n=== 第 3/36 页 ===\n\nAbstract\n“How to Project Customer Retention” Revisited:\nThe Role of Duration Dependence\nCohort-level retention rates typicallyincrease over time, and the beta-geometric (BG) distribu-\ntionhasproven tobe arobustmodelforcapturingand projectingthese patternsintothefuture.\nAccording tothismodel, thephenomenon ofincreasingcohort-levelretentionratesis purelydue\nto cross-sectional heterogeneity; an individual customer’s propensity to churn does not change\nover time. In this paper we present the beta-discrete-Weibull (BdW) distribution as an exten-\nsiontothe BGmodel, onethatallowsindividual-levelchurn probabilitiestoincrease ordecrease\nover time. In addition to capturing the phenomenon of increasing cohort-level retention rates,\nthis new model can also accommodate situations in which there is an initial dip in retention\nrates before they increase (i.e., a U-shaped cohort-level retention curve). A key finding is that\neven when aggregate retention rates are monotonically increasing, the individual-level churn\nprobabilities are unlikely to be declining over time, as conventional wisdom would suggest. We\ncarefully explore these connections between heterogeneity, duration dependence, and the shape\nof the retention curve, and draw some managerially relevant conclusions, e.g., that account-\ning for cross-sectional heterogeneity is more important than accounting for any individual-level\ndynamics in churn propensities.\nKeywords: beta-geometric (BG) distribution, beta-discrete-Weibull (BdW) distribu-\ntion, retention rate dynamics.\n\n=== 第 4/36 页 ===\n\n1 Introduction\nAny researcher workingwith data froma business that has a “contractual”relationshipwith its\ncustomers (e.g., one with a subscription-based business model) will want models for projecting\ncustomer retention (or equivalently, tenure) as part of their toolkit. For example, estimates of\nthe length of a customer’s relationship with the firm lie at the heart of any attempt to compute\ncustomer lifetime value (CLV). Similarly, such models are useful when evaluating the relative\nperformance of different acquisition channels.\nFader and Hardie (2007),hereafter FH, presented the beta-geometric (BG)distribution as a\nsimple probability model for projecting customer retention. This model is based on an easy-to-\nunderstand “story”ofcustomer behavior,is simpletoimplement(e.g.,can be done soin Excel),\nand its estimates of customer retention over a longitudinal holdout period have proven to be\nsurprisingly accurate and robust.\nAccordingtothismodel,thewidelyobservedphenomenonofincreasingcohort-levelretention\nrates (Reichheld 1996) is purely due to cross-sectional heterogeneity, with individual customers\nhaving a constant propensity to churn. Cohort-level retention rates increase because those\ncustomers with high churn propensities drop out early on, leavingan ever-increasing proportion\nofcustomerswhohavelowpropensitiestochurn. Thisassumptionthatanindividualcustomer’s\npropensity to churn does not change over time flies in the face of conventional wisdom, which\nassumes that a customer’s propensity to churn decreases the longer their tenure with the firm.\nWhile cohort-level retention rates are typically monotonically increasing with tenure, we\nsometimes observe an initial dip before they increase (e.g., Israel 2005, Nitzan et al. 2011, and\none of the FH datasets), a phenomenon that the BG model cannot capture. In this paper we\ndevelopageneralizationoftheBGmodel,onethatbothrelaxestheassumptionoftime-invariant\nindividual-level propensities to churn and is sufficiently flexible to capture the phenomenon of\nnon-monotonically increasing cohort-level retention rates. Surprisingly, we find that when the\nassumption of constant individual-level propensities to churn is violated, it is more likely that\nthese propensities increase with tenure (rather than decrease, as conventional wisdom would\nsuggest).\nThis paper is organized as follows. In the next section we re-examine the work of FH,\n1\n\n=== 第 5/36 页 ===\n\nreviewing the BG model and its empirical performance. We then present our generalization\nof the BG model, the beta-discrete-Weibull (BdW) distribution, and examine its performance\nusing the two datasets presented in FH. This is followed by an investigation of the properties\nof the cohort-level retention rates associated with the BdW model. We then investigate the\nrobustness of our results by exploring some alternative model specifications, and conclude with\na brief discussion of the implications of this work.\n2 A Brief Review of the BG Model\nFH propose a simple probability model for characterizing and forecasting the length of a cus-\ntomer’s relationship with a firm in a contractual setting that is based on the following “as if”\nstory of customer behavior:\ni) At the end of each contract period, an individual decides whether or not to renew their\ncontract by tossing a coin: “heads” they renew their contract, “tails” they cancel it.\nii) For a given individual, the probability of a coin coming up “tails” does not change over\ntime.\niii) The probability of a coin coming up “tails” varies across customers. (This implies that\nthe coins are not assumed to be “fair.”)\nThis is formalized in the following manner. Let the random variable T denote the length of\nan individual’s relationship with the firm, and θ denote the probability of a given individual’s\ncoin coming up “tails” when tossed. Assumptions (i) and (ii) are equivalent to assuming that\nT is distributed geometric with survivor function\nS(t|θ)= (1−θ)t, 0< θ < 1, t = 0,1,2,... (1)\nFrom the analyst’s perspective, the unobserved (and unobservable) θ is a realization of the\nrandom variable Θ. Given its flexibilityand mathematicalconvenience, the naturaldistribution\nfor characterizing Θ is the beta distribution:\n2\n\n=== 第 6/36 页 ===\n\nθγ−1(1−θ)δ−1\nf(θ|γ,δ)= , γ,δ > 0. (2)\nB(γ,δ)\nIt follows that for a randomly chosen individual,\n1\nS(t|γ,δ)= S(t|θ)f(θ|γ,δ)dθ\nZ\n0\nB(γ,δ+t)\n= , t = 0,1,2,... (3)\nB(γ,δ)\nThis beta mixture of geometrics is called the beta-geometric (BG) distribution.1 (See FH for\nmodel derivationsand informationon howtoestimatethemodel parameters;alsosee Fader and\nHardie (2014) for an alternative estimation approach.)\nTo the best of our knowledge, this mixture model was first derived by Pielou (1962), who\nused ittocharacterizetheruns lengthsofspeciesinplantpopulations. Potterand Parker(1964)\nwere the first to use it as a model for duration-time data, using it to characterize the number\nof menstrual cycles a woman experiences before she conceives. Within the marketing literature,\nit was used by Morrison and Perry (1970) as a model of the number of units purchased on a\ngiven transaction occasion, and by Buchanan and Morrison (1988) as a model of response to\npromotional stimuli; also see Fox et al. (1997). FH explored its properties as a model of the\nlength of a customer’s relationship with a firm in a contractual setting, with Fader and Hardie\n(2010) taking the logical next step and using it as the basis for calculating CLV.\nAtthe heartofthe FHpaper isacustomermetricofgreatinteresttomanagersand analysts:\nthe retention rate. When computed at the level of the cohort, the period t retention rate is the\nportion of period t customers (i.e., those who have “survived” to period t) who renew their\ncontracts at the end of that period. This can be computed as\n1FH called this the shifted-beta-geometric (sBG) model; the term “shifted” is used to make the distinction\nbetween two versions of the geometric distribution; one with support 0,1,2,..., and the other with support\n1,2,3,..., with the term “shifted” being applied to the second version. (In our contract-duration setting, the\nfirst version would apply when T isdefined as the number of contract renewals the individual makes before they\ncanceltheircontract,ratherthanthelengthoftheindividual’srelationshipwiththefirm(measuredinnumberof\ncontractperiods),asisthecaseabove.) However,mostapplicationsofthismixturemodelareinsettingsinwhich\nthe support is 1,2,3,..., yet the term shifted is not applied. So as to be consistent with this broader literature,\nwe use the label BG (rather than sBG) for this distribution.\n3\n\n=== 第 7/36 页 ===\n\nS(t|γ,δ)\nr(t|γ,δ)=\nS(t−1|γ,δ)\nB(γ,δ+t)\n=\nB(γ,δ+t−1)\nδ+t−1\n= , t = 1,2,3,... (4)\nγ +δ+t−1\nNote that, for any values of γ and δ, this is an increasing function of time. It is important\nto note that there are no underlying time dynamics at the level of the individual customer; see\nassumption (ii) above. The increasing (aggregate/cohort-level)retention rate is simply due to a\nsorting effect in a heterogeneous population.\nTo elaborateon this sortingeffect, let ρ(t)denote the individual-levelprobabilitythatsome-\none who has made t −1 renewals will renew at the next opportunity (i.e., P(heads)). Given\nthe assumption of individual-level relationship durations characterized by the geometric distri-\nbution, ρ(t)= S(t|θ)/S(t−1|θ)= 1−θ. Recall that the unobserved (and unobservable) θ are\nviewed as realization of the random variable Θ. Similarly, ρ(t) is a realization of P(t).\nSince ρ(t)isafunctionofθ, thedistributionofP(t)isafunctionoftheposteriordistribution\nof Θ acrosstheperiod t customers(i.e.,those whohave madet−1 contractrenewals). Recalling\nBayes’ theorem,\nS(t−1|θ)f(θ|γ,δ)\nf(θ|γ,δ;t−1renewals)=\nS(t−1|γ,δ)\nθγ−1(1−θ)δ+t−2\n= , t = 1,2,3,... (5)\nB(γ,δ+t−1)\nwhich is a beta distribution with parameters γ and δ + t − 1. Given that ρ(t) = 1 − θ, the\ndistributionofP(t)acrossperiodtcustomersissimplythereflectionofthisposteriordistribution\nabout θ = 0.5, which is a beta distribution with parameters δ+t−1 and γ:\nρ(t)δ+t−2(1−ρ(t))γ−1\nf(ρ(t)|γ,δ)= , t = 1,2,3,... (6)\nB(δ +t−1,γ)\nThe meanofthisdistributionequalstheexpressionforr(t|γ,δ)givenin(4),i.e.,r(t)= E[P(t)].\nDynamics in r(t) are simply due to changes in the nature of the distribution of P(t), which are\n4\n\n=== 第 8/36 页 ===\n\nsimplydue tocustomers withhigher churn propensities droppingout, leavingan ever-increasing\nproportion of customers who have low propensities to churn.\n3 Revisiting FH’s Analysis\nWe start by revisiting the empirical analysis presented in FH. Our objectives are two-fold.\nFirst, we wish to highlight the robustness of the BG model. Second, we wish to identify the\nphenomenon that motivates this work.\nThe data presented in Table 1, drawn from Berry and Linoff (2004), document the year-on-\nyear renewals fortwosegments of customers (“Regular”and “High End”) of an unspecified firm\nin a contractual setting. (See FH for further details.) For a nominal sample of 1000 customers\nacquired at the beginning of Year 1, we observe their pattern of renewals over 12 consecutive\n(annual) renewal opportunities. For example, 631 individuals in the Regular dataset renew\ntheir contract at the end of the first year, and are therefore customers in Year 2. Of these 631\nindividualswhohaveacontractualrelationshipwiththe firminYear 2,468renew theircontract\nat the end of the year and are therefore customers in Year 3. And so on.\n# Customers\nYear Regular High End\n1 1000 1000\n2 631 869\n3 468 743\n4 382 653\n5 326 593\n6 289 551\n7 262 517\n8 241 491\n9 223 468\n10 207 445\n11 194 427\n12 183 409\n13 173 394\nTable 1: Pattern of year-on-year renewals for a cohort of 1000 customers from two seg-\nments (Regular and High End) acquired at the beginning of Year 1\nFH undertake an analysis in which the model is calibrated using the first eight years of\ndata (seven renewal opportunities) and its predictive performance assessed over the remaining\n5\n\n=== 第 9/36 页 ===\n\nfive years of data (five renewal opportunities). The estimation results are presented in Table 2\n(columns2and3)andthemodel-based estimatesofsurvivalandretentionarecomparedagainst\nthe actual numbers for both datasets in Figure 1.2\nEight-year Calibration Period Five-year Calibration Period\nRegular High End Regular High End\nγ 0.704 0.668 0.764 1.281\nδ 1.182 3.806 1.296 7.790\nLL −1680.3 −1611.2 −1401.6 −1225.1\nTable 2: BG model estimation results\n100 80 60 40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS% ..................................... .... .. ... ..... ...... ...... ... .... ..... ......... ... ... ... ...... ...... ..... ... .... ..... ..... ..... ..... ..... .... ...... ...... .. .... ...... ..... .... ... R... ... .... ... ... ... ... .... .... ... ....... ..... .. ... ... .. ... .. .... .. .. .. ... .. .. .. .. .. .. . e.. ... .. ... .. ... ... .... .. .. ... .. .. ... ..... . g.. .... ... ... .. ... .. .... ... ... .... ... .... ... .... . u.. .... .... ... .... ... .. .. ... .. .. .... ... ... ..... ... ... .. l... .. ... .. ..... .. a.. ... .. ...H ... ... .. ... .. .. .... .. ... ... .. .. r. ....... .... ... .... .... .i .... ... ... .. .... ...g . .... ... ... ..... ... .... .. .... ... .. .. ..h . .. .. ... ........ ... .. .. .... .. ... .... .. ...... .. .. .... .. .... ...E ... ... .... ... .. .... ... .... .. .. ... .. .. .... .. .. .. ...n . .. .. .. .. ... .. .... .. ...... ... ..... ...d .. .. .... .... .. .. .. ... .. ... .... ... ..... .... ... ... ... .. .. .. ...... ... .... .. .... ... .. ...... ..... ... .... ... .... .. ... .... .. .. .. .. .... .. .. .... ... .. ... .. .. ... ... ... .. ...... .... .. .. .... .... .. ... ..... ... ..... .... ... .. ..... .... ..... .... .. ... .. .. ... .. .. .... ..... .... .. .... .... .... ..... .. ... .. .. .. .. ... .. ... .. .. .. ... ... .... .. .. .... .... ... ..... ... ... .. .. .. .. .. .. .. .. ... .... ... ... .. ... .. .... ...... .... ... ... .... .... .. .. .. ... .. .. ... .. ..... ... ... ... .. .. .. ... .. .. .. ... ..... ... ... ... .. .. .. .. .. .. ... ... ...... .. .. ..A B . ... ..... .... .... ... ... .. .... ... .... ... .. .. ..G ...c . .... ... .... .. .. .... .. .. .. ..t . ... ... .... ... .... .... .... .u ... ... .. .... ... .. .. .... ... .... ....a ..... .... ... ... ..... .... .... .. ...l .. .. ... ..... ... ...... .... .... ... ... ... ... ... ... ... ... .. ... .... .. .. .. ... ... .. .... .. .... .. ... .... .. .... .... .. ... .... .. ... ..... .. ... .... ..... ....\n.\n00001 ..... 67890\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR ... . ...... .... ... ... .. ... .... ..... ...... .. .... ... .... .... ... .... .... .. .... ... ...... ... ... ... ....... .... ... .... .. .. .. ... ..... ..... .... .. .. ... . ..... ... ... . .... . ... .... .... ... ..... ... .. .. .H .. ..... .... .... ... ... ... .... ... ... . ..... ...... ... .... . ... ...i .. ... ..... .... .... . .g ... .. . .... .. . ..... ... ... ..... ... ... ... .h .... .... . ... ... ... .. .... .. ... ... .. ... ... ... .. .. .... ... .... ...E ... .. .. .... .... .... .... .... ... .. ..... .... ... ... ..n .. ..... .. .... ... ... .. ... .. ... .... ... .. .. R.d .... ... ... .. .... .... ..... .... ... ..... .. .... e.. ...... ... ... ... .. .. .... .. .. .. g.. .. .... ... .. ... .. .. ..... .... .. ... .. u... ... .... .... .. .. ... ..... .... .. .... l... ... ... .. ... ... .. a. ... .... .. ... ... ... ...... ..... .. r... .... .... ... .. .. .. ... .... .. .. ... ... ... .. ... .... ... .... ..... .. ... ... ... .... .... ...... .. .. ... .. ..... ... .... .. ... ..... ..... .. ... .... .. .. .... ... ... .... .. .. .. ... .... ... .. ... ... ... ... .... ... ... ... ... .... .. .... .... .. .. .. .. ... ... ... .... ..... ... ... .... .... ... ... ... .... .... ... ... ... ... .. ... ... ... .... .. ... ... ... ... .. .... ... .... ... .. .... .... .. ... ..... ... .... ... ... ... .... ... .. .... .. ..... .. .... ... .. ... ... .... ... .... .. .. .. .... .... .... .... ... ... .. .. .. ......... ... ..... ... ... .. ....... ......... ... ........ .. ... .... .... ...... ... .... ... ... .. .... .. ........... ...... ....... ... ... ............. ... ... .... ... ... .... ..... ........ ...... .... ... ... ... ... .... ... ...... ... .. ..... .... ..... .... .... .. ... ..... .. .... .. ... ... ....... ............. ... .\nActual\nBG\nFigure 1: ComparingactualandBG-model-basedestimatesofsurvival(LHS)andreten-\ntion (RHS) given an eight-year model calibration period. (The model-based\nnumbers to the right of the vertical dashed line are projections given the pa-\nrameter values estimated using the data to the left of this line.)\nWe note that this simple probability model does an excellent job of predicting survival (and\ntherefore retention) in the Regular dataset. The prediction of survival in the High End dataset\nis good but not quite as impressive on retention as for the Regular dataset. FH noted that\n“[d]espite the existence of certain unexplained “blips” as in Year 2 for the High End dataset,\nthe tracking/prediction plot for [r(t)] is very impressive through Year 12,” and made no further\ncomment.\nWe now “stress test” the BG model by shortening the calibration period to five years (four\nrenewal opportunities), thereby lengthening the validation period to eight years. The estima-\ntion results are presented in Table 2 (columns 4 and 5). Note that for the Regular cohort the\nparameter estimates are quite similar. In fact, evaluating the eight-year calibration period like-\n2A copy of the spreadsheet containing the analyses presented in this paper can be found at <insert URL>.\n6\n\n=== 第 10/36 页 ===\n\nlihood function using the five-year calibrationperiod parameter estimatesyieldsa log-likelihood\nof −1680.6. This stability in parameter estimates suggests that the BG model is an excellent\ncharacterization of the true data-generating process. This conclusion is supported when we\ncompare the model-based estimates of survival and retention to the actual numbers (top half\nof Figure 2). While not as good as for an eight-year calibration period, the performance is still\nvery impressive when we consider that we are predicting behavior across a holdout period that\nis twice as long as the calibration period.\nRegular\n100\n80\n60\n40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS%\n...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................A\nB .......................G\n.c ..............t .............u .................a ................l\n........................................................................\n00001 ..... 67890\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR\n.............................................................................................................................................................................................................................................................................................................................................................................. ..... ..... ..... ........ .... ...... ...... .. .... ... ......... ...... .......... ..\n...........................................................................................................................................................................................................................................................................................\nActual\nBG\nHigh End\n100\n80\n60\n40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS%\n........................................................................................................................................................................................................................................................................................................................................................................................\n......... .... ... ... ..... .. .. .. ... ...... ... ... ... ....... .. .. ... .... ...... .. .. .. .. .. ..... ... .... .. ....... ... ... .. .. ....... .. .. .. .... ...... .. .. .. .. ....... .. ... ... .. ...... .. .. ... ...\n.A\nB\n...... .. .. .. .. ..\n.....G\n.c\n.. .. .. ...\n......t\n... .. .. .. ..\n....u\n... .... .. .. ..\n....a\n. ... ... .. ..\n.....l\n. .. ... ... .. ....... .. .. .. ... .. ..... .. .. .. ... .. ..... .. .. .. .. .. ...... .. .. ..\n0001 .... 7890\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR ........................................................ ... ....... .. .... .. ..\n............................................................................................................. .. .. .. ....... ... .. .. .. .. ..... ... .. .. .. ... ..... .. .. .. .... .. ..... .. ... ... .. ...... ... ... ... ...... ... .. .. .. .. ..... .... .. .. .. ........ .... ... ... ...... .. .. .. .. ........ .. .. .. .. .. ....... .. .... .. ........ ... ... ... ...... .. ... .. .. ....... .. ... .. .. .. ...... .. .. .. .. ... ..... .. .. .. .. .. ...... .... .. .... ...... ... .... .. .. ..... .. ... .. .. .. ..... ... .. .... .. ..... .. .. ... .. .. ..... ... .. .... .. ...... .. ... ... ........ ... ... .. .. ....... .. ... .. .. ....... .... .. ... ...... .. .. .. .. ... ..... .. .... ..\nActual\nBG\nFigure 2: Comparing actual and BG-model-based estimates of survival (LHS) and re-\ntention (RHS) for the Regular (top) and High End (bottom) datasets given a\nfive-year model calibration period. (The model-based numbers to the right of\nthe vertical dashed line are projections given the parameter values estimated\nusing the data to the left of this line.)\nThe results for the High End dataset are a completely different story. We note from Table 2\nthattheparameterestimatesareverysensitivetothelengthofthecalibrationperiod,suggesting\nthat the BG model is a not good characterization of the true data-generating process for this\n7\n\n=== 第 11/36 页 ===\n\ndataset. This problem is even more evident when we compare the model-based estimates of\nsurvival and retention to the actual numbers (bottom half of Figure 2). While the model\nappears to be tracking actual survival in the calibration period, it progressively under-predicts\nsurvival with the passage of time in the validation period. This is reflected in the failure of the\nmodel to capture the dynamics in the retention rates observed in this dataset.\nFor an eight-year calibration period, the dip in the retention rate observed in Year 2 is\neffectively treated as an outlier that has little impact on model estimation; the overall trend of\nincreasing retention rates is adequately captured (as observed in Figure 1). However, it has a\nfar greater influence on model estimation when we shorten the model calibration period to five\nyears (as observed in the bottom-right plot in Figure 2). Whereas it seemed acceptable for FH\nto brush aside the Year 2 dip, the shorter calibration period shows that we cannot ignore it.\nCohort-level retention rates are predominantly monotonically increasing (as in the Regular\ndataset), and the BG model is a robust way to characterize such behaviour. However, it is not\na robust model when faced with the type of cohort-level retention rate pattern observed in the\nHighEnd datset. Wehaveobservedsuchadipinseveralotherdatasets(e.g.,Israel2005,Nitzan\net al. 2011). This suggests the need for an alternative, more flexible, model for characterizing\nand forecasting the length of a customer’s relationship with the firm in a contractual setting.\n4 The BdW Model\nWhen a model doesn’t “work,” we question its underlying assumptions. Reflecting on the “as\nif” story of buyer behavior underpinning the BG model, a number of people struggle with the\nassumption that, for a given individual, the probability of a coin coming up “tails” does not\nchange over time. They expect it to become more “headsy” over time (i.e., the individual is\nexpected to become more “loyal” the longer they remain a customer).3\nTo accommodate this in a continuous-time environment, the natural starting point would\nbe to replace the exponential distribution (the continuous-time equivalent of the geometric\ndistribution) with the Weibull distribution, which allows for an individual’s risk of canceling\n3Therearetwostandard explanations forsuch anexpectation. Thefirstisbased on anevolution ofcustomer\nsatisfaction argument (e.g., Bolton 1998), while the second is based on an increasing switching costs argument\n(e.g.,Burnham et al. 2003).\n8\n\n=== 第 12/36 页 ===\n\ntheir contract to increase or decrease as the length of the relationship with the firm increases\n(Murthy et al. 2004, Rinne 2009). Working within a discrete-time contractual setting, the\nnatural starting point is to use a discrete Weibull distribution.\nA discrete-time equivalent of a continuous distribution can be constructed by treating the\ndiscrete lifetime variable as the integer part of the continuous lifetime and discretizing its cdf\nor, equivalently, survivor function (Lai 2013). Suppose the continuous lifetime random variable\nX is distributed Weibull with survivor function\nS(x|λ,c)= exp(−λxc) = [exp(−λ)]xc .\nLettingexp(−λ)= 1−θ,itfollowsthatthesurvivorfunctionassociatedwiththediscretelifetime\nrandom variable T = bXc is\nS(t|θ,c)=\n(1−θ)tc\n, 0< θ < 1, c> 0, t = 0,1,2,.... (7)\nThe associated pmf is given by\nP(T = t|θ,c)=\n(1−θ)(t−1)c −(1−θ)tc\n, t = 1,2,3,.... (8)\nThis is the discrete Weibull (dW) distribution proposed by Nakagawa and Osaki (1975).\nWhile there are other discrete Weibull distributions (Murthy et al. 2004, Chapter 13; Rinne\n2009,Section 3.3.1),thisone is simple, flexible, and the best analogueof the continuous Weibull\ndistribution(Bracquemond andGaudoin2003). We notethatthedWcollapsestothegeometric\ndistribution when c = 1, just as the Weibull collapses to the exponential distribution when\nc = 1.4\nUnder thedWdistribution,theindividual-levelprobabilitythatsomeonewhohas madet−1\nrenewals will renew at the next opportunity is given by\n4There are no closed-form expressions for the mean and variance of the dW distribution. Looking at (8),we\nsee that P(T =1|θ,c)=θ, which means the pmf is reverse-J-shaped (i.e.,the mode is at T =1) when θ >0.5.\nWhen c≤1,thepmfisreverse-J-shaped forallvaluesofθ. When θ<0.5,thepmfhasan interiormode(i.e.,the\ncdf is S-shaped) when c>ln[ln(1−2θ)/ln(1−θ)]/ln(2). (In contrast, the geometric distribution pmf is always\nreverse-J-shaped, which means itscdf is concave.)\n9\n\n=== 第 13/36 页 ===\n\nS(t|θ,c)\nρ(t|θ,c)=\nS(t−1|θ,c)\n=\n(1−θ)tc−(t−1)c\n, t = 1,2,3,... (9)\nWhen c = 1, we have a constant individual retention probability, i.e., P(heads) in the “story”\nunderpinning the BG model. When c > 1, tc − (t − 1)c increases with time, which means\nthe proverbial coin becomes less “headsy” the longer the individual remains a customer (i.e.,\nthe longer they remain a customer, the less likely they are to renew their contract). When\nc < 1, tc − (t − 1)c decreases with time, which means the coin becomes more “headsy” the\nlonger the individual remains a customer (i.e., we have an increasing individual-level retention\nprobability). When c > 1, the discrete Weibull is said to exhibit positive duration dependence\n(i.e., an increasing probability of “failure”). Similarly, when c < 1, it is said to exhibit negative\nduration dependence (i.e., a decreasing probability of “failure”).\nWe fit this distribution to our two datasets (using the same five-year calibration period).\nThe estimation results are presented in Table 3, and the associated tracking plots in Figure 3.\nRegular High End\nθ 0.374 0.138\nc 0.636 0.910\nLL −1404.0 −1226.5\nTable 3: dW model estimation results\nAs would be expected, our estimate of c is less than 1 (i.e., negative duration dependence)\nfor both datasets, which implies that individual-level retention probabilities increase over time.\nComparing the LL values with those associated with the BG model (Table 2), we see that the\ndW does not fit the data nearly as well as the BG. Looking at the tracking plots in Figure 3,\nwe see that the dW fails to capture the retention rate dynamics observed in both datasets,\nand therefore fails to track the number of “surviving” customers. Comparing these plots with\nthose in Figure 2, it is clear that, at least for these datasets, a model that explains increases in\naggregate retention rates in terms of heterogeneity alone (i.e., the BG, which assumes c = 1)\ndoes better than one that explains it in term of duration dependence alone. We also note that\n10\n\n=== 第 14/36 页 ===\n\nRegular\n100\n80\n60\n40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS%\n.........................................................................................................................................................................................................................................................................................................................................................................................................................................................................................\n......... .. .. ... ... ..... .. ... ... .. ...... .. .. .. .. .. ...... ..... .. .... ..... ... ... ... ....... .. ... ... .. ...... .. .. .. .. ... ...... .. .. ... ... ...... .. ... ... .. ...... .. .. .. .. .. .......\n.A d\n. .. .. .. .. ...... ..\n.W\n. ....\n..c\n...... .. .. ..\n..t\n....... ....\n..u\n.... .. ..... ...\n...a\n.. .. ...... .. .. ..\n.l\n. ... ..... ... ... .... ...... .. .. .. .. .. ...... .. .. .. .. .. ..... ... .. .... .. ...\n00001 ..... 67890\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR\n....................................................................... .. ... .. .. ..... .. .. ... .. ....... .. ... .. ...... ... ... .. .. ........ .. ... .. .. ..\n.......................................................................... ...... ... .... .... ...... .. .. .. .. .. ........ .. .. .. .. ........ .. .. ... .. .. ...... .. ... .... .. ...... .. .. ... ..... ..... ... ... ... ....... .. ... ... ... ..... ... .... ... ...... .. .... ... ... ..... .. .. ... ... ...... ... .. .... .. ..... .. ... .. .... ...... .. .. .. .. .. ...... .... .. .... ..... ... .. .... .. ..... ... ..... ... ...... .. .. .. .. .. ...... .. .. .. .. .. ....... ... ... ... ....... .. .. .. .. ........ .. .. .. .... ...... ... .. ... .. ..... .. .. ... ..... ..... .. ... ... .. ...... .. .. .. .. .. ...... .... .. .. .. ....... ... .. .. .. ...... .. .. .. .. ... ..... .. .. .. .. .. ...... .. .. .. .... ..\nActual\ndW\nHigh End\n100\n80\n60\n40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS%\n....................................................................................................................................................................................................................................................................................................................................................\n.. .. ... ..... .. .. ... .. ...... .... .. ... ...... .. ... ... ... ..... .. .. ... ... ...... ... ... ... ...... .. ... ... .. ...... ... .. ... .. ...... .. .. .... .. ..... .. .. .. ... ....... .. ... .. .... ..... .. ... .. ... ...... .. ... ... .. ...... .... .. ....\n.....A d\n.. .. .. ... ..\n......W\n.. .. ..\n..c\n.. ...... ....\n..t\n.... .......\n...u\n. .. ... ...... .. ..\n..a\n.. ... ..... .. ..\n..l\n.. .. ...... .. .... ... ....... .. ... .... .. ...... ... .. .. ... ..... .. .. .. .. ... ...\n0001 .... 7890\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR\n................................................. .. .. .. .. .. ....... .. .... ... .. ..... .................................................................................... .... .. .. .. ...... ... ... ... ... ...... .. .. .. .. .. ...... .. .. .. .. .. ...... ... ..... .. ....... .. .. ... ... ...... .. .. .. .. .. ...... .. .. .. .. .. ....... .. .. .. .. .. ...... .. ... ... .. ...... .. .. .. .. .. ...... .. .. .. .. .... ..... .. .. .. .. .. ...... .. .. .. .. .. ...... .. .. .. .. .. ....... .. ... .. .. ...... .. .. .. ... .. ...... .. .... .. ....... .. .. .... .. ....... .. .. .. .. ........ .. .. .. .. .. ...... .. .. .. .. .. ...... ... .. .. .. ........ .... .. .... ..... .. .... .. ........ .... .. .. .. ....... .. .. .. .. .. ...... .. .. .. .. .. ..... .. .. .. ... .. ..... .. .. ..\nActual\ndW\nFigure 3: Comparing actual and dW-model-based estimates of survival (LHS) and re-\ntention (RHS) for the Regular (top) and High End (bottom) datasets given a\nfive-year model calibration period. (The model-based numbers to the right of\nthe vertical dashed line are projections given the parameter values estimated\nusing the data to the left of this line.)\nthe dW fails to capture the Year 2 dip in the High End retention rate curve.\nSowhathappens when weallowforboth heterogeneityand durationdependence? Assuming\ncross-sectional heterogeneity in θ is characterized by a beta distribution with parameters (γ,δ),\nit follows that for a randomly chosen individual,\n1\nS(t|γ,δ,c)= S(t|θ,c)f(θ|γ,δ)dθ\nZ\n0\nB(γ,δ+tc)\n= , t = 0,1,2,... (10)\nB(γ,δ)\n11\n\n=== 第 15/36 页 ===\n\nand\nP(T = t|γ,δ,c)= S(t−1|γ,δ,c)−S(t|γ,δ,c)\nB(γ,δ+(t−1)c)−B(γ,δ+tc)\n= , t = 1,2,3,.... (11)\nB(γ,δ)\nWe call this parametric mixture model the beta-discrete-Weibull (BdW).\nThe associated aggregate/cohort-levelretention rate is\nS(t|γ,δ,c)\nr(t|γ,δ,c)=\nS(t−1|γ,δ,c)\nB(γ,δ+tc)\n=\nB(γ,δ+(t−1)c)\nΓ(δ+tc) Γ(γ+δ+(t−1)c)\n= , t = 1,2,3,.... (12)\nΓ(δ +(t−1)c) Γ(γ+δ+tc)\nWe explore the shape of the associated retention rate curve below.\nThedistributionofP(t)(i.e.,P(heads)acrossthosecustomerswhohavemadet−1renewals)\nis\n1 1\nf(ρ(t)|γ,δ,c)=\ntc −(t−1)cρ(t)\n1 γ−1 1 δ+(t−1)c\n1−ρ(t)tc−(t−1)c ρ(t)tc−(t−1)c\n× n o n o , t = 1,2,3,.... (13)\nB(γ,δ+(t−1)c)\nThe mean of this distribution is, of course, the (aggregate) retention rate r(t|γ,δ,c). (See\nAppendix A for the derivations.) When c= 1 (i.e., BdW→BG), (13) reduces to (6). Similarly,\n(12) reduces to (4).\nWe fit this model to both the Regular and High End datasets using a five-year calibration\nperiod. (See Appendix B for details of how to estimate the model parameters in Excel.) The\nestimation results are presented in Table 4. Comparing these results with those for the BG\nmodel (Table 2), we see that the improvement in model fit for the Regular dataset is negligible\n(LR = 0.35,p= 0.553), which means c is not significantly different from 1. On the other hand,\nwe observe a significant improvement in fit for the High End dataset (LR = 4.77,p = 0.029),\nwhich meanscissignificantlydifferentfrom1. Comparingthese estimatesofctothoseobtained\n12\n\n=== 第 16/36 页 ===\n\nwhenfittingthedWtothesedatasets(Table3),wenoticethatdWestimateisdownwardbiased,\nreflectingthe well-knowresultthatunobservedheterogeneityinduces spuriousnegativeduration\ndependence in models for duration-time data (Kiefer 1988, Proschan 1963, Vaupel and Yashin\n1985).\nRegular High End\nγ 0.523 0.259\nδ 0.894 1.722\nc 1.197 1.584\nLL −1401.4 −1222.7\nTable 4: BdW model estimation results\nIn Figure 4 the model-based estimates of survival and retention are compared against the\nactual numbers for both datasets. The figures speak for themselves; the performance of the\nBdW model is impressive.5\nAt first glance, it is not clear why the model captures the downward blip in the aggregate\nretentionratecurveweobserveintheHighEnddataset. Wenotethatcˆ= 1.584,indicatingthat\nat the level of the individual, customers become less likely to renew their contracts the longer\nthey remain a customer. However, after some time, the sorting effect of heterogeneity (which\ncauses the aggregateretention rate to increase over time) dominates the individual-level decline\nand the aggregate retention rate starts rising. We explore this further in Section 5 below.\nHowdoourinferencesabouttheunderlyinglevelofcross-sectionalheterogeneitychangewhen\nwe allowfor individual-level duration dependence? The parameters of the beta distribution can\nbe characterizedintermsofthemeanE(Θ)= γ/(γ+δ)andpolarizationindexφ = 1/(γ+δ+1).\nThelogicbehind thepolarizationindexisasfollows: asγ,δ → 0(thusφ → 1),thevaluesofθ are\nconcentrated near θ = 0 and θ = 1 and wecan think of the values of θ as being very different, or\n“highlypolarized.” Asγ,δ → ∞(thusφ → 0),thebetadistributionbecomesaspikeatitsmean;\nthere is no “polarization” in the values of θ. Given the five-year calibration period parameter\n5Fitting the BdW model to the Regular dataset using an eight-year calibration period yields the following\nparameterestimates: γˆ =0.456,δˆ=0.779,andcˆ=1.284. Whiletheseareslightlydifferentfromthoseassociated\nwith a five-year calibration period (Table 4), inserting these eight-year estimates in the log-likelihood function\nassociated with a five-year calibration period yields the same value of −1401.4. Using an eight-year calibration\nperiod for the High End dataset yields the following parameter estimates: γˆ =0.214, δˆ= 1.427, and cˆ=1.724.\nInserting these eight-year estimates in the log-likelihood function associated with a five-year calibration period\nyields a value of −1222.9 (versus −1222.7 in Table 4). This relative insensitivity to the length of the model\ncalibration period suggests that the BdW provides a good characterization of the true data-generating process.\n13\n\n=== 第 17/36 页 ===\n\nRegular\n100\n80\n60\n40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS%\n...................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................A\nB ......................d\n..c ..............t\n.W\n............u .................a ................l\n.......................................................................\n00001 ..... 67890\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR\n....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................\nActual\nBdW\nHigh End\n100\n80\n60\n40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS%\n......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................A\nB .........................d\n.c ..............t\n..W\n...........u ..................a ................l\n.......................................................................\n0001 .... 7890\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR\n.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................\nActual\nBdW\nFigure 4: Comparing actual and BdW-model-based estimates of survival (LHS) and re-\ntention(RHS)forthe Regular(top)and HighEnd (bottom)datasetsgivenan\nfive-year model calibration period. (The model-based numbers to the right of\nthe vertical dashed line are projections given the parameter values estimated\nusing the data to the left of this line.)\nestimates for the High End dataset in Table 2, φˆ = 0.099; given the parameter estimates in\nBG\nTable 4, φˆ = 0.335. We observe that there is greater heterogeneity in the presence of the\nBdW\npositive duration dependence to capture the dominantpattern of increasing aggregateretention\nrates observed in the data.\nr t\n5 Exploring the Shape of ( )\nOur analysis of the High End dataset demonstrates that the BdW model can capture an early\ndip in the aggregate retention rate curve, something that is not immediately obvious given the\nunderlying modelassumptions. The explanationgivenisone ofthe “battle”between individual-\n14\n\n=== 第 18/36 页 ===\n\nleveldurationdependence andcross-sectionalheterogeneity. Wenowundertakeamorethorough\ninvestigation of the shape of r(t) under the BdW model.\nLet us startby consideringthree scenarios: Case1 (γ = 4.75and δ = 14.25),Case2 (γ = 0.5\nand δ = 1.5), and Case 3 (γ = 0.083 and δ = 0.250). While the associated distributions of Θ\nhave the same mean (E(Θ)= 0.25), they take on quite different shapes (Figure 5). In Case 1,\nthe distribution of Θ is relatively homogeneous (φ = 0.05) with an interior mode. In Case 2,\nthere is quite a bit of heterogeneity (φ = 0.33) in the distribution of Θ, with the majority of\nindividuals having lowish values of θ. The heterogeneity in Case 3 (φ = 0.75) is extreme; this\nU-shaped distribution indicates that some of the acquired customers have a high value of θ\n(which maps to a low probability of renewal), while a larger number of customers have small\nvalues of θ.\n5\ng(θ) 01234\n............\n........... ............ ........... . ........... .. .......... ........... ......\n.......\n.....\n.........\n.................\n............ ............ .......................... ............. ...... ...... ....... .......... ........... ......... ............ ...... ..... ........ ....... ...... ...... ........... .... ....... ....... ..... ..... ..... ..... .... ...... .... ..... ..... ..... ...... ... .... .. .... ... .. ... .. .. .... ... .. .. ... .... .... ... .... ... ... .... .. .... ....... .... ... .. .. ... ........ ..... .... .... .... ... ..... ..... . ...... .... ... .... ... ... ..... .. ...... .. ... ..... .. ...... . .. . ... .... ... . ... ...... . ..... . .... ... . ..... .... . ... ... .... .. ...... ... ..... .... .. ... ... ... . ... ... .... .... .. .. .. . ... ... .. .... ... .. .. ... .... .... .. .. .. . .. .. . ..... ... . .. ... .. ... .... .... .... ..... .. .. ..... ... . .... ... ... ... .... .... .. ... . ... .... .. .... . ..... ........ .. ..... ... ... ...... ..... ... ... . ... . ..... . .. .. ... . .... . .... .... .. .. .. ... . .... ... ...... ..... .... ... ..... . .. .. . ...... . .. ... ... ... .... .. .. . ... ... .. .. ..... .... .... .... ... ..... . .... .. ... .. ..... ..... .. ... .... ...... .. .... .. ... .... .... .... . ... ... . .. ... ... ... ... ..... ... .... ... .. .... ....... .... ... .... .... ... .. ... ....\n...\n.. .. .. ... ...... .. ..\n.\n..\n.\n.... ..\n.\n.. ... ... .. ..\n..\n..\n.\n.... ... ....... ...\n..\n.. .. ..\n.\n.... ... .. .. ...\n.\n..\n.\n..\n.\n.... ... ...... ..\n.\n....\n.\n.. .... ..\n.\n... .. .. ..\n.\n.....\n..\n.. .. ... ....... .. ...\n.\n... .. ... .. .. ... ... .. ..\n..\n... .. .. .. ........ ... .. .... ... .. .. ... ............................................................C ( C (C ( ...............φ φφ .......a aa ..... .... ... ..... .s ss . .. ..= == . ......e ee ... .... .... ... .... .. .. .....0 00 . .1 32 .. ...... ..... ........ ........ ... .. ... .......... ............ ............ ... .0 735 53) ))\n0.0 0.5 1.0\nθ\nFigure 5: Shape of the beta distribution for Cases 1–3.\nWhen c< 1, the cohort-level retention rates always increase over time; see Figure 6a, which\nshowsthecohort-levelretentionrateswhenc= 0.75forthecompletelyhomogeneouscaseaswell\nas for Cases 1–3. Referring back to Figure 5, the distribution of Θ for Case 3 is very polarized.\nThis means we havea group of people with high values of θ and they churn almostimmediately,\nleaving us with the group of customers who have very low values of θ (and, because c < 1, their\nchurn probabilitiesonlygetsmallerovertime). Thus r(t) jumps and then levelsoffvery quickly.\nIn Case 1, where the distribution of Θ is relatively more homogeneous, the leveling-off process\nis slower, and is much closer to the homogeneous case. Case 2 lies between these extremes.\n15\n\n=== 第 19/36 页 ===\n\nc= 0.75\n1.0 0.9 0.8\n0.7\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10\nPeriod\netaRnoitneteR ...................................... ... ...... ..... .... ... ... ... .. .. ..... .... ..... .... ..... ... .... ... .. .. ... ..... ..... ..... ... ..... .... .. ... .. ... .. ... .... ... .... ..... ... ... .. .. .. .. .. ... ... ..... ..... .... .... ... . .. ... .... .. .... ..... .... .... .... .. .. .... ... .. .. .... ... ..... ...... ... ... .. . .. ... .. ... .. .. ... .. .. ... ... .... .. .... .. ... .. .. ... ..... ... ... .... .. .. .. .... ... .... ... .. ... .. .. ... .. .. ... ..... .. .... .. . ... .. .. .. .. ... .. ..... .. . . .. . ... ... .. ... . .. ... ... .. ... . ... ... .. .. .. .. .. ... ..... . ... . .. ... ..... ... . .. ... .. . .. .. . ... .... .... .. ... .. . .. .... . .. ... .... .... ... . .... .... ... .. .. ... . .. .... .. ... . .... . . .. .. . .. .. .. . .. .... . . .. .. . .. ... . .. . ... . ... ... . .. .. .... .. . ... .. .. . .... .. . .. . ... .. .... .. . .. . .. .. .. ... .. .... . ... .. .. ... . .. . ... ... ... .... . .. ... .. .. ... ... .... .. .. .. .... ... . .. .... .. ... .. .... ..... ... . .. .. .. ... .. ... .. .. .. .. .... ... . .. .. ... . .. .. . .. .. . . .. ... ... .. .. ... ... ... .. . .. .. ... .. .. . .. .. .. . .... .. .. . ..... . .. .... . . ... . ... ... .. . ... .. .. ... . . .. .. .. ... ... . ... . . .. .. ... ... .. . .. .... .. .. ... .. ... .. .. . . ... .. .. .. .. ... .. . .. . .. ... . .. .. . .. . .. . .. .. ... . .. .... .. . . .. .. .. . .. . ... . ... .. .. .. .. ... . .. .. . .. . .. .. .. . .. .. ..... .. .. .. . .. . .. .. .. ... . .. ... ... ... . .. .. . .. . .... .. ... ... .. ... .. . .... .. ... .. . .. .. .... . ... ... . .. ... ... ... .. . ... .. .. .. . .. .. .. . ... .. .. . .. .. .. .. . .. . ... . .. .. .. . .. .... .. ..... .. .. .. ... .. .. ... .... .. . ... . .... ... . ... . ... . .. .. .. . .. .. ... .. .. .. .. .. ... . ... ... ... .. .... ... .. . .. .. ... .. .. . ... .. .. .. . .. .. .. .. .. ... .. ... .. . .... .... .. . .. . ... .. ... ... .. .. ... ... .. .. ... .. ... .. ... .. .... .. .. ... .. ... .. .. .. .. ... ... .. ... .. .. .. . ... .. .. . .. . .. ... . ... ... . . .. .. .. .. .. .. .. ... .... .. . ... ... . .... ... ... . ... ... .. ... . .. C HCC aaa osss meee o123 ge((( nφ φφ e= == ou0 00 s. .. (7 03 φ5 53) )) →0)\n(a)\nc= 1.25\n1.0 0.9 0.8\n0.7\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10\nPeriod\netaRnoitneteR ................... ....... ..... .. ... .... . .. . ... ... . .. .. ... .. .. .. .. .. .. . ... ... . ... . ... ... . .. .. .. .. .. .. ... . ... . ... ... . .. ... ... . .. . ... ... ..\n.\n... .. . .. ...\n.\n.. .\n.\n... . ... .. .. ..\n.\n... ... ..\n.\n... .\n.\n.. .\n.\n.... .. .. .. .. ....\n..\n... .\n.\n.. . ... .. . .. ... .. . .. . .. ...\n.\n...\n. ..\n... .. . ... .. .... .. .. . ... .... .\n..\n.. .. .. .. .. .. ... ... ... ... .... .. .\n.\n...\n.\n.. .. ... .. . ... .. .. .. . ... ... .. ....\n..\n.... .. ..\n.\n.. .. .. . .. . ... .. ... .. ... .. . .. ..\n.\n... . ... ....\n.\n... .. ... ..... .. .. .. . ...\n.\n... .. ... .. ... ... . .. ... ... . .. .. ..\n..\n... ..\n.\n... .. ...\n.\n.. .\n.\n... ... .... . ..\n.\n..... .... ....\n.\n....\n.\n.. ... .. .. .... ..... ... . .. .. ... .. .. .\n.\n.. . .. . ... .... . ...\n.\n... .. ....\n..\n... ..\n.\n.... ... .. ... .. .. .. .. .. ...\n.\n.. ... .... ....\n..\n.. . .. ..\n.\n.. ..\n.\n..... ...\n..\n.. . .. .. .\n..\n... .. .. .. .. .. .. ... . .. .. .. . ..\n.\n.. ... ...... .. ... .. .. ...\n.\n....\n..\n... . ... .. . .... .... .... . ....\n..\n...\n.\n..... .... ... . .. . .. .. . ... .\n.\n.. ... .. . .. . .. . .. . .. .. .. ..\n.\n... .. .. ... .. ... .. .. .. ...\n.\n.. .\n.\n.. . .... . .. . .... ..... ..\n..\n.. .. ... ... .. .. .. .. ... .\n.\n...\n..\n.. .. .. ... ... .. ... . .. ... .\n.\n...\n..\n..\n.\n... .. .... .. .. .. ..\n..\n...\n..\n.... . .. . .. .. . .. .. ... ..\n.\n.. ..\n..\n.. .. ... ... ... .. .\n.\n...\n.\n.. .\n.\n.. . ... ... ... ... . .. ... . .. ..\n.\n.. .. ... .. .. .. ... .. .\n.\n... . .. .\n.\n... .. .. .. .... ... . .. ... ... .. .... .... .. .... ... .. .. .. .. ..\n.\n.. .. .. .. . .. .. ... ... ... . .. .. ...\n.\n.... .. .... .. ... .. ...\n.\n... ... .. ... ... . .. ... .. ...\n.\n... .\n.\n..\n.\n.. .. .. .... .. .. .. .. ... .. ... .. ... .\n.\n.. .. .. .. ... ...\n.\n...\n.\n.. .. ... ... ... ... ..\n.\n...\n.\n... . ... ... . .. ... ... .... . ...\n.\n...\n.\n.. ... .. .. .. .. ... ..\n.\n.. . .... ... . ... ... ... ... ... .. ... .. ... .. .... ... ... .. .. .. .. ... ... . .... ... ... .. .. .. ... . .. .. .. ... .. .... ... .. ... .. ..\n.\n..\n.\n.. .. ... .. . .. .. .. .. . .. .. . .. . ... ... .\n..\n.. .. .. .. .. ... .... ... .. .\n.\n... .. ... .. ... . .. .. . .. .\n.\n.. ... ..\n.\n.. .. .. .. .. .. CC HC aa a oss s mee e o13 2 ge(( ( nφφ φ e== = ou00 0 s.. . (07 3 φ55 3 )) )\n→0)\n(b)\nFigure 6: Shape of the beta-discrete-Weibull retention curve for different levels of het-\nerogeneity in Θ and different values of c (with E(Θ)= 0.25 in all cases).\nWhen c > 1, an individual’s retention probability decreases over time. But what is the\ncountervailingeffectofheterogeneity? ConsiderFigure6bwhichshowsthecohort-levelretention\nrateswhen c = 1.25fordifferinglevelsof heterogeneityin the distributionofΘ acrossthe cohort\nmembers. In thecompletelyhomogeneouscase, weobserveamonotonicallydecreasingretention\nrate. But when any heterogeneity is present, we see the “ruse of heterogeneity” (Vaupel and\n16\n\n=== 第 20/36 页 ===\n\nYashin 1985): even though the individual-level customer retention probabilities are decreasing\nover time, the effect of a moderate amount of heterogeneity in the distribution of Θ across\ncohort members is to cause the retention rate to start increasing, either immediately or (if φ is\nlow) after a few periods. We note that model can accommodate a one-period dip in retention\nrates (as observed in Figure 4) or a more prolonged dip (as observed in Case 1). As the level\nof heterogeneity increases, as in Cases 2 and 3, we find that the cohort-level retention rate\nmonotonically increases over time (even though the individual-level retention probability is a\ndecreasing function of time).\nTo further explore the relationship between heterogeneity and dynamics, we look at the\nshape of the BdW retention curve as a function of c and φ for three different values of E(Θ)\n(0.10,0.25,0.40)—see Figure 7. The monotonically decreasing curve appears only in the de-\ngenerate case where there is no heterogeneity (φ → 0), so we ignore this special case to focus\non the other two general shapes that can occur. When c < 1, the aggregate retention curve\nmust increase monotonically. For this range of the shape parameter c, each individual’s renewal\nprobability increases over time. The sorting effect of heterogeneity also pushes the curve up.\nSince both forces work in the same direction, the resulting aggregate retention curve must rise\novertime. When c> 1,eithershapecanarisedepending onthestrengthofdurationdependence\nand the level of heterogeneity in the underlying distribution of Θ. A U-shaped curve requires\ncustomers to have an increasing propensity to churn in order to generate the initial dip. The\nlatter part of the U results from the sorting effect of heterogeneity. However, as the level of\nheterogeneity increases, the sorting effect of heterogeneity dominates individual-level duration\ndependence and we have a monotonically increasing retention curve. Finally, as seen by the\ngradual “fanning out” of the curved line towards the top of the figure, we note that the shape\nof the BdW retention curve is surprisingly insensitive to variations in E(Θ).\nTogainadditionalinsightintotheU-shapedversusmonotonicallyincreasingshapeofr(t),let\nus further investigate the shape of the BdW retention curve for different levels of heterogeneity\nin θ, with E(Θ) = 0.25 and c = 1.25 in all cases. We see in Figure 8 that the location of the\nminimum of r(t) increases as the level of heterogeneity in θ decreases, with its location going\nto infinity as φ → 0. As the level of heterogeneity increases, the location of the dip shifts to\nthe left and it becomes less deep. We then get to the point where r(1) = r(2) after which r(t)\n17\n\n=== 第 21/36 页 ===\n\nmonotonically\ndecreasing\n?\nc\n112 ... 050 ..................................................................................................................... ................................................................................................U ........................................- ...................s ...................h\n.......... .... .. ..... ............\n...a\n.....................\n...p\n. .. ........... ....... .... ... ..\n.........e ............. ... .. ... ..... .d .. ... .. .. .. .. .... ... ................... ... ... ... .... .... .. ... . .. . .. .. . .. . .. .. . ....... ... .. .. ... .. ... .. .. .. ... .. .. . .. . .. .. . .. . ... .. ....... .. .. .. .. .......... . .. . .. .. . .. .... m.. .. .. .. .... ... .. . ... .. .. .. ......... .. .. .. .. .. .. ... ...... ... .. .. ... ... ... . o... ......... .... ...... .... ..... ...... ... .... .... .... n... ....... ..... ...... ... ...... ...... .... .... ... ........ .... o. .... ....... ..... ....... .... ...... .... ...... ......... .... t.. ..... ..... ...... ....... .......... .... o.. ...... ...................................... n.\nically\nincreasing\n0.5\n0.0\n0.00 0.25 0.50 0.75 1.00\nless φ more\nheterogeneous heterogeneous\nE(Θ)=0.10 E(Θ)=0.25 E(Θ)=0.40\nFigure 7: Shape of the beta-discrete-Weibull retentioncurve as a function of c and φ for\nE(Θ)= {0.10,0.25,0.40}.\nbecomes S-shaped and then concave.\n1.0\n0.9 0.8 0.7\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10\nPeriod\netaRnoitneteR ................................................. ...... ....... .. .... ... ..... ..... ..... .... .... .... .. .. .. .. . .... .. .. . .. .. . .... . ... ... .. .... .... .. .. .. ... .... ........ ... ...... . ... .. . .. .... .. ... .. .... . .. . .... .. .. ... .. . .. .. . .. .. .. ... .. . .. . .. ..... . .. ..... .... . .. ... ...... .. .. .. .. .. .. . ... . ... . .. .. . . .. .. .. . .. .. . .. . .. . .. .. .. .... ... ... ... .. .. ...... .. .. ... .. . .. .. .. .. . ... .. .. . ... . . .. . .. . . .. .. . .. .. . . . .. ... .. . . .. .. .. . .. . .. . .. . .. . .. ..... . .. .. .. .. .. .. .. . ... . .... .. ... . . ... . . .. . ... .. . . .. . . .. .. . . . .. . ... . .. . . .. . . ... . .. ... . ... . ... .. . .... .. ... . . .. . .. .. .. . . ... . . .. .. .. . . .. . .. . ... . .. .. .. . . .. .. . ... .. . .. .. .. ... .. .. ... .. .. . .. . .. . .. . . ... . . ... .. . .... . .. .. .. . . .. ... . .. .. . . .. . . ... . .. .. . .. . .. . .. .. . .. .. .. . .. . .. .. . .. . .. .. . .. .. .. .. .. .. .. ... . .. ... .. .. .. . . ... .... . ... ... ... . ... . . . ... ... . . ... .. ... .. ... . ... .... . .. .. .. . ... .... . ... . .. .. . ..... . .. .. .. ... .. .... .. ... ... . ... .... . ... . . . ... .. . ... . ... .. ... .. .. . ... . . .. . . ... . .... . .. .. .. . .. .. . .. . .. .. ... . .... .. .. . . ... .. .. . . .. ... . .. ... . .... .... ... .. ... .. .. ... .. .. .. .. ... .. . ... . .. . ... . . ... .. .. .. . .. ... .. . . .. .. . .... . . ... .. . .. . . .. .. . . ... .. . .. . ... . .. . .. ... .. . ... . .. ... ... .. .. .. .. .. ... .. ... ... . ... .. . . .. . .... .. . .. ... .. . . .. .. .... .. . . .. . ... . . ... .. . .. . ... . .. .. ... .. . .. .. .. .. . .. .. .. . .. . ... .. . .. .. . .... . .. . .. . .. .. . .. . . .. .. ... . . ... .. ... .. .. ... .. ... . ... .. . . .. ... .. ... .. . .. ... . .. . .. . . .. ... . . .. .. . .. . . ... ... .. . . .. .. .... .. .. . . .. .. ... .. . .. . .... . .. . . . ... .. .. . . .. .. ... . ... . ... .. . .. ... .... . ... . .. .. .. ... .. . . .. .. .. . .... .. .. ..... . .. . ... ... ... . ... .. ... .. . ... . .... ... .. .. .. . ... .. .. . .. .. .. .. .. .. .. . . .. .. . .. .. . ... . .. .... .... .. ... . .. ... .... .. ... . .. .. .. . ... . .. . . .. ... . ... .. .. ... .. .. ... .. .. .. . . ... . . .... .. . .. . ... .. .. .. ... ... . .. ... . .. .. .. .. .. . .. ... ... .. . ... . .. ... .. .. ... ... .. .... .. .. . .. . ... .. . .. .. . .... .. . . .... . .. ... .. . .. .. .. . .. .. . .. . .. .. .. .. . .. ... . .... . . .. .. .. ... . .. .. . . .. . .. ... . ... . .. . . .. . ... . . . .. .. . . .. .. .. .. .. . ... . ..... . .. . .. .. .. . .. .. .. .. ... .. ... .. . .. ... ... . . . ... . .. . .. . . ... .. . .. . .. .. .. . ... .. .. . .. . .. .. .. .... .. .. .. . ... . .. . ... .. .. . .... ... .. .. .. ... ... . ... ... .. .. .. ... .. . .... .. . .. ... .. . ... .. ... . .... . ... ... .. ... . ... . ... ... ... .. .. .. . .. .. .. .. . ... . . .. .. . ... .. ... .. . . .. .. ... . .... .. ... . ... .... .. ... . ...... .. . ... .. . .... .. .. . .. . . .. . . .. .. .. . .. .. .. .. ...... ... .... ... .. .. .. ... .. .. . .. . .. .. .. .. . .... .. .. . .. .. . .. . .. . .. . . . .... . .. .. . . ... .. . . .... . .. ... .. ... .. . .. . ... .. . .. .. . . ... . .. . .. .. . . .. .. . .. .. . ... .. .. .. ... . .. .. . .. . .. . .. .. . .. . .. . .. . ... . . . .. ... . . .. .... .... ... ... .. .. .. .. .. .. . .... . . .. ... . .. . .. . ... . .. .. .. ... . .. . .... . . .. .. .... . ... . .. . .. . ... .. .. . .. ... . .. ... .. .. .. .. . .. . ... .. ... . .. . .. . . .. . . . .. . . .. . .. . .. . .. .. . ... .. .. .. .. . .. .. . . ... .. ... ... .. ... .... .. .. . .. . .. . .. . .... .. ... . . .. .. φ φ φ φ φ φφ = = = = = →= 0 0 0 0 00 0. . . . .. 3 2 0 0 04 0 1 7 5 20 0 9 5 0 50\nFigure 8: Evolutionofthe shape ofthe beta-discrete-Weibullretentioncurveasthelevel\nof heterogeneity in Θ increases (with E(Θ)= 0.25 and c= 1.25 in all cases).\nClearly c > 1 when r(t) is U-shaped. Is there anything we can learn about the sign of c\nfrom the shape of a monotonically increasing r(t)? The answer is yes. Given r(1), r(2), and\nr(3), we can (numerically) solve for the three BdW model parameters (γ,δ,c). We present in\n18\n\n=== 第 22/36 页 ===\n\nFigure 9 a plot that indicates whether c is greater than or less than 1 as a function of r(2)\nand r(3) given r(1) = 0.60 and r(1) = 0.75. (We limit the analysis to the region of interest\nabove the dashed line where retention rates increase monotonically (r(1) ≤ r(2) ≤ r(3)). We\nnote that the area where the aggregate retention curve is monotonically increasing while each\nindividual’s renewal probability increases over time (i.e., c < 1) is very small. In order for c to\nbe less than 1, r(2)must be largerthan r(1) but r(3)cannot be much larger than r(2). In other\nwords,thejumpintheaggregateretentionratebetweenperiods2and3iscrucialtodetermining\nwhether there is positive or negative dependence at the level of the individual customer: the\nbigger the difference between r(2) and r(3), the more positive duration dependence there is in\nthe data (i.e., individual customers’ propensities to churn increase over time). (We observe this\nin Figure 6 (where r(1)= 0.75). For the Case 3 retention curves, r(2)= 0.952 and r(3)= 0.973\nin Figure 6a,while the corresponding numbers in Figure6b are 0.923and 0.956. The biggergap\nis observed in Figure 6b, which corresponds to the case of c> 1.)\nr(3) 00001 ..... 67890 ......................... ... .. .. .. .......... ... ... ... .. .. ......... ... .. .... .. .. .......... .. .. .. .... .. ....... ... .. .... .. ........ .. ..... .. .. ... ........ .. .... .. .. ......... .. ... .. .. ... ...... ... .. ... .. .. ........ .. ... .. ... .......... .. .. ... ... .. ..... ... ... .c . .. .... ....... ... .. ... .. ....... .... .> . ... .. ......... .. ... .... ........ .. ... .. ... ..... ... .. .. .1 ... ...... .. .... .. .. ... ......... .. .. .. .. .. ...... .. .. ... .... ..... .. ... .. .. ........ .. .. .. .. .. ....... ... .. .... ....... .. ... ... ... ...... .. c .. .. .. .. ....... .. .. .. .. ... ...... ..<. ... .. .. ....... .. .. ... .. ....... .. .. .. .. .. ...... .. 1 .. ... .. .. ....... ... .. ... ....... ... ... ......... .. .. ... .. .. ...... .. .. .. .. .. ....... .. ... .. .... ..... .. .. .. ... .... .... .. .. .. .. ...... .. .. ... ... ....... .. .. .. ... .... .... .. .. .. ........ .. .. .. .. .. ..... ... ...... ....... ... .. ... ... ..... .. ... ... ....... .. ... .. .. .. ...... .. ... .. ........ .. .. .. .... ...... ... ... .. ...... ...... .. ... ...... ... ... .. ....... .. .. .. ... .. ..... ... ... ... ....... ... ... ... ..... .. ... .... .. ...... .. ... ... ...... ... .. ... .. ..... ... .. .... ....... .. .. .... ............................. r(3) 000001 ...... 788990 505050 ........................... ... .. ... ......... .... .. .. ... ... ......... .. ... ... ... .. .......... .. ... .. .... .. .......... .. ... .. .. ... ......... .. .... .. .. .. ...... ... .... ... .. ......... .. ... .. .. ..... ...... .. .. ... .. ......... .. .... .. ... .. ......... .. .. ... .. ........ .....c .. .. .... ........ ... ... .. ... .. ...... .. ... .> .. .. ....... .. .. .. .. .. ....... .. .. .. .. .. ....... ... ... ..1 .. ... ......... .. .... .. .. .. ...... .. .. .. ... .. ...... .. .. .. .. .. ...... ... .. .. .. .. ....... .... .. .. .. ....... .. ... ... .. ...... .. ... ... ... ..... .. c. .. .. ... ....... .. .. ... .. .. ....... ... < .. ... .. ....... .. ... .. .. ...... .... .. ... .. ...... .. 1 .. .... ... ...... .. .. .. .... ........ ... .. .. .. ...... ... .. .. ....... .. .. .. .. .. ...... ... .. .. .. .. ...... .. .. ... .. .. ...... .. .. .. .. ... ...... .. .... .. .. ...... ... .. ... ...... .. .. .. .. ....... .. .... .. .. .. ...... ... .... ... .... .. ... .. ... ..... ... .. ... .. ....... .. ... .. ... ....... .. ... .. ...... .... .. .. ... ..... ... ... ... .. ...... .. ... ... .. ...... ... ... .. ...... .. .. .. .. .. ....... .. ... ... .... .. ... .. .. .. ...... .. .. .. ... .. .... ... .. ... ...... ... .. .. .. .. ...... ... .. .. ....... .. ... ... .. ...... .. ... ................\n0.6 0.7 0.8 0.9 1.0 0.75 0.80 0.85 0.90 0.95 1.00\nr(2) r(2)\nFigure 9: Nature of individual-level duration dependence as a function of r(2) and r(3)\ngiven r(1)= 0.60 (LHS) and r(1)= 0.75 (RHS).\n6 Alternative Model Specifications\nThe conclusions drawn above about the nature of retention dynamics are based on the assump-\ntionthattheBdWisavalidcharacterizationofthetruedata-generatingprocess. In thissection,\n19\n\n=== 第 23/36 页 ===\n\nwe investigatethe robustness ofthese resultsby exploringsomealternativemodel specifications.\nWe first consider a model based on an alternative distribution that allows for retention dynam-\nics at the level of the individual customer, and find that our results hold. This analysis still\nsuffers from a key assumption associated with our BdW-based analysis, that of homogeneity in\nthe parameter that determines whether individual-level retention probabilities are decreasing,\nconstant, or increasing over time. We therefore undertake some analysis in which we allow the\nc parameter of the discrete-Weibull distribution to vary across customers. We do not find any\nstrong evidence to support such an effect.\n6.1 Changing the Underlying Weibull Distribution\nIn Section 4, we noted that the naturalstartingpoint for accommodatingindividual-levelreten-\ntionprobabilitydynamicsin acontinuous-timeenvironmentwouldbe toreplace the exponential\ndistribution(the continuous-timeequivalent ofthe geometric distribution)withthe Weibulldis-\ntribution (which reduces to the exponential when c = 1). Alternatively, we could assume that\nthe continuous lifetime random variable X is distributed gamma, with pdf\nλsxs−1e−λx\nf(x|λ,s)= .\nΓ(s)\nwhere λ > 0 is the rate (or scale) parameter and s > 0 is the shape parameter. The gamma\ndistributionexhibitsnegativedurationdependence whens < 1andpositivedurationdependence\nwhen s > 1. This corresponds to increasing and decreasing retention rates, respectively. When\ns = 1, the gamma distribution collapses to the exponential distribution. While there is no\nclosed-form expression for the associated survivor function, it can be written in terms of the\nincomplete gamma function:\nγ(s,λx)\nS(x|λ,s)= 1− .\nΓ(s)\nEach customer’s λ is unobserved (and unobservable) and is treated as a realization of the\nrandom variable Λ, which we assume to be distributed gamma with pdf\nαrλr−1e−αλ\ng(λ|r,α)= .\nΓ(r)\n20\n\n=== 第 24/36 页 ===\n\nAssuming the value of s is constant across the population, it follows that, for a randomly\nchosen individual,\n∞\nS(x|r,α,s)= S(x|λ,s)g(λ|r,α)dλ\nZ\n0\nr s\n1 α x\n= 1− F r+s,1;s+1; x ,\nsB(r,s)(cid:18)α+x(cid:19) (cid:18)α+x(cid:19) 2 1 α+x\n(cid:0) (cid:1)\nwhere F ·) is the Gaussian hypergeometric function. This gamma mixture of gammas is\n2 1\n(cid:0)\nsometimes known as the “beta of second kind” (B2) distribution. The implied retention curve\nis monotonically increasing when s ≤ 1 and U-shaped when s > 1.6,7\nThe B2is acontinuous-timedistribution. As noted inSection 4,the discrete-timeequivalent\nof a continuous distribution can be constructed by treating the discrete lifetime variable as the\ninteger part of the continuous lifetime and discretizing its survivor function. For T = bXc,\nP(T = t|r,α,s)= S(t−1|r,α,s)−S(t|r,α,s),t= 1,2,3,...\nFittingthis model to the High End dataset using a five-year model calibration period yields\nthe following parameter estimates: rˆ = 0.483, αˆ = 0.562, and sˆ = 2.721; the associated value\nof the log-likelihood function is −1222.8.8 The model-based estimates of survival and retention\nare compared against the actual numbers in Figure 10.\nIn terms of fit and forecasting performance (for both survival and retention), the results are\nalmost identical to those associated with the BdW model. Our estimate of s is greater than 1,\nwhichimpliesthatindividualchurnprobabilitiesincreasewithtenure,eventhoughtheaggregate\nretention rate increases once we are past the initialdip. This demonstratesthat the conclusions\ndrawn using the BdW model are robust to changes in the underlying model specification.\n6Strictly speaking, the B2 has what is known in the survival analysis literature as an upside-down bathtub-\nshaped failure-rate/hazard function (Glaser 1980) when s>1;this maps to a U-shaped retention curve.\n7When s = 1, this collapses to the Pareto distribution of the second kind, which is the continuous-time\nequivalent ofthe BG model (Fader et al.2017).\n8The fact that the B2 survivor function contains the Gaussian hypergeometric function means that it is im-\npracticaltoestimatethemodelparametersinExcel. Ourmaximumlikelihoodestimatesofthemodelparameters\nwere obtained using MATLAB.\n21\n\n=== 第 25/36 页 ===\n\n100\n80\n60\n40\n20\n0\n1 2 3 4 5 6 7 8 9 10 11 12 13\nTenure(years)\ngnivivruS%\n.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................A\nB ........................2\n.c ...............t .............u ..................a ................l\n......................................................................\n0001 .... 7890\n0.6\n0.5\n1 2 3 4 5 6 7 8 9 10 11 12\nYear\netaRnoitneteR\n.....................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................\nActual\nB2\nFigure 10: Comparing actual and B2-model-based estimates of survival (LHS) and re-\ntention (RHS) for the High End dataset given an five-year model calibration\nperiod. (The model-based numbers to the right of the vertical dashed line\nare projections given the parameter values estimated using the data to the\nleft of this line.)\n6.2 Relaxing the Assumption of Common c\nThe dW distribution has two parameters: θ and c. When deriving the BdW distribution, we\nallowed θ to vary across customers but assumed that everyone has the same c. To what extent\ncouldourcounterintuitiveresultsregardingindividual-levelretentiondynamicsbeaconsequence\nof this assumption? In other words, what happens if we allow c to vary across customers?\nTo the best of our knowledge, there is no continuous distribution f(c) that can be used to\ncharacterize heterogeneity in c that results in a closed-form solution to the integral\n∞\n(1−θ)tc\nf(c)dc.\nZ\n0\nWe will therefore capture heterogeneity in c using a discrete mixing distribution.\nLet us consider the two-component dW distribution with survivor function\nS(t|θ ,θ ,c ,c ,π)= πS(t|θ ,c )+(1−π)S(t|θ ,c ),0< π < 1.\n1 2 1 2 1 1 2 2\nThe five model parameters are not identified if we use the five-year model calibration period\n(as we only observe four renewal opportunities). We will therefore use the whole dataset, which\ncontains 12 renewal opportunities, in our investigations of heterogeneity in c. The estimation\nresults are reported in Table 5, alongwith the results for allthe nested models that“switch off”\n22\n\n=== 第 26/36 页 ===\n\nModel Specification:\n– heterogeneity in θ 3 3 7 7\n– heterogeneity in c 3 7 3 7\nθ 0.068 0.019\n1\nθ 0.288 0.302\n2\nθ 0.135 0.160\nc 0.853 0.668\n1\nc 1.386 2.230\n2\nc 1.232 0.688\nπ 0.710 0.599 0.840\nLL −2004.3 −2004.5 −2005.3 −2027.7\nAIC 4018.8 4017.0 4018.6 4059.4\nBIC 4043.2 4036.6 4038.2 4069.2\nEvidence ratio 4.5 1.9 4.4 3×109\nTable 5: Estimation results for the two-component dW distribution and its nested vari-\nants.\nthe heterogeneity (i.e., θ = θ = θ and/or c = c = c).9\n1 2 1 2\nWhen we allow for heterogeneity in both θ and c, we see that one segment of the cus-\ntomer base exhibits negative duration dependence while the other exhibits positive duration\ndependence. At first glance, this would suggest that our assumption of homogeneity in c is not\nsupported. However, we must first compare the fit of this specification to that of its nested\nvariants. On the basis of both AIC and BIC, the specification with heterogeneity in θ and\nhomogeneityin c is the best model. Estimatingthe BdW model using the full datasetyields the\nfollowing results: γˆ = 0.250, δˆ= 1.654, cˆ= 1.597, LL = −2004.8. The associated values of AIC\nand BIC are 4015.6 and 4030.4, respectively. This means the BdW is the “best” model among\nthose examined in this analysis.\nWe alsoreport in Table 5 the evidence ratio(Anderson 2008,Burnham and Anderson 2002),\nwhich tells us the strength of the empirical support for the model with the minimum AIC\nrelative to the other candidate models.10 The evidence for the BdW is 4.5 times that for the\n9What about the fit of a three-component model? The value of the log-likelihood function is −2004.1. This\nimprovement of 0.2 (relative to the log-likelihood associated with the two-component model) comes at a cost of\nthreeadditionalmodelparameters,andsothethree-componentmodelisclearlydominatedbythetwo-component\nmodel.\n10The evidence ratio is the relative likelihood of a pair of models. The evidence ratio for the best model\n(i.e., the one with the lowest AIC) versus model i is computed as Ei = exp((AICi−AIC min)/2). An evidence\nratioofEi meansthattheprobabilitythatthemodelwiththelowestAICistheK-Lbest modelisEi timesthat\nof model i. By K-L best, we mean that the model has the smallest estimated Kullback-Leibler distance (i.e., it\n23\n\n=== 第 27/36 页 ===\n\nfull two-component dW model and 4.4 times that for the model that allowsfor heterogeneity in\nc but not in θ. The evidence for the BdW is only 1.9 times that for the two-component model\nwith homogeneous c. The conclusion we draw from this is that there is no reason to reject our\nmodeling assumption of homogeneity in c.\nWefindsimilarresultswhenwebringinthebetadistributionforθ anduseadiscretemixture\nfor c alone. Even though there is intuitive appeal for having heterogeneity in c, there is very\nlittle empirical support for it. Most of the “action”is in the heterogeneity in the baseline churn\npropensities across customers; once this factor is accommodated, there is virtually nothing left\nover to be explained by a more elaborate model specification.\n7 Discussion\nGenerallyspeaking, aggregate/cohort-levelretentionratesincreaseovertime,andtheBGmodel\nhas proven to be a robust tool for projecting retention (and therefore survival) into the future.\nAccording to the BG model, this phenomenon is entirely due to heterogeneity; individual-level\npropensities to churn are assumed to be constant. Despite the performance of the model, a\nnumberofpeoplestrugglewiththisassumption,contendingthatincreasingcohort-levelretention\nrates are the result of individual-level propensities to churn decreasing over time.\nOccasionallyweobserve aninitialdip inthe cohort-levelretentionratesbefore theyincrease.\nThis phenomenon cannot be captured by the BG model. We have presented the BdW model\nas an extension to the BG model, one that relaxes the assumption of time-invariant individual-\nlevel propensities to churn. This model is sufficiently flexible to capture the phenomenon of\nnon-monotonically increasing cohort-level retention rates.11 If the aggregate retention curve is\nU-shaped then individual-level churn probabilities must increase over time (i.e., c > 1). How-\never, c > 1 does not guarantee that the aggregate retention curve is U-shaped; the effect of\nheterogeneity can swamp the individual-level positive duration dependence to yield a monoton-\nically increasing aggregate retention curve. Our analysis suggests that when the assumption of\nhas the lowestinformation lost among the models used to approximate the true data-generating process).\n11As previously noted, Fader and Hardie(2010) explore how to use the BG model as the basis for calculating\nCLV. We show in Appendix C how such calculations can be performed assuming lifetimes are characterized by\nthe BdW model.\n24\n\n=== 第 28/36 页 ===\n\nconstant individual-level propensities to churn is violated in a setting where aggregateretention\nrates are increasing, it is most likely that these individual propensities increase with tenure\n(rather than decrease, as conventional wisdom would suggest).\nThissurprisingresultthat,ifnotconstant,individual-levelchurnpropensitiesareexpectedto\nincrease over time is supported by other researchers. In an analysis of health-club membership\ndata, Giudicati et al. (2013) find a negative correlation between length of membership and\nthe probability of a member renewing their contract. Lemmens and Croux (2006) find that a\ncustomer’s churn probability is postively correlated with the length of time they have owned\ntheir currentphone. Jamaland Bucklin (2006)use a latent-classWeibullmodel in theiranalysis\nof churn among customers of a direct-to-home satellite TV provider. In all three segments, the\nestimateofcisgreaterthan1. Schweideletal.(2008)useaWeibull-gammamodel(acontinuous-\ntimeanalogueofthe BdW)toanalyzechurn amongcustomersofa telecommunicationsprovider\nand find that c > 1. In a non-marketing context, Morrison and Schmittlein (1980) observe the\nsame result in some analyses of job duration data.\nWhatare some potentialcauses forsuch an effect? Possibleexplanationsinclude the novelty\nof the new service wearingoffor boredom setting-inover time,increasing competitivepressures,\nand changes in consumer preferences that are not matched by changes in the firm’s offerings.\nIn additionalto exploring such possibly causes, future research should explore the robustness of\nthis result across other product categories. Given the increasing role of networks in society, it\nis important that we understand the role of network externalities. Of interest is how changes in\nnetwork structure (and the customer’s centrality in the network) affect retention rates.\n25\n\n=== 第 29/36 页 ===\n\nP t\nAppendix A: Deriving the Distribution of ( )\nRecall from (9) that, conditional on θ and c, the individual-level probability that someone who\nhas made t−1 renewals will renew at the next opportunity is\nρ(t|θ,c)=\n(1−θ)tc−(t−1)c\n, t = 1,2,3,... (A1)\nThe distribution of Θ across those individuals who have made t − 1 renewals is simply the\nposterior distribution of Θ for the BdW model:\nS(t−1 renewals|θ,c)f(θ|γ,δ)\nf(θ|γ,δ,c;t−1 renewals) =\nS(t−1 renegwals|γ,δ,c)\n(1−θ)(t−1)cθγ−1(1−θ)δ−1\nB(γ,δ)\n=\nB(γ,δ+(t−1)c)\nB(γ,δ)\nθγ−1(1−θ)δ+(t−1)c−1\n= , t = 1,2,3,...\nB(γ,δ+(t−1)c)\nWe derive the distribution of P(t) using the basic result for deriving the distribution of the\nfunction Y = g(X) of random variable X:\nd\nf (y)= g−1(y) f (g−1(y)). (A2)\nY X\n(cid:12)dy (cid:12)\n(cid:12) (cid:12)\n(cid:12) (cid:12)\n(cid:12) (cid:12)\nRewriting (A1) as ρ(t)= g(θ), we have\ng−1(ρ(t))= 1−ρ(t)tc−(1 t−1)c and\nd g−1(ρ(t))= − 1 ρ(t)tc−(t1 −1)c−1 .\ndρ(t) tc −(t−1)c\nIt follows from (A2) that\nf(ρ(t)|γ,δ,c)\n1 γ−1 1 δ+(t−1)c−1\n=\n1 ρ(t)tc−(t1 −1)c−1\nn1−ρ(t)tc−(t−1)c\no\nnρ(t)tc−(t−1)c\no\ntc −(t−1)c B(γ,δ+(t−1)c)\n26\n\n=== 第 30/36 页 ===\n\n1 γ−1 1 δ+(t−1)c\n1 1\n1−ρ(t)tc−(t−1)c ρ(t)tc−(t−1)c\n= n o n o , t =1,2,3,....\ntc −(t−1)cρ(t) B(γ,δ+(t−1)c)\nThe mean of this distribution is\n1 γ−1 1 δ+(t−1)c\n1 1\n1−ρ(t)tc−(t−1)c ρ(t)tc−(t−1)c\nE[P(t)|γ,δ,c]= n o n o dρ(t)\nZ tc−(t−1)c B(γ,δ+(t−1)c)\n0\n1\nwhich, letting z = ρ(t)tc−(t−1)c,\n=\n1 1 (1−z)γ−1zδ+tc−1dz\nB(γ,δ+(t−1)c) Z\n0\nB(γ,δ+tc)\n= , t = 1,2,3,...,\nB(γ,δ+(t−1)c)\nwhich is r(t) under the BdW model.\n27\n\n=== 第 31/36 页 ===\n\nAppendix B: Implementing the BdW Model in Excel\nWe briefly describe how to estimate the BdW model parameters using Microsoft Excel. It is\nassumed that the reader is familiar with the basics of estimating the parameters of the BG\nmodel, as covered in FH, Appendix B.\nWithreferencetoFigureB1,we“codeup”ourexpressionforS(t)(cellsF7:F11)andcompute\nP(T = t)asS(t−1)−S(t)(cellsE8:E11). Soastoreduce thechancesofanynumericalprecision\nproblems, we compute S(t|γ,δ,c) as exp(ln(B(γ,δ+ tc))−ln(B(γ,δ))). The elements of the\nlog-likelihood function are computed in cells G8:G12, and their sum computed in cell B4.\n(cid:0)\n(cid:1)\n(cid:2)\n(cid:3)\n(cid:4)\n(cid:5)\n(cid:6)\n(cid:7)\n(cid:8)\n(cid:0) (cid:9)\n(cid:0) (cid:0)\n(cid:0) (cid:1)\n(cid:19) (cid:20) (cid:21)\n(cid:10)\n(cid:21) (cid:20)\n(cid:11)\n(cid:22) (cid:23)(cid:24) (cid:24) (cid:24)\n(cid:12)\n(cid:25)(cid:26)\n(cid:13)\n(cid:27) (cid:28) (cid:27)(cid:19) (cid:20) (cid:21) (cid:21) (cid:20) (cid:29)\n(cid:14) (cid:15) (cid:16) (cid:17) (cid:18)\n(cid:3)\n(cid:30)\n\"\n#\n(cid:31)\n#\n(cid:25) (cid:20) (cid:22)\n(cid:22)\n(cid:23)(cid:24)\n(cid:23)(cid:24)\n(cid:24)\n(cid:24)\n(cid:24)\n(cid:24)\n(cid:30) (cid:31)\n(cid:25) (cid:20) !! (cid:24) (cid:23)(cid:24) (cid:24) (cid:24)\n(cid:882)\n(\n(cid:31)\n(cid:20) )\n(cid:22) $ % & (cid:23)’\n*\n(cid:3) + , -\n(cid:23) *\n(cid:3)(cid:3)\n(cid:22)\n3\n(cid:24)\n(cid:22)\n(cid:22) (cid:24)\n4\n(cid:24)\n$\n(cid:24)\n5\n# .\n-\n(cid:22) ’ (cid:22) (cid:24)\n/ (cid:27)0\n(cid:23)6 (cid:24)\n1\n(cid:24)\n!\n(cid:24)\n(cid:22)\n(cid:24)\n(cid:23)(cid:24)\n(cid:23)6\n2\n(cid:24)\n(cid:24)\n(cid:27) !\n(cid:24) (cid:24)\n(cid:24) (cid:24)\n(cid:882)\n’ 3 % & ’ (cid:22) 3 $ (cid:24) (cid:23)(cid:22) $ $ % (cid:24) (cid:23)’ ’ ’ ’\n5 (cid:24) (cid:23)4 (cid:24)\n(cid:882) & ’ $ 6 ’ 5 (cid:24) (cid:24) (cid:23)(cid:24) 4 ’ ’ (cid:24) 3(cid:23) 6 (cid:24) (cid:24)\n3 3 6 (cid:23)% $\n(cid:882) 6 & 6 5 ’ $ (cid:24) (cid:24) (cid:23)(cid:24) 6 (cid:24) (cid:24) (cid:24) 3(cid:23) (cid:24) (cid:24) (cid:24)\n3 3 ’ (cid:23)$ &\n(cid:882)\n(cid:22) % 5 (cid:23)% &\n(cid:882)\n1 7 %\n5 6 & (cid:23)& (cid:24)\n(cid:882)\n7 4\n1 7 8 / (cid:27)9 : ; ; : # < (cid:27)= (cid:28) = (cid:22) !> 9 : ; ; : # < (cid:27)= (cid:28) = 3 > : 4 ? = (cid:28) = ’ !\n(cid:882)\n9 : ; ; : # < (cid:27)= (cid:28) = (cid:22) > = (cid:28) = 3 > : 4 ? = (cid:28) = ’ !\n(cid:882)\n1 9 : ; ; : # < (cid:27) (cid:28) (cid:22) !> 9 : ; ; : # < (cid:27) (cid:28) 3 !\n= 7 = (cid:22) !\n(cid:882)\n9 : ; ; : # < (cid:27) (cid:28) (cid:22) > (cid:28) 3 !\nFigure B1: Screenshot of the Excel worksheet for parameter estimation (High End\ndataset)\nThe values of γ,δ,c that maximize the value of the log-likelihood function are found using\nthe Excel add-in Solver—see Figure B2. (In this particular case, the log-likelihood function\nis quite flat near its maximum. The parameter estimates reported in Table 4 are obtained by\nrunning Solver twice, with the solution from the first “run” serving as starting values for the\nsecond “run”.) With reference to Figure B3, we can now project S(t) by copying cell F11 down\nto F19, and compute the associated retention rates as S(t)/S(t−1) (cells H8:H19).\n28\n\n=== 第 32/36 页 ===\n\n(cid:3)\nFigure B2: Solver settings\n@\nA\nB\nC\nD\nE\nF\nG\nH\n@ I\n@ @\n@ A\n@ B\n@ C\n@ D\n@ E\n@ F\n@ G\n@ H\nR S T\nJ\nT S\nK\nU VW X Y\nL\nZ[ \\\nM\n] \\ R S T T S ^\nN O P Q\n(cid:3)\n_\nf\ni\n‘\ni\nZa S c\nc\nVe\nVX\nW\ng\nW\nh\n_ ‘\nZa S b b c Vc d d\n(cid:882)\nj\n‘\nS k\nc W W W Ve\na l\n(cid:3) m n o\na V l\n(cid:3)(cid:3)\nc\nW\nU\nc\nc U\ng\nU\nd\nU\nY\ni p\nc t\na\no\nc U\nq \\\nVc\n_\nt\nb\nr‘\nU Y\nc\nU\nVU\nVg\ns\nU\nd\n\\a b\nU U\nY c\nk \\a b\n(cid:882)\nt W e h t c W d U Vc W e W U Ve h c Y\nW d d Vh c U Vg d Y c\n(cid:882)\nh t d X t Y U U VU g e c U Vd X h g\nW X Y Ve Y U Vg X t d\n(cid:882)\nX h X Y t d U U VU d c Y U VX Y W Y\nW c Y Vd W U Vg g W X\n(cid:882)\nd X U VX h d d\nc d d VY g U VY U X X\n(cid:882)\nc\nc\nc\nc\ne\ng\nY\nU\nc\nW\nt\nc\nc\nc\nd\ne\ng\nY\nU\nc\nW\nU\nU\nU\nU\nU\nU\nU\nVX\nVh\nVh\nVh\nVh\nVh\nVt\nc\ng\nX\nt\nc\nU\ng\nU\nc\nd\nd\ng\nW\ng\nh\nU\ne\nU\nW\ne\nY\nt\nu\nU\ngv\nY\nw\nVY\nv\nh\ne\nU\nU\nU\nU\nU\nU\nU\nU\nVY\nVY\nVY\nVY\nVY\nVY\nVY\nVY\nW\nt\nh\nh\nX\nX\nd\nd\nc\nt\nW\nY\nh\nY\nW\nX\nY\ne\nX\nh\ng\nW\ng\ng\nFigure B3: ScreenshotoftheExcelworksheetforprojectingsurvivalandretention(High\nEnd dataset)\n29\n\n=== 第 33/36 页 ===\n\nAppendix C: Computing CLV under the BdW Model\nFader and Hardie (2010) introduce the idea of discounted expected lifetime and discounted ex-\npected residual lifetime, which they label DEL and DERL, and show how these quantities are of\nuse when computing (expected) customer lifetime value and residual lifetime value in contrac-\ntual settings. It is more correct to think of the random variables discounted lifetime (DL) and\ndiscounted residual lifetime (DRL) and to compute their means, giving us expected discounted\nlifetime (E(DL)) and expected discounted residual lifetime (E(DRL)).\nFor discount rate d, the expected discounted lifetime of an as-yet-to-be-acquired customer\nis, by definition,\n∞\nS(t)\nE(DL)= , (C1)\n(1+d)t\nXt=0\nwhile expected discounted residual lifetime of customer at the end of period n (i.e., someone\nwho has made n−1 renewals) is\n∞\nS(t|T > n−1)\nE(DRL|active for n periods) = . (C2)\n(1+d)t−n\nXt=n\nWhen the duration of a customer’s relationship with the firm is characterized by the BG\nmodel, we can substitute (3) in (C1) and (C2), solve the infinite sums, and derive closed-form\nexpressions for E(DL) and E(DRL). This is not the case when lifetimes are characterized by\nthe BdW model. However, we can simply evaluate (C1) and (C2) terminating the series at a\npoint where additionaltermsare effectively zero. Withreference toFigure C1, we first compute\nS(t) for t = 0,1,2,...,100 using (10) in column C. Given an annual discount rate of 10% (cell\nB4, the discount factor is computed in column E. Summing up the product of the two terms\n(cell E6) gives us the value of E(DL) for an as-yet-to-be-acquired customer.\nInordertocomputeE(DRL),weneedtoevaluatetheconditionalsurvivorfunctionS(t|T >\nn − 1). Standing at the end of year 5, the customer has made four renewals, and we can\ncompute the conditional survivor function as S(t)/S(4)—see column G. Given the discount\nfactor computed in column H, we sum up the product of the two terms (cell H6), giving us the\nvalue of E(DRL).\n30\n\n=== 第 34/36 页 ===\n\nx\ny\nz\n{|\n}\n~\n(cid:127)\n(cid:128)\nx (cid:129)\nx x\nx y\nx z\nx {|\nx\nx }\nx ~\nx (cid:127)\nx (cid:128)\ny (cid:129)\ny x\ny y\ny z\nx (cid:129) }\nx (cid:129) ~\nx (cid:129) (cid:127)\n(cid:139) (cid:140) (cid:141)\n(cid:130)\n(cid:141) (cid:140)\n(cid:131)\n(cid:142) (cid:143)(cid:144) (cid:145) (cid:146)\n(cid:132)\n(cid:147)(cid:148) (cid:149)\n(cid:133)\n(cid:150) (cid:149)(cid:139) (cid:140) (cid:141) (cid:141) (cid:140) (cid:151)\n(cid:134) (cid:135) (cid:136) (cid:137) (cid:138)\n(cid:3)\n(cid:152) (cid:159)\n(cid:152)\n(cid:153)\n(cid:147)(cid:154) (cid:140) (cid:156)\n(cid:156)\n(cid:143)(cid:158) (cid:144) (cid:144)\n(cid:143)(cid:145) (cid:160) ¡\n(cid:142) (cid:143)(cid:156)\n¢ (cid:149) £ ⁄ (cid:155)\n(cid:152) (cid:153)\n(cid:147)(cid:154) (cid:140)\n(cid:145) (cid:143)(cid:146)\n(cid:155) (cid:155)\n(cid:146)\n(cid:156) (cid:143)(cid:156) (cid:157) (cid:157)\n§\n¢ (cid:149) £\n¤'\n(cid:153)\n¥\n(cid:148)\n⁄ (cid:155) (cid:158) (cid:143)ƒ (cid:144)\n(cid:3)\n¡\n(cid:3)\n›\n(cid:153)\n(cid:156)\n(cid:140)\n(cid:156)\n(cid:156)\n(cid:156)\n(cid:156)\n(cid:156)\n(cid:156)\n(cid:146)\n(cid:146)\n(cid:142)\n“\n(cid:156)\n(cid:144)\nƒ\n¡\n(cid:145)\n(cid:157)\n(cid:158)\n(cid:160)\n(cid:146)\n(cid:142)\n(cid:156)\n(cid:144)\nƒ\n¡\n(cid:145)\n(cid:160)\n(cid:146)\n(cid:142)\n(cid:156)\n(cid:156)\n(cid:156)\n(cid:156)\n(cid:156)\n(cid:146)\n(cid:146)\n(cid:146)\n(cid:154)\n(cid:142)\n(cid:156)\n(cid:144)\nƒ\n¡\n(cid:145)\n(cid:157)\n(cid:158)\n(cid:160)\n(cid:146)\n(cid:142)\n(cid:156)\n(cid:144)\nƒ\n¡\n(cid:158)\n(cid:160)\n(cid:146)\n†\n(cid:156)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\nfi • ‚ „\nfi (cid:149)(cid:154) (cid:155)\n(cid:143)(cid:142) (cid:142) (cid:142) (cid:142)\n(cid:143)(cid:160) (cid:157) (cid:146) (cid:156)\n(cid:143)(cid:158) ¡ (cid:156) (cid:146)\n(cid:156) ‡ (cid:149) (cid:156)†\n(cid:143)(cid:157) (cid:145) ¡ (cid:160)\n(cid:143)(cid:145) (cid:146) (cid:144) (cid:146)\n(cid:143)(cid:145) ¡ (cid:157) (cid:157)\n(cid:143)(cid:145) (cid:156) (cid:142) ¡\n(cid:143)¡ (cid:160) (cid:156) (cid:142)\n(cid:143)¡ (cid:145) (cid:157) (cid:158)\n(cid:143)¡ ƒ (cid:157) (cid:142)\n(cid:143)¡ (cid:156) (cid:160) (cid:144)\n(cid:143)¡ (cid:142) (cid:144) (cid:158)\n(cid:143)ƒ (cid:160) (cid:160) (cid:146)\n(cid:143)ƒ (cid:158) (cid:157) (cid:157)\n(cid:143)ƒ (cid:157) (cid:145) (cid:157)\n(cid:143)(cid:156) (cid:157) (cid:145) (cid:146)\n(cid:143)(cid:156) (cid:157) (cid:145) (cid:144)\n(cid:143)(cid:156) (cid:157) ¡ (cid:145)\n¥\n·\n”\n(cid:150) (cid:181)\n£\n¡\n•\n(cid:155)\n»\n¶\n(cid:176)\n(cid:150)\n(cid:149)»\n(cid:156) (cid:142)\n(cid:146) …» (cid:156) (cid:142)\n(cid:156)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n(cid:142)\n¢ ¢(cid:160) (cid:151) (cid:146) …\n¤‹ (cid:159) (cid:143)\n(cid:152)\n(cid:143)(cid:142) (cid:142) (cid:142) (cid:142)\n(cid:143)(cid:146) (cid:142) (cid:146) (cid:156)\n(cid:143)(cid:160) (cid:144) (cid:157) ¡\n(cid:143)(cid:158) (cid:145) (cid:156) ƒ\n(cid:143)(cid:157) (cid:160) ƒ (cid:142)\n(cid:143)(cid:157) (cid:144) (cid:142) (cid:146)\n(cid:143)(cid:145) (cid:157) ¡ (cid:145)\n(cid:143)(cid:145) (cid:156) ƒ (cid:144)\n(cid:143)¡ (cid:157) (cid:157) (cid:145)\n(cid:143)¡ (cid:144) ¡ (cid:156)\n(cid:143)ƒ (cid:160) (cid:145) (cid:145)\n(cid:143)ƒ (cid:145) (cid:142) (cid:145)\n(cid:143)ƒ (cid:156) (cid:160) (cid:157)\n(cid:143)(cid:144) (cid:160) (cid:146) (cid:158)\n(cid:143)(cid:144) (cid:157) ƒ ƒ\n(cid:143)(cid:142) (cid:142) (cid:142) (cid:156)\n(cid:143)(cid:142) (cid:142) (cid:142) (cid:156)\n(cid:143)(cid:142) (cid:142) (cid:142) (cid:156)\n(cid:156) (cid:142) (cid:160) (cid:155)\nfi •†\n» (cid:156)†\n‚\n¡ ‡\n„\n»\n¥\n(cid:181)\n”\n(cid:156)\n£\n†\nƒ\n§ §• » (cid:176) (cid:149) (cid:156) ¡ … (cid:156) (cid:142) (cid:160) (cid:151) ‰ (cid:156) ¡ …‰\n(cid:148) (cid:140) (cid:147)‹\n“(cid:153) (cid:153) «\n¤fi (cid:149)(cid:154) fl(cid:176) – ¡ (cid:155) ‹ (cid:159) (cid:143)\n(cid:152)\n(cid:156) ‡ (cid:149) (cid:156) · (cid:150) (cid:181) ¡ (cid:155) ¶ (cid:149) (cid:150) (cid:156) ¡\n(cid:142) (cid:143)(cid:146) (cid:144) (cid:156) (cid:146) (cid:156) (cid:143)(cid:142) (cid:142) (cid:142) (cid:142)\n(cid:142) (cid:143)(cid:160) (cid:157) (cid:142) (cid:160) (cid:142) (cid:143)(cid:146) (cid:142) (cid:146) (cid:156)\n(cid:142) (cid:143)(cid:160) (cid:156) (cid:156) ƒ (cid:142) (cid:143)(cid:160) (cid:144) (cid:157) ¡\n(cid:142) (cid:143)(cid:158) (cid:158) (cid:142) (cid:144) (cid:142) (cid:143)(cid:158) (cid:145) (cid:156) ƒ\n(cid:142) (cid:143)(cid:158) ƒ (cid:145) ¡ (cid:142) (cid:143)(cid:157) (cid:160) ƒ (cid:142)\n(cid:142) (cid:143)(cid:158) (cid:142) (cid:145) ¡ (cid:142) (cid:143)(cid:157) (cid:144) (cid:142) (cid:146)\n(cid:142) (cid:143)(cid:157) (cid:158) (cid:146) (cid:156) (cid:142) (cid:143)(cid:145) (cid:157) ¡ (cid:145)\n(cid:142) (cid:143)(cid:157) (cid:145) (cid:145) (cid:146) (cid:142) (cid:143)(cid:145) (cid:156) ƒ (cid:144)\n(cid:142) (cid:143)(cid:157) ƒ (cid:145) (cid:144) (cid:142) (cid:143)¡ (cid:157) (cid:157) (cid:145)\n(cid:142) (cid:143)(cid:157) (cid:156) (cid:157) (cid:157) (cid:142) (cid:143)¡ (cid:144) ¡ (cid:156)\n(cid:142) (cid:143)(cid:144) (cid:158) (cid:146) (cid:160) (cid:142) (cid:143)(cid:142) (cid:142) (cid:142) (cid:144)\n(cid:142) (cid:143)(cid:144) (cid:158) (cid:160) (cid:157) (cid:142) (cid:143)(cid:142) (cid:142) (cid:142) (cid:156)\n(cid:142) (cid:143)(cid:144) (cid:158) (cid:158) ¡ (cid:142) (cid:143)(cid:142) (cid:142) (cid:142) (cid:156)\n(cid:156) (cid:142) (cid:160) (cid:155)\n(cid:882) (cid:145) (cid:155)\nFigure C1: ScreenshotoftheExcelworksheetforcomputingE(DL)andE(DRL)under\nthe BdW model\n31\n\n=== 第 35/36 页 ===\n\nReferences\nAnderson, David R. (2008),Model Based Inferencein the Life Sciences: A Primer on Evidence,\nNew York, NY: Springer Science+Business Media, LLC.\nBerry, Michael J.A. and Gordon S. Linoff (2004), Data Mining Techniques, 2nd edition, Indi-\nanapolis, IN: Wiley Publishing, Inc.\nBolton, Ruth N. (1998), “A Dynamic Model of the Duration of the Customer’s Relationship\nwith a Continuous Service Provider: The Role of Satisfaction,” Marketing Science, 17 (1),\n45–65.\nBracquemond, CyrilandOlivierGaudoin(2003),“ASurveyonDiscreteLifetimeDistributions,”\nInternational Journal of Reliability, Quality and Safety Engineering, 10 (1), 69–98.\nBuchanan, Bruce and Donald G. Morrison (1988), “A Stochastic Model of List Falloff with\nImplications for Repeat Mailings,” Journal of Direct Marketing, 2 (Summer), 7–15.\nBurnham,KennethP.andDavidR.Anderson(2002),ModelSelectionandMultimodelInference:\nA Practical Information-Theoretical Approach, 2nd edition, New York, NY: Springer-Verlag,\nInc.\nBurnham, Thomas A., Judy K. Frels, and Vijay Mahajan (2003), “Consumer Switching Costs:\nATypology,Antecedents,andConsequences,”Journal of the Academy of MarketingScience,\n31 (Spring), 109–126.\nFader, Peter S. and Bruce G.S. Hardie (2007), “How to Project Customer Retention,” Journal\nof Interactive Marketing, 21 (Winter), 76–90.\nFader, Peter S. and Bruce G.S. Hardie (2010), “Customer-Base Valuation in a Contractual\nSetting: The Perils of Ignoring Heterogeneity,” Marketing Science, 29 (January–February),\n85–93.\nFader,PeterS.andBruce G.S.Hardie(2014),“ASpreadsheet-LiterateNon-Statistician’sGuide\nto the Beta-Geometric Model.” <http://brucehardie.com/notes/032/>\nFader, PeterS.,Bruce G.S.Hardie, DanielMcCarthy,andRamnath Vaidyanathan(2017),“Ex-\nploring the Equivalence of Two Common Mixture Models for Duration Data,” unpublished\nworking paper.\nFox, Richard J., Srinivas K. Reddy, and Bharat Rao (1997), “Modeling Response to Repetitive\nPromotional Stimuli,” Journal of the Academy of Marketing Science, 25 (3), 242–255.\nGiudicati,Gianna,MassimoRiccaboni, and Anna Romiti(2013),“Experience,Socializationand\nCustomer Retention: Lessons From the Dance Floor,” Marketing Letters, 24 (December),\n409–422.\nGlaser, Ronald E. (1980), “Bathtub and Related Failure Rate Characterizations,” Journal of\nthe American Statistical Association, 75 (September), 667–672.\nIsrael, Mark(2005),“Tenure Dependence in Consumer-FirmRelationships: An EmpiricalAnal-\nysis of Consumer Departures from Automobile Insurance Firms,” The RAND Journal of\nEconomics, 36 (Spring), 165–192.\n32\n\n=== 第 36/36 页 ===\n\nJamal, Zainab and Randolph E. Bucklin (2006), “Improving the Diagnosis and Prediction of\nCustomer Churn: A Heterogeneous Hazard Modeling Approach,” Journal of Interactive\nMarketing, 20 (Summer/Autumn), 16–29.\nKiefer, Nicholas M. (1988), “Economic Duration Data and Hazard Functions,” Journal of Eco-\nnomic Literature, 26 (June), 646–679.\nLai, C.D.(2013),“Issues Concerning Constructionsof Discrete Lifetime Models,” Quality Tech-\nnology and Quantitative Management, 10 (2), 251–262.\nLemmens, Aur´elie and Christophe Croux (2006) “Bagging and Boosting Classification Trees to\nPredict Churn,” Journal of Marketing Research, 43 (May), 276–286.\nMorrison, Donald and Arnon Perry (1970), “Some Data Based Models for Analyzing Sales\nFluctuations,” Decision Sciences, 1 (3–4), 258–274.\nMorrison, Donald G. and David C. Schmittlein (1980), “Jobs, Strikes, and Wars: Probability\nModels for Duration,” Organizational Behavior and Human Performance, 25 (April), 224–\n251.\nMurthy, D.N. Prabhakar, Min Xie, and Renyan Jiang (2004), Weibull Models, Hoboken, NJ:\nJohn Wiley & Sons.\nNakagawa, Toshio and Shunji Osaki (1975), “The Discrete Weibull Distribution,”IEEE Trans-\nactions on Reliability, 24 (December), 300–301.\nNitzan, Irit, Barak Libai, and Danit Ein-Gar (2011),“The Time of Decreasing Retention: Pay-\nments as a Virtual Lock-in,” unpublished working paper.\nPielou, E.C. (1962),“Runs of One Species with Respect to Another in Transects through Plant\nPopulations,” Biometrics, 18 (December), 579–593.\nPotter, R.G. and M.P. Parker (1964),“Predictingthe Time Required to Conceive,” Population\nStudies, 18 (July), 99–116.\nProschan, Frank (1963), “TheoreticalExplanation of Observed Decreasing Failure Rate,” Tech-\nnometrics, 5 (August), 375–383.\nReichheld, Frederick F. (1996), The Loyalty Effect: The Hidden Force Behind Growth, Profits,\nand Lasting Value, Boston, MA: Harvard Business School Press.\nRinne, Horst (2009), The Weibull Distribution: A Handbook, Boca Raton, FL: Chapman &\nHall/CRC Press.\nSchweidel,DavidA.,PeterS.Fader,andEricBradlow(2008),“UnderstandingServiceRetention\nWithinandAcrossCohortsUsingLimitedInformation,”Journalof Marketing,72(January),\n82–94.\nVaupel,JamesW.andAnatoliI.Yashin(1985),“Heterogeneity’sRuses: SomeSurprisingEffects\nof Selection on Population Dynamics,” The American Statistician, 39 (August), 176–185.\n33"
    },
    {
      "turn_idx": 14,
      "action": "file_saver({\"file_path\": \"work_space/gaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md\", \"content\": \"# Consolidated Search Results Summary — Step 1\\n\\nThis file consolidates all retrieved information and direct quotations from credible sources about Manash Pratim Kashyap’s (2019) and Peter S. Fader’s (2018) customer retention studies, focusing on explicit mentions of model types.\\n\\n---\\n\\n## Source 1: International Journal of Advanced Research (IJAR) — PDF (March 2019)\\n- URL: https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf\\n- Title: A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE. (Manash Pratim Kashyap)\\n\\nExtracted Content (direct quotes):\\n- “Key words:- sBG Model, Beta distribution, Beta geometric, Beta binomial.” (p.1)\\n- “The main goal of this paper is to predict future purchase patterns for rural and urban customer … The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.” (p.1)\\n- “Now our approach is to fit the sBG model for the three purchase occasion of rural and urban customer.” (p.6)\\n\\n---\\n\\n## Source 2: ResearchBib Index Page (2019)\\n- URL: http://paper.researchbib.com/view/paper/207357\\n- Title: A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE\\n\\nExtracted Content (direct quotes):\\n- “Keywords : sBG Model Beta distribution Beta geometric Beta binomial.”\\n- “The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.”\\n\\n---\\n\\n## Source 3: Academia.edu listing (2019)\\n- URL: https://www.academia.edu/89361498/A_Simple_Model_for_Analyzing_the_Customer_Retention_Comparing_Rural_and_Urban_Store\\n- Title: A Simple Model for Analyzing the Customer Retention Comparing Rural and Urban Store (Manash Pratim Kashyap)\\n\\nExtracted Content (direct quotes):\\n- “The study analyzes customer retention using a discrete time beta geometric model on 390 households.”\\n\\n---\\n\\n## Source 4: LBS Research Online (Journal of Interactive Marketing, 2018) — PDF\\n- URL: https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf\\n- Title: “How to Project Customer Retention” Revisited: The Role of Duration Dependence (Peter S. Fader, Bruce G.S. Hardie, et al.)\\n\\nExtracted Content (direct quotes):\\n- “Cohort-level retention rates typically increase over time, and the beta-geometric (BG) distribution has proven to be a robust model for capturing and projecting these patterns into the future.” (p.3, Abstract)\\n- “In this paper we present the beta-discrete-Weibull (BdW) distribution as an extension to the BG model…” (p.3, Abstract)\\n- “FH propose a simple probability model … This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\\n- “Atthe heartofthe FHpaper is a customer metric of great interest … This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\\n\\n---\\n\\n## Source 5: Wharton Faculty Platform — 2007 reference (contextual, not in 2018–2019 window)\\n- URL: https://faculty.wharton.upenn.edu/wp-content/uploads/2012/04/Fader_hardie_jim_07.pdf\\n- Title: How to Project Customer Retention (Peter S. Fader, Bruce G.S. Hardie)\\n\\nExtracted Content (direct quotes):\\n- “We show that our basic model (known as a ‘shifted-beta-geometric’) can be implemented in a simple Microsoft Excel spreadsheet and provides remarkably accurate forecasts…” (p.1)\\n\\nNote: This 2007 reference is included only to show consistent terminology; the 2018 paper (Source 4) explicitly references the “beta-geometric (BG) distribution” within the 2018–2019 timeframe.\\n\\n---\\n\\nSummary of relevance to the question:\\n- Manash Pratim Kashyap (2019) explicitly states use of the “discrete time beta geometric model” and lists “Beta geometric” as a keyword.\\n- Peter S. Fader et al. (2018) explicitly reference the “beta-geometric (BG) distribution” as a robust model for projecting customer retention (and introduce BdW as an extension).\\n\\nCommon two-word model type across both studies (as directly stated): “beta geometric”.\", \"mode\": \"w\"})",
      "explanation": "",
      "observation": "Content successfully saved to work_space/gaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md"
    },
    {
      "turn_idx": 15,
      "action": "mark_step({\"step_index\": 1, \"step_status\": \"completed\", \"step_notes\": \"Executed Step 1: Conducted targeted searches for Manash Pratim Kashyap’s 2019 customer retention paper and Peter S. Fader’s 2018 study. Retrieved and extracted content from IJAR 2019 PDF (Kashyap), ResearchBib, Academia.edu listing, LBS Research Online 2018 PDF (Fader et al.), and Wharton faculty PDF for background. The consolidated findings were saved as a single file per instruction: work_space/gaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md. Key extracted phrases:\\n- Kashyap (2019): “discrete time beta geometric model”; keywords include “Beta geometric”.\\n- Fader et al. (2018): “beta-geometric (BG) distribution” cited as robust model for retention. \\nThese directly indicate the common two-word model type: “beta geometric”. No issues encountered beyond filtering to 2018–2019; extracted information is 100% faithful and properly attributed.\"})",
      "explanation": "",
      "observation": "Step 1: step_status is completed, step_notes is Executed Step 1: Conducted targeted searches for Manash Pratim Kashyap’s 2019 customer retention paper and Peter S. Fader’s 2018 study. Retrieved and extracted content from IJAR 2019 PDF (Kashyap), ResearchBib, Academia.edu listing, LBS Research Online 2018 PDF (Fader et al.), and Wharton faculty PDF for background. The consolidated findings were saved as a single file per instruction: work_space/gaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md. Key extracted phrases:\n- Kashyap (2019): “discrete time beta geometric model”; keywords include “Beta geometric”.\n- Fader et al. (2018): “beta-geometric (BG) distribution” cited as robust model for retention. \nThese directly indicate the common two-word model type: “beta geometric”. No issues encountered beyond filtering to 2018–2019; extracted information is 100% faithful and properly attributed. "
    },
    {
      "turn_idx": 16,
      "action": "file_read({\"file\": \"work_space/gaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md\"})",
      "explanation": "Completed Step 2 by extracting explicit methodology model phrases from both studies and saving them to a single file. File created: work_spacegaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md. The extracted two-word model phrase found in both sources is “Beta geometric”. Verified that the file was successfully saved and readable.",
      "observation": "# Consolidated Search Results Summary — Step 1\n\nThis file consolidates all retrieved information and direct quotations from credible sources about Manash Pratim Kashyap’s (2019) and Peter S. Fader’s (2018) customer retention studies, focusing on explicit mentions of model types.\n\n---\n\n## Source 1: International Journal of Advanced Research (IJAR) — PDF (March 2019)\n- URL: https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf\n- Title: A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE. (Manash Pratim Kashyap)\n\nExtracted Content (direct quotes):\n- “Key words:- sBG Model, Beta distribution, Beta geometric, Beta binomial.” (p.1)\n- “The main goal of this paper is to predict future purchase patterns for rural and urban customer … The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.” (p.1)\n- “Now our approach is to fit the sBG model for the three purchase occasion of rural and urban customer.” (p.6)\n\n---\n\n## Source 2: ResearchBib Index Page (2019)\n- URL: http://paper.researchbib.com/view/paper/207357\n- Title: A SIMPLE MODEL FOR ANALYZING THE CUSTOMER RETENTION COMPARING RURAL AND URBAN STORE\n\nExtracted Content (direct quotes):\n- “Keywords : sBG Model Beta distribution Beta geometric Beta binomial.”\n- “The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.”\n\n---\n\n## Source 3: Academia.edu listing (2019)\n- URL: https://www.academia.edu/89361498/A_Simple_Model_for_Analyzing_the_Customer_Retention_Comparing_Rural_and_Urban_Store\n- Title: A Simple Model for Analyzing the Customer Retention Comparing Rural and Urban Store (Manash Pratim Kashyap)\n\nExtracted Content (direct quotes):\n- “The study analyzes customer retention using a discrete time beta geometric model on 390 households.”\n\n---\n\n## Source 4: LBS Research Online (Journal of Interactive Marketing, 2018) — PDF\n- URL: https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf\n- Title: “How to Project Customer Retention” Revisited: The Role of Duration Dependence (Peter S. Fader, Bruce G.S. Hardie, et al.)\n\nExtracted Content (direct quotes):\n- “Cohort-level retention rates typically increase over time, and the beta-geometric (BG) distribution has proven to be a robust model for capturing and projecting these patterns into the future.” (p.3, Abstract)\n- “In this paper we present the beta-discrete-Weibull (BdW) distribution as an extension to the BG model…” (p.3, Abstract)\n- “FH propose a simple probability model … This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\n- “Atthe heartofthe FHpaper is a customer metric of great interest … This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\n\n---\n\n## Source 5: Wharton Faculty Platform — 2007 reference (contextual, not in 2018–2019 window)\n- URL: https://faculty.wharton.upenn.edu/wp-content/uploads/2012/04/Fader_hardie_jim_07.pdf\n- Title: How to Project Customer Retention (Peter S. Fader, Bruce G.S. Hardie)\n\nExtracted Content (direct quotes):\n- “We show that our basic model (known as a ‘shifted-beta-geometric’) can be implemented in a simple Microsoft Excel spreadsheet and provides remarkably accurate forecasts…” (p.1)\n\nNote: This 2007 reference is included only to show consistent terminology; the 2018 paper (Source 4) explicitly references the “beta-geometric (BG) distribution” within the 2018–2019 timeframe.\n\n---\n\nSummary of relevance to the question:\n- Manash Pratim Kashyap (2019) explicitly states use of the “discrete time beta geometric model” and lists “Beta geometric” as a keyword.\n- Peter S. Fader et al. (2018) explicitly reference the “beta-geometric (BG) distribution” as a robust model for projecting customer retention (and introduce BdW as an extension).\n\nCommon two-word model type across both studies (as directly stated): “beta geometric”."
    },
    {
      "turn_idx": 17,
      "action": "file_saver({\"file_path\": \"work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md\", \"mode\": \"w\", \"content\": \"Title: Extracted Methodology Model Types (Step 2)\\n\\nObjective: Extract explicit model type phrases from the methodology/approach descriptions of the two studies (2018–2019) with exact wording and source attribution.\\n\\nStudy A: Manash Pratim Kashyap (2019) — IJAR\\n- Source URL: https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf\\n- Direct quotations:\\n  - “Key words:- sBG Model, Beta distribution, Beta geometric, Beta binomial.” (p.1)\\n  - “The main goal of this paper is to predict future purchase patterns for rural and urban customer … The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.” (p.1)\\n  - “Now our approach is to fit the sBG model for the three purchase occasion of rural and urban customer.” (p.6)\\n- Exact two-word model phrase found: “Beta geometric”\\n\\nStudy B: Peter S. Fader et al. (2018) — Journal of Interactive Marketing (LBS Research Online PDF)\\n- Source URL: https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf\\n- Direct quotations:\\n  - “Cohort-level retention rates typically increase over time, and the beta-geometric (BG) distribution has proven to be a robust model for capturing and projecting these patterns into the future.” (p.3, Abstract)\\n  - “In this paper we present the beta-discrete-Weibull (BdW) distribution as an extension to the BG model…” (p.3, Abstract)\\n  - “This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\\n- Exact two-word model phrase extracted (de-hyphenated for two-word capture): “Beta geometric”\\n\\nNotes:\\n- All extracted content is directly quoted from the sources listed above.\\n- The two-word phrase normalization (removing hyphen) is done only for phrase capture per task requirement; underlying quotes retain original punctuation in-line.\\n\"})",
      "explanation": "",
      "observation": "Content successfully saved to work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md"
    },
    {
      "turn_idx": 18,
      "action": "file_read({\"file\": \"work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md\"})",
      "explanation": "",
      "observation": "Title: Extracted Methodology Model Types (Step 2)\n\nObjective: Extract explicit model type phrases from the methodology/approach descriptions of the two studies (2018–2019) with exact wording and source attribution.\n\nStudy A: Manash Pratim Kashyap (2019) — IJAR\n- Source URL: https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf\n- Direct quotations:\n  - “Key words:- sBG Model, Beta distribution, Beta geometric, Beta binomial.” (p.1)\n  - “The main goal of this paper is to predict future purchase patterns for rural and urban customer … The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.” (p.1)\n  - “Now our approach is to fit the sBG model for the three purchase occasion of rural and urban customer.” (p.6)\n- Exact two-word model phrase found: “Beta geometric”\n\nStudy B: Peter S. Fader et al. (2018) — Journal of Interactive Marketing (LBS Research Online PDF)\n- Source URL: https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf\n- Direct quotations:\n  - “Cohort-level retention rates typically increase over time, and the beta-geometric (BG) distribution has proven to be a robust model for capturing and projecting these patterns into the future.” (p.3, Abstract)\n  - “In this paper we present the beta-discrete-Weibull (BdW) distribution as an extension to the BG model…” (p.3, Abstract)\n  - “This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\n- Exact two-word model phrase extracted (de-hyphenated for two-word capture): “Beta geometric”\n\nNotes:\n- All extracted content is directly quoted from the sources listed above.\n- The two-word phrase normalization (removing hyphen) is done only for phrase capture per task requirement; underlying quotes retain original punctuation in-line.\n"
    },
    {
      "turn_idx": 19,
      "action": "mark_step({\"step_index\": 2, \"step_status\": \"completed\", \"step_notes\": \"Completed Step 2 by extracting explicit methodology model phrases from both studies and saving them to a single file. File created: work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md. The extracted two-word model phrase found in both sources is “Beta geometric”. Verified that the file was successfully saved and readable.\"})",
      "explanation": "",
      "observation": "Step 2: step_status is completed, step_notes is Completed Step 2 by extracting explicit methodology model phrases from both studies and saving them to a single file. File created: work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md. The extracted two-word model phrase found in both sources is “Beta geometric”. Verified that the file was successfully saved and readable. "
    },
    {
      "turn_idx": 20,
      "action": "file_read({\"file\": \"work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md\"})",
      "explanation": "Cross-verified both studies’ methodology/model references using extracted quotes. Both Kashyap (2019) and Fader et al. (2018) reference the beta-geometric (BG) model/distribution. Normalized to the required two-word, no-punctuation format: \"Beta geometric\". Files generated:\n- work_spacegaia_0ff53813_20251222_031212_092979/cross_verification_step3.md\nNo issues encountered. Ready to compose final response per formatting constraints.",
      "observation": "Title: Extracted Methodology Model Types (Step 2)\n\nObjective: Extract explicit model type phrases from the methodology/approach descriptions of the two studies (2018–2019) with exact wording and source attribution.\n\nStudy A: Manash Pratim Kashyap (2019) — IJAR\n- Source URL: https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf\n- Direct quotations:\n  - “Key words:- sBG Model, Beta distribution, Beta geometric, Beta binomial.” (p.1)\n  - “The main goal of this paper is to predict future purchase patterns for rural and urban customer … The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.” (p.1)\n  - “Now our approach is to fit the sBG model for the three purchase occasion of rural and urban customer.” (p.6)\n- Exact two-word model phrase found: “Beta geometric”\n\nStudy B: Peter S. Fader et al. (2018) — Journal of Interactive Marketing (LBS Research Online PDF)\n- Source URL: https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf\n- Direct quotations:\n  - “Cohort-level retention rates typically increase over time, and the beta-geometric (BG) distribution has proven to be a robust model for capturing and projecting these patterns into the future.” (p.3, Abstract)\n  - “In this paper we present the beta-discrete-Weibull (BdW) distribution as an extension to the BG model…” (p.3, Abstract)\n  - “This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\n- Exact two-word model phrase extracted (de-hyphenated for two-word capture): “Beta geometric”\n\nNotes:\n- All extracted content is directly quoted from the sources listed above.\n- The two-word phrase normalization (removing hyphen) is done only for phrase capture per task requirement; underlying quotes retain original punctuation in-line.\n"
    },
    {
      "turn_idx": 21,
      "action": "file_saver({\"file_path\": \"work_space/gaia_0ff53813_20251222_031212_092979/cross_verification_step3.md\", \"content\": \"Title: Cross-Verification of Common Model Type (Step 3)\\n\\nObjective\\n- Cross-verify the methodologies from the two target studies (2018–2019) and determine the common two-word model type they share.\\n\\nInputs Reviewed\\n- extracted_methodology_models_step2.md (compiled exact quotations and phrases)\\n- Source A: Manash Pratim Kashyap (2019), IJAR — https://www.journalijar.com/uploads/2019/03/188_IJAR-26303.pdf\\n- Source B: Peter S. Fader et al. (2018), Journal of Interactive Marketing (LBS Research Online PDF) — https://lbsresearch.london.edu/id/eprint/954/1/BdW_JIM_2018-01-10.pdf\\n\\nKey Evidence (Direct Quotations)\\n- Kashyap (2019): “Key words:- sBG Model, Beta distribution, Beta geometric, Beta binomial.” (p.1)\\n- Kashyap (2019): “The discrete time beta geometric model allows to heterogeneity of each of rural and urban customer behavioral process.” (p.1)\\n- Fader et al. (2018): “the beta-geometric (BG) distribution has proven to be a robust model for capturing and projecting these patterns into the future.” (p.3, Abstract)\\n- Fader et al. (2018): “This beta mixture of geometrics is called the beta-geometric (BG) distribution.” (pp.5–6)\\n\\nNormalization for Two-Word Phrase\\n- The Fader et al. source uses “beta-geometric” (hyphenated). Per task requirement (two words, no punctuation), we normalize to “Beta geometric”.\\n\\nConclusion\\n- Both studies explicitly use or discuss the same model type: Beta geometric.\\n\\nObservations\\n- Kashyap (2019) focuses on the discrete-time beta geometric (including sBG), applied to rural/urban customer purchases.\\n- Fader et al. (2018) discuss BG and extend it via the beta–discrete–Weibull (BdW) distribution, confirming BG’s role in modeling retention.\\n\\nActionable Outcome\\n- The common two-word model type to report is: Beta geometric.\\n\", \"mode\": \"w\"})",
      "explanation": "",
      "observation": "Content successfully saved to work_space/gaia_0ff53813_20251222_031212_092979/cross_verification_step3.md"
    },
    {
      "turn_idx": 22,
      "action": "mark_step({\"step_index\": 3, \"step_status\": \"completed\", \"step_notes\": \"Cross-verified both studies’ methodology/model references using extracted quotes. Both Kashyap (2019) and Fader et al. (2018) reference the beta-geometric (BG) model/distribution. Normalized to the required two-word, no-punctuation format: \\\"Beta geometric\\\". Files generated:\\n- work_space/gaia_0ff53813_20251222_031212_092979/cross_verification_step3.md\\nNo issues encountered. Ready to compose final response per formatting constraints.\"})",
      "explanation": "",
      "observation": "Step 3: step_status is completed, step_notes is Cross-verified both studies’ methodology/model references using extracted quotes. Both Kashyap (2019) and Fader et al. (2018) reference the beta-geometric (BG) model/distribution. Normalized to the required two-word, no-punctuation format: \"Beta geometric\". Files generated:\n- work_space/gaia_0ff53813_20251222_031212_092979/cross_verification_step3.md\nNo issues encountered. Ready to compose final response per formatting constraints. "
    },
    {
      "turn_idx": 23,
      "action": "mark_step({\"step_index\": 4, \"step_status\": \"completed\", \"step_notes\": \"Composed final response. Cross-verified the two studies: Kashyap (2019, IJAR) explicitly uses a “discrete time beta geometric model,” and Fader et al. (2018, LBS Research Online) refer to the “beta-geometric (BG) distribution” for retention. The common two-word model type, normalized without punctuation, is “beta geometric.” Supporting files created in prior steps: work_space/gaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md, work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md, work_space/gaia_0ff53813_20251222_031212_092979/cross_verification_step3.md.\"})",
      "explanation": "Composed final response. Cross-verified the two studies: Kashyap (2019, IJAR) explicitly uses a “discrete time beta geometric model,” and Fader et al. (2018, LBS Research Online) refer to the “beta-geometric (BG) distribution” for retention. The common two-word model type, normalized without punctuation, is “beta geometric.” Supporting files created in prior steps: work_spacegaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md, work_spacegaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md, work_spacegaia_0ff53813_20251222_031212_092979/cross_verification_step3.md.",
      "observation": "Step 4: step_status is completed, step_notes is Composed final response. Cross-verified the two studies: Kashyap (2019, IJAR) explicitly uses a “discrete time beta geometric model,” and Fader et al. (2018, LBS Research Online) refer to the “beta-geometric (BG) distribution” for retention. The common two-word model type, normalized without punctuation, is “beta geometric.” Supporting files created in prior steps: work_space/gaia_0ff53813_20251222_031212_092979/search_results_summary_step1.md, work_space/gaia_0ff53813_20251222_031212_092979/extracted_methodology_models_step2.md, work_space/gaia_0ff53813_20251222_031212_092979/cross_verification_step3.md. "
    },
    {
      "turn_idx": 24,
      "action": "FINAL_ANSWER: beta geometric",
      "explanation": "Overview:\n- The task was completed successfully. The objective was to identify the common two-word model type used in customer retention studies by Manash Pratim Kashyap (2019) and PS Fader (2018–2019), and provide the answer in a specific format.\n\nWhat was done:\n- Parsed the question to extract constraints and output formatting rules (two-word string, no punctuation).\n- Searched for and located the relevant studies:\n  - Kashyap (2019) in IJAR referencing a “discrete time beta geometric model.”\n  - Fader et al. (2018) referencing the “beta-geometric (BG) distribution” for retention.\n- Extracted methodology sections to capture explicit model terminology used in each study.\n- Cross-verified the findings and normalized the shared model type to a two-word, no-punctuation format.\n- Composed the response adhering to the specified output constraints.\n\nOutcome:\n- Identified the common two-word model type used in both studies: beta geometric.\n- Confirmed both sources explicitly refer to the beta-geometric model/distribution for retention analysis.\n- Prepared the final answer in compliance with the required formatting.\n\nKey success factors:\n- Clear parsing of strict output rules ensured correct normalization and formatting.\n- Targeted search using author names, dates, and domain terms quickly surfaced the correct sources.\n- Direct extraction from methodology sections minimized ambiguity.\n- Cross-verification guaranteed consistency across sources before finalizing the answer.\n\nFINAL ANSWER: beta geometric",
      "observation": ""
    }
  ],
  "source": "gaia_level2"
}