Beyond Predictive Algorithms in Child Welfare

Published: 23 Jan 2024, Last Modified: 30 May 2024GI 2024EveryoneRevisionsBibTeXCC BY 4.0
Letter Of Changes: We would like to thank reviewers for taking the time to review our paper. Following reviewer comments, we have made the following changes in response to specific comments. **[Meta review] R2 pointed to two important issues with the paper: a) It is unclear how the qualitative evaluation of case notes helps answer the research question, and b) there is missing information about the data used to train the classifier model** For our changes regarding a) please see point A, for b) please see point B. **A. [R2] Unclear how the qualitative evaluation of casenotes fit in the overall research with a disjoint between the quantitative and the qualitative evaluations** & **[R2] The paper should discuss how the qualitative findings may help improve the discharge outcome of the CW ML algorithms** Reviewing our paper, we realized the description of the study's research questions was unclear, resulting in many of R2's concerns raised around the paper. For our study, we wanted to convey that we are examining information signals from risk assessment data and textual data because (1) there are criticisms around the validity of CW algorithms that use quantitative risk assessment data, and (2) recent HCI scholarship has been exploring incorporating textual data into CW decision-making algorithms. To clarify our research motivation, including how the qualitative evaluation of case notes fit into the overall research study, we have edited the 3rd paragraph of the introduction and clarified our research questions. We have also signalled how each finding and discussion point in Sec 6 and 7 relate to our study's overall research questions by referencing the RQ number in each section's subheadings. This way, we believe we provide clearer signposting on how textual information from casenotes carry inherent limitations in predicting CW discharge outcomes and caution is needed when using them as a data source for algorithmic prediction tasks in sections 7.2 and 7.3 **B. [R2] Not enough information on the amount of data that were analyzed by open-coding, how the data analysis was carried out, how were the themes and topics generated, why all the themes and topics presented being positive/neutral, while the paper notes casenotes may be biased, not uniform or the ground truth, and underplay oppression/surveillance/coercion experienced by bio-parents** We have clarified the data we analyzed for open-coding and the theme/topic generation steps by editing our writing in Sec 5.3. Additionally, themes and topics were often neutral because casenotes are written in a neutral tone. Caseworkers are trained by child protective services to write facts and observations in a neutral tone. This, however, does not mean the casenotes are not biased because the information a caseworker chooses to include in their casenotes can differ between caseworkers. To clarify this point, we note the neutral tone of casenotes in Sec 6.3 (1st paragraph) and explain how the casenotes may still be biased in Sec 7.2 (1st paragraph). **Other comments** **[R2] This paper does not seem to be a good fit for the HCI community** We believe our research is well integrated within HCI literature. Our study builds on range of SIGCHI work which examines stakeholder perspectives and the validity of child welfare decision-making algorithms, which we detail in Sec 3. **[R2] In the related work, what does “the system itself poses a significant risk to family well-being” mean?** We have edited the referenced sentence in the second paragraph of Sec 3.2. We have specified that systemic risks include the shortage of experienced child welfare staff and good foster homes. **[R2] In Research Context, what does “preventing trafficking” mean?** We have updated the first paragraph of Sec 4 by specifying that trafficking refers to sex trafficking. **[R2] Why were the “numbers” anonymized?** Some casenotes included telephone numbers of building unit/street numbers. Because we wanted to remove any form of personal identifying information, it was necessary for us to anonymize the numbers. **[R2] In 7.2, “Through this study, we add quantitative support to existing qualitative concerns surrounding the use of RAs in algorithmic decision-making tools to affect child welfare case outcomes.” This seems to contradict with the introduction, “The US child welfare (CW) sector is one such sector that has extensively adopted risk assessment algorithms to predict child maltreatment risk” and the findings of the research** We believe this statement does not contradict with the introduction because despite the widespread use of algorithms build on RA data in child welfare, there are concerns around its uses.
Keywords: Human-centered computing, Human-computer interaction (HCI), Empirical studies in HCI, Applied computing, Computing in government
TL;DR: We quantitatively deconstruct child welfare risk assessments and casenotes to examine the predictive validity of these data sources and find they cannot predict discharge outcomes for children who are not-reunified with their bio-parent.
Abstract: Caseworkers in the child welfare (CW) sector use predictive decision-making algorithms built on risk assessment (RA) data to guide and support CW decisions. Researchers have highlighted that RAs can contain biased signals which flatten CW case complexities and that the algorithms may benefit from incorporating contextually rich case narratives, i.e. - the casenotes written by caseworkers. To investigate this hypothesized improvement, we quantitatively deconstructed two commonly used RAs from a United States CW agency. We trained classifier models to compare the predictive validity of RAs with and without casenote narratives and applied computational text analysis on casenotes to highlight topics uncovered in the casenotes. Our study finds that common risk metrics used to assess families and build CWS predictive risk models (PRMs) are unable to predict discharge outcomes for children who are not reunified with their birth parent(s). We also find that although casenotes cannot predict discharge outcomes, they contain contextual case signals. Given the lack of predictive validity of RA scores and casenotes, we propose moving beyond quantitative risk assessments for public sector algorithms and towards using contextual sources of information such as narratives to study public sociotechnical systems.
Submission Number: 7
Loading