{
  "title": "Please Read This Paper",
  "turns": [
    {
      "turn": 1,
      "user": "> mathematic...3-00949-v2 - PDF\n\nPlease read this paper.",
      "assistant": "Sure, I'll read the paper \"Synergistic Joint Model of Knowledge Graph and LLM for Enhancing XAI-Based Clinical Decision Support Systems\" and summarize the key points for you.\n\nThis paper proposes a new synergistic joint model that combines large language models (LLMs) and knowledge graphs (KGs) to overcome existing limitations and enhance the performance of explainable AI (XAI)-based clinical decision support systems (CDSS).\n\n### Paper Overview\n\n*   **Title:** Strengthening XAI-Based Clinical Decision Support Systems through a Synergistic Joint Model of Knowledge Graph and LLM (Synergistic Joint Model of Knowledge Graph and LLM for Enhancing XAI-Based Clinical Decision Support Systems)\n*   **Authors:** Chae-rim Park, Ha-young Lee, Seong-hee Lee, Ok-ran Jeong\n*   **Affiliations:** Gachon University, Electronics and Telecommunications Research Institute (ETRI)\n*   **Journal:** Mathematics\n*   **Publication date:** March 13, 2025\n\n* * *\n\n### Background and Challenges\n\n*   **Limitations of LLMs:** Although LLMs have strong generalization capabilities, they have serious limitations when applied directly to CDSS.\n    *   **Hallucination:** They can generate information that is untrue or misleading.\n    *   **Lack of domain knowledge:** They lack the specialized medical knowledge needed to accurately interpret complex clinical information.\n    *   **Lack of explainability:** As 'black box' models, they do not transparently reveal how they derive results, reducing trust.\n*   **Limitations of Knowledge Graphs (KGs):** While KGs provide structured knowledge that can complement LLMs, they also have inherent limitations.\n    *   **Static nature:** It is difficult to reflect newly emerging knowledge in real time, such as 'COVID-19'.\n    *   **Complexity of use:** Leveraging KGs requires a high level of technical expertise.\n\n* * *\n\n### Proposed Model: A synergistic joint model with a circular structure\n\nThis study proposes a new joint model with a circular structure that leverages the complementary strengths of LLMs and KGs to overcome each other's limitations.\n\n*   **KGs complement LLMs:** KGs provide structured, validated domain knowledge to reduce hallucinations and increase explainability by presenting evidence for decisions.\n*   **LLMs complement KGs:** LLMs extract new information from up-to-date documents to dynamically expand and update the KG, overcoming its static limitations.\n\n#### Model architecture layers\n\nThe proposed model consists of four main layers.\n\n1.  **Subgraph extraction (Sub-Graph Extraction) layer:** Use a Medical NER model to extract key terms from the input text and retrieve related subgraphs from a knowledge graph (ConceptNet).\n2.  **Graph expansion (Graph Expansion) layer:** When new keywords not present in the existing KG (e.g., 'COVID') are found, use an LLM (GPT-3.5-turbo) to extract new knowledge (triples) from relevant documents to expand the KG.\n3.  **Triple scoring (Triple Scoring) layer:** Among the extracted and expanded knowledge, select the top k triples most related to the input text based on cosine similarity using S-BERT. This reduces 'knowledge noise' from unnecessary information.\n4.  **Concat (Concat) layer:** Combine the original text and selected top triples in a structured format using special tokens, then feed them to an LLM acting as a classifier to perform the task.\n\n* * *\n\n### Experiments and Results\n\nTo validate the effectiveness of the proposed model, the authors conducted experiments on subtasks of mental health-related CDSS.\n\n*   **Evaluation tasks:**\n    *   Detection of causes/factors for mental health\n    *   Detection of mental health status/level\n    *   Emotion recognition\n*   **Datasets used:** SAD, CAMS, DR, Dreaddit, GoEmotions\n*   **Baselines:** RoBERTa, Mental RoBERTa (a model fine-tuned on mental-health data)\n*   **Key results:**\n    *   Compared with a standalone LLM model, the proposed joint model improved accuracy by up to 12.0% and F1 score by up to 8.6%.\n    *   By integrating a KG, the joint model based on a general LLM (RoBERTa) outperformed a domain-specific LLM (Mental RoBERTa). This shows that integrating structured domain knowledge can significantly improve LLM performance.\n    *   Performance dropped when the Graph Expansion and Subgraph Extraction layers were removed, demonstrating the importance of each component.\n\n* * *\n\n### Conclusion and Contributions\n\n*   **Conclusion:** This study proposes a new joint model that integrates LLMs and KGs in a circular manner, improving explainability and trustworthiness of AI models in complex domains such as CDSS, and enhancing adaptability to new knowledge. Experimental results show that the proposed model effectively addresses the limitations of existing systems and has the potential to support reliable decision-making across various medical fields.\n*   **Contributions:**\n    1.  Development of a synergistic joint model that mutually compensates for the limitations of LLMs and KGs\n    2.  Demonstration of improved CDSS performance on clinical tasks such as mental health detection\n    3.  Construction of a scalable, adaptable system capable of dynamically integrating new knowledge"
    },
    {
      "turn": 2,
      "user": "I don't quite understand the task—could you explain it in detail?",
      "assistant": "Sure, let me explain the 'task(s)' mentioned in the paper in more detail.\n\nIn the paper, a 'task' means a concrete 'problem' or 'test' performed to evaluate how well the proposed AI model (knowledge graph + LLM) performs. Because the study proposes a model tailored to Clinical Decision Support Systems (CDSS), they set text classification problems related to the mental-health domain as the tasks.\n\nThe model reads a given text (mainly social media posts or conversational content) and predicts which of the predefined categories it belongs to. The concrete tasks fall into three types:\n\n### 1\\. Mental Health Cause/Factor Detection (Mental Health Cause/Factor Detection)\n\nThe goal of this task is to analyze the text and classify the \\*\\*'cause'\\*\\* of the user's stress or mental difficulties.\n\n*   SAD dataset :\n    A task that classifies the causes of stress in everyday conversations into nine types.\n    *   **Example:** If the input is \"I'm having such a hard time because of work stress,\" the model should classify it as 'work(work)', choosing correctly among the nine categories such as 'work(work)', 'school(school)', 'financial problems(financial problems)', 'family issues(family issues)', and 'social relationships(social relationships)'.\n*   CAMS dataset :\n    Classifies factors that cause mental illness in social media posts into six categories.\n    *   **Categories:** Includes 'bias/abuse(bias/abuse)', 'jobs/careers(jobs/careers)', 'medication(medication)', 'relationships(relationships)', and 'alienation(alienation)'.\n\n### 2\\. Mental Health Status/Level Detection (Mental Health Status/Level Detection)\n\nThe goal of this task is to determine, from the text, the user's current mental-health \\*\\*'status'\\*\\* or the \\*\\*'severity'\\*\\* of stress.\n\n*   DR dataset :\n    A task that classifies the stress index into four levels.\n    *   **Categories:** 'moderate(moderate)', 'minimal(minimal)', 'mild(mild)', 'severe(severe)'. The model infers how severe the stress is from the expressions in the text.\n*   Dreaddit dataset :\n    A binary classification ('yes/no') task that judges whether a post indicates a stressful situation.\n\n### 3\\. Emotion Recognition (Emotion Recognition)\n\nThe goal of this task is to accurately recognize and classify the \\*\\*'emotion'\\*\\* expressed in the text.\n\n*   GoEmotions dataset :\n    Classifies emotions in Reddit comments into 28 categories: 27 fine-grained emotions plus 'neutral'.\n    *   **Feature:** This is a **multi-label classification(multi-label classification)** task where a single text can contain multiple emotions.\n    *   **Example:** Given the text, \"I'm happy you like the gift, but I'm also worried,\" the model should predict both 'joy(joy)' and 'nervousness(nervousness)'.\n\nIn short, in this paper, a 'task' means \\*\\*\"a problem where the AI model reads text and classifies it into the correct category based on certain criteria (cause, status, emotion, etc.)\"\\*\\*. The researchers demonstrated through experiments that their knowledge graph + LLM combined model solves these problems much better than an LLM alone."
    },
    {
      "turn": 3,
      "user": "Please explain the method in more detail.",
      "assistant": "Sure, I’ll explain the proposed method in the paper step by step in detail.\n\nThe core idea of this model is to\n\n**tie together a Knowledge Graph (KG) and a Large Language Model (LLM) in a circular structure**, so they complement each other’s weaknesses and produce more accurate and explainable results. The entire workflow is clearly shown in the paper’s \\*\\*Figure 2(Figure 2)\\*\\* and consists of four layers.\n\n* * *\n\n### **How the model works (4 steps)**\n\nWhen a user inputs text (e.g., \"I’m stressed because news about COVID-19 and lockdowns is keeping me from seeing my friends\"), the model proceeds as follows.\n\n#### **Step 1: Sub-Graph Extraction Layer**\n\nThe goal here is to quickly retrieve knowledge related to the input text from a large, existing knowledge graph.\n\n*   1-1. Keyword extraction:\n    First, identify the key terms (keywords) in the input text.\n    *   **Technique used:** A `Medical NER (medical named-entity recognition)` model specialized for medical-domain text, built by training DeBERTa on the PubMed dataset. If keyword extraction fails, Python’s `SpaCy` library is used as a fallback.\n    *   **Example:** Extract keywords like \"COVID-19,\" \"stress,\" and \"friends.\"\n*   **1-2. Retrieve related knowledge:** Use the extracted keywords to search for relevant facts in the KG.\n    *   **KG used:** The commonsense knowledge graph `ConceptNet`, which is composed of triples in the form (subject, relation, object) (e.g., (friend, used for, good mood)).\n    *   **Search result:** Collect all triples connected to the keywords to form a small-scale 'sub-graph.'\n*   **Filtering knowledge noise:** Among ConceptNet’s 34 relation types, exclude overly generic or low-relevance ones like 'related\\_to,' and keep only 16 meaningful relations such as 'causes (cause)' and 'created\\_by (creator)' to reduce unnecessary information (knowledge noise).\n\n#### **Step 2: Graph Expansion Layer**\n\nThis is crucial for handling \\*\\*new concepts (e.g., 'COVID')\\*\\* that don’t exist in a KG like ConceptNet.\n\n*   **Goal:** Dynamically expand the static KG with up-to-date information.\n*   **How it works:**\n    1.  If the sub-graph extraction step finds a keyword absent from the KG, this layer is triggered.\n    2.  Use an LLM like `GPT-3.5-turbo-instruct` to search recent documents or data containing the keyword and generate new knowledge triples.\n    3.  Prompt the LLM to produce triples strictly in the `(subject, relation, object)` format.\n    4.  Validate and clean the generated triples, then add them to the existing sub-graph.\n*   **Example:** Add new knowledge such as (COVID, cause, stress), (COVID, related, pandemic).\n\n#### **Step 3: Triple Scoring Layer**\n\nFrom the triples collected in Steps 1 and 2, select only those that are\n\n**most relevant** to the input text.\n\n*   **Goal:** Optimize the amount of knowledge injected into the LLM and minimize knowledge noise for efficiency.\n*   **Technique used:** `S-BERT (Sentence-BERT)`, which excels at measuring semantic similarity between sentences.\n*   **How it works:**\n    1.  Convert the original input text and each knowledge triple into vectors (embeddings) using S-BERT.\n    2.  Compute `cosine similarity` between the text vector and each triple vector to score them.\n    3.  Select the **top k triples (3 in the paper)** with the highest scores.\n\n#### **Step 4: Concat Layer**\n\nFinally, effectively combine the selected knowledge with the original text and feed it to the final classifier (LLM).\n\n*   **Goal:** Integrate structured knowledge (KG) and unstructured text in a format the LLM can readily understand.\n*   **How it works:**\n    1.  Use the `special_token` feature of the `Huggingface Tokenizer` to explicitly mark the structure of knowledge triples.\n    2.  The final input format looks like:\n        `[CLS] <HEAD> subject <REL> relation <TAIL> object [SEP] original sentence`\n        The special tokens `<HEAD>`, `<REL>`, and `<TAIL>` tell the LLM that these segments are structured knowledge.\n    3.  The processed input is then fed into an LLM classifier such as `RoBERTa` or `Mental RoBERTa` to perform the final task, e.g., \"mental health cause classification.\"\n\nThrough these multi-step processes, the model doesn’t just read the text; it reasons more deeply based on validated and expanded background knowledge, enabling accurate and reliable decisions."
    }
  ]
}