# SSG-ECPE: Semantics-Structured Generation with Alignment for Emotion-Cause Pair Extraction

## 🚀 Introduction
Emotion-Cause Pair Extraction (ECPE) aims to jointly identify emotion clauses and their corresponding cause clauses.  
We propose **SSG-ECPE**, a task-adaptive generative multi-task learning framework that:
- Reformulates ECPE as **structured text-to-text generation**.  
- Designs **semantics-structured outputs** encoding clause roles, emotion types, and triggers.  
- Introduces **Clause Prediction Alignment (CPA)** to ensure faithful predictions and mitigate hallucinations.  

Our framework achieves **state-of-the-art performance** on both Chinese and English ECPE benchmarks.

---

## 📂 Dataset
We evaluate on two widely-used benchmarks:
- **Chinese ECPE Dataset** [[Xia et al., ACL 2019]](https://aclanthology.org/P19-1096/)  
- **NTCIR-13 English Emotion Corpus** [[Gao et al., 2017]](https://research.nii.ac.jp/ntcir/workshop/OnlineProceedings13/pdf/ntcir/01-NTCIR13-OV-ECA-GaoQ.pdf)  

Please download the datasets from the official sources and preprocess them using our scripts under `data/ecpe/`.

---
## 📥 Pretrained Models
We use Hugging Face models for initialization. Please download them before training.

- **Chinese ECPE:** [Randeng-T5-77M-MultiTask-Chinese](https://huggingface.co/IDEA-CCNL/Randeng-T5-77M-MultiTask-Chinese) 
- **English ECPE:** [T5-base](https://huggingface.co/google-t5/t5-base)

Please download the pretrained models from Hugging Face and put them in `pre-models/`.


---

## ⚙️ Requirements
We implement our method with PyTorch and Hugging Face Transformers.

```bash
# Create environment
conda create -n ssgecpe python=3.8
conda activate ssgecpe

# Install dependencies
pip install torch==1.13.0
pip install transformers==4.1.0
pip install pytorch_lightning==0.8.1
```
---
## 🏋️ Training & Evaluation
```bash
python main.py --task ecpe \
            --dataset eca_cn_10 \
            --model_name_or_path ./pre-models/Randeng-T5-77M-MultiTask-Chinese \
            --paradigm multi_task \
            --n_gpu 1 \
            --do_train \
            --do_direct_eval \
            --train_batch_size 16 \
            --gradient_accumulation_steps 2 \
            --eval_batch_size 8 \
            --learning_rate 0.005 \
            --num_train_epochs 20
