1. [Works!] Use an LLM (https://arxiv.org/pdf/2401.11505v1) to process the radiology report and get labels.
2. Take the label dataset from the mimic chest x-ray dataset.
3. Follow this causal graph(https://arxiv.org/pdf/2006.02482) to train the IDGEN.
3. Use this diffusion model to generate images from labels: https://arxiv.org/pdf/2303.11224
   or https://arxiv.org/pdf/2211.12737
4. During test time, we play with different types of reports. When some are partial, we detect their labels
   and use IDGEN to obtain the remaining labels. Because remaining labels are causally related to given labels.
5. Finally, we use all the labels to generate the image with diffusion models.

Prof, I want to perform the following experiment for IDGEN with Chest Xray dataset.
The goal is to generate X-ray images from an input prompt. P(Image | do(prompt)).
The prompt might contain some labels.
.
Training time:
Step 1: Train a DCM with labels from the mimic chest x-ray dataset. Use the above causal graph for this purpose.
.
Inference time:
Step 2: Use an LLM to process the radiology prompt/report and get their labels.
Step 3: After extracting the labels from the prompt, we will infer remaining labels using the trained DCM at step 1.
Step 4: Use this labels to generate X-ray images with an image generative model.

# Cauasal graph: Symptoms and disease

Cardiomegaly is an umbrella designation for various conditions leading to enlargement of the heart, which usually
remains undiagnosed until the symptoms ensue. It has become increasingly prevalent and carries a high mortality.
Cardiomegaly means enlargement of the heart.

vs

Hyperinflated lungs are when your lungs expand beyond their usual size due to air being trapped inside. It's common in
people with COPD and other respiratory conditions. It causes symptoms like difficulty inhaling and shortness of breath

* **Pulmonary edema** happens when fluid collects inside the lungs, in the alveoli, making it difficult to breathe

* **Lung opacity**: A lack of transparency; an opaque or nontransparent area. 2. On a radiograph, a more transparent
  area is
  interpreted as an opacity to x-rays in the body

# Edges

* **Edges**: In the X-ray experiment, we find that edges from
  Atelectasis, Infiltration, and Effusion are stable
  possible causes of the Pneumonia label. This is reassuring,
  since these are clinically relevant conditions for patients with
  pneumonia.

* [Pneumonia -> visible symptoms] Pneumonia causes other physical attributes which have more close relationships with
  the xray images.
* [Effusion -> Atelectasis] External pulmonary compression by pleural fluid or air (ie, pleural effusion, pneumothorax)
  may also cause
  atelectasis. (https://emedicine.medscape.com/article/296468-overview#:~:text=External%20pulmonary%20compression%20by%20pleural,with%20oxygen%20toxicity%20and%20ARDS, https://www.mayoclinic.org/diseases-conditions/atelectasis/symptoms-causes/syc-20369684)


* [Pneumonia -> Lung opacity] : https://www.healthline.com/health/lung-opacity#causes

# Nodes

* **Pneumonia**: is an infection. If a person has pneumonia, the alveoli in one or both lungs fill with pus and fluids (
  exudate), which interferes with the gas exchange. This is sometimes known as 'consolidation and collapse of the lung'.
  Chextgpt(Case Study on Ablation Study): • (a): GPT-4 mapping erroneously labels ‘pneumonia’ as ‘Consolidation’, whereas rule-based mapping refrains
  from categorizing it, recognizing the distinction between the terms. Similarly to case (a), GPT-4 mapping conflates
  ‘pneumonia’ with ‘Consolidation’. Additionally, it overlooks ‘Lung
  opacity’, which is the label applied by rule-based mapping.

* **pneumothorax** Both COVID-19 pneumonia and pneumothorax share a non-specific but common symptom of dyspnoea. The
  incidence of spontaneous pneumothorax occurring as a complication of pneumonia in adults is apparently very
  low. https://www.acpjournals.org/doi/10.7326/0003-4819-40-1-153#:~:text=The%20incidence%20of%20spontaneous%20pneumothorax,in%20the%20literature%20are%20sparse.&text=Numerous%20reports%20of%20such%20cases%20in%20children%20have%20been%20recorded.
  This is called pneumothorax. It may be a complication of pneumonia (particularly S pneumoniae) or of the invasive
  procedures used to treat pleural effusion. Pneumothorax occurs when air leaks from inside the lung to the space
  between the lung and the chest wall. The lung then
  collapses. https://www.mountsinai.org/health-library/report/pneumonia#:~:text=Collapsed%20Lung&text=This%20is%20called%20pneumothorax.,The%20lung%20then%20collapses.

* **cardiomegaly** : widened mediastinal silhouette by chexgpt

* **Atelectasis** is one of the most common breathing complications after surgery complication is a medical problem that
  occurs during a disease, or after a procedure or treatment. The most common cause of atelectasis is surgery with
  anesthesia.(https://my.clevelandclinic.org/health/diseases/17699-atelectasis#:~:text=Atelectasis%20happens%20when%20lung%20sacs,atelectasis%20is%20surgery%20with%20anesthesia.)

* **Pleural effusion** occurs when fluid builds up in the space between the lung and the chest wall. This can happen for
  many different reasons, including pneumonia or complications from heart, liver, or kidney disease.
  Pleural effusion also involves fluid in the lung area and is sometimes called “water on the lungs.
  Pleural effusions are common in patients who develop pneumonia. At least 40-60% of patients with bacterial pneumonia
  will develop a pleural effusion of varying severity.

# Biases

* **Bias/Confounder**: performance in detecting “pleural effusion” decreased between 10.7% and 11.6% for Black patients.
  https://pubs.rsna.org/doi/full/10.1148/ryai.230060#:~:text=The%20studied%20chest%20radiography%20foundation,be%20unsafe%20for%20clinical%20applications.
  Our bias analysis showed significant differences between features related to disease detection across biologic sex (
  P < .001) and race (P < .001).

* **Image and other attributes**: [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8545228/]  These biases could arise, for
  example, when using some specific devices to acquire images of patients with a low
  probability of suffering the disease (mainly controls), and different ones for those patients with a high probability
  of suffering it (mainly cases).
* **Dataset mixing** :when most of the patients are screened in certain health services and highly suspicious patients
  are derived to a different area or, even worse, when, aiming to increase the number of controls or cases, a dataset is
  expanded with samples coming from significantly different origins and labeled with unbalanced class identifiers. In
  these cases, a CNN trained to discriminate between cases and controls could learn to differentiate images from
  different origins rather than finding features actually related to the disease. **Text <--> Disease** (due to
  collecting
  data from different region)
* Does accuracy gets increased/dropped in some sub-groups?


  