# 🧠 RacoNet 🔥
## 📌 What’s Available

| Component                        | Status         |
|----------------------------------|----------------|
| 📚 Model Definition              | ✅ Available   |
| 💻 Training Code                 | ✅ Available   |
| 🧪 Inference Code                | ✅ Available   |
| 🎯 Model Weights                 | ✅ Available   |
| 📊 Dataset                       | ⏳ Coming Soon |
---

## 📄 Paper Information

- **Title**: ShadowSpeak: Is It Possible to Communicate Cross-Room Solely by Decoding Gesture Shadows?
- **Status**: Under Review at [ICLR 2026]


## Overview
Accurately decoding hidden information in dynamic shadows for Non-Line-of-Sight (NLOS) imaging enables us to overcome visual occlusions and perceive or reconstruct obscured targets. This breakthrough holds significant potential for real-world applications such as disaster rescue, autonomous driving, and security surveillance. Conventional algorithms struggle to model the physical propagation of light in space. Furthermore, the signal distortions introduced by nonlinear transformations incur the loss of geometric information about the source scene, limiting sensitivity to subtle shadow variations. To overcome these challenges, we present Radiation-constraint Network (RacoNet) that marries physical propagation simulation with geometric-information recovery to interpret minute gesture signals embedded in dynamic shadows. In RacoNet, Radiance-Constrained Light-Transportation (RCLT) optical propagation is proposed to capture complete light-space information. Meanwhile, Geometric Information Aliment Operation (GIAO) restores source-scene geometry lost in the modulated shadow through layer-by-layer refined prior attention. Moreover, Kolmogorov-Arnold Enhanced Layerwise Nonlinear Reorganization (KA-ELNR) fuses light-space and geometric cues to produce the final decoded output. Extensive experiments show that RacoNet markedly surpasses existing approaches in both accuracy and robustness for dynamic-shadow decoding, confirming the possibility of gesture-based information interaction via shadows.

![](./imgs/intro.png)


## The overall architecture of RacoNet
![](./imgs/all.png)

## ScLT Block architecture
![](./imgs/sclt.png)

## ALT Block architecture
![](./imgs/alt.png)

---

## Pretrained pth
### ALT
- Link: https://XXXXXXXXX.com/ALT
- Password: RacoNet

### GIAO
- Link: https://XXXXXXXXX.com/GIAO
- Password: RacoNet

## Our conda environment

- Python *3.11.13* and PyTorch *2.9.0*
- NVIDIA GeForce RTX 5090 GPUs

> environment.yml


---

## 🛠️ Usage

### 📦 Create Environment

```shell
conda env create -f environment.yml
conda activate RacoNet
```

### 📈 Create a dataset csv file

```shell
python3 create_csv.py --dataset_dir '[dataset path]'
```

### 🚀 Train

```shell
#! /bin/bash

for var in RacoNetClassify
do

export CUDA_VISIBLE_DEVICES=1
export OMP_NUM_THREADS=4
torchrun  --nproc_per_node 1 train.py \
    --model $var \
    --dataset_name "[dataset]" \
    --train_dataset "data/[dataset]/train_data.csv" \
    --test_dataset "data/[dataset]/test_data.csv" \
    --in_channels 3 \
    --in_size 256 \
    --classify_num 24 \
    --batch_size 16 \
    --lr 0.0005 \
    --epochs 60 \
    --start_epoch 0 \
    --device "cuda:1" \
    --continue_train "false" \
    --checkpoint "[GIAO weight]" \
    --parameter "[RacoNet weight]" \
    --checkpoints_dir "checkpoins"

done
```
---

## Diffractive Text Reassembly via Modulated Shadow
![](./imgs/rnn.png)
![](./imgs/sentence.png)

## Datasets
- [Sign Language for Numbers (S-Numbers)](https://www.kaggle.com/datasets/muhammadkhalid/sign-language-for-numbers)
![](./imgs/numbers.png)

- [Sign Language MNIST (S-MNIST)](https://www.kaggle.com/datasets/datamunge/sign-language-mnist)
![](./imgs/MNIST.png)


## Results

<details>
<summary>Performance (click me)</summary>
<p align="center">
  <img width="900" src="imgs/r1.png">
</p>
</details>

<details>
<summary>Ablation (click me)</summary>
<p align="center">
  <img width="900" src="imgs/r2.png">
</p>
</details>


---

> [!WARNING]
> The datasets we collected and the experiments we conducted strictly complied with local laws.

> [!CAUTION]
> This project is released under the Apache 2.0 license.
