\section{Introduction}
\label{sec:intro}
About $13$ million people in Bangladesh are suffering from different degrees of hearing loss, of which $3$ million have hearing disability~\cite{alauddin2004deafness}.
There are around $1$ million using Bangladeshi Sign Language (BdSL) in their everyday life\cite{c2}. While communicating with a signer
, there are two major tasks for a non-signer: \textit{(i)} understanding the signs and \textit{(ii)} expressing the signs. Researchers made impressive contributions to task~\textit{(i)} by developing sign letters\footnote{signs that represent letters only.} recognition techniques from images (Fig.~\ref{fig:teaser}~\protect\includegraphics[scale=.18]{crc1.png}). Several works has been proposed for BdSL letters classification via machine learning techniques~\cite{islam2022improving, rahim2022soft, miah2022bensignnet, hasan2021shongket, khatun2021systematic, talukder2021okkhornama, Hoque_2020_ACCV, BdSLiciet}.
Task \textit{(ii)} still has less research attention since it is a difficult process for non-signers. A naive and tiresome approach to expressing signs is to use flashcards with signs and symbols. Being inspired by that,~\cite{shishir2020esharagan} proposed a system that generates symbols of signs using generative adversarial networks (GANs)~\cite{goodfellow2014generative}. However, their work only produces symbols of signs---which may raise questions regarding the necessity of such a system. Another way is by using animated avatars of signs, i.e.~\cite{KippAvat}; but there is no such system for BdSL. Moreover, an avatar-based system does not provide a realistic environment for communication.

All the above scenarios inspired us to make task~\textit{(ii)} more realistic yet effortless. Hence, we introduced \textit{\textbf{PerSign}: {Pers}onalized Bangladeshi {Sign} Letters Synthesis} which converts the image of a user into an image showing signs while keeping the person's profile unchanged. Fig.~\ref{fig:teaser}~\protect\includegraphics[scale=.19]{crc2.png} explains the working pipeline of our prototype. A user first uploads their profile photo ($I_P$) only once to our system. The $I_P$ must follow a specific rule of showing hand and palm (as shown in~\protect\includegraphics[scale=.19]{crc2.png}{$^{a}$}). After that, the user inserts the desired letter ($L$) to be expressed (e.g. \protect\includegraphics[scale=.17]{Ga.png} in \protect\includegraphics[scale=.19]{crc2.png}{$^{b}$}). Our system converts $I_P$ into $I_L$ by considering $L$. This can be seen as $I_L \leftarrow I_P+L$, where $I_L$ contains the same person in $I_P$ with unchanged face, skin tone, attire, and background (\protect\includegraphics[scale=.19]{crc2.png}{$^{c}$}). In that case, the person does not need any expertise in sign language. We believe, a signer will feel a natural environment if $I_L$ is shown, thus, making the communication more realistic and affectionate.

\textbf{\textit{\color{blue}Do we really need such a system?}}---in order to address this question, we performed a survey on a group of $6$ {guardians} and $11$ {teachers} of deaf children---who were also signers---regarding the necessity of our system.
We let participants upload profile images to \textit{PerSign} and asked them to rate the results on a scale of \textcircled{\small 1} to \textcircled{\small 5} according to \textit{Likert} rating method~\cite{likert1932technique}, with \textcircled{\small 1} being \textit{not necessary at all} and \textcircled{\small 5} being \textit{very necessary}. Out of total $17$ participants, $13$ and $3$ rated \textit{PerSign} with \textcircled{\small 5} and \textcircled{\small 4} respectively with an average rating of $4.705$. Most of the sign language teachers commented that \textit{personalized} signs are very helpful for general people to get closer to signers, especially when it comes to children.

\section{Implementation}
We employed the \textit{Generative Adversarial Network} (Fig.~\ref{fig:teaser}~\protect\includegraphics[scale=.18]{crc3.png}), an unsupervised deep learning technique that automatically learns the patterns from datasets in order for the model to produce a new output~\cite{goodfellow2014generative}. Our problem lies under a sub-domain of GAN which \textit{image-to-image translation}~\cite{isola2017image} and we adopted \textit{GestureGAN}~\cite{tang2018gesturegan}---a gesture-to-gesture translation method---to implement a prototype system. For this purpose, we built a dataset of images with hand gestures of arbitrary poses, sizes, and backgrounds.
Since we needed a paired dataset $\{I_P, I_L\}$ we could not reuse any of the existing unpaired ones. For localizing gestures and appearances, we exploited \textit{OpenPose}~\cite{8765346} to make the skeleton of the hands and face of the images and store the pair of input and output images. We then trained and tested the \textit{GestureGAN} model with our dataset, and constructed our system's working prototype. Fig.~\ref{fig:res} presents more results of \textit{PerSign}.
\begin{figure}[htb]
  \includegraphics[width=0.49\textwidth]{res2.png}
  \caption{Result analysis. (a) input profile image. (b) generated image with signs. (c) zoomed view of faces from input \textit{(left)} and output \textit{(right)}. We can see, that the face is retained.}
  \label{fig:res}
  \Description {A picture of the result obtained exclusively showing the face and hand-gestures.}
\end{figure}

\section{Conclusion and Future work }
In this poster, we proposed a framework---\textit{PerSign}---for synthesizing Bangladeshi Sign Letters that can be \textit{personalized}. Through our method, anyone will be able to communicate as a signer without having any expertise in BdSL. We built our own dataset and exploited \textit{GestureGAN} method to accomplish the task. Our work is still in progress and we gathered comments from the participants during the survey to find areas for improvement. Most of the users recommended applying the technique for a sequence of images to render video for a stream of input letters. We can achieve better results by increasing the number of diverse examples in our dataset. Though our dataset has samples for all BdSL letters, some specific letters need to be treated carefully because of their similarities in patterns.
Our final  aim is to merge task (\textit{i})
and (\textit{ii})
into a single system to provide an \textit{one-stop solution} for two-way communication. Some of the users also suggested extending our work for full gestures, rather than letters only. We are also intent on implementing better \textit{GUI} with voice input. Last but not the least, we plan to conduct a thorough evaluation from experts and signers. We believe, this poster opens new avenues in sign language for further research.

\newpage
\newpage
\bibliographystyle{ACM-Reference-Format}
