<html>

<head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  <link rel="stylesheet" type="text/css" href="cid:css-99d622a7-b09d-4ecb-b6de-0cda4c5d89cf@mhtml.blink" />

  <title>Deep Active Speech Cancellation with Mamba-Masking Network</title>
  <meta property="og:title" content="Deep Active Speech Cancellation with Mamba-Masking Network">
  <meta property="og:type" content="article">
  <!-- FIXME(shillingford): add final URL -->
  <meta property="og:url" content="">
  <!--meta property="og:image" content="images/vdtts_teaser.webp"-->
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <link rel="preconnect" href="https://fonts.googleapis.com">
  <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
  <link href="https://fonts.googleapis.com/css2?family=Roboto+Slab:wght@300&family=Roboto:wght@300&display=swap"
    rel="preload" as="style">
  <link href="https://fonts.googleapis.com/css2?family=Roboto+Slab:wght@300&family=Roboto:wght@300&display=swap"
    rel="stylesheet">


  <link rel="stylesheet" href="style.css">
  <style>
    .centered-paragraph {
      max-width: 600px;
      /* Adjust the width as needed */
      margin: 0 auto;
      padding: 10px;
      text-align: center;
      /* Optional: Add padding for better readability */
    }
  </style>
</head>

<body>
  <p class="main">

  <h1>Deep Active Speech Cancellation with Mamba-Masking Network</h1>

  <div class="fig-teaser">
    <!-- TODO ADD TEASER IMG -->
  </div>

  <div class="abs">
    <p>
      We present a novel deep learning network for Active Speech Cancellation (ASC), advancing beyond Active Noise Cancellation (ANC) 
      methods by effectively canceling both noise and speech signals. The proposed Mamba-Masking architecture introduces a masking 
      mechanism that directly interacts with the encoded reference signal, enabling adaptive and precisely aligned anti-signal 
      generation—even under rapidly changing, high-frequency conditions, as commonly found in speech. Complementing this, a multi-band 
      segmentation strategy further improves phase alignment across frequency bands. 
      Additionally, we introduce an optimization-driven loss function that provides near-optimal supervisory signals for anti-signal 
      generation. Experimental results demonstrate substantial performance gains, achieving up to 7.2dB improvement in ANC scenarios 
      and 6.2dB in ASC, significantly outperforming existing methods.
    </p>
  </div>

  <h1>Model</h1>
  <div class="scroll-container">
    <div class="fig-model">
      <img src="figures/our_arch.png">
    </div>

  </div>

  <a id="ConversationalSTE">
    <h2>Active Noise Cancellation</h2>
  </a>
  <div>
    <p style="font-size: 20px;" class="centered-paragraph"> <strong> Please use headphones, as it may be difficult to
        hear otherwise. </strong>
    </p>
  </div>

  <p class="centered-paragraph">
    The first audio column, labeled <highlighted>Primary signal d(n)</highlighted>, represents the signal that a
    listener would hear without the application of any ANC algorithm. The subsequent column, <highlighted>DeepASC
    </highlighted>, presents the canceling signal e(n) generated by our proposed model in response to the input signal
    from the first column. The following columns, <highlighted>ARN</highlighted>, <highlighted>DeepANC</highlighted>,
    and <highlighted>FxLMS</highlighted>, display the results produced by the ARN, DeepANC, and FxLMS methods,
    respectively, for the same input signal.
  </p>

  <div class="table-container">
    <table class="sample-table" id="samples-table">
      <colgroup>
        <col>
      </colgroup>
      <thead>
        <tr>
          <th colspan="1">Primary signal d(n)</th>
          <th colspan="1">DeepASC</th>
          <th colspan="1">ARN</th>
          <th colspan="1">DeepANC</th>
          <th colspan="1">FxLMS</th>
        </tr>
      </thead>

      <tbody>
        <tr class="audio">
          <td><audio controls="">
              <source src="noises/babble_1/pt.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_1/our_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_1/arn_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_1/deepanc_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_1/fxlms_error.wav" type="audio/wav">
          </td>
        </tr>
        <tr class="transcript">
          <td></td>
          <td></td>
        </tr>
        <tr class="audio">
          <td><audio controls="">
              <source src="noises/babble_2/pt.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_2/our_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_2/arn_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_2/deepanc_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/babble_2/fxlms_error.wav" type="audio/wav">
          </td>
        </tr>
        <tr class="transcript">
          <td></td>
          <td></td>
        </tr>
        <tr class="audio">
          <td><audio controls="">
              <source src="noises/factory_1/pt.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_1/our_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_1/arn_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_1/deepanc_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_1/fxlms_error.wav" type="audio/wav">
          </td>
        </tr>
        <tr class="transcript">
          <td></td>
          <td></td>
        </tr>

        <tr class="audio">
          <td><audio controls="">
              <source src="noises/factory_2/pt.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_2/our_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_2/arn_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_2/deepanc_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_2/fxlms_error.wav" type="audio/wav">
          </td>
        </tr>
        <tr class="transcript">
          <td></td>
          <td></td>
        </tr>
        <tr class="audio">
          <td><audio controls="">
              <source src="noises/factory_3/pt.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_3/our_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_3/arn_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_3/deepanc_error.wav" type="audio/wav">
          </td>
          <td><audio controls="">
              <source src="noises/factory_3/fxlms_error.wav" type="audio/wav">
          </td>
        </tr>
        <tr class="transcript">
          <td></td>
          <td></td>
        </tr>
        <tr class="transcript">
          <td></td>
          <td></td>
        </tr>
      </tbody>

    </table>
  </div>

  <a id="wk">
    <h2>Active Speech Cancellation</h2>
  </a>

  <div>
    <p style="font-size: 20px;" class="centered-paragraph"> <strong> Please use headphones, as it may be difficult to
        hear otherwise.
      </strong>
    </p>
  </div>

  <p class="centered-paragraph">
    The first audio column, labeled <highlighted>Primary signal d(n)</highlighted>, represents the signal that a
    listener would hear without the application of any ASC algorithm. The subsequent column, <highlighted>DeepASC
    </highlighted>, presents the canceling signal e(n) generated by our proposed model in response to the input signal
    from the first column. The following columns, <highlighted>ARN</highlighted>, <highlighted>DeepANC</highlighted>,
    and <highlighted>FxLMS</highlighted>, display the results produced by the ARN, DeepANC, and FxLMS methods,
    respectively, for the same input signal.</p>

  <div class="table-container">
    <table class="sample-table" id="samples-table">
      <colgroup>
        <col>
      </colgroup>
      <thead>
        <tr>
          <th colspan="1">Primary signal d(n)</th>
          <th colspan="1">DeepASC</th>
          <th colspan="1">ARN</th>
          <th colspan="1">DeepANC</th>
          <th colspan="1">FxLMS</th>
        </tr>
      </thead>
      <tr class="audio">
        <td><audio controls="">
            <source src="noises/speech_1/pt.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_1/our_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_1/arn_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_1/deepanc_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_1/fxlms_error.wav" type="audio/wav">
        </td>
      </tr>

      <tr class="transcript">
        <td></td>
        <td></td>
      </tr>
      <tr class="audio">
        <td><audio controls="">
            <source src="noises/speech_2/pt.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_2/our_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_2/arn_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_2/deepanc_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_2/fxlms_error.wav" type="audio/wav">
        </td>
      </tr>
      <tr class="transcript">
        <td></td>
        <td></td>
      </tr>
      <tr class="audio">
        <td><audio controls="">
            <source src="noises/speech_3/pt.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_3/our_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_3/arn_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_3/deepanc_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_3/fxlms_error.wav" type="audio/wav">
        </td>
      </tr>
      <tr class="transcript">
        <td></td>
        <td></td>
      </tr>
      <tr class="audio">
        <td><audio controls="">
            <source src="noises/speech_4/pt.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_4/our_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_4/arn_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_4/deepanc_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_4/fxlms_error.wav" type="audio/wav">
        </td>
      </tr>
      <tr class="transcript">
        <td></td>
        <td></td>
      </tr>
      <tr class="audio">
        <td><audio controls="">
            <source src="noises/speech_5/pt.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_5/our_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_5/arn_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_5/deepanc_error.wav" type="audio/wav">
        </td>
        <td><audio controls="">
            <source src="noises/speech_5/fxlms_error.wav" type="audio/wav">
        </td>
      </tr>
      <tr class="transcript">
        <td></td>
        <td></td>
      </tr>
      <tbody>

      </tbody>
    </table>
  </div>


</body>

</html>