{
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qb8eskKzgbj-"
      },
      "source": [
        "# Parallel Adapter Inference\n",
        "\n",
        "The [`Parallel` adapter composition block](https://docs.adapterhub.ml/adapter_composition.html#parallel) allows to forward the same input through multiple adapter blocks in parallel in a single forward pass. This can be useful for multi-task inference to perform multiple tasks on a single input sequence. However, the `Parallel` can also be used to _train_ multiple adapters in parallel.\n",
        "\n",
        "In this example, we use `Parallel` to simulataneously execute named entity recognition (NER) and sentiment classification on some input sentences.\n",
        "We leverage two adapters trained independently on the _CoNLL2003_ task for NER and the _SST-2_ task for sentiment analysis.\n",
        "Both adapters are freely available via [HuggingFace's Model Hub](https://huggingface.co/models?library=adapter-transformers&sort=downloads).\n"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Xn6C4jMXhrdj"
      },
      "source": [
        "## Installation\n",
        "\n",
        "Let's install the `adapter-transformers` library first:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 73,
      "metadata": {
        "id": "pg8T848Kfjph",
        "pycharm": {
          "is_executing": false
        }
      },
      "outputs": [],
      "source": [
        "!pip install -Uq adapter-transformers"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AkNeNzq1ukIB"
      },
      "source": [
        "## Usage\n",
        "\n",
        "Before loading the adapter, we instantiate the model we want to use, a pre-trained `roberta-base` model from HuggingFace. We use `adapter-transformers`'s `AutoAdapterModel` class to be able to add a prediction head flexibly."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 74,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "F7FKb5-Wh9Pu",
        "outputId": "5c63bf29-64fc-42a4-c866-a5a09035abdb",
        "pycharm": {
          "is_executing": false
        }
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "Some weights of the model checkpoint at roberta-base were not used when initializing RobertaModelWithHeads: ['lm_head.layer_norm.weight', 'lm_head.decoder.weight', 'lm_head.bias', 'lm_head.layer_norm.bias', 'lm_head.dense.weight', 'lm_head.dense.bias']\n",
            "- This IS expected if you are initializing RobertaModelWithHeads from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
            "- This IS NOT expected if you are initializing RobertaModelWithHeads from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
            "Some weights of RobertaModelWithHeads were not initialized from the model checkpoint at roberta-base and are newly initialized: ['roberta.embeddings.position_ids']\n",
            "You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
          ]
        }
      ],
      "source": [
        "from transformers import AutoTokenizer, AutoAdapterModel\n",
        "\n",
        "tokenizer = AutoTokenizer.from_pretrained(\"roberta-base\")\n",
        "model = AutoAdapterModel.from_pretrained(\"roberta-base\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "10RO7OsDvF1A"
      },
      "source": [
        "Using `load_adapter()`, we download and add pre-trained adapters. In our example, we use adapters hosted on the HuggingFace Model Hub, therefore we add `source=\"hf\"` to the loading method.\n",
        "\n",
        "Also note that most adapters come with a prediction head included. Thus, this method will also load the heads trained together with each adapter."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 75,
      "metadata": {
        "id": "vANZh6YgjeAQ",
        "pycharm": {
          "is_executing": false
        }
      },
      "outputs": [],
      "source": [
        "ner_adapter = model.load_adapter(\"AdapterHub/roberta-base-pf-conll2003\", source=\"hf\")\n",
        "sentiment_adapter = model.load_adapter(\"AdapterHub/roberta-base-pf-sst2\", source=\"hf\")"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iCP3xTLXvwt2"
      },
      "source": [
        "Now's when the `Parallel` block comes into play: With `set_active_adapters()`, we specify an adapter setup that uses the two adapters we just loaded in parallel."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 76,
      "metadata": {
        "id": "Khgmm6OQvvku",
        "pycharm": {
          "is_executing": false
        }
      },
      "outputs": [],
      "source": [
        "import transformers.adapters.composition as ac\n",
        "\n",
        "model.set_active_adapters(ac.Parallel(ner_adapter, sentiment_adapter))"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "up1pBqO5qWCR"
      },
      "source": [
        "With everything set up, the only thing left to do is to let our model run. For this purpose, we use a small helper method that calls the model forward pass and processes the outputs of the two prediction heads."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 77,
      "metadata": {
        "id": "wL9UDMapcKtY"
      },
      "outputs": [],
      "source": [
        "import torch\n",
        "\n",
        "def analyze_sentence(sentence):\n",
        "  tokens = tokenizer.tokenize(sentence)\n",
        "  input_ids = torch.tensor(tokenizer.convert_tokens_to_ids(tokens))\n",
        "  outputs = model(input_ids)\n",
        "\n",
        "  # Post-process NER output\n",
        "  ner_labels_map = model.get_labels_dict(ner_adapter)\n",
        "  ner_label_ids = torch.argmax(outputs[0].logits, dim=2).numpy().squeeze().tolist()\n",
        "  ner_labels = [ner_labels_map[id_] for id_ in ner_label_ids]\n",
        "  annotated = []\n",
        "  for token, label_id in zip(tokens, ner_label_ids):\n",
        "    token = token.replace('\\u0120', '')\n",
        "    label = ner_labels_map[label_id]\n",
        "    annotated.append(f\"{token}<{label}>\")\n",
        "  print(\"NER: \" + \" \".join(annotated))\n",
        "\n",
        "  # Post-process sentiment output\n",
        "  sentiment_labels = model.get_labels_dict(sentiment_adapter)\n",
        "  label_id = torch.argmax(outputs[1].logits).item()\n",
        "  print(\"Sentiment: \" + sentiment_labels[label_id])\n",
        "  print()"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PmNW-8mQquTj"
      },
      "source": [
        "Let's test our pipeline with some example sentences (taken from the XSum training set):"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 78,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "n_YCRbP0dKsU",
        "outputId": "e08ce678-1687-481a-eb37-0da95612b901"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "NER: A<O> man<O> in<O> central<O> Germany<B-LOC> tried<O> to<O> leave<O> his<O> house<O> by<O> the<O> front<O> door<O> only<O> to<O> find<O> a<O> brick<O> wall<O> there<O> .<O>\n",
            "Sentiment: negative\n",
            "\n",
            "NER: The<O> Met<B-ORG> Office<I-ORG> has<O> issued<O> a<O> yellow<O> weather<O> warning<O> for<O> ice<O> across<O> most<O> of<O> Wales<B-LOC> .<O>\n",
            "Sentiment: negative\n",
            "\n",
            "NER: A<O> vibrant<O> animation<O> telling<O> stories<O> of<O> indigenous<O> Australia<B-LOC> will<O> be<O> projected<O> on<O> to<O> the<O> Sydney<B-LOC> Opera<I-ORG> House<I-LOC> every<O> night<O> at<O> sunset<O> .<O>\n",
            "Sentiment: positive\n",
            "\n"
          ]
        }
      ],
      "source": [
        "sentences = [\n",
        "  \"A man in central Germany tried to leave his house by the front door only to find a brick wall there.\",\n",
        "  \"The Met Office has issued a yellow weather warning for ice across most of Wales.\",\n",
        "  \"A vibrant animation telling stories of indigenous Australia will be projected on to the Sydney Opera House every night at sunset.\"\n",
        "]\n",
        "\n",
        "for sentence in sentences:\n",
        "  analyze_sentence(sentence)"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "koFncfqFq5tC"
      },
      "source": [
        "Voilá! Each sentence is annotated using NER tags and classified based on its sentiment.\n",
        "\n",
        "**How does it work?** At the first occurrence of an adapter layer, `adapter-transformers` will automatically replicate the input by the number of adapters. This mechanism is especially useful if only later Transformers layers include adapters as the input will be replicated as late as possible.\n",
        "\n",
        "**Where to go from here?**\n",
        "\n",
        "➡️ Make sure to check out the [corresponding chapter in the documentation](https://docs.adapterhub.ml/adapter_composition.html) to learn more about adapter composition and the `Parallel` block.\n",
        "\n",
        "➡️ Also check out [Rücklé et al., 2021](https://arxiv.org/pdf/2010.11918.pdf) who also use parallel inference in their analysis of adapter efficiency.\n",
        "\n",
        "➡️ To see more `adapter-transformers` features in action, visit our [notebooks folder on GitHub](https://github.com/Adapter-Hub/adapter-transformers/tree/master/notebooks)."
      ]
    }
  ],
  "metadata": {
    "accelerator": "GPU",
    "colab": {
      "collapsed_sections": [],
      "name": "Parallel_Adapter_Inference.ipynb",
      "provenance": [],
      "toc_visible": true
    },
    "kernelspec": {
      "display_name": "Python 3",
      "language": "python",
      "name": "python3"
    },
    "language_info": {
      "codemirror_mode": {
        "name": "ipython",
        "version": 3
      },
      "file_extension": ".py",
      "mimetype": "text/x-python",
      "name": "python",
      "nbconvert_exporter": "python",
      "pygments_lexer": "ipython3",
      "version": "3.7.9"
    },
    "pycharm": {
      "stem_cell": {
        "cell_type": "raw",
        "metadata": {
          "collapsed": false
        },
        "source": []
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
