{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 38,
   "id": "0a1884c3-daa5-49e3-bddb-b55cb1a3f43a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Unnamed: 0</th>\n",
       "      <th>document</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>0</td>\n",
       "      <td>\\t\\t   \"   On-lineOnline registration at the\\t\\t of the st Indies, Mona pus.\"\\t\\t  Thesis statement: Registering   on-lineonline\\t\\tis better than the previous method used since it is time consuming and easy\\t\\tto work with irrespective of the few hang-ups in the introductory stage.\\t\\t  On hearing the statement,   on-lineonline\\t\\tregistration, the first thing that is conveyed to mind is registering \\t\\tvia on computer.\\t   on-lineonline registration is a new system set\\t\\tup at the  of the...</td>\n",
       "      <td>Jamaica</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1</td>\n",
       "      <td>24.10.95   Dear Joan  Your mother is trying to get everyone to send you a card for your birthday so as I am sitting here alone on a wet windy Tuesday morning I thought I would send you a few lines  Summer is well and truly gone now   altough   although   October has been the warmest on record.   you know Dad died on the 18th Sept aged 90.  He just died of old age, knew he was going,   any way   anyway   I am sure he is now setting about to organise heaven to his needs   all miss him very m...</td>\n",
       "      <td>Ireland</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>2</td>\n",
       "      <td>-028$A&gt;   CANNABIS    Cannabis is one of the oldest plants cultivated by man.  Archaeological evidence from a  Age village, excavated on the island of , suggests that mankind has been using the plant  Cannabis sativa  from earliest times.   The cannabis plant is a very adaptable annual, which can grow in most parts of the world including .  In its chequered career, it has been grown for its long fibres known as hemp fibre, for its seed (hemp seed), used as a source of oil and for bird seed, ...</td>\n",
       "      <td>Ireland</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>3</td>\n",
       "      <td>The  BLACK LEOPARD\\t\\tBy Risidra s\\t\\tTHEY WERE FIRST SPOTTED in the 1940s and in the 1960s in  and in Hiniduma. wever there is a possibility that these rare animals known as black leopards may have been spotted by villagers even before the 1940s but not recorded in the country's history.\\t\\tBlack pards are considered to be endangered and a rare sighting in the country. wever environmentalists and wildlife experts believe the animal still exists in the country after having discovered three c...</td>\n",
       "      <td>SriLanka</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>\\t\\t  The author of this quotation is T.S. .\\t It is taken from his work \" The steland\".\\t The significance is that this is at the end of the poem with reference\\t\\tto the Fisher , the speaker is looking out at the steland and wanders\\t\\twhat to do with the ruins of stern culture that lay before him.\\t Is there anything that he can do with the wreckage of civilization  ?\\t\\t\\t Should he try to build a dam   unclear word \\t\\t to stop culture from crumbling? \\t\\t  The author of this quotation ...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2283</th>\n",
       "      <td>2283</td>\n",
       "      <td>\\t\\t\\t Theatre of the brave   \\t\\t\\t  Ko is one of ng Kong  ' s great theatrical showmen but\\tcan he create a local  ?  Desiree   reports  \\t\\t \\t\\t TWO DAYS BEFORE  the premiere of time\\tProduction  ' s latest show ,  Jubilee  , and things are not falling\\tinto place .\\t\\t\\t Is ng Kong going backwards ?   Ko , head of\\ttime Production asks after putting down the receiver .\\t\\t\\t People really need to be more flexible , instead of being so\\tstiff and bureaucratic .  \\t\\t \\t\\tThe problem is ...</td>\n",
       "      <td>HongKong</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2284</th>\n",
       "      <td>2284</td>\n",
       "      <td>TWO WOMEN FOUND DEAD   Hill and Nigel Gould   Carrick murder inquiry   A murder inquiry was launched today after a mother and daughter were found strangled in a house in Carrickfergus.   The bodies of Kate Curran, in her 50s, and her 32-year-old daughter Angela were discovered in Mrs Curran 's vale Avenue home.   Another daughter, Patricia, is recovering from stab wounds after being slashed in the neck and wrists at her tion Road home two miles away.   A man - believed to be her estranged ...</td>\n",
       "      <td>Ireland</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2285</th>\n",
       "      <td>2285</td>\n",
       "      <td>\\t  MICROSOFT CORPORATION\\t  GENERAL\\t\\t  Microsoft Corporation ( the \" Company\" or \" Microsoft\") was\\t\\tfounded as a partnership in 1975 and was incorporated in 1981.\\t The Company operates in one business segment - the development,\\t\\tmanufacture, marketing, licensing, and support of a wide range of software\\t\\tproducts, including operating systems for personal computers ( PCs), office\\t\\tmachines, and personal information devices; applications programs; and\\t\\tlanguages; as well as person...</td>\n",
       "      <td>USA</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2286</th>\n",
       "      <td>2286</td>\n",
       "      <td>\\t\\t\\t16 Oct 2000\\t\\t\\tMINISTRY OF EDUCATION NOTIFICATION ( GS/16/00)\\t\\t\\t( General Information)\\t\\t\\t SCHEME FOR ENHANCING SUBJECT PROFICIENCY SPONSORSHIP FOR BSC (\\tPHYSICS) PROGRAMME ( FULL-TIME) AT THE NATIONAL UNIVERSITY OF SINGAPORE\\tSTARTING JUL 2001 \\t\\t\\tAPPLICATIONS TO BE SUBMITTED BY 6 NOV 2000\\t\\t\\t1) SESP ( Scheme for Enhancing Subject Proficiency) was first launched\\tin 1997 for secondary school graduate teachers to study one subject in the\\tundergraduate programme at NUS or N...</td>\n",
       "      <td>Singapore</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2287</th>\n",
       "      <td>2287</td>\n",
       "      <td>\\t\\t\\t Woman fined for driving with twin sister's IC and licence \\t\\t\\tBy Tan Ooi on\\t\\t\\tA WOMAN who did not have a driving licence thought she could fool the\\tpolice by driving with her twin sister's identity card and licence.\\t\\t\\tBut what she did not count on was the police receiving earlier\\tinformation on the ploy and setting a trap for her.\\t\\t\\tsterday, Lim Yan Peng 21, was fined a total of\\t$7,000 and banned from driving for a year after she pleaded guilty to using\\ther sister's ide...</td>\n",
       "      <td>Singapore</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>2288 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Unnamed: 0  \\\n",
       "0              0   \n",
       "1              1   \n",
       "2              2   \n",
       "3              3   \n",
       "4              4   \n",
       "...          ...   \n",
       "2283        2283   \n",
       "2284        2284   \n",
       "2285        2285   \n",
       "2286        2286   \n",
       "2287        2287   \n",
       "\n",
       "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 document  \\\n",
       "0     \\t\\t   \"   On-lineOnline registration at the\\t\\t of the st Indies, Mona pus.\"\\t\\t  Thesis statement: Registering   on-lineonline\\t\\tis better than the previous method used since it is time consuming and easy\\t\\tto work with irrespective of the few hang-ups in the introductory stage.\\t\\t  On hearing the statement,   on-lineonline\\t\\tregistration, the first thing that is conveyed to mind is registering \\t\\tvia on computer.\\t   on-lineonline registration is a new system set\\t\\tup at the  of the...   \n",
       "1       24.10.95   Dear Joan  Your mother is trying to get everyone to send you a card for your birthday so as I am sitting here alone on a wet windy Tuesday morning I thought I would send you a few lines  Summer is well and truly gone now   altough   although   October has been the warmest on record.   you know Dad died on the 18th Sept aged 90.  He just died of old age, knew he was going,   any way   anyway   I am sure he is now setting about to organise heaven to his needs   all miss him very m...   \n",
       "2     -028$A>   CANNABIS    Cannabis is one of the oldest plants cultivated by man.  Archaeological evidence from a  Age village, excavated on the island of , suggests that mankind has been using the plant  Cannabis sativa  from earliest times.   The cannabis plant is a very adaptable annual, which can grow in most parts of the world including .  In its chequered career, it has been grown for its long fibres known as hemp fibre, for its seed (hemp seed), used as a source of oil and for bird seed, ...   \n",
       "3     The  BLACK LEOPARD\\t\\tBy Risidra s\\t\\tTHEY WERE FIRST SPOTTED in the 1940s and in the 1960s in  and in Hiniduma. wever there is a possibility that these rare animals known as black leopards may have been spotted by villagers even before the 1940s but not recorded in the country's history.\\t\\tBlack pards are considered to be endangered and a rare sighting in the country. wever environmentalists and wildlife experts believe the animal still exists in the country after having discovered three c...   \n",
       "4     \\t\\t  The author of this quotation is T.S. .\\t It is taken from his work \" The steland\".\\t The significance is that this is at the end of the poem with reference\\t\\tto the Fisher , the speaker is looking out at the steland and wanders\\t\\twhat to do with the ruins of stern culture that lay before him.\\t Is there anything that he can do with the wreckage of civilization  ?\\t\\t\\t Should he try to build a dam   unclear word \\t\\t to stop culture from crumbling? \\t\\t  The author of this quotation ...   \n",
       "...                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ...   \n",
       "2283   \\t\\t\\t Theatre of the brave   \\t\\t\\t  Ko is one of ng Kong  ' s great theatrical showmen but\\tcan he create a local  ?  Desiree   reports  \\t\\t \\t\\t TWO DAYS BEFORE  the premiere of time\\tProduction  ' s latest show ,  Jubilee  , and things are not falling\\tinto place .\\t\\t\\t Is ng Kong going backwards ?   Ko , head of\\ttime Production asks after putting down the receiver .\\t\\t\\t People really need to be more flexible , instead of being so\\tstiff and bureaucratic .  \\t\\t \\t\\tThe problem is ...   \n",
       "2284    TWO WOMEN FOUND DEAD   Hill and Nigel Gould   Carrick murder inquiry   A murder inquiry was launched today after a mother and daughter were found strangled in a house in Carrickfergus.   The bodies of Kate Curran, in her 50s, and her 32-year-old daughter Angela were discovered in Mrs Curran 's vale Avenue home.   Another daughter, Patricia, is recovering from stab wounds after being slashed in the neck and wrists at her tion Road home two miles away.   A man - believed to be her estranged ...   \n",
       "2285  \\t  MICROSOFT CORPORATION\\t  GENERAL\\t\\t  Microsoft Corporation ( the \" Company\" or \" Microsoft\") was\\t\\tfounded as a partnership in 1975 and was incorporated in 1981.\\t The Company operates in one business segment - the development,\\t\\tmanufacture, marketing, licensing, and support of a wide range of software\\t\\tproducts, including operating systems for personal computers ( PCs), office\\t\\tmachines, and personal information devices; applications programs; and\\t\\tlanguages; as well as person...   \n",
       "2286  \\t\\t\\t16 Oct 2000\\t\\t\\tMINISTRY OF EDUCATION NOTIFICATION ( GS/16/00)\\t\\t\\t( General Information)\\t\\t\\t SCHEME FOR ENHANCING SUBJECT PROFICIENCY SPONSORSHIP FOR BSC (\\tPHYSICS) PROGRAMME ( FULL-TIME) AT THE NATIONAL UNIVERSITY OF SINGAPORE\\tSTARTING JUL 2001 \\t\\t\\tAPPLICATIONS TO BE SUBMITTED BY 6 NOV 2000\\t\\t\\t1) SESP ( Scheme for Enhancing Subject Proficiency) was first launched\\tin 1997 for secondary school graduate teachers to study one subject in the\\tundergraduate programme at NUS or N...   \n",
       "2287  \\t\\t\\t Woman fined for driving with twin sister's IC and licence \\t\\t\\tBy Tan Ooi on\\t\\t\\tA WOMAN who did not have a driving licence thought she could fool the\\tpolice by driving with her twin sister's identity card and licence.\\t\\t\\tBut what she did not count on was the police receiving earlier\\tinformation on the ploy and setting a trap for her.\\t\\t\\tsterday, Lim Yan Peng 21, was fined a total of\\t$7,000 and banned from driving for a year after she pleaded guilty to using\\ther sister's ide...   \n",
       "\n",
       "         target  \n",
       "0       Jamaica  \n",
       "1       Ireland  \n",
       "2       Ireland  \n",
       "3      SriLanka  \n",
       "4        Canada  \n",
       "...         ...  \n",
       "2283   HongKong  \n",
       "2284    Ireland  \n",
       "2285        USA  \n",
       "2286  Singapore  \n",
       "2287  Singapore  \n",
       "\n",
       "[2288 rows x 3 columns]"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import pandas as pd\n",
    "pd.set_option('display.max_colwidth', 500)\n",
    "df = pd.read_csv('new.csv')\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "id": "2c382326-fd90-419b-8aa9-98534744683b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>Unnamed: 0</th>\n",
       "      <th>document</th>\n",
       "      <th>target</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>4</td>\n",
       "      <td>\\t\\t  The author of this quotation is T.S. .\\t It is taken from his work \" The steland\".\\t The significance is that this is at the end of the poem with reference\\t\\tto the Fisher , the speaker is looking out at the steland and wanders\\t\\twhat to do with the ruins of stern culture that lay before him.\\t Is there anything that he can do with the wreckage of civilization  ?\\t\\t\\t Should he try to build a dam   unclear word \\t\\t to stop culture from crumbling? \\t\\t  The author of this quotation ...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>6</th>\n",
       "      <td>6</td>\n",
       "      <td>\\t   dge-based systems and operational hydrology \\t\\t 1      1  The National Research\\t\\tCouncil of 's sociate Committee on Hydrology identifies, solicits,\\t\\tand promotes the preparation of state-of-art papers on hydrological topics\\t\\tthat require research.\\t The Committee has requested the preparation of this report and is\\t\\tpleased to bring it to your attention.\\t The views expressed are those of the author.  \\t Slobodan P. onovic\\t\\t  dge-based systems were brought to the attention of\\...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>12</th>\n",
       "      <td>12</td>\n",
       "      <td>\\t  AFFIRMATIVE ACTION: MAKING EQUALS EQUAL \\t\\t  Since the early part of this century women have gradually gained\\t\\taccess to compete for better jobs in .\\t They have however, been prevented from success in their attempt to\\t\\tbecome equals in the workplace through systemic discrimination.\\t  well as women; native dians, visible minorities and persons\\t\\twith disabilities have also been subjected to discriminative hiring policies\\t\\tand practices by a majority of employers in .\\t In a soci...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>18</th>\n",
       "      <td>18</td>\n",
       "      <td>\\t   THE LIVING PRAIRIE MUSEUM  \\t   Keeping the tall-grass prairie alive and well in the heart\\t\\tof   \\t   By  R. Kynman  \\t\\t  It's the middle of a hot and breezy summer afternoon on the\\t\\tprairie and all around the hum of grass-hoppers, crickets and sparrows drones\\t\\ton.\\t A small group of school children follows a naturalist on a voyage of\\t\\texploration through one of the most unusual museums in .\\t On all sides the concrete dwellings of a modern city are held at bay\\t\\tby a low chai...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>27</th>\n",
       "      <td>27</td>\n",
       "      <td>\\t   A woman, as I see it, is more like moss or lichen ... as\\t\\tshe takes to her husband.  \\t\\t  In Patrick 's  A Fringe of Leaves  , this phrase\\t\\tspoken by Ellen near the end of the novel is something I believe she picked\\t\\tup from her friend when she first escaped the bush.\\t The essence of the phrase is that women cling to men as a source of\\t\\ttheir identity.\\t Throughout the novel, Ellen   percieves   perceives \\t\\t herself as either Mrs ugh or Ellen Gluyas: this distinction in\\t\\tE...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2209</th>\n",
       "      <td>2209</td>\n",
       "      <td>\\t Dear    , \\t\\t  I'm delighted that you're putting together a supplement to\\t\\tcelebrate the anniversary of the MBA programme and draw attention to its\\t\\tachievements. \\t\\t   10 isn't very far away, especially with a couple  days'\\t\\t holiday in the middle, and February being a short month!\\t I thought it would be useful to set some time-lines to avoid\\t\\tconfusion. \\t\\t  I understand  Dani Di Franco  is doing the design, and I\\t\\tknow she'll do a good job for you. \\t\\t  She says you wil...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2210</th>\n",
       "      <td>2210</td>\n",
       "      <td>\\t  The  shoot \\t\\t  Jock  got the assignment of a lifetime when he was sent to\\t\\tphotograph Marilyn  on the set of .\\t She wasn't at all what he'd expected ... \\t\\t  It was over forty years ago on Thursday, June 5, 1952.\\t Harry Truman was president, the Cold r had escalated into the Korean\\t\\tr, the  had set off its first nuclear bomb, and the popular\\t\\tsongs of the year were \" ses Sweeter Than Wine\" and agy chael's\\t\\tversion of \" In the Cool, Cool, Cool of the Evening\". \\t\\t  I was abo...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2243</th>\n",
       "      <td>2243</td>\n",
       "      <td>\\t   Minister, ML meet with ropean parliamentarians in\\t\\t, a  \\t\\t  The fur industry was the topic of conversation as three members of\\t\\tthe legislative assembly met with members of the ro  pean parliament\\t\\tlast dnesday. \\t\\t  Renewable Resources Minister s Allooloo and ML Don in and\\t\\tPeter Ernerk travelled to , ., for one day.\\t MLA John Ningark ( Natilikmeot) decided to stay in llowknife after\\t\\tsome members raised con  cerns about spending. \\t\\t  Allooloo said pressure is mounting ...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2252</th>\n",
       "      <td>2252</td>\n",
       "      <td>\\t   LETTER TO THE QUEBEC COUNCIL OF BISHOPS  (   Nov.\\t\\t  November   1996) \\t   Introduction  \\t\\t  While separation of ch and State is a concept which has been\\t\\tassimilated in , and we may agree that the government should not\\t\\tinterfere in religion, and although some may feel that in Quebec the clergy\\t\\thave historically aroused resentment by exercising wide authority on matters\\t\\tboth religious and secular, still it is both false and destructive to hold\\t\\tthat faith and religious ...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2270</th>\n",
       "      <td>2270</td>\n",
       "      <td>\\t  Extradition hearing delayed for drug suspects \\t\\t  Three men suspected of heading the dian connection in an\\t\\tinternational drug-smuggling ring were granted a postponement in an\\t\\textradition hearing yesterday. \\t\\t  Michel Chouinard, 46,  Doyer, 43, and Jean uthillier, 59,\\t\\tare wanted by the U.S. drug enforcement authorities.\\t It is alleged they imported tonnes of marijuana to 's Gulf\\t\\tCoast. \\t\\t  Doyer and Chouinard, who operated Domaine Montjoye ski centre near\\t\\tNorth Hatle...</td>\n",
       "      <td>Canada</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>200 rows × 3 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "      Unnamed: 0  \\\n",
       "4              4   \n",
       "6              6   \n",
       "12            12   \n",
       "18            18   \n",
       "27            27   \n",
       "...          ...   \n",
       "2209        2209   \n",
       "2210        2210   \n",
       "2243        2243   \n",
       "2252        2252   \n",
       "2270        2270   \n",
       "\n",
       "                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 document  \\\n",
       "4     \\t\\t  The author of this quotation is T.S. .\\t It is taken from his work \" The steland\".\\t The significance is that this is at the end of the poem with reference\\t\\tto the Fisher , the speaker is looking out at the steland and wanders\\t\\twhat to do with the ruins of stern culture that lay before him.\\t Is there anything that he can do with the wreckage of civilization  ?\\t\\t\\t Should he try to build a dam   unclear word \\t\\t to stop culture from crumbling? \\t\\t  The author of this quotation ...   \n",
       "6     \\t   dge-based systems and operational hydrology \\t\\t 1      1  The National Research\\t\\tCouncil of 's sociate Committee on Hydrology identifies, solicits,\\t\\tand promotes the preparation of state-of-art papers on hydrological topics\\t\\tthat require research.\\t The Committee has requested the preparation of this report and is\\t\\tpleased to bring it to your attention.\\t The views expressed are those of the author.  \\t Slobodan P. onovic\\t\\t  dge-based systems were brought to the attention of\\...   \n",
       "12    \\t  AFFIRMATIVE ACTION: MAKING EQUALS EQUAL \\t\\t  Since the early part of this century women have gradually gained\\t\\taccess to compete for better jobs in .\\t They have however, been prevented from success in their attempt to\\t\\tbecome equals in the workplace through systemic discrimination.\\t  well as women; native dians, visible minorities and persons\\t\\twith disabilities have also been subjected to discriminative hiring policies\\t\\tand practices by a majority of employers in .\\t In a soci...   \n",
       "18    \\t   THE LIVING PRAIRIE MUSEUM  \\t   Keeping the tall-grass prairie alive and well in the heart\\t\\tof   \\t   By  R. Kynman  \\t\\t  It's the middle of a hot and breezy summer afternoon on the\\t\\tprairie and all around the hum of grass-hoppers, crickets and sparrows drones\\t\\ton.\\t A small group of school children follows a naturalist on a voyage of\\t\\texploration through one of the most unusual museums in .\\t On all sides the concrete dwellings of a modern city are held at bay\\t\\tby a low chai...   \n",
       "27    \\t   A woman, as I see it, is more like moss or lichen ... as\\t\\tshe takes to her husband.  \\t\\t  In Patrick 's  A Fringe of Leaves  , this phrase\\t\\tspoken by Ellen near the end of the novel is something I believe she picked\\t\\tup from her friend when she first escaped the bush.\\t The essence of the phrase is that women cling to men as a source of\\t\\ttheir identity.\\t Throughout the novel, Ellen   percieves   perceives \\t\\t herself as either Mrs ugh or Ellen Gluyas: this distinction in\\t\\tE...   \n",
       "...                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   ...   \n",
       "2209   \\t Dear    , \\t\\t  I'm delighted that you're putting together a supplement to\\t\\tcelebrate the anniversary of the MBA programme and draw attention to its\\t\\tachievements. \\t\\t   10 isn't very far away, especially with a couple  days'\\t\\t holiday in the middle, and February being a short month!\\t I thought it would be useful to set some time-lines to avoid\\t\\tconfusion. \\t\\t  I understand  Dani Di Franco  is doing the design, and I\\t\\tknow she'll do a good job for you. \\t\\t  She says you wil...   \n",
       "2210  \\t  The  shoot \\t\\t  Jock  got the assignment of a lifetime when he was sent to\\t\\tphotograph Marilyn  on the set of .\\t She wasn't at all what he'd expected ... \\t\\t  It was over forty years ago on Thursday, June 5, 1952.\\t Harry Truman was president, the Cold r had escalated into the Korean\\t\\tr, the  had set off its first nuclear bomb, and the popular\\t\\tsongs of the year were \" ses Sweeter Than Wine\" and agy chael's\\t\\tversion of \" In the Cool, Cool, Cool of the Evening\". \\t\\t  I was abo...   \n",
       "2243  \\t   Minister, ML meet with ropean parliamentarians in\\t\\t, a  \\t\\t  The fur industry was the topic of conversation as three members of\\t\\tthe legislative assembly met with members of the ro  pean parliament\\t\\tlast dnesday. \\t\\t  Renewable Resources Minister s Allooloo and ML Don in and\\t\\tPeter Ernerk travelled to , ., for one day.\\t MLA John Ningark ( Natilikmeot) decided to stay in llowknife after\\t\\tsome members raised con  cerns about spending. \\t\\t  Allooloo said pressure is mounting ...   \n",
       "2252  \\t   LETTER TO THE QUEBEC COUNCIL OF BISHOPS  (   Nov.\\t\\t  November   1996) \\t   Introduction  \\t\\t  While separation of ch and State is a concept which has been\\t\\tassimilated in , and we may agree that the government should not\\t\\tinterfere in religion, and although some may feel that in Quebec the clergy\\t\\thave historically aroused resentment by exercising wide authority on matters\\t\\tboth religious and secular, still it is both false and destructive to hold\\t\\tthat faith and religious ...   \n",
       "2270  \\t  Extradition hearing delayed for drug suspects \\t\\t  Three men suspected of heading the dian connection in an\\t\\tinternational drug-smuggling ring were granted a postponement in an\\t\\textradition hearing yesterday. \\t\\t  Michel Chouinard, 46,  Doyer, 43, and Jean uthillier, 59,\\t\\tare wanted by the U.S. drug enforcement authorities.\\t It is alleged they imported tonnes of marijuana to 's Gulf\\t\\tCoast. \\t\\t  Doyer and Chouinard, who operated Domaine Montjoye ski centre near\\t\\tNorth Hatle...   \n",
       "\n",
       "      target  \n",
       "4     Canada  \n",
       "6     Canada  \n",
       "12    Canada  \n",
       "18    Canada  \n",
       "27    Canada  \n",
       "...      ...  \n",
       "2209  Canada  \n",
       "2210  Canada  \n",
       "2243  Canada  \n",
       "2252  Canada  \n",
       "2270  Canada  \n",
       "\n",
       "[200 rows x 3 columns]"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "df = df[df.target == \"Canada\"]\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "id": "5d437fdb-e91a-4313-9bf1-20b66f09dc7f",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "['\\t\\t  The first sector to capitalize on genetic engineering was health\\t\\tcare, particularly in the US.\\t In 1983, 10 years after Boyer and Cohen performed the first\\t\\trecombinant DNA experiment, scientists cultivated large quanti  ties of\\t\\tthe human gene responsible for synthesizing insulin and marketed it as a\\t\\ttreatment for diabetics.\\t Among the most commercially significant products to emerge since the\\t\\tearly 1980s, in addition to insulin, are human growth hormone ( which combats\\t\\tdwarfism), alpha interferon ( an immune-system stimulator which helps ward\\t\\toff disease), tissue plasmino  gen activator ( which can dissolve blood\\t\\tclots in heart-attack patients), a hepatitis-B vaccine, Ortho \\t\\tPharmaceuticals\\'  OKT-3 ( which helps prevent kidney transplant\\t\\trejection), and a drug called Epogen ( which fights ane  mia).\\t Annual worldwide sales of just these seven products have reached\\t\\t&dollar;1.25 billion US ( Gianturco 208).\\t Worldwide sales of all biotechnology products last year reached\\t\\t&dollar;6 billion.\\t By the turn of the century sales are expected to soar to more than\\t\\t&dollar;100 billion (  Canadian Biotech  12). \\t\\t  That remarkable growth is now being fuelled by rapidly evolving\\t\\tpatent legislation.\\t In perhaps the most sensational event in the brief history of modern\\t\\tbiotechnology, the US Patent and Trademark Office in 1987 passed a policy\\t\\tstating that it  \" considers non-naturally occurring nonhu  man\\t\\tmulticellular living organisms, including animals, to be patentable subject\\t\\tmatter\"  ( Wallis 78).\\t That declaration prompted fierce opposition from animal-rights\\t\\tactivists who feared the policy would promote the exploratory production of\\t\\tmisfit animals incapable of fighting pain or dis  ease.\\t In Canada, after 10 years of controversy, Parliament this summer\\t\\tpassed Bill C-15, the plant  breeders\\'  rights act.\\t Though not as sweeping as the US policy, the legislation for the\\t\\tfirst time puts plant breeders on a par with inventors of machinery and\\t\\tmanufacturing processes, permitting them to protect their new plant varieties\\t\\twith 18-year patents.\\t As a result, plant breeders will not only be able to collect\\t\\troyalties for the seeds they sell but also for the seed adult plants\\t\\tsubsequently produce.\\t Since the bill was introduced in its original form by the liberal\\t\\tgovernment in 1980, some farm groups, led by the National  Farmers\\' \\t\\tUnion, have been insist  ing it will hand over control of the food supply\\t\\tto a handful of large multi-national corporations. \\t\\t  Even in the absence of incentive legislation, however, the\\t\\tCanadian biotechnology industry\\'s growth to date has been impressive.\\t Sales of Canadian biotechnology products reached &dollar;660 million\\t\\tin 1988, while the industry as a whole posted a net after-tax loss of only\\t\\t&dollar;3 million, a remarkably strong performance for an infant industry\\t\\tthat is having to invest considerably in manufacturing facilities and\\t\\tmarketing efforts.\\t Dur  ing that same year, Canada\\'s biotechnology companies spent a\\t\\tsubstantial &dollar;275 million on research and development.\\t While many of Canada\\'s biotechnology enterprises are newly created\\t\\tdepartments of large indus  trial firms, most are privately owned\\t\\tstart-ups employing fewer than 50 people. \\t\\t  Building on Canada\\'s historical strength in agricultural,\\t\\tmetallurgical, and forestry research, the industry is distinguished by its\\t\\tprimary focus on resource-based products and by its diversity.\\t A recent survey of the indus  try sponsored by the Department of\\t\\tIndustry, Science and Technology and the National Research Council ( NRC )\\t\\treveals that:   \\t The products and processes invented and sold by Canadian\\t\\tbiotechnology compa  nies involve nearly every industrial sector.\\t They include cloned varieties of orna  mental plants, bioleaching\\t\\tin the mining of uranium and gold, quick tip-of-the-tongue tests to measure\\t\\tblood alcohol, anaerobic digestion systems for the treatment of pulp mill\\t\\teffluents, the world\\'s first conjugate vaccine, cattle improvement through\\t\\tnuclear transplantation and embryo cloning, monoclonal antibodies for blood\\t\\ttyping, soil microbes to improve plant growth, diagnostic kits for AIDS, the\\t\\tbrewing of beer, biological pesticides, and mass production of bio \\t\\tlogical reagents from eggs and plants. (  Canadian Biotech  1)\\t\\t  \\t\\t  According to that survey, more than 10,000 products are currently\\t\\tunder development, 76% of which are being developed by seed compa  nies.\\t\\t\\t\\t  The Canadian industry\\'s plans for future growth are equally\\t\\timpres  sive.\\t Companies plan to spend almost &dollar;7 million each on new manufac\\t\\t turing facilities by 1992.\\t By the same year, the industry expects to hire 5,000 new employees,\\t\\tnearly doubling its 1989 work-force.\\t Sales are expected to grow at an average annual rate of 46% between\\t\\t1988 and 1992, reaching &dollar;5 billion.\\t The expectations seem realistic, too.\\t About half of the 220 companies involved in biotechnology in Canada\\t\\treported net profits in 1988.\\t Overall, the firms are less indebted than those of many established\\t\\tindustrial sectors, with an average debt-to-equity ratio in 1988 of 0.47.\\t The average assets-to-liability ratio for the industry was a\\t\\trelatively healthy 2.3 (  Canadian Biotech  6). \\t\\t  Such statistics reveal a vibrant and remarkably strong industry.\\t Yet the overall picture points to more than an increase in efficiency\\t\\tand quantity of research.\\t It also reflects an impressive shift in demographics.\\t Private-sector involvement is rapidly growing against public-sector\\t\\tactivity.\\t Where once research proceeded almost exclusively according to\\t\\tpublic-policy objectives and at the discretion of government funding bodies,\\t\\tprivate capital is fuelling an expanding proportion of the new growth.\\t The dynamics of the marketplace will inevitably play an increasingly\\t\\tvital role in the guidance of that growth.\\t Clearly, the power and priorities of Cana  dian biotechnology\\t\\tresearch are changing.\\t What will some of those early priorities be, and what are the\\t\\timplications for public policy? \\t  II \\t\\t  The risks associated with the progress of biotechnology can be\\t\\tdivided into two categories, those that arise from the research activities\\t\\tthemselves and those that arise from the application of its products.\\t With the Canadian biotechnology industry still in its infancy, and\\t\\tthe bulk of its products still in development, the most imminent risks to\\t\\tconsider stem from the pro  cess, as opposed to the application, of\\t\\tresearch.\\t Ominous signs of the level of ignorance among biotechnology\\t\\tresearchers of the regulations that apply to their industry are already\\t\\tevident.\\t In the survey of 84 companies conducted by the federal government and\\t\\tthe NRC from November 1988 to February 1989, only 29 respondents were\\t\\tfamiliar with the Canadian Environmental Protection Act and its implications\\t\\tfor industrial biotech  nology.\\t Although ignorance of regulations will not be tolerated as a plea by\\t\\tthe courts, that can only be cold comfort when irresponsible practices have a\\t\\tvast potential to alter - or even destroy - parts of   the   the\\t\\t  environ  ment.\\t If biotechnology has the capacity to do irrevocable damage to humans\\t\\tor their surroundings, retroactive enforcement through punish  ment could\\t\\tbe irrelevant.\\t Measures to ensure compliance will be - and clearly already are -\\t\\tessential.\\t Regulatory authorities will thus have to work alongside the\\t\\tindustrial biotechnology community to assist in mak  ing every member\\t\\taware of the laws and guidelines that apply to their practices. \\t\\t  Enforcement presents another challenge.\\t As in nuclear research, the containment of potentially hazardous\\t\\tmaterial is a genuine problem, albeit only in a very small minority of cases.\\t While established firms are likely to conform voluntarily with public\\t\\tpolicies associated with research methods, smaller firms, which prevail in\\t\\tbiotechnology, are likely to take more risks.\\t This situation stands in stark contrast to the nuclear industry\\t\\twhere, because of the scale required by the research and its complexity,\\t\\tsmall companies are effectively precluded from participating.\\t A sound regulatory policy should contain provisions for random\\t\\tinvestigation of private laboratories - not unlike the scrutiny to which the\\t\\tfood industry is subject - to ensure that proper measures are taken to\\t\\tminimize the acci  dental release of potentially hazardous organisms. \\t\\t  The risks associated with the testing or application of products\\t\\tin the envi  ronment, especially micro-organisms, represent perhaps the\\t\\tmost critical area for regulation ( Doyle 50).\\t After all, the goal of biotechnology  is  to alter the\\t\\tenvironment.\\t Concerns for the long-term, broad ecological impact of biotechnology\\t\\twill have special relevance to Canadian research and development, where the\\t\\tvast majority of products are emerging from the natural-resource sectors,\\t\\tparticularly agriculture.\\t Monitoring and con  trolling the effects of that alteration will\\t\\tbe the key issues. \\t  III \\t\\t  When considering ramifications of the products of biotechnology,\\t\\tthe vast majority of which have yet to enter the market, an immediate dilemma\\t\\tpresents itself.\\t Is it possible to regulate effectively an enterprise in which\\t\\tpotential risks have yet to be uncovered?\\t The constant pressure to antici  pate risks means regulatory\\t\\tagencies must place strong emphasis on the evolution of policy itself.\\t The explosive growth of the industry underscores the urgency of that\\t\\tneed.\\t Co-operation from private industry would help identify areas of\\t\\tpotential public concern, but it is insufficient.\\t Neutral expertise is critical.\\t In an era when so many publicly-funded research institutions have\\t\\tseen their direct government support diminish against the growth of even\\t\\tmoderately expanding industries, and have seen their historical neutrality\\t\\tcompromised by the growing tendency to make federal funding for research\\t\\tcontingent upon industrial invest  ment, the capacity of the vibrant\\t\\tprivate biotechnology industry seems poised to outstrip the resources of\\t\\tthose very scientists whose indepen  dent expertise will be critical for\\t\\tinforming public policy.\\t Even if funding for public research were to be maintained at a\\t\\trelatively high level of 4%-6% of real growth, it is hardly a match for the\\t\\t46% annual rate at which the industry plans to expand.\\t Government and university research efforts in biotechnology cannot be\\t\\tallowed to languish in their present state of underfunding.\\t Among other things, public and university laboratories will have to\\t\\tstep up research into mathematical modelling and controlled-field testing in\\t\\torder to establish a predictive ecology before widespread release takes\\t\\tplace. \\t\\t  Another systematic trend that may compromise the goal of a sound\\t\\tand publicly supported regulatory system is evident in the the manner in\\t\\twhich the newly formed National Biotechnology Advisory Committee operates -\\t\\tnamely, under the auspices of the federal Department of Industry, Science and\\t\\tTechnology.\\t A founding priority of this department is to promote the  \"\\t\\tdevelopment, exploitation and applications of strategic technologies to\\t\\timprove Canada\\'s international competitiveness\"  (  \" Regu \\t\\tlatory Concerns\"  ) .\\t Environment Canada, on the other hand, has no such mandate, and its\\t\\tadvice ought to take a prime role in the formulation of regulations.\\t Can the Department of Industry, Science and Technology and the\\t\\tregulatory advisory body it oversees continue to be regarded as neutral\\t\\tsources of scrutiny by the biotechnology industry\\'s present critics, while\\t\\tthat department offers direct or indirect subsidies to industry and publicly\\t\\tchampions industry\\'s attempts to become globally competitive? \\t\\t  Even more important, because more than three-quarters of the prod\\t\\t ucts under development in Canada fall into the category of seed research,\\t\\tCanadian regulations will require specific attention to the impact of this\\t\\tresearch on accidental transmission and genetic diversity.\\t Plant research, unlike chemical, pharmaceutical, and animalian\\t\\tresearch, will in many cases require widespread dispersal of experimental\\t\\tgenetic material into the environment.\\t Geographically dispersed material is generally more sus  ceptible\\t\\tto the influences of a host of uncontrollable environmental factors, such as\\t\\tbioleaching into soil and transport by wind, micro-organisms, and animals.\\t Measures for assessing the potential of accidental transport, as well\\t\\tas the eventual fate of experimental material in the environment, will be\\t\\tcrucial. \\t\\t  Because agricultural applications of biotechnology can be\\t\\texpected to produce key crops with higher commercial value, it can be\\t\\texpected that biotechnology companies will find willing customers in the\\t\\tCanadian farming industry.\\t Measures should be taken to ensure that any reduction in genetic\\t\\tdiversity through the widespread planting of these key crops will not make\\t\\tthe food supply more susceptible to genetic diseases.\\t This may entail preserving a greater number of existing varieties in\\t\\tseed banks.\\t The potential for lost diversity also raises an interesting question\\t\\tfor the issue of farm support.\\t Will farmers who attempt to remain competitive by planting only those\\t\\tcrops with the highest commercial value be required to bear the burden of\\t\\tsudden losses to their incomes if their crops are wiped out by disease? \\t\\t  Insurance against such catastrophes might be obtained in another,\\t\\tpotentially more controversial way.\\t Even if consumers show a strong demand for new  \" super\"\\t\\t varieties, it is possible to  promote  the desirable goal\\t\\tof genetic diversity by taxing the consumption of these crops.\\t While the idea of limiting consumer choice through disincentives\\t\\tmight seem antithetical to a market-driven economy, it is nevertheless\\t\\tconsistent with a society that places high value on environmental security\\t\\tand human health.\\t Many western societies have already set precedents for\\t\\tmarket-intervention where public health and safety have been concerned.\\t Tobacco taxes, despite their threat to some  farmers\\' \\t\\tinterests, have enjoyed widespread public support in many countries,\\t\\tincluding Canada.\\t Would it be unthinkable, then, to impose taxes on crop varieties\\t\\tthat, because of their popularity, limit diversity and thus threaten the food\\t\\tsupply?\\t It has been suggested in the case of tobacco, and not without\\t\\treasonable justifi  cation, that a portion of the proceeds from taxes be\\t\\tdiverted to hospitals for the treatment of cancer and heart disease.\\t Would it be unreasonable to divert the proceeds from taxes on popular\\t\\tcrops toward maintaining seed banks? \\t\\t  As important as any specific regulatory consideration, finally,\\t\\tis the active participation of the public throughout the advisory process.\\t\\t \\t']\n"
     ]
    }
   ],
   "source": [
    "df_n = df[df[\"document\"].str.contains(\" dian \", na=False)]\n",
    "print(df_n[[\"document\"]].values[0])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "28d6bccf-54b9-4dca-b361-fa4199f4bdf6",
   "metadata": {},
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "90b13463-d3b8-4372-a42b-0ed0a6a94ff9",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
