{
  "id": "tapakah68/email-spam-classification",
  "id_no": 3554536,
  "datasetSlugNullable": "email-spam-classification",
  "ownerUserNullable": "tapakah68",
  "usabilityRatingNullable": 1.0,
  "titleNullable": "Email Spam Classification",
  "subtitleNullable": "Dataset of spam and non-spam emails",
  "descriptionNullable": "# Email Spam Classification\n\nThe dataset consists of a collection of emails categorized into two major classes: **spam** and **not spam**. It is designed to facilitate the development and evaluation of spam detection or email filtering systems. \n\n**The spam emails** in the dataset are typically unsolicited and unwanted messages that aim to promote products or services, spread malware, or deceive recipients for various malicious purposes. These emails often contain misleading subject lines, excessive use of advertisements, unauthorized links, or attempts to collect personal information.\n\nThe **non-spam emails** in the dataset are genuine and legitimate messages sent by individuals or organizations. They may include personal or professional communication, newsletters, transaction receipts, or any other non-malicious content.\n\nThe dataset encompasses emails of varying *lengths, languages, and writing styles*, reflecting the inherent heterogeneity of email communication. This diversity aids in training algorithms that can generalize well to different types of emails, making them robust against different spammer tactics and variations in non-spam email content.\n\n### The dataset's possible applications:\n- spam detection\n- fraud detection\n- email filtering systems\n- customer support automation\n- natural language processing\n\n![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F618942%2F4d1fdedb2827152696dd0c0af05fd8da%2Ff.png?generation=1690286497115141&alt=media)\n\n# Get the Dataset\n\n### This is just an example of the data\n\nContact us via **[sales@trainingdata.pro](mailto:sales@trainingdata.pro)** or leave a request on **[trainingdata.pro/data-market](https://trainingdata.pro/data-market?utm_source=kaggle)** to get the dataset\n\n# File with the extension .csv\n\nincludes the following information:\n\n- **title**: title of the email,\n- **text**: text of the email,\n- **type**: type of the email\n\n# Email spam might be collected in accordance with your requirements.\n\n## **[TrainingData](https://trainingdata.pro/data-market?utm_source=kaggle)** provides high-quality data annotation tailored to your needs",
  "datasetId": 3554536,
  "datasetSlug": "email-spam-classification",
  "hasDatasetSlug": true,
  "ownerUser": "tapakah68",
  "hasOwnerUser": true,
  "usabilityRating": 1.0,
  "hasUsabilityRating": true,
  "totalViews": 664,
  "totalVotes": 10,
  "totalDownloads": 96,
  "title": "Email Spam Classification",
  "hasTitle": true,
  "subtitle": "Dataset of spam and non-spam emails",
  "hasSubtitle": true,
  "description": "# Email Spam Classification\n\nThe dataset consists of a collection of emails categorized into two major classes: **spam** and **not spam**. It is designed to facilitate the development and evaluation of spam detection or email filtering systems. \n\n**The spam emails** in the dataset are typically unsolicited and unwanted messages that aim to promote products or services, spread malware, or deceive recipients for various malicious purposes. These emails often contain misleading subject lines, excessive use of advertisements, unauthorized links, or attempts to collect personal information.\n\nThe **non-spam emails** in the dataset are genuine and legitimate messages sent by individuals or organizations. They may include personal or professional communication, newsletters, transaction receipts, or any other non-malicious content.\n\nThe dataset encompasses emails of varying *lengths, languages, and writing styles*, reflecting the inherent heterogeneity of email communication. This diversity aids in training algorithms that can generalize well to different types of emails, making them robust against different spammer tactics and variations in non-spam email content.\n\n### The dataset's possible applications:\n- spam detection\n- fraud detection\n- email filtering systems\n- customer support automation\n- natural language processing\n\n![](https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F618942%2F4d1fdedb2827152696dd0c0af05fd8da%2Ff.png?generation=1690286497115141&alt=media)\n\n# Get the Dataset\n\n### This is just an example of the data\n\nContact us via **[sales@trainingdata.pro](mailto:sales@trainingdata.pro)** or leave a request on **[trainingdata.pro/data-market](https://trainingdata.pro/data-market?utm_source=kaggle)** to get the dataset\n\n# File with the extension .csv\n\nincludes the following information:\n\n- **title**: title of the email,\n- **text**: text of the email,\n- **type**: type of the email\n\n# Email spam might be collected in accordance with your requirements.\n\n## **[TrainingData](https://trainingdata.pro/data-market?utm_source=kaggle)** provides high-quality data annotation tailored to your needs",
  "hasDescription": true,
  "isPrivate": false,
  "keywords": [
    "internet",
    "text",
    "email and messaging",
    "text classification"
  ],
  "licenses": [
    {
      "nameNullable": "Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)",
      "name": "Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)",
      "hasName": true
    }
  ],
  "collaborators": [],
  "data": []
}