{
  "question": "How effective are the real-time blacklists and other checks in reducing the amount of junk mail and what impact does delaying tactics have on good and junk mail processing in a corporation's email servers?",
  "answer": "Real-time blacklists and other checks block around 34–36% of incoming messages. Despite this, around 65–70% of the accepted mail is still junk. The proposed delaying tactics involve a 4-hour delay for new senders and a 12-hour delay for predicted junk mail, which would result in 26% of good mail being delayed. Additionally, 95% of junk mail could potentially be rejected if spammers and virus-infected machines do not retry sending.",
  "question_type": "causal_reasoning",
  "contexts": [
    "3 Characteristics of email trafﬁc\nTo evaluate existing responses and developnew ones, we collected data from the logﬁles of two Internet-facing email servers in a large corporation, as detailed in Ta- ble 1. Server1and server2are the primary and secondary server for a single email domain. These servers ran a virus checker (Sophos [26]), and a spam ﬁlter (Spam As- sassin [25]), using MailScanner [33], so for each SMTP transaction it is possible to discover from the log ﬁles whether received mail contained viruses or was ﬂagged as spam. Server1 and server2 did not store mail for read- ing or check the addresses of recipients, but forwarded mail to other servers. Undeliverable mail was thus in- dicated by a failure to deliver to these servers, and was recordedin the logs. The logs also includedrecords from incoming mail blocked because of real-time blacklists and other heuristics. This data gives a good picture of a server’s-eye view of spam.\nThese servers are listed in public MX lookups, but also receive some mail from other mail servers within the corporation (they are part of an internal mail routing chain). Because the intent of this work was to com- bat spam entering a corporation, we removed the data corresponding to this “indirect” mail, leaving only the “direct” messages—interactions with other mail servers over the Internet.\nBoth servers subscribed to real-time blacklists, and per- formed other checks on the sender’s address before ac- cepting messages. These mechanisms rejected around 34–36% of incoming messages (see Table 2). How- ever this is only partially effective at reducing spam, as\nTable 1: Details of the data collected.\n<<tab-63a7f96ba3480868db9ffad38e97b656>>\n<table> server length number number (days) of messages of recipients server1 69 855,228 1,229,459 server2 69 755,565 1,097,169 </table>Table 2: Effectiveness of real-time blacklists and other blocking responses. While blacklists and other checks block around 34–36% of incoming messages, the ac- cepted mail is still mostly junk, as shown in Table 3.<<tab-a04d5b60d83ad2388b01d1762e7a610b>>\n<table> type server1 server2 number % number % rbl 256,700 19 207,144 18 open relay 119,161 9 95,536 8 other checks 112,555 8 81,170 7 total blocked 488,416 36 383,850 34 total accepted 855,228 64 755,565 66 total attempts 1,344,960 100 1,139,415 100 </table>Table 3: Breakdown of mail messages by type for each server. The percentages for spam, undeliverable and virus do not sum to 100%, as messages can fall into mul- tiple types. The last two rows show totals of accepted good mail and accepted junk mail, where junk mail is any one of virus, spam or undeliverable.<<tab-249fd2e731864f5a1155992dc06114f4>>\n<table> type server1 server2 number % number % good 260,348 30 262,941 35 spam 497,554 58 414,234 55 virus 2,749 0.3 2,371 0.3 undeliverable 364,487 43 298,169 39 total accepted good 260,348 30 262,941 35 total accepted junk 594,880 70 492,624 65 </table>around 65–70% of the accepted mail is junk, as shown in Table 3.",
    "numbertypepercentage\n<<tab-6f469f1f71a37c8151d6154ad5591567>>\n<table> good delayed by 4 hours 33,021 13% of good good delayed by 12 hours 34,899 13% of good junk mails rejected 565,922 95% of junk </table>5 Testing the responses\nThis section provides some evidence for how much ef- fect the responses described abovewould have on reduc- ing the amount of spam processed, and on reducing the effect of the volume of junk mail on the ﬂow of good mail through a mail server.\nFirstly, we consider the effect of tempfailing new servers and servers predicted to send junk. Unfortunately it is very difﬁcult to test this just using log data, as the re- sponse is to request the remote sending server to retry later, and it is difﬁcultto predictthe behaviorof the send- ing server.\nA best-case estimate would be to assume that spammers and virus-infected machines do not retry, but that good senders do. Thus out of 855,228 messages accepted by server 1, if a 4 hour delay was used for new senders and a 12 hour delay for predicted junk mail, the effect would be as in Table 7. Only 26% of good mail would be de- layed, and as discussedabove,a signiﬁcant proportion of this mail is likely to be misclassiﬁed junk. If this sort of system were widely deployed, it is likely that spammers would implement retries. This would cause the amount of junk mail rejected to decline considerably.\nIt is much easier to predict the effect of post-acceptance responses. We do this by constructing a model of a mail server that allows us to calculate the time taken to pro- cess a mail message, under different amounts of loading. We can then alter that model to incorporate prioritization schemes and predict the ﬁnal performance of the system.\nThe initial model of the system is shown in Figure 6 (a). This is a generic model of a mail server that includes a mail scanner or filter. The incoming mail is handled by an SMTP process that writes the mail to a local disk or spool gars, taking time ty. The mail is then loaded, scanned and placed in a second spool gour by the mail scanner, marking the mail accordingly (normally by writing a header), and taking on average tsc. an. The mail is then taken from this second spool and delivered<<fig-6e486d6fd045d780624aa2df44a585c6>>\n<figure> (a)  t  IN  t  SCAN  t  OUT  Incoming  Mail  Outgoing  SMTP  Scanner  SMTP  q  MS  q  OUT  (b)  t  IN  +  t  CLASSIFY  t  SCAN  +  t  SCHEDULE  t  OUT  Incoming  Mail  Outgoing  SMTP  Scanner  SMTP  q  HI  q  LO  q  OUT  </figure>Figure 6: (a) Model of basic mail server. Incoming mail is handled by an SMTP process, before being scanned for viruses and spam by the mail scanner, and being de- livered using a second SMTP process. (b) Model of mail server with prioritization. The incoming process places mail in one of two queues depending on the predicted message type. The mail scanner selects messages from the queues using a scheduling algorithm.by the outgoing SMTP process, taking tour. The over all time will be try + tscan + tour plus the time that mails spend on the queues waiting to be processed. As each of the spools is effectively a queue, this system can be analyzed using Queueing Theory [10]."
  ],
  "hints": [
    "MailScanner",
    "mail scanner",
    "incoming mail"
  ],
  "rewritten_question_specific": "How effective are real-time blacklists and other checks in reducing junk mail, and what is the impact of delaying tactics on good and junk mail processing in a large corporation's email servers based on server log data?",
  "rewritten_question_obscured": "What is the efficiency of using blacklists and additional checks to decrease spam, and how do delay strategies affect the processing of legitimate and spam emails in a corporation's email infrastructure?",
  "complete_answer": "Real-time blacklists and other checks block around 34–36% of incoming messages. Despite this, around 65–70% of accepted mail is still junk. Delay strategies involve a 4-hour delay for new senders and a 12-hour delay for predicted junk mail, resulting in 26% of good mail being delayed. Additionally, 95% of junk mail could potentially be rejected if spammers do not retry sending. However, spammers may implement retries, reducing the effectiveness of these strategies. The delay strategy assumes good senders will retry, while spammers and virus-infected machines will not, but this might change if such systems are widely adopted."
}