Spam
No discussion of email would be complete without a discussion of spam. Spam's official names are UCE (Unsolicited Commercial Email) or UBE (Unsolicited Bulk Email). Spam is known to most of us as junk mail. It is flooding the Internet with millions of messages that you don't care about and don't want. If you have an email address, it's inevitable that you'll eventually end up on a spammer's list.
The problem with spam is that there are so many people on the Internet that marketers discovered it was an effective way to target their advertisements. Pop-ups, pop-unders, banners, and cookies have surrounded our every Web move, like locusts on a grain field. These were just mere annoyances until spam started to flood our email boxes. The legitimate email advertisements from real companies were one thingyou could get off their lists. The illegitimate spammers are the ones who have created the monster.
These senders lie and cheat, and you can't even respond to them. The email headers are forged, and the ISPs do not really exist. Therein lies the problem. Without controls, chaos ensues. Spam is chaos on the Internet. SMTP, the email protocol, was never designed with the idea of cheats and liars. It was designed with the assumption that people would be honest. It was never foreseen as a problem; therefore, there was never a need to verify the sender's identity or location. Always remember that many of these solicitations are cons and scams.
Combating Spam
The Coalition Against Unsolicited Commercial Email Web site (http://www.cauce.org) puts the problem of spam in perspective:
The great economist Ronald Coase won a Nobel Prize talking about exactly this kind of situation. He said that it is particularly dangerous for the free market when an inefficient business (one that can't bear the cost of its own activities) distributes its costs across a greater and greater number of victims. What makes this situation so dangerous is when millions of people only suffer a small amount of damage, it is often more costly for the victims to go out and hire lawyers to recover the few bucks in damages they suffer. That population will likely continue to bear those unnecessary and detrimental costs unless and until their individual damage becomes so great that those costs outweigh the transaction costs of uniting and fighting back. And the spammers are counting on that: they hope that if they steal only a tiny bit from millions of people, very few people will bother to fight back.
In economic terms, this is a prescription for disaster. Because when inefficiencies are allowed to continue, the free market no longer functions at peak efficiency. As you learn in college Microeconomics, the "invisible hands" normally balance the market and keep it efficient, but inefficiencies tip everything out of balance. And in the context of the Internet, these invisible market place forces aren't invisible anymore. The inefficiencies can be seen every time you have trouble accessing a Web site, or whenever your email takes 3 hours to travel from AOL to Prodigy, or when your ISP's server is crashed by a flood of spam.
You can read more on the subject of spam at http://www.cauce.org/about/ problem.shtml.
Is the Internet Full Yet?
The biggest problem with spam is that it clogs up the Internet. From small ISPs to large companies, the strain is felt. It makes mail servers work harder than they need to, server owners need more server space than is really necessary, and the end users spend at least 5 minutes a day (and often as long as 20 minutes) sorting and deleting unwanted spam to find their real email.
Just to get a basic idea of the volume, let's take the example of a high-profile company such as AOL, for instance. It has around 26 million users, and if each of those users receives an average of 35 spam email messages a day, the total is 910 million spam messages systemwide. AOL stores messages on mail servers. If each of those 910 million spam messages are a nominal 5 K each, that would be almost 5,000 GB of information that AOL must store and manipulate each day. If this number continues to escalate, you have to wonder at what point does the system break down? How much can the Internet, even with its vastness, really hold?
At the "Block All Spam" Web site, Dick Lipton, a Georgia Institute of Technology computer professor, commented on this subject. He said, "If you plot the growth of spam on any reasonable chart, clearly at some point it will exceed the capacity of the entire Internet."
In a report released to the media, Ferris Research claimed that, in 2002, spam accounted for $8.9 billion in cost to U.S. corporations. Spam is growing. It is expected that this cost will rise. It's been estimated by various sources that spam growth in the last year has been over 100%, which seems to be a conservative number.
How Did the Term Spam Become Associated with Junk Email?
There is debate as to where the term spam originated. Two of the most likely places are:
An early Monty Python song, phrased as follows:
"Spam spam spam spam, spam spam spam spam, lovely spam, wonderful spam..."
This song is seen by many as an infinite recurrence of insignificant words.
A computer lab at USC (University of Southern California) in Los Angeles that coined the term from the meat by the same name, claiming they shared several of the same characteristics:
Nobody wants it or ever asks for it.
No one ever eats it; it is the first item to be pushed to the side when eating the entree.
Sometimes it is actually tasty, like 1% of junk mail that is really useful to some people.
Spammers' Tools
Spammers are an ingenious group. They have created a number of tools to track down email addresses. It's akin to cancer. The tools either creep along the Internet searching, bombarding email servers to try and figure out new email addresses or sending out spy messages to get some idea what you are interested in viewing and verifying your email address. New schemes will come, but the most tried and true are harvesting, dictionary attacks, and HTML mail.
Harvesting and Dictionary Attacks
There are various tools to harvest email addresses to use for more spam generation. The most common of the tools are Web-crawling bots (short for robot). These are also called spiders. They go from Web site to Web site, looking only for email addresses. They find them in Internet mailing list addresses in Web sites, on forum posts, in chat rooms, and elsewhere. Another spam tool are programs called dictionary attacks, which use various combinations of letters to compose and send emails to see what gets bounced back or what seems to be deliverable.
The Direct Marketing Associations Stance on Spam
The DMA (Direct Marketing Association), founded in 1917, is the largest business trade association interested in direct, database, and interactive global marketing. Its members include catalog companies, direct mailers, teleservices firms, Internet marketers, and other at-distance marketers.
In an April 30, 2003 press release, the DMA announced that commercial messages should not be sent when email addresses have been captured surreptitiouslya practice often called harvesting. In addition, the DMA announced its position against the practice of automatic algorithmic email addressing, also known as dictionary attacks, that spammers use in mass untargeted mailing campaigns or in order to ascertain live addresses.
According to The DMA, both practices constitute abuses to the right to send email legitimately and could ultimately undercut email as a valuable business and communications tool. The DMA's announcement was the latest move in its antispam campaign. In addition, the DMA is calling for bolstered law enforcement of current consumer antifraud laws, as well as federal legislation, among other things, to combat unscrupulous spam.
Legitimate Marketers Do Not Spam
The spam problem is caused by hucksters. "Even other vocal antispam advocates agree that legitimate marketers are NOT the problem," said H. Robert Wientzen, President and CEO of the DMA.
The DMA requires its members' email solicitations to represent the four pillars of reputable email:
Honest subject lines
Accurate header information that has not been forged
A physical street address for consumer redress
An opt out that works (opt is short for option, meaning, to give one the option not to be included in the mass mailings)
For more information on the DMA, go to http://www.the-dma.org.
HTML (Spy) Mail
Advertisers and spammers have come up with better ways to track the exact receipt and time a message was "viewed" (even if you simply opened it by mistake). When you download and open one of its emails, it in turn opens an image that it grabs from its host Web page. At the same time, it sends information about your actions and your machine. In effect, that email sends out information that can be cross-referenced by the sender to track who received the message, when they read it, how many times, and from what IP address. It's used by advertisers to intrude a bit more on your privacy. The spammers verify an email address as active, so they can make money by selling their email lists, which means even more spam will be sent your way.
SPAM Mail versus Opt-in Email
A user can broadcast or multicast an email message to users. These are legitimate and convenient ways to target messages to several people at once. There is a downside, too. It can also be used to illegally send unsolicited advertising junk email to millions of users without permission from the service-provider host.
Broadcasts
Email broadcasts are when a single individual sends an email to all users on a network. It can be the best way for companies to send bulletins to all their employees. However, to simply blast out a broadcast to the entire subscriber base of a global ISP (such as AOL), it's both an amazing annoyance for everyone and explicitly forbidden by the service. It's illegal because it breached usability policies. (Unfortunately, this is rarely enforced because of the nebulous nature of spam-mongers.) Broadcasting complicates network traffic and, in some cases, has caused total network meltdowns. In the case of broadcast storms, when a broadcast message receives several responses and those, in turn, receive yet more responses, a snowball syndrome ensues, the effects of which can be catastrophic for a network mail server.
Multicasts (Narrowcasts)
Multicasts, also known as narrowcasts, are when users send email to a select group of recipients. Email lists are a terrific example of multicasting. An email list is a list of people who subscribe and look forward to receiving missives. Most email client applications support mailing lists and give you the ability to forward a single message to every recipient on the list. Sometimes, though, lists containing hundreds to thousands of email addresses are spammed with unauthorized advertisements that take up tons of network bandwidth.
Spam is Junk
A year ago, the estimate was that one in five spam "advertising" messages was from legitimate companies, but the other four fifths were pure junk or frauds. (This number has probably changed to something like one in ten.)
The legitimate companies offer a way to "opt out" of the mailing list. The spammers don't. Occasionally they have an opt out to try to look like a legitimate business. Their links either don't work or take you right to the site you are trying to avoid. As often as not, the addresses the email originated from are bogus, faked, or already closed accounts. The spam just wants you to go to a Web site. The Web site rarely has any information to contact anyone. More often than not, the Web site will, at the very least, plant a cookie on your machine and at the worst, a virus.
The Federal Trade Commission did a study that concluded that as many as two thirds of the email sent out are telling lies, and 96% of those offering ways for people to make money in business or investment are also frauds. Spamincluding all those mail-order brides, miracle pills, and invitations to Web sitestotaled 6.7 million emails in the month of March 2003.
Spam Breakdown
On any given day, the typical Spam Junk mail breakdown can be as follows:
Chain letters, pyramid schemes, get rich quick20%
Porn sites and sex-related solicitations20%
Real product advertising20%
Financial offers (refinance your home)10%
Medicines, drugs, and quack remedies10%
Figure 10.1 shows an example of an inbox filled with spam.
At least 20 states have passed laws outlawing spam. But how do you bring someone to justice if they are hiding behind spoofed IP addresses and forged email headers? Clearly, something needs to be done.
The Great Nigerian Scam Letter
The following is for those newcomers who have not received this infamous spam letter (or its newer variations) telling you that your name was purportedly given to someone in Nigeria and you were a "trusted person who can help." For some long-winded reason they have access to a large amount of moneyusually around $20 millionthat they need to get out of the country. You can help them. And to help them, they'll give you 1020% of the money as a fee. Then the letter gets vague with off-the-wall and weird details. Anyway, one version of the scam works to convince you money will be transferredby wireinto your bank account. You must get all sorts of wire transfer information, and soon your account is drained of whatever money you had. It's transferred out, not in.
Duh!
What makes this interesting is that with spamming techniques, the con men do not have to spot a "mark" and target him or her anymore. Just try to scam everyone in the world and see what happens. This scam actually did begin in Nigeria and predates the Net. Apparently, real letters were sent out by hand with an elaborate package of documents. Care went into finding the right sucker. With spam broadcast mailing, this is no longer necessary. What this says to me is that quality con jobs are going to be a thing of the past, as Darwin takes over. The dumbest get ripped off instead of the richest marks.
This is urban folklore at its finest. The scam has moved from Nigeria to all over the place. (I actually doubt any Nigerians even do this scam anymore.) The last version I received was from a Mr. Nosa (no first name given). He is hiding out in the Benin Republic and has millions of dollars he needs me to help him get out of the country. Now the way this approach used to work is that you'd do a Google search of Benin and Nosa and find out there is some guy in Benin named Nosa who is famous, hiding out there, and loaded with dough. These days if you type this in, you get hit after hit regarding the Nigerian scam.
I'm impressed with the way the classic scam letter has morphed over time but still appreciate the original Nigerian scam where there is a crooked banker trying to move money out of Nigeria. I love the unique names of the supposed letter writers. My favorites include: Sandra SaviMBi and Joseph SaviMBi. SaviMBi is a popular name, apparently. Then there is Moses Mutolezi, Helen Khobi, Issa Gwazo (a personal favorite), Dan Ogaga, and Prince Tunji Abu, who apparently can type only in uppercase. Finally, I've received letters from Mahmud Daya, Prince Ahmadu A. Ahmadu (whose middle initial must stand for Ahmadu), and Dr. Francis Oputa. And a last mention goes to two identical letters from two different fakes; Dr. Thomas Okon and Dr. Raymond Okoro, both of whom were the "bank manager of Zenith Bank, Lagos, Nigeria."
Isn't the Net wonderful!
Dvorak Discusses Spam Killer/Blocker Software
The spam situation has worsened, and although people have advocated government intervention, this will just move the worst spammers offshore (where nothing can be done). This is an opportunity for some sharp operator to make money by doing real spam prevention with a good product.
Spamnix is rule-based filtering software (http://www.spamnix.com), and it works as a plug-in for Eudora. What makes Spamnix interesting is that it uses a set of open source governance rules to spot spam. These rules are based on a simple checklist that looks for features commonly found in the majority of spam. If you happen to get a newsletter that appears as spam because of forbidden features (e.g., ALL CAP headlines, unsubscribe comments, weird headers), you can simply put the newsletter in an exception file, and it comes through fine. From my experience, Spamnix manages to stop about 70% of incoming spam, although the company claims 95%. Still, 70% is quite good, and you can adjust the sensitivity levels of the system.
For more information, go to http://www.spamnix.com.
Spam Blockers, Spam Filters, Spam Killers
Filtering email is a hot topic because of spam. There are many different approaches to blocking spam. There is the strategy of filtering messages for specific contentemail addresses or other types of data. There are whitelist or verification filters, where the sender of the email is checked. Another ploy is blacklists, which block known spam senders. Rule-based filtering evaluates a number of different patterns to try to "figure out" what is spam and what is not. Spam blocking is never 100% effective. Any type of strategy can also block a significant proportion of real email, as well.
Mail Bombs
Mail bombs are another example of email gone awry. A mail bomb is a massive amount of email sent to a specific person or system. The volumes of mail will fill up disk space on a server, or in the worst-case scenario, actually crash the server because it's too much for it to handle. Mail bombs have been used to punish Internet users who have angered someone for slights or wrongsreal or imagined. Most mail bombs are defused now with programs added to mail servers, such as Exim, an open source program.
Header and Text Analysis
Simple strings in headers, the subject line, or the body of the email text can be filtered. This is easy to do. Most email clients have at least this kind of filtering activated. The idea is good, but they often have a high false positive. (In other words, they identify real email as spam.)
Email Authentication
Email authentication is a method of blocking spam that requires senders of email to authenticate that they are the originators of the email. An email doesn't get delivered until the email server receives a confirmation from the sender. Once an email server has this information, it will deliver email from that sender as long as the sender continues to use the same email address. Should it change, the sender must reconfirm.
The problem is that this creates a barrier to communicating that some people really resent. The message, "reply to this to authenticate," that is required before your sent email will be received by the recipient has not been well accepted yet. Perhaps as the spam problem gets worse, this may become the only option.
Rule-Based Filtering
This is a type of filtering that looks at a large number of patterns and compares them with an incoming message. It ranks the email based on the number of patterns identifiedso if there is too high a score, the email is deemed spam and disposed of.
Some scoring schemes are consistentthe use of forged headers and auto-executing JavaScript, for example, will instantly be deemed spam. Other rules are updated as the products change. As spam evolves, so must the rule-based filtering systems.
SpamAssassinhttp://spamassassin.org
Spamnixhttp://www.spamnix.com
Spam-Smart Blocking
There are some proprietary spam-blocking techniques that have started to appear that use a combination of techniques blended together to make a dynamic smart-blocking option. This type of strategy was designed to block a polymorphic spam attack.
Polymorphic is a term that means a dynamic, rapidly changing variation. In a polymorphic spam attack, the spam can alter itself, making each version different from others that resemble it. This usually looks like a string of garbled letters. The polymorphic email will continually alter this string of letters with each email. The strategy to deal with these is to look at more information than conventional filteringwith dynamic features to change as the spam changescombined with automatic rule updates, and other blocking filters.
Other Filtering Methods
One idea bandied about is the Bayesian probability models of spam. This is the idea that some words occur more often in spam, whereas other words are found in legitimate emails. (This is probably true, because none of my legitimate emails have ever arrived using the words "Russian bride" or "rock-hard all night" from any of my friends, business associates, or relatives.) A similar concept is Vipul's Razor, a collaborative spam-tracking database. It works on the theory that spam of a specific kind will come in at the same time in an avalanche. The filter will detect the duplicate messages. It will deliver the first but delete the rest.
Blacklists
This isn't very effective, given that most spammers use forged headers and email addresses, but it's a way to block known spam addresses. The idea here is to find consistent IP addresses, sites, or servers that are delivering a lot of spam and simply block all traffic.
Hand Filtering
Most people use hand filtering by reading each subject line and deleting the ones that appear rather spammish. Sometimes one may get opened in error, but for the most part, it works.