January 31, 2004

MyDoom: Lessons Learned

MyDoom is the latest worm unleashed on the Internet. It spreads by sending itself as an email attachment. There's nothing unusual here.

But how does it spread, really? I created a "Hello, World" Windows executable file and mailed it to myself as an attachment. When I received the attachment, I tried to open it. Here's what I found out: Outlook would not let me get to the attachment at all. I couldn't open it. I couldn't save it. Nothing. Outlook Express was the same as Outlook. Mozilla Mail would only let me save the attachment; it would not run the executable. IncrediMail, which my wife uses, displays a warning when you open the attachment, and if you choose to disregard the warning, IncrediMail will open the attachment and run the executable.

Considering that Outlook and Outlook Express are really protective, and that Mozilla Mail is probably protective enough, and that IncrediMail is probably not widely deployed enough to spread a worm, how is MyDoom spreading? Two thoughts: First, there may still be a lot of people using older versions of Outlook or Outlook Express. That's bad. Second, MyDoom often arrives as a zip file. It turns out that receiving a zip file containing an executable takes only about two extra mouse clicks to run, if you have WinZip installed or if you are using Windows XP.

I tried another experiment. I created a zip file that contained the "Hello, World" executable file. Then I sent that zip file as an attachment to myself. Here's what I found out: If you have WinZip installed, when you open the attachment, Outlook starts WinZip and WinZip opens the zip file. Now, if you double click on the icon in the WinZip window, WinZip runs the executable. The whole process takes less than five seconds. If you don't have WinZip installed, but you have Windows XP, then the procedure is similar, except that Windows Explorer takes the place of WinZip. (I checked this with the Windows XP Home Edition.)

So, who's to blame about this MyDoom worm? If you ask Microsoft, it's the person(s) who created the worm. Microsoft is right, of course. But what can we do, if anything, to avoid a repeat of MyDoom? Larry Selzer, in "MyDoom Lessons: Failures of Education, Antivirus Vendors," claims that user education and anti-virus services have failed. He believes that we cannot possibly rely on user education to stop the spread of worms through email, and that anti-virus services failed to respond in time.

While I think he makes some good points, I have my own ideas. First, it should not be so easy to run an executable file that comes in an email attachment. With Windows XP, or with WinZip installed, it takes only two more mouse clicks -- probably another three seconds -- to run the executable file. There is no excuse for making it that easy. I understand that these utilities, Windows Explorer and WinZip, make it easy for users to view the files that are in a zip file. But there is no reason why they should launch an executable. I believe Microsoft and WinZip should be condemned by the security community until they fix this security hole.

Second, Windows needs to break from its DOS history and change the way executable files become executable. In Windows, a file becomes executable if it has the file extension of an executable file, like .exe, .bat, .scr, .pif, and a few more. Since the file name is never changed, an executable file is always executable, even when one sends it as an attachment.

If DOS and Windows were all we knew, we would probably think that this is all quite normal. But in Linux -- and I assume Mac OS X -- a file is executable if its executable bit is set in the file attributes. When one sends an executable file as an email attachment, the file attributes are not sent, and the attributes are set to a default value when the attachment is saved. The executable bit is off in the default value. The net result is, that the sender of a file cannot make that file executable; only the recipient can make the file executable.

This difference between Linux and Windows is profound. Data and code are different, and it's a fundamental security principle that data and code be treated differently. Open a file explorer application. Double click on one of the icons. What happens? What should happen? If the file is data, it starts an application that knows what to do with that data. If the file is an executable file -- that is, code -- it launches the executable. This may not sound like a big deal. But when we consider email attachments, in the case of Windows, it's the sender who decides if the file is code. In the case of Linux, it's the recipient who decides if the file is code. The reason this matters is because ordinary users don't normally make a distinction between data and code. All they know is that they double click on an icon. Utilities like the Windows Explorer and WinZip allow them to think this way. It's the same action to open a data file and to launch an executable file. Before computers were well-connected, this may not have been an issue. But when one can send a file and decide for the recipient whether that file should be treated as data or code, the consequence is the spread of nasty worms.

There's a fairly simple, short-term solution to this problem. Create an anti-virus email scanner that finds every email attachment and changes file extension so that the file is no longer executable. For example, the scanner should change the file name readme.exe to readme.exe.dat. Then, if the recipient really wanted to make the file executable, he would have to save the file and change the file name. This is similar to a Linux user having to change the executable bit to make a file executable. Because of the WinZip vulnerability, the scanner would also have to look inside every zip file and change the file names of executable files.

In summary, I think the MyDoom worm demands two action items. First, security experts should complain to WinZip and Microsoft about their running executable files that have not been extracted first by the user. Second, anti-virus tools should change the file names of executable files in attachments so that the files are not executable.

Posted by Doug Sauder at 08:10 AM | permalink

January 22, 2004

But What About SOAP?

Simon St. Laurent doesn't care too much for XML Schema. I absolutely agree.

Simon's article doesn't mention SOAP. If XML Schema is destined to fail as a technology, won't SOAP fail, too?

Posted by Doug Sauder at 09:33 PM | permalink

Sender Permitted From (SPF) in the News

Sender Permitted From (SPF) in the News

AOL has started testing Sender Permitted From (SPF).

SPF is an attempt to restore some sanity to the Internet mail system. Under SPF, a mail domain provides a DNS record that lists all the IP addresses that are permitted to send mail as that domain. So, you couldn't send a message claiming to be from aol.com if you were not really aol.com. To most people, this sounds like it's the way things ought to be. But some technically knowledgeable people think otherwise.

SPF is so modest in what it tries to do, I really don't understand the objections.

I get the impression that people are starting to get serious about fighting spam through technical means. I predict that some kind of sender authentication becomes adopted by the biggest email service providers in 2004. Yahoo likes Domain Keys. AOL apparently likes SPF. If I were a betting man, I would put my money on SPF, or whatever it eventually morphs into.

There are some really nice things about SPF. Foremost, it's such an incremental step that it's easy to see how it would come to be widely deployed. There's no disruption. Besides that, it makes the sending domain more important, which places a cost on spammers. That cost would provide the economic disincentive against spam. Here's how it works: Instead of blacklisting IP addresses, which are plentiful, we blacklist domain names, which are less plentiful. We reject mail from domains that were registered within the last three months, which would prevent the current practice of spammers registering a new domain name every time they begin a new campaign. Or, it means that spammers would have to keep several months' inventory of domain names that are not blacklisted. In any case, a domain name that's eligible to send mail from becomes a scarce resource, which could put the squeeze on spammers.

I have to wonder about AOL, though. They blacklist every IP address that's assigned to a residential user. And, for their subscribers, they redirect outbound TCP port 25 to their own mail servers. With SPF, they wouldn't need either of these measures. Would they change their policy?

And one final thought about SPF that I have not heard mentioned by anyone else. Domain names are recycled. Before you register a new domain name, how can you know the history of that domain name? More specifically, how can you know whether or not that domain name is on any blacklists? There are just too many blacklists with too many policies, so the chances are, if that domain name was ever used by a spammer, you wouldn't be able to send mail reliably from it.

Posted by Doug Sauder at 08:01 PM | permalink

January 16, 2004

The Importance of Small Talk

Tim Bray mentions Edge.org's collection of "laws" and offers two of his own:

The First Law: When you’re explaining something to somebody and they don’t get it, that’s not their problem, it’s your problem.

The Second Law: When someone’s explaining something to you and you’re not getting it, it’s not your problem, it’s their problem.

I wouldn't exactly call it a "law," but I would like to offer my own contribution:

Never underestimate the importance of small talk.

I learned this lesson at a job interview. It was my second interview with the company, and I thought the first one went very well. At this second interview, I was interviewed by five people at the same time. During those few minutes that it took the five to enter one by one, there was no small talk -- there were greetings as people entered, but a lot of silence. As a result, the nervous tension never seemed to subside. The interview was the worst one I ever had in my life. The lack of small talk at the beginning was not the reason why the interview went so badly. But as I left, I remember thinking about the tension that never subsided, and it seemed very clear to me that a little small talk at the beginning could have helped a lot.

Small talk is important, because it helps to relieve nervous tension. Learn to engage in small talk at job interviews, sales meetings, or other business meetings. If it doesn't come naturally to you, take a few minutes beforehand to think of a few topics. You can always talk about the weather. You can ask a traveler how his flight went. You can ask about another person who is a mutual acquaintance. You can ask someone how long he has lived in the area. Just find something other than business.

By starting off with a couple of minutes of small talk, you can really help to get a meeting off to a good start.

Posted by Doug Sauder at 07:06 AM | permalink

January 15, 2004

XML and Information Theory

Greg Reinacker stirred up a controversy by announcing that he would update the Atom parser in NewsGator to parse and display badly formed XML. Those who criticize Greg's decision point to the fact that if parsers reject badly formed XML, then that puts pressure on content creators to fix their badly formed XML. Greg bases his decision on what he thinks his customers would want.

In this piece, I won't take a position one way or another. But I would like to enter some more information into the debate.

I claim that the ability to parse badly formed files is a feature of all text-based formats, including XML. If Atom were a binary format, like BER-encoded ASN.1, or like PNG, or like zip, then there would be no controversy. When a PNG file is badly formed, no one expects a parser to do anything but reject it. However, with a text-based format, whether HTML, or XML, or MIME, or mbox, or .ini, or any other text-based format, a clever parser may succeed in parsing a badly formed file.

Having a background in information theory, I see the situation from a more theoretical view. One of the features of text-based protocols is that the information rate is low. In other words, the message format contains a lot of redundancy. In XML, we can easily recognize the redundancy: white space between elements is insignificant, the closing tags are more verbose than they need to be (</> would suffice), and overall, XML is more verbose than it needs to be. But even at a lower level, the alphabet XML uses is constrained: most control characters are not allowed, and some characters such as '<' and '&' have special meaning. In contrast, binary protocols have a higher information rate, as redundancy is squeezed out.

It is a basic principle of information theory (Shannon's Theorem) that one can transmit information reliably over a noisy channel, provided one uses a clever encoding, and provided the information rate is below the channel's maximum rate. When we think about text-based protocols, we can think of the badly formed messages as having been affected by a noisy channel -- the defects are "noise." Because text-based protocols have a low information rate, that means, theoretically, it's possible to pass the messages through a noisy channel and convey the information reliably. In other words, the messages can be badly formed, but a parser may be able to recover the information.

It's interesing to consider how redundancy works, practically, to make text-based protocols more robust. I present one good example here. In many text-based protocols, CRLF has special meaning: it divides the message into records. CRLF is not allowed within the records. Because CRLF affects the way a message is displayed in a text editor, it is easy to see where CRLF occurs illegally. Therefore these errors occur infrequently. In contrast, in the BER-encoding of ASN.1, records are indicated by a length field that precedes the record. In general, text-based protocols use record separators (such as CRLF) and field separators (such as white space), rather than length indicators, and the separators are more robust. But separators may only be used if the content of the record or field is constrained; hence, the redundancy.

One more example. In XML, closing tags are verbose. We use </tag> where, for example, </> would suffice. Because we use verbose closing tags, it's possible to write a clever parser that can recover from certain errors, like a missing closing tag.

So, what's all this have to do with the controversy over the handling of badly formed XML? A simple observation: people will write forgiving XML parsers because they can.

Posted by Doug Sauder at 07:26 AM | permalink

January 13, 2004

What's more fun than blogging?

Some of my blogging time lately has been diverted to watching Friends, in German, on my PC. It's more fun than blogging. And it's more fun than I expected it to be.

Before December, I don't think I had ever watched an entire episode of Friends. I had watched enough only to recognize the main characters. But for a Christmas present, I asked for, and got, a DVD collection of the complete season 6 of Friends in Geman. It's PAL video format, and it's Region 2 encoding on the DVD, but with a $50 DVD player permanently set to Region 2, watching the show on a PC is no problem.

So, is this a good way to improve one's conversational German? Who cares!? It's fun! This is away better than just listening to audio in the car.

If you are wanting to learn a foreign language, I highly recommend watching videos like this!

And now, if you will excuse me, I'm going to watch Friends.

Posted by Doug Sauder at 10:13 PM | permalink

Domain keys in the news again

Yahoo hasn't released the details of its Domain Keys (DK) plan for signing email messages. But there are a couple of articles online. At eWeek, Larry Selzer is skeptical. He says:

Something of this magnitude isn't done unless it's really, really necessary. And (this is important) you absolutely have to get it right the first time.

Whether it's really, really necessary is hard to say. But he's right on about the need to get it absolutely right. You can't move millions of people to an updated email system, and then say "oops!"

Larry questions whether DK will stop the spam problem. We know it won't.

However, being able to confirm, with reasonable certainty, the mail domain that a message originated from does have value not directly related to spam. It could help to prevent "Joe Jobs," a situation where an innocent user gets flooded with returned mail messages or hate-mail replies because a spammer forged the return address. It's not a good idea to send a returned mail message from a spam filter because of Joe Jobs. But it's not a good idea to not send a returned mail message either, because sometimes spam filters catch the mail of innocent senders. Either way, returned mail notification, or no returned mail notification, innocent people are wronged. A good solution would be to send a returned mail message if the sender's domain is confirmed through something like DK.

To digress slightly, maybe sending returned mail messages is the wrong approach. Instead of sending these negative acknowledgements, we should be sending positive acknowledgements when mail is successfully received. But that's another discussion for another time.

A second article is at BusinessWeek Online. Here's an interesting quote:

A unilateral move from a powerful commercial entity such as Yahoo, however, threatens to overtake the Internet's governing bodies and could effectively cede control of e-mail technology standards to the mammoth ISPs.

I don't quite agree with the idea expressed. Many of the IETF's approved standards originated when someone created an implementation, then went to the IETF with a draft, which resulted in the formation of a working group, which eventually lead to the publishing of an RFC. If Yahoo submits working code and a draft specification, how is that any different?

One final thought about DK. Those of us who are accustomed to using open source software probably see an upgrade to DK as not much of an issue. However, for enterprises that married themselves to Microsoft Exchange, Lotus Notes, or Novell Groupwise, an upgrade to DK may cost them some real money. That would be a significant barrier to broad adoption.

Posted by Doug Sauder at 10:00 PM | permalink

January 07, 2004

The Value of the Whois Service

Wendy Seltzer discusses, in an online article, that ICANN is considering limiting access to registrant information. The problem is, that marketers mine the domain name registration information, then use that information for abusive marketing. One of the possible solutions being considered is to close TCP port 43, thereby completely disabling the whois service.

I get lots of spam at email addresses I have used in domain name registration. I know this problem first hand.

However, the domain name registration information could be useful in blocking spam. I have recently been looking at the domain names used by spammers in the URLs they send. In a very large percentage of cases, the domain name has been very recently registered. Typically, the domain name was registered only a couple of days ago. Most spammers register a new domain name for each campaign, because to do otherwise would allow domain names to be easily blacklisted. I believe one could use a domain name blacklist, together with the domain name registration information, to put up another barrier to spam.

Posted by Doug Sauder at 07:55 AM | permalink