Because domain names are spelled with Latin characters, the WWW is not truly internationalized, right? Casual thinking says yes. But I disagree.
It's not that I don't understand the need of a billion-plus population of China to use familiar names for web sites in their own language. So let me explain a few points that perhaps we ought to consider.
We managed to get by for many decades using phone numbers, which comprise only digits. Phone numbers are not friendly, but they work. The great thing about phone numbers, is that it's easy to give out a phone number to someone else over the phone. Let's call this the "phone test": Can you easily give this "name" or "address" out over the phone?
The Internet introduces new addresses that contain not just digits, but also Latin letters and some punctuation marks. As an added benefit, many Internet addresses are pronounceable. Thus, we can say ebay.com or yahoo.com, and not just spell it. Web addresses, in this regard, appear to be a big improvement over phone numbers.
Or are they?
If you are Amazon.com, or drugstore.com, or the owner of a "nice" domain name, you have good reason to be happy. However, there are millions who will never be able to get a "nice" name. This is especially true for email addresses or other online names such as AOL screen names. Is doug3390 a "nice" name? At some point we start to run out of "nice" names. And many Internet names -- think email addresses -- are not particularly good under the "phone test" criterion.
So, what about internationalized domain names? What about a domain name <Chinese word for "weather">.cn? The Chinese language is not English. In fact, the Chinese language is about as different from English as can possibly be. Note that you do not "spell" Chinese words: you write them. If two Chinese speakers talk on the phone, one cannot "spell" a name for the other one. The reason is simple: there are far too many Chinese characters to assign names to them all. It's this feature of the Chinese language that makes Chinese Internet names difficult. Because you cannot spell Chinese words, that limits useful names to combinations of words that the majority of educated Chinese citizens know how to write.
Who wants friendly WWW names in languages other than English? Marketers. Marketers in China want their web address to be <Chinese company name>.cn. This is especially true of the big companies with names that are household words in China. The average educated Chinese citizen knows how to write that company's name, and a friendly web name makes it easy to get to the web site. For lesser known names, the benefits of friendly Chinese names are more dubious. Consider email addresses: with many millions of Chinese having the name Yau, there can be only one lucky person who gets the address <chinese character for Yau>@hanmail.net. The other 999,999 Yaus might do just as well to have an email address that is all digits: at least it easily passes the "phone test."
My opinion is that we should leave the DNS as it currently is, with names that use Latin characters and Arabic numerals. These characters are universal. Yes, these characters are used daily in Chinese newspapers. Because these characters have names in every language, names constructed from these characters can always be spelled, and they can therefore pass the "phone test." Current Internet names will just work, just like telephone numbers just work. The fact that some Internet names happen to be pronounceable in English and a few other Western languages is a bonus, but nothing more. If allowing Latin characters seems too Western, why not fall back to Arabic numerals, so that Internet addresses become like phone addresses (that is, phone numbers).
There is a better solution to "friendly" names: directory services. RealNames would have been the perfect solution. The Chinese government could take some leadership here and create their own RealNames-like service for their people, just as the US government took the lead in creating the Internet. The directory service could be layered on top of the DNS. The result would be that ordinary Chinese citizens could easily get to popular web sites using friendly names, and the rest of the world could get to the same web sites by URIs that can be spelled in any language.
On the other hand, if we allow Internet names to use Unicode characters, we run the risk of the Internet becoming more provincial. Consider the name <some chinese word>.cn. There is very little chance that a person who does not know Chinese will ever be able to type that name. Keeping the DNS as it currently is keeps the number of characters to a minimum, and maintains a lowest common denominator for the entire world.
For a different viewpoint, news.com has this article: Is the Internet truly global? As a simple rebuttal, let me say that making the Internet and the WWW more friendly to everyday people is a worthy goal. But there is a better way to do it than an overhaul of the DNS. The question really comes down to this: How many levels of indirection make sense? Resolving a DNS name leads to an IP address. DNS names are perhaps friendlier than IP addresses, but DNS names also serve a few entirely different purposes, like redunancy, stability, load sharing, and more. These purposes are at a lower level than "friendliness," and I believe that argues for yet another level of indirection. So, rather than overhaul the DNS, I believe it makes sense to have a friendly name that resolves to a DNS name that resolves to an IP address -- two levels of indirection. I get the feeling that the push for IDNs comes from marketing types who don't have as much concern for technical issues as marketing issues.
If your native language is not English -- especially, if your native language is based on characters other than Latin characters -- I would be interested in hearing your viewpoint on this issue. Write me: doug /dot/ sauder /at/ ieee /dot/ org.
Not so many years ago we would pay quite a bit more for a home computer. Example: In 1997 I bought a Pentium Pro 200 for something close to $3000. When you consider how technology was changing at the time, it seems like a lot of money. And it was. Let's suppose you bought a computer at that time for $2400. You were lucky if it it had a useful lifetime of four years, or forty-eight months. That works out to $50 per month. All that just to be able to read email, to browse the web, and to use Microsoft Word and Excel to work at home.
Fast forward to 2005. Computers are less expensive. Needs change, too. I'm considering the options for getting a host with a public IP address. One of the options is Linode, which offers a shared virtual Linux host with a public IP address. The prices seem reasonable, compared to what we paid for a PC not so many years ago. For $20 a month, one could run a small web site, a mail server, a teredo server, a STUN server, or other geeky applications.
I just finished reading Peer-to-Peer: Harnessing the Power of Disruptive Technologies. I hope to say more about this book in a future blog post.
The book got me thinking about identity and reputation. These two concepts form the basis for commerce. Because identity is so important, I decided that I ought to solidify my own online identity. I'll start by choosing a public email address: doug /dot/ sauder /at/ IEEE /dot/ org. While this email address is virtually unpublished at the moment, it will eventually be overrun with spam. That's okay. I plan to use a spam filter to toss all spam into the trash. I do not plan to scan any messages marked as spam: I can't afford the time to do so. Still, I believe anyone who wants to reach me via that email address will be able to. Getting past the spam filter is easy: don't include any images, don't send HTML text, mention words that likely to be interesting to me, etc. Another tip: make the subject line relevant.
Just when I was about to conclude that there would never be a Net Beans or Eclipse for C# development on Windows (not mono), I learn about #develop (pronounced "sharp develop"). Why would anyone care? Well, an install of Visual Studio .NET 2002 requires 3 GB of space on your hard drive. That's not something you can afford on a laptop that is several years old!
I do pretty well using emacs to edit code. But for debugging, command line tools are not productive. I'm a step-through-the-code kind of guy. I'm definitely not a debug-by-print-statement kind of guy. If only I could just write bug-free code, I wouldn't need to worry about debugging!