If you follow me on Twitter, you will have seen me rant about an alpha version of some piece of software on my phone. While I will not be naming it here, because that is just not the point, it is clearly the reason for this blog post.
The application - lets call it FizzBuzz just to have a generic name - looks good, and seems to work okay (great, if you consider that it's an alpha version) except for one thing that for me made it totally useless. FizzBuzz contacts an internet service through (I think) a REST API. And at least one of the parameters involved is a decimal number. When I used the version of the FizzBuzz application that sparked this post, all I got was a fatal exception message box after which the applications was killed, which in itself was fine.
So some application crashed. What gives?
What's not fine however is the fact that the problem that caused the crash was a FizzBuzzException with a message a bit like "Invalid attribute 52,0987654321". The original FizzBuzz API no doubt accepts its decimal values with a dot. I live in the Netherlands however, so my phone is set to the (for me) correct local settings, which include a decimal comma. So, what went wrong?
Simple: the creator of the FizzBuzz phone application seems to have been working from a faulty assumption: in the .NET framework decimal.ToString() always creates its results "the same way I'm used to getting them". And that is plainly wrong. The whole question of globalization is just completely ignored then!
Globalization? Oh no, not that!
Don't get me wrong here. Globalization is a royal pain in the ass. So, I'm not saying that, just because the current .NET framework seems to be able cope just fine with globalization in almost every way, each and every application written using it should pass the complete Turkey test. Which by the way is in no particular way meant to be negative about Turkey in particular. ;-)
Also, this post is not about the FizzBuzz phone application showing times in 12-hour time (without AM/PM indication even) and dates in M/DD numerical format. I sure would like it very much if it would show this to me in 24-hour time and D-M format (either by making use of the current Culture settings, or by letting the user choose among a number of options in its Settings), but that is again not my point here.
What are you saying, then?
I just think there are minimum levels of "cultural awareness" for each and every .NET programmer. And knowing that ToString() (and anything directly or indirectly using it) is dependent on the current culture is part of that. Not just getting an "aha" feeling when someone (like me) tells you about this, but using this in your daily work every day.
What really helps with this is static code analysis: if you turn on rules CA1304 and CA1305 it almost becomes impossible to forget about these things. Especially if you turn on the Treat warnings as errors option, which might be a bit much for most.
Another basic that every .NET developer should be aware of is the invariant culture and when to use it.
Every time you are converting a value to a string for the purpose of showing it to the human user of your application, the assumption above is usually fine. Normally the .NET Framework will silently use the current culture which will be related to the local settings made on the device and everything should be peachy.
However, when you are creating a string for the specific use in an API to other software - like in the situation the FizzBuzz application was converting decimal values to strings - you normally need a specific format. Or to be correct: the API you use expects a specific format. And since most software development has historically been English based, you'd better use dots in your text-formatted decimal numbers, unless specified otherwise. For this, the invariant culture is perfect.
For some small bits of code then. When dealing with API use (not GUI to human user use) I often put the following in string related classes:
using System.Globalization; // ... class SomeApiStringHandling { private static CultureInfo invariant = CultureInfo.InvariantCulture; // ... void SomeMethodUsingStringFormat() { string.Format(invariant, "Format", data); // instead of the usual string.Format("Format", data); } // ... }
Also, don't forget about string.ToLowerInvariant() as opposed to the much more well known string.ToLower() (and the same for the variants for upper case) in situations where use of the invariant culture matters.
Conclusion
- Please adjust your basic assumptions (if you haven't already): assume the current culture comes into play when converting to/from strings at all times.
- When working with strings in relation to software based protocols and your "old assumptions" should hold, simply explicitly specify the use of the invariant culture.
- Only if you have a compelling need to (or when it interests you) should you actually read up further on the whole globalization situation, including the Turkey test, date/time/currency formats, multiple resource files for different cultures, etc.
Oh, and if any one of you still thinks that one char equals one byte, let me know. Perhaps I can then start talking about Unicode and other character encodings a bit, alright? Whole different can of worms entirely... ;-)
P.S. To be fair, by the time this blog post is actually published, the FizzBuzz application that was a reason for me to write this post corrected the problem and is now working fine. Just wanted to have this mentioned explicitly.
Could the world cup application at http://www.mobilepractices.com/2010/06/world-cup-2010-application-for-windows.html perhaps have fallen victim to unintential "culture bias" as well...?
ReplyDeleteThat app looks stunning by the way. I will be downloading it, even though the soccer world cup means next to nothing to me. At least I'll know when the streets will be quiet the next couple of weeks with the schedule on my phone. ;-)
Did the Windows Phone 7 Marketplace team somehow fall into this trap as well...?
ReplyDeletehttp://twitter.com/peSHIr/status/40664734617305088
http://twitter.com/peSHIr/status/40666699921506304
Speaking about Unicode: http://www.joelonsoftware.com/articles/Unicode.html
ReplyDelete