Thursday, February 11, 2010

Cultural barbarism in .NET

If you follow me on Twitter, you will have seen me rant about an alpha version of some piece of software on my phone. While I will not be naming it here, because that is just not the point, it is clearly the reason for this blog post.

The application - lets call it FizzBuzz just to have a generic name - looks good, and seems to work okay (great, if you consider that it's an alpha version) except for one thing that for me made it totally useless. FizzBuzz contacts an internet service through (I think) a REST API. And at least one of the parameters involved is a decimal number. When I used the version of the FizzBuzz application that sparked this post, all I got was a fatal exception message box after which the applications was killed, which in itself was fine.

So some application crashed. What gives?

What's not fine however is the fact that the problem that caused the crash was a FizzBuzzException with a message a bit like "Invalid attribute 52,0987654321". The original FizzBuzz API no doubt accepts its decimal values with a dot. I live in the Netherlands however, so my phone is set to the (for me) correct local settings, which include a decimal comma. So, what went wrong?

Simple: the creator of the FizzBuzz phone application seems to have been working from a faulty assumption: in the .NET framework decimal.ToString() always creates its results "the same way I'm used to getting them". And that is plainly wrong. The whole question of globalization is just completely ignored then!

Globalization? Oh no, not that!

Don't get me wrong here. Globalization is a royal pain in the ass. So, I'm not saying that, just because the current .NET framework seems to be able cope just fine with globalization in almost every way, each and every application written using it should pass the complete Turkey test. Which by the way is in no particular way meant to be negative about Turkey in particular. ;-)

Also, this post is not about the FizzBuzz phone application showing times in 12-hour time (without AM/PM indication even) and dates in M/DD numerical format. I sure would like it very much if it would show this to me in 24-hour time and D-M format (either by making use of the current Culture settings, or by letting the user choose among a number of options in its Settings), but that is again not my point here.

What are you saying, then?

I just think there are minimum levels of "cultural awareness" for each and every .NET programmer. And knowing that ToString() (and anything directly or indirectly using it) is dependent on the current culture is part of that. Not just getting an "aha" feeling when someone (like me) tells you about this, but using this in your daily work every day.

What really helps with this is static code analysis: if you turn on rules CA1304 and CA1305 it almost becomes impossible to forget about these things. Especially if you turn on the Treat warnings as errors option, which might be a bit much for most.

Another basic that every .NET developer should be aware of is the invariant culture and when to use it.

Every time you are converting a value to a string for the purpose of showing it to the human user of your application, the assumption above is usually fine. Normally the .NET Framework will silently use the current culture which will be related to the local settings made on the device and everything should be peachy.

However, when you are creating a string for the specific use in an API to other software - like in the situation the FizzBuzz application was converting decimal values to strings - you normally need a specific format. Or to be correct: the API you use expects a specific format. And since most software development has historically been English based, you'd better use dots in your text-formatted decimal numbers, unless specified otherwise. For this, the invariant culture is perfect.

For some small bits of code then. When dealing with API use (not GUI to human user use) I often put the following in string related classes:

using System.Globalization;

// ...

class SomeApiStringHandling {

   private static CultureInfo invariant = CultureInfo.InvariantCulture; 

   // ...

   void SomeMethodUsingStringFormat() {
      string.Format(invariant, "Format", data);
      // instead of the usual
      string.Format("Format", data);
   }   

   // ...
}

Also, don't forget about string.ToLowerInvariant() as opposed to the much more well known string.ToLower() (and the same for the variants for upper case) in situations where use of the invariant culture matters.

Conclusion

  • Please adjust your basic assumptions (if you haven't already): assume the current culture comes into play when converting to/from strings at all times.
  • When working with strings in relation to software based protocols and your "old assumptions" should hold, simply explicitly specify the use of the invariant culture.
  • Only if you have a compelling need to (or when it interests you) should you actually read up further on the whole globalization situation, including the Turkey test, date/time/currency formats, multiple resource files for different cultures, etc.

Oh, and if any one of you still thinks that one char equals one byte, let me know. Perhaps I can then start talking about Unicode and other character encodings a bit, alright? Whole different can of worms entirely... ;-)

P.S. To be fair, by the time this blog post is actually published, the FizzBuzz application that was a reason for me to write this post corrected the problem and is now working fine. Just wanted to have this mentioned explicitly.

Wednesday, February 3, 2010

My old (make that ancient) website is going down...

If you read this blog and generally follow me online or have been doing some time ago, chances are http://www.jarno.demon.nl/ has come up on your screen at some time.

This is the website I started working on when I was at university. This was somewhere in 1996 I think, judging from the copyright message. I wrote an MS-DOS console application myself in some compilable BASIC dialect to create and maintain the site. This application, called JHS2HTML, would take *.jhs files and some general header/footer type information and create a an HTML file for each, in essence giving me the website to then go FTP somewhere. The tool had dynamic placemarkers for things like the current date or year, easy linking to other *.jhs files (creating mouseover links) and macros for things like the UPDATED images that were all in fashion at the time. Oh yes, this 'sophisticated CMS' included style sheets and master pages and all that. Sort of. <grin>

Let's face it, this website is old, outdated and frankly a bit (lot?) embarrassing even. But still, I never took it down or moved it and I'm still paying the ISP I had the time just so it remains. That, and because I'm a lazy administrator, and never actually canceled. This is now going to change.

I think I keep getting some job offer calls from the CV on the site. And there are also still things on there that are difficult or impossible to find elsewhere on the web, like: the ghost website of RUN Flagazine, information on the HP82240B printer (although I think this is also mirrored in the HP48 FAQ) and my masters thesis.

So, after reading about HTTrack on POKE 53280,0 I am now dragging all the content down so I can stored it in my Jungledisk account somewhere so it isn't lost completely. And afterwards I will be taking my Demon account down. So expect my ancient website to soon vanish, at least from the current URL...