Case Sensitivity

The well-known-and-if-it-isn't-well-known-it-ought-to-be Guru of the Week discussions held on Usenet covered this topic in January of 1998. Briefly, the challenge was, write a 'ci_string' class which is identical to the standard 'string' class, but is case-insensitive in the same way as the (common but nonstandard) C function stricmp().

   ci_string s( "AbCdE" );

   // case insensitive
   assert( s == "abcde" );
   assert( s == "ABCDE" );

   // still case-preserving, of course
   assert( strcmp( s.c_str(), "AbCdE" ) == 0 );
   assert( strcmp( s.c_str(), "abcde" ) != 0 ); 

The solution is surprisingly easy. The original answer was posted on Usenet, and a revised version appears in Herb Sutter's book Exceptional C++ and on his website as GotW 29.

See? Told you it was easy!

Added June 2000: The May 2000 issue of C++ Report contains a fascinating article by Matt Austern (yes, the Matt Austern) on why case-insensitive comparisons are not as easy as they seem, and why creating a class is the wrong way to go about it in production code. (The GotW answer mentions one of the principle difficulties; his article mentions more.)

Basically, this is "easy" only if you ignore some things, things which may be too important to your program to ignore. (I chose to ignore them when originally writing this entry, and am surprised that nobody ever called me on it...) The GotW question and answer remain useful instructional tools, however.

Added September 2000: James Kanze provided a link to a Unicode Technical Report discussing case handling, which provides some very good information.