Getting started with C++TemplatesMetaprogrammingIteratorsReturning several values from a functionstd::stringNamespacesFile I/OClasses/StructuresSmart PointersFunction Overloadingstd::vectorOperator OverloadingLambdasLoopsstd::mapThreadingValue CategoriesPreprocessorSFINAE (Substitution Failure Is Not An Error)The Rule of Three, Five, And ZeroRAII: Resource Acquisition Is InitializationExceptionsImplementation-defined behaviorSpecial Member FunctionsRandom number generationReferencesSortingRegular expressionsPolymorphismPerfect ForwardingVirtual Member FunctionsUndefined BehaviorValue and Reference SemanticsOverload resolutionMove SemanticsPointers to membersPimpl Idiomstd::function: To wrap any element that is callableconst keywordautostd::optionalCopy ElisionBit OperatorsFold ExpressionsUnionsUnnamed typesmutable keywordBit fieldsstd::arraySingleton Design PatternThe ISO C++ StandardUser-Defined LiteralsEnumerationType ErasureMemory managementBit ManipulationArraysPointersExplicit type conversionsRTTI: Run-Time Type InformationStandard Library AlgorithmsFriend keywordExpression templatesScopesAtomic Typesstatic_assertoperator precedenceconstexprDate and time using <chrono> headerTrailing return typeFunction Template OverloadingCommon compile/linker errors (GCC)Design pattern implementation in C++Optimization in C++Compiling and BuildingType Traitsstd::pairKeywordsOne Definition Rule (ODR)Unspecified behaviorFloating Point ArithmeticArgument Dependent Name Lookupstd::variantAttributesInternationalization in C++ProfilingReturn Type CovarianceNon-Static Member FunctionsRecursion in C++Callable Objectsstd::iomanipConstant class member functionsSide by Side Comparisons of classic C++ examples solved via C++ vs C++11 vs C++14 vs C++17The This PointerInline functionsCopying vs AssignmentClient server examplesHeader FilesConst Correctnessstd::atomicsData Structures in C++Refactoring TechniquesC++ StreamsParameter packsLiteralsFlow ControlType KeywordsBasic Type KeywordsVariable Declaration KeywordsIterationtype deductionstd::anyC++11 Memory ModelBuild SystemsConcurrency With OpenMPType Inferencestd::integer_sequenceResource Managementstd::set and std::multisetStorage class specifiersAlignmentInline variablesLinkage specificationsCuriously Recurring Template Pattern (CRTP)Using declarationTypedef and type aliasesLayout of object typesC incompatibilitiesstd::forward_listOptimizationSemaphoreThread synchronization structuresC++ Debugging and Debug-prevention Tools & TechniquesFutures and PromisesMore undefined behaviors in C++MutexesUnit Testing in C++Recursive MutexdecltypeUsing std::unordered_mapDigit separatorsC++ function "call by value" vs. "call by reference"Basic input/output in c++Stream manipulatorsC++ ContainersArithmitic Metaprogramming

Internationalization in C++

Other topics

Remarks:

The C++ language does not dictate any character-set, some compilers may support the use of UTF-8, or even UTF-16. However there is no certainty that anything beyond simple ANSI/ASCII characters will be provided.

Thus all international language support is implementation defined, reliant on what platform, operating system, and compiler you may be using.

Several third party libraries (such as the International Unicode Committee Library) that can be used to extend the international support of the platform.

Understanding C++ string characteristics

#include <iostream>
#include <string>

int main()
{
    const char * C_String = "This is a line of text w";
    const char * C_Problem_String = "This is a line of text ኚ";
    std::string Std_String("This is a second line of text w");
    std::string Std_Problem_String("This is a second line of ϯϵxϯ ኚ");

    std::cout << "String Length: " << Std_String.length() << '\n';
    std::cout << "String Length: " << Std_Problem_String.length() << '\n';

    std::cout << "CString Length: " << strlen(C_String) << '\n';
    std::cout << "CString Length: " << strlen(C_Problem_String) << '\n';
    return 0;
}

Depending on platform (windows, OSX, etc) and compiler (GCC, MSVC, etc), this program may fail to compile, display different values, or display the same values.

Example output under the Microsoft MSVC compiler:

String Length: 31
String Length: 31
CString Length: 24
CString Length: 24

This shows that under MSVC each of the extended-characters used is considered a single "character", and this platform fully supports internationalised languages.
It should be noted however that this behaviour is unusual, these international characters are stored internally as Unicode and thus are actually several bytes long. This may cause unexpected errors

Under the GNC/GCC compiler the program output is:

String Length: 31
String Length: 36
CString Length: 24
CString Length: 26

This example demonstrates that while the GCC compiler used on this (Linux) platform does support these extended-characters, it also uses (correctly) several bytes to store an individual character.
In this case the use of Unicode characters is possible, but the programmer must take great care in remembering that the length of a "string" in this scenario is the length in bytes, not the length in readable characters.

These differences are due to how international languages are handled on a per-platform basis - and more importantly, that the C and C++ strings used in this example can be considered an array of bytes, such that (for this usage) the C++ language considers a character (char) to be a single byte.

Contributors

Topic Id: 5270

Example Ids: 18775

This site is not affiliated with any of the contributors.