In our program, we use std::setlocale(LC_ALL, ".UTF8"); to support UTF-8 encoding, as noted in the Microsoft documentation. However, when certain C++ STL functions fail, their message() outputs are not in UTF-8.
After investigating the source code, I found that the error messages are always formatted according to the system's default locale, as shown in this STL implementation:
|
[[nodiscard]] size_t __CLRCALL_PURE_OR_STDCALL __std_system_error_allocate_message( |
|
const unsigned long _Message_id, char** const _Ptr_str) noexcept { |
|
// convert to name of Windows error, return 0 for failure, otherwise return number of chars in buffer |
|
// __std_system_error_deallocate_message should be called even if 0 is returned |
|
// pre: *_Ptr_str == nullptr |
|
DWORD _Lang_id; |
|
const int _Ret = GetLocaleInfoEx(LOCALE_NAME_SYSTEM_DEFAULT, LOCALE_ILANGUAGE | LOCALE_RETURN_NUMBER, |
|
reinterpret_cast<LPWSTR>(&_Lang_id), sizeof(_Lang_id) / sizeof(wchar_t)); |
|
if (_Ret == 0) { |
|
_Lang_id = 0; |
|
} |
|
const unsigned long _Chars = |
|
FormatMessageA(FORMAT_MESSAGE_ALLOCATE_BUFFER | FORMAT_MESSAGE_FROM_SYSTEM | FORMAT_MESSAGE_IGNORE_INSERTS, |
|
nullptr, _Message_id, _Lang_id, reinterpret_cast<char*>(_Ptr_str), 0, nullptr); |
|
|
|
return _CSTD __std_get_string_size_without_trailing_whitespace(*_Ptr_str, _Chars); |
|
} |
While it would be acceptable if the system locale used UTF-8 as the codepage, allowing the messages to be correctly printed to log files, this is not the case. There is a "Beta: Use Unicode UTF-8 for worldwide language support" setting in the Region settings, which does the job correctly, but toggling that setting requires a reboot, which is not acceptable for our requirements.
Question
Should the std::error_code::message function respect the locale set by the user (e.g., via std::setlocale), or should it continue to use the system's default locale? If the former, is there a plan to implement this behavior in the Microsoft STL implementation?
Related Information
- I learned that the locale is by default set across the whole program, unless specified by configthreadlocale, from the Microsoft documentation.
- The issue of system_error not honoring the current thread's locale is mentioned in this issue.
Please let me know if you need any additional information or clarification.
In our program, we use
std::setlocale(LC_ALL, ".UTF8");to support UTF-8 encoding, as noted in the Microsoft documentation. However, when certain C++ STL functions fail, theirmessage()outputs are not in UTF-8.After investigating the source code, I found that the error messages are always formatted according to the system's default locale, as shown in this STL implementation:
STL/stl/src/syserror_import_lib.cpp
Lines 38 to 54 in e36ee6c
While it would be acceptable if the system locale used UTF-8 as the codepage, allowing the messages to be correctly printed to log files, this is not the case. There is a "Beta: Use Unicode UTF-8 for worldwide language support" setting in the Region settings, which does the job correctly, but toggling that setting requires a reboot, which is not acceptable for our requirements.
Question
Should the
std::error_code::messagefunction respect the locale set by the user (e.g., viastd::setlocale), or should it continue to use the system's default locale? If the former, is there a plan to implement this behavior in the Microsoft STL implementation?Related Information
Please let me know if you need any additional information or clarification.