A Conversion Story: Improving from_chars and to_chars in C++17

The Dome of The Pantheon, Rome

Note: You can get source code and examples on Github at https://github.com/ljestrada/charconvutils and watch my CPPCON2019 Lightning Challenge talk on YouTube here, and the Spanish version of this post here.

Nope, this is not about a religious experience, epiphany, change of faith or meeting my CMaker. Rather, it’s about C++.

C++17 sports two low-level character conversion functions,  std::from_chars and std::to_chars, but they have a usage model that can be easily improved using the power of templates.

from_chars and to_chars are appealing because they have a number of nice guarantees: locale-independent, non-allocating and non-throwing. Compare std::to_chars to std::to_string, which is locale-dependent and can throw bad_alloc. Saying that, the functions are in reality sets of overloaded functions for integer types, char, and floating point types (float, double and long double). Notice that bool is not supported, and neither are the other character types.

Nicolai Josuttis has excellent coverage and devotes a full chapter to these functions in his C++17: The Complete Guide eBook. There you can find examples for from_chars and to_chars using structured bindings and if with initialization, two other new additions to C++17. Moreover, you can reference the documentation at cppreference.com for a few examples.

The function signatures for the character conversion functions from_chars are as follows:

std::from_chars_result from_chars(
  const char* first, const char* last, /*see below*/& value,
  int base = 10);

std::from_chars_result from_chars(
  const char* first, const char* last, float& value, 
  std::chars_format fmt = std::chars_format::general);

std::from_chars_result from_chars(
  const char* first, const char* last, double& value,
  std::chars_format fmt = std::chars_format::general);

std::from_chars_result from_chars(
  const char* first, const char* last, long double& value,
  std::chars_format fmt = std::chars_format::general);

struct from_chars_result {
    const char* ptr;
    std::errc ec;
};  

The first signature is for integer types. The /*see below*/ comment refers to the integer types that are supported, as well as char (again, but not bool or any of the other character types). The rest of the signatures are for floating point types. The structure from_chars_result is used to hold the result. Albeit the function signatures are not marked with noexcept, the documentation states that they do not throw–errors are communicated via the ec member in the from_chars_result structure. Anecdotally, Microsoft’s implementation “strengthens” the functions by marking them as noexcept.

Similarly, the function signatures for the character conversion functions to_chars are as follows:

std::to_chars_result to_chars(
  char* first, char* last, /*see below*/ value, int base = 10);

std::to_chars_result to_chars(char* first, char* last,
  float value);

std::to_chars_result to_chars(char* first, char* last,
  double value);

std::to_chars_result to_chars(char* first, char* last,
  long double value);

std::to_chars_result to_chars(char* first, char* last,
  float value, std::chars_format fmt);

std::to_chars_result to_chars(char* first, char* last,
  double value, std::chars_format fmt);

std::to_chars_result to_chars(char* first, char* last,
  long double value, std::chars_format fmt);

std::to_chars_result to_chars(char* first, char* last,
float value, std::chars_format fmt, int precision);

std::to_chars_result to_chars(char* first, char* last,
  double value, std::chars_format fmt, int precision);

std::to_chars_result to_chars(char* first, char* last,
long double value, std::chars_format fmt, int precision);

struct to_chars_result {
    char* ptr;
    std::errc ec;
}; 

The usage model requires that you always provide two pointers to char. Consider converting a character sequence to a floating point type using from_chars:

#include <charconv>

int main() {
  char a[] = "3.141592";
  float pi;
  std::from_chars_result res = std::from_chars(a, a+9, pi);
  if (res.ec != std::errc{} ) {
    // CONVERSION FAILED
  }
}

The overloaded function taking a float will be selected and the result will be placed in the variable pi.

Similarly, to turn a floating point variable into a character sequence using to_chars:

#include <charconv>

int main() {
  char a[10];
  float pi = 3.141592f;

  std::to_chars_result res = std::to_chars(a, a+10, pi);
  if (res.ec != std::errc{} ) {
    // CONVERSION FAILED
  }
}

See the documentation for the value of the ptr member upon success or failure.

While the usage model is consistent, one cannot help but notice that for cases where the array size is known, one must pass one pointer too many. What if we could use instead:

#include "charconvutils.hpp" // templates and pixie dust

using charconvutils::from_chars, charconvutils::to_chars;

int main() {
  char in[] = "3.141592";
  char out[50]; 
  float pi;

  // Look, ma, no tracking array size
  std::from_chars_result res1 = from_chars(in, value);
  std::to_chars_result   res2 = to_chars(out, pi);
}

The usage model is simplified: you do not need neither to keep track of the array size. Furthermore, you could use other classes such as std::array, std::string, std::string_view or std::vector. Interested? Read on.

Function templates to the rescue

By looking carefully at the function signatures, you can come up with the following observations:

  • All the functions take a pair [const] char* that constitute a valid range. In the case of from_chars, the range is read-only, in the case of to_chars, it is where the output will be placed.
  • The functions are overloaded for integer types, char and floating point types.

A first cut at a function template for from_chars could look like this (the ??? is information that will be provided later):

template<std::size_t N, typename T>
std::from_chars result
from_chars(const char(&a)[N], T& value, ???)
{
  return std::from_chars(a, a+N, value, ???);
}

This initial stub takes advantage of template argument deduction to determine the array size. Because the array is passed by reference (that’s what the syntax const char(&a)[N] does), it won’t decay onto a pointer, and that’s what we want.

Similarly, a first cut onto a function template for to_chars could look like this:

template<std::size_t N, typename T>
std::to_chars result
to_chars(const char(&a)[N], T& value, ???)
{
  return std::to_chars(a, a+N), value, ???);
}

Here’s a non-working version of the function templates using this approach, limited to from_chars for simplicity:

// For integral types
template<std::size_t N, typename T>
std::from_chars_result
from_chars(const char(&a)[N], T& value, int base = 10)
{
  return std::from_chars(a, a + N, value, base);
}

// For floating-point types
template<std::size_t N, typename T> 
std::from_chars_result
from_chars(const char(&a)[N], T& value,
           std::chars_format fmt = std::chars_format::general)
{
  return std::from_chars(a, a + N, value, fmt);
}

Running this version of the function templates somewhat works, but it breaks when you use a floating point type and don’t provide a base argument, defaulting to 10, and thinking that the second function template will be instantiated, but that is not the case. When called with the same number of arguments, the first function template will be instantiated with a float/double/long double, and the call to std::from_chars does not match.

Reversing the function template definitions–first the one for floating point types, then the one for integral types–does not work, either.

What we want is a way to instantiate the first function template when we pass an integer type, and the second when we pass a floating point type. What we need are templates and a little bit of pixie dust:

#include "charconvutils.hpp" // templates and pixie dust

 using charconvutils::from_chars, charconvutils::to_chars; 

int main() {
  char a[] = "365";
  int daysPerYear;

  // Instantiates a function template for integer types
  std::from_chars_result res1 = from_chars(a, daysPerYear);

  char b[] = "3.141592";
  float pi;

  // Instantiates a function template for floating point types
  std::from_chars_result res2 = from_chars(b, pi);
}

Of course, using the function with additional parameters would let the compiler pick the right function, too, such as providing a base for the integer types or fmt for the floating point types.

#include "charconvutils.hpp" // templates and pixie dust

 using charconvutils::from_chars, charconvutils::to_chars; 

 int main() {
  char a[] = "365";
  int daysPerYear;

  // Instantiates a function template for integer types
  std::from_chars_result res1 = from_chars(a, daysPerYear, 10);

  char b[] = "3.141592";
  float pi;

  // Instantiates a function template for floating point types
  std::from_chars_result res2 = from_chars(b, pi);
}

Enter enable_if

The solution is to use the type trait enable_if, available in the <type_traits> header file, or rather its alias template enable_if_t. This type trait yields void when passed one argument, or a given type when passed two arguments and the first argument yields true.

The pixie dust behind enable_if is that if the instantiation fails, the template in question will not be instantiated. Nifty. By using two other type traits, is_integral and is_floating_point, we define two helper alias templates that will let us discriminate between integral types and floating point types and select the proper template function. A minor caveat is that the is_integral includes bool and character types other than char, but for our purposes it should suffice–we can let the compiler reject other instantiations. Moreover, we will use the alias templates is_integral_v and is_floating_point_v, available since C++17, for simplification.

// Alias template for integer types types
template<typename T>
using EnableIfIntegral = 
  std::enable_if_t<std::is_integral_v<T>>;

// Alias template for floating point types
template<typename T>
using EnableIfFloating =
  std::enable_if_t<std::is_floating_point_v<T>>;


NOTE: Integer or integral? The type trait std::is_integral encompasses bool, character types and integer types. The std::from_chars and std::to_chars character conversion functions consider only integer types, char and floating point types. Keep in mind that the alias template EnableIfIntegral will filter anything allowed by the std::is_integral template.

With the alias templates handy, we define two function templates for the from_chars character conversion function (we will extend to types other than arrays of characters later):

template<std::size_t N, typename T, typename=EnableIfIntegral<T>>
std::from_chars_result
from_chars(const char(&a)[N], T& value, int base = 10)
{
  return std::from_chars(a, a + N, value, base);
}

template<std::size_t N, typename T  typename=EnableIfFloating<T>>
std::from_chars_result
from_chars(const char(&a)[N], T& value,
   std::chars_format fmt = std::chars_format::general)
{
  return std::from_chars(a, a + N, value, fmt);
}

Similarly, we define three function templates for the to_chars character conversion function. Notice that the function templates for floating types do not have a default format or precision:

template<std::size_t N, typename T, typename=EnableIfIntegral<T>>
std::to_chars_result
to_chars(char(&a)[N], T value, int base = 10)
{
  return std::to_chars(a, a + N), value, base);
}

template<std::size_t N, typename T, typename=EnableIfFloating<T>> 
std::to_chars_result
to_chars(char(&a)[N], T value, std::chars_format fmt)
{
  return std::to_chars(a, a + N), value, fmt);
}

template<std::size_t N, typename T, typename=EnableIfFloating<T>>  
std::to_chars_result
to_chars(char(&a)[N], T value, std::chars_format fmt,
         int precision)
{
    return std::to_chars(a, a + N), value, fmt, precision);
}

Extending to std::array

It is easy to extend the functionality to std::array. The corresponding from_chars and to_chars for std::array are straightforward. Here are a couple of examples for integer types:

template<std::size_t N, typename T, typename=EnableIfIntegral<T>>
std::from_chars_result
from_chars(const std::array<char, N>& a, T& value, int base = 10)
{
  return std::from_chars(a.data(), a.data() + N, value, base);
}

template<std::size_t N, typename T, typename=EnableIfIntegral<T>> 
std::to_chars_result
to_chars(std::array<char, N>& a, T value, int base = 10)
{
  return std::to_chars( a.data(), a.data() + N, value, base);
}

Extending to Random Access Containers

In Generic Programming and the STL, Matt Austern describes random access containers. The book relies on the SGI STL documentation (or vice versa) and Martin Broadhurst kindly preserved it here. You can find the description of random access containers here. With that said, std::string, std::vector, std::deque and std::array are random access containers and could be used as the source or destination of from_chars and to_chars. Simply allocate the storage needed for the conversion ahead of time. std::string_view, although strictly speaking not a random access container, qualifies because it provides the same interface, accessing its underlying std::string. Of course, one must guarantee that the containers won’t change while the conversion takes place.

With the definitions that follow, the function templates for std::array defined above can be removed and replaced.

The function template calculates the range at runtime using data() and size(). In addition, there’s a check to avoid passing containers whose value type is not char. Here are a couple of example for integer types:

template<typename Cont, typename T, typename=EnableIfIntegral<T>>
std::from_chars_result
from_chars(const Cont& c, T& value, int base = 10)
{
  static_assert(std::is_same_v<char, typename Cont::value_type>,
                "Container value type must be char.");
  return std::from_chars(c.data(), c.data() + c.size(), 
                         value, base);
}

template<typename Cont, typename T, typename=EnableIfIntegral<T>> 
std::to_chars_result
to_chars(Cont& c, T value, int base = 10)
{
  static_assert(std::is_same_v<char, typename Cont::value_type>,
                "Container value type must be char.");
  return std::to_chars(c.data(), c.data() + c.size(),
                       value, base);
}

Kicking the tires…

You would expect that after all of this template brouhaha things would just magically work, but alas, the compiler support for C++17 is, let’s say, unbalanced? I ran into different problems and decided to see if support for these character conversion functions was really there. I ran two programs with Compiler Explorer, one to test from_chars, which you can find here, and one to test to_chars, which you can find here.

I tested with C++ Compiler Explorer using the latest versions of MSVC, Clang and GCC, which support for C++17 (and therefore <charconv>).

For MSVC I used the flags /std:c++17, for Clang I used the flags -std=c++17 -stdlib=libc++ and for GCC I used the flags -std=c++17. If you run Clang with this flag effectively you’re using Clang as a front end to GCC.

The test program for from_chars is reproduced here. I’ve used the problematic portions with numbers (1 through 4). These tests are not exhaustive and you may try other types (e.g., wchar_t) to see if there are other inconsistencies in the support.

#include <charconv>

using std::from_chars, std::from_chars_result, 
      std::chars_format;

int main() {
  char const bstr[] = "true";
  bool bValue;

  // 1: bool should not be supported
  from_chars_result bres = from_chars(bstr, bstr+5, bValue);

  char const istr[] = "12345";
  int iValue;
  from_chars_result ires = from_chars(istr, istr+6, iValue);

  char const lstr[] = "12345";
  long lValue;
  from_chars_result lres = from_chars(lstr, lstr+6, lValue);

  char const fstr[] = "3.141592";
  float fValue{};

  // 2: Should pick overload that takes float
  from_chars_result fres = from_chars(fstr, fstr+9, fValue);

  char const dstr[] = "3.14159265358979";
  double dValue;

  // 3: Should pick overload that takes double
  from_chars_result dres = from_chars(dstr, dstr+17, dValue,
                                      chars_format::general);

  char const ldstr[] = "1234567890.50";
  long double ldValue;
  
  // 4: Should pick overload that takes long double
  from_chars_result ldres = from_chars(ldstr, ldstr+14, ldValue);
}

The test program for to_chars is reproduced here.

#include <charconv>

using std::to_chars, std::to_chars_result, std::chars_format;

int main() {
  char bstr[10];
  bool bValue{true};

  // 1: bool should not be supported
  to_chars_result bres = to_chars(bstr, bstr+5, bValue);

  char istr[6];
  int iValue{12345};
  to_chars_result ires = to_chars(istr, istr+6, iValue);

  char lstr[6];
  long lValue{12345};
  to_chars_result lres = to_chars(lstr, lstr+6, lValue);

  char fstr[9];
  float fValue{3.141592f};

  // 2: Should pick overload that takes float, chars_format
  to_chars_result fres = to_chars(fstr, fstr+9, fValue,
                                  chars_format::general);

  char dstr[17];
  double dValue{3.14159265358979};

  // 3: Should pick overload that takes double, chars_format,
  //    precision
  to_chars_result dres = to_chars(dstr, dstr+17, dValue,
                                  chars_format::general, 10);

  char ldstr[14];
  long double ldValue{1234567890.50};
  
  // 4: Should pick overload that takes long double, chars_format 
  to_chars_result ldres = to_chars(ldstr, ldstr+14, ldValue,
                                   chars_format::general, 10);
} 

Testing charconv support in Clang 8.0.0

The first thing I found is that there seems to be a bug in Clang’s implementation when using libc++. It correctly flags the conversion to bool, on both tests, but incorrectly flags the calls using floating point types at #2, #3 and #4 .

Testing charconv support in GCC 9.1

Right off, GCC cannot find std::chars_format. Similar results when using Clang as a front end to GCC. std::chars_format seems to be in another header and most probably has not been ported fully. It correctly flags the conversion to bool at #1, but incorrectly flags the calls using floating point types at #2, #3 and #4.

Testing charconv support in MSVC 19.20

The best results. I used flags /std:c++17. It correctly flags the conversion to bool at #1 for the from_chars test, but allows conversion from bool to char[] for the to_chars test (Incidentally, Visual Studio 2019 v16.2 Preview 1 adds additional support for floating point to_chars overloads and the feature test macro __cpp_lib_to_chars, a welcome improvement).

…and taking it for a spin

With the compilers out of the way, we can now turn towards testing the function templates from_chars and to_chars. Here’s the full "charconvutils.hpp" file, which you can find in Compiler Explorer here:

#ifndef CHARCONVUTILS_HPP
#define CHARCONVUTILS_HPP

#include <charconv>
#include <type_traits>

namespace charconvutils {

// -----
// Helper alias templates

template<typename T>
using EnableIfIntegral = 
  std::enable_if_t<std::is_integral_v<T>>;

template<typename T>
using EnableIfFloating = 
std::enable_if_t<std::is_floating_point_v<T>>;

// -----
// from_chars function templates for C-style arrays

template<std::size_t N, typename T,
         typename =  EnableIfIntegral<T>>
std::from_chars_result
from_chars(const char(&a)[N], T& value, int base = 10) {
  return std::from_chars(a, a + N, value, base);
}

template<std::size_t N, typename T, 
         typename =  EnableIfFloating<T>>
std::from_chars_result
from_chars(const char(&a)[N], T& value,
           std::chars_format fmt = std::chars_format::general) {
  return std::from_chars(a, a + N, value, fmt);
}

// -----
// from_chars function templates for random access containers

template<typename Cont, typename T,
         typename = EnableIfIntegral<T>>
std::from_chars_result
from_chars(const Cont& c, T& value, int base = 10) {
  static_assert(std::is_same_v<char, typename Cont::value_type>, 
                "Container value type must be char.");
  return std::from_chars(c.data(), c.data() + c.size(),
                         value, base);
}

template<typename Cont, typename T,
         typename = EnableIfFloating<T>>
std::from_chars_result
from_chars(const Cont& c, T& value,
           std::chars_format fmt = std::chars_format::general) {
  static_assert(std::is_same_v<char, typename Cont::value_type>, 
                "Container value type must be char.");
  return std::from_chars(c.data(), c.data() + c.size(),
                         value, fmt);
}

// -----
// to_chars function templates for C-style arrays

template<std::size_t N, typename T,
         typename = EnableIfIntegral<T>>
std::to_chars_result
to_chars(char(&a)[N], T value, int base = 10) {
  return std::to_chars(a, a + N, value, base);
}

template<std::size_t N, typename T,
         typename = EnableIfFloating<T>>
std::to_chars_result
to_chars(char(&a)[N], T value, std::chars_format fmt) {
  return std::to_chars(a, a + N, value, fmt);
}

template<std::size_t N, typename T,
         typename = EnableIfFloating<T>>
std::to_chars_result
to_chars(char(&a)[N], T value, std::chars_format fmt,
         int precision) {
  return std::to_chars(a, a + N, value, fmt, precision);
}

// -----
// to_chars function templates for random access containers

template<typename Cont, typename T,
         typename = EnableIfIntegral<T>>
std::to_chars_result
to_chars(Cont& c, T value, int base = 10) {
  static_assert(std::is_same_v<char, typename Cont::value_type>, 
                "Container value type must be char.");
  return std::to_chars(c.data(), c.data() + c.size(),
                       value, base);
}

template<typename Cont, typename T,
         typename = EnableIfFloating<T>>
std::to_chars_result
to_chars(Cont& c, T value, std::chars_format fmt) {
  static_assert(std::is_same_v<char, typename Cont::value_type>, 
                "Container value type must be char.");
  return std::to_chars(c.data(), c.data() + c.size(),
                       value, fmt);
}

template<typename Cont, typename T,
         typename = EnableIfFloating<T>> 
std::to_chars_result
to_chars(Cont& c, T value, std::chars_format fmt, int precision) {
  static_assert(std::is_same_v<char, typename Cont::value_type>,
                "Container value type must be char.");
  return std::to_chars(c.data(), c.data() + c.size(),
                       value, fmt, precision);
}

} // namespace charconvutils

#endif // CHARCONVUTILS_HPP

With this file in place, you can run the sample programs outlined above.

Afternotes

Jens Maurer proposed the character conversion functions in this paper. It’s interesting to see the proposed interfaces, and I could not find out why templates were not considered (albeit std::string_view was). Perhaps in a different iteration of the paper this idea was rejected and I’m missing something (e.g., considerations for the containers not to be accessed by multiple threads while the conversion takes place).

What about structured bindings? Sure, you can use structured bindings to capture the resulting structure–as previously mentioned, Nicolai Josuttis has several examples.

Another aspect that seems to be missing is that if the functions as declared are non throwing, why not mark them with noexcept? That’s what Microsoft did, and it seems only logical. Perhaps a consideration is that the range would become invalid while performing the conversion?

In closing, this post explored how one can improve the usage model for the character conversion functions by investing in function templates.

And remember, vote for Pedro.

Updated on June 20, 2019

Posted a bug report to Microsoft for taking the bool parameter and they report that the behavior is as designed (that is, the bool is promoted to an integer). You can find the problem report here.

The approach taken by Clang is that they mark the with delete, so it does not allow calling std::to_chars by passing a bool.

While logical, this discrepancy among compilers is problematic–which implementation is correct?

The problem I see with allowing the conversion to chars from bool is that there is no symmetry: Calling std::from_chars with “true” or “false” does not result in an int 1 or 0, or a bool. Now, that’s understandable, since there can be many representations for true and false strings (e.g., “yes” and “no”), but the opposite should hold, too.

Updated on August 28, 2019

An issue has been entered in the LWG that will delete the std::to_chars overload for bool overload. It can be found here:
https://cplusplus.github.io/LWG/issue3266
and the corresponding documentation in cppreference.com reflects it.

Updated on September 16, 2019

Reflects comments and simplification by using the data() member for containers, and now preserving the same semantics as the std::from_chars and std::to_chars functions.

This entry was posted in C++, C++17, software and tagged , , , , . Bookmark the permalink.

1 Response to A Conversion Story: Improving from_chars and to_chars in C++17

  1. Pingback: Una historia de conversión: mejorando from_chars y to_chars en C++17 | Se Habla C++

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s