C++ Core Guidelines: Rules for Strings

Contents[Show]

The C++ core guidelines use the term string as a sequence of characters. Consequently, the guidelines are about  C-strings, C++-strings, the C++17 std::string_view's, and std::byte's. 

 

 thread 2995466 1280

I will in this post only loosely refer to the guidelines and ignore the strings which are part of the guidelines support library such as gsl::string_span, zstring, and czstring. For short, I call in this post a std::string a C++-string, and a const char* a C-string.

Let me start with the first rule:

SL.str.1: Use std::string to own character sequences

Maybe, you know another string that owns its characters sequence: a C-string. Don't use a C-string! Why? Because you have to take care of the memory management, the string termination character, and length of the string.

 

// stringC.c

#include <stdio.h>
#include <string.h>
 
int main( void ){
 
  char text[10];
 
  strcpy(text, "The Text is too long for text.");   // (1) text is too big
  printf("strlen(text): %u\n", strlen(text));       // (2) text has no termination character '\0'
  printf("%s\n", text);
 
  text[sizeof(text)-1] = '\0';
  printf("strlen(text): %u\n", strlen(text));
 
  return 0;
}

 

The simple program stringC.c has inline (1) and line (2) undefined behaviour. Compiling it with a rusty GCC 4.8 seems to work fine.

stringCThe C++ variant does not have the same issues.

// stringCpp.cpp

#include <iostream>
#include <string>

int main(){
 
  std::string text{"The Text is not too long."};  
 
  std::cout << "text.size(): " << text.size() << std::endl;
  std::cout << text << std::endl;
 
  text +=" And can still grow!";
 
  std::cout << "text.size(): " << text.size() << std::endl;
  std::cout << text << std::endl;
 
}

 

The output of the program should not surprise you.

stringCpp

In the case of a C++-string, I cannot make an error because the C++ runtime takes care of the memory management and the termination character. Additionally, if you access the elements of the C++-string with the at-operator instead of the index-operator, bounds errors are not possible. You can read the details of the at-operator in my previous post: C++ Core Guidelines: Avoid Bounds Errors.

You know, what was strange in C++, including C++11? There was no way to create a C++-string without a C-string. This is strange because we want to get rid of the C-string. This inconsistency is gone with C++14.

SL.str.12: Use the s suffix for string literals meant to be standard-library strings 

With C++14 we got C++-string literals. It's a C-string literal with the suffix s: "cStringLiteral"s.

Let me show you an example that makes my point: C-string literals and C++-string literals a different.

 

// stringLiteral.cpp

#include <iostream>
#include <string>
#include <utility>

int main(){
    
    using namespace std::string_literals;                         // (1)

    std::string hello = "hello";                                  // (2)
    
    auto firstPair = std::make_pair(hello, 5);
    auto secondPair = std::make_pair("hello", 15);                // (3)
    // auto secondPair = std::make_pair("hello"s, 15);            // (4)
    
    if (firstPair < secondPair) std::cout << "true" << std::endl; // (5)
    
}

 

It's a pity; I have to include the namespace std::string_literals in line (1) to use the C++-string-literals. Line (2) is the critical line in the example. I use the C-string-literal "hello" to create a C++-string. This is the reason that the type of firstPair is (std::string, int), but the type of the secondPair is (const char*, int). In the end, the comparison in line (5) fails, because you can not compare different types. Look carefully at the last line of the error message: 

stringLiteralsError

When I use the C++-string-literal in line (4 ) instead of the C-string-literal in line (3), the program behaves as expected:

stringLiterals

C++-string-literals was a C++14 feature. Let's jump three years further. With C++17 we got std::string_view and std::byte. I already wrote, in particular, about std::string_view. Therefore, I will only recap the most important facts.

SL.str.2: Use std::string_view or gsl::string_span to refer to character sequences

Okay, a std::string view only refers to the character sequence. To say it more explicitly: A std::string_view does not own the character sequence. It represents a view of a sequence of characters. This sequence of characters can be a C++-string or C-string. A std::string_view only needs two pieces of information: the pointer to the character sequence and their length. It supports the reading part of the interface of the std::string. Additionally to a std::string, std::string_view has two modifying operations: remove_prefix and remove_suffix.

Maybe you wonder: Why do we need a std::string_view? A std::string_view is quite cheap to copy and needs no memory. My previous post C++17 - Avoid Copying with std::string_view shows the impressive performance numbers of a std::string_view.

As I already mentioned it, we got with C++17 also a std::byte.

SL.str.4: Use char* to refer to a single character and SL.str.5: Use std::byte to refer to byte values that do not necessarily represent characters

If you don't follow rule str.4 and use const char* as a C-string, you may end with critical issues as the following one.

 

char arr[] = {'a', 'b', 'c'};

void print(const char* p)
{
    cout << p << '\n';
}

void use()
{
    print(arr);   // run-time error; potentially very bad
}

 

arr decays to a pointer when used as an argument of the function print. The undefined behaviour is, that arr is not zero-terminated. If you now have the impression to can use std::byte as a character, you are wrong.

std::byte is a distinct type implementing the concept of a byte as specified in the C++ language definition. This means, a byte is not an integer or a character and is, therefore, not open to programmer errors. Its job is to access object storage. Consequently, its interface consists only of methods for bitwise logical operations.

 

namespace std { 

    template <class IntType> 
        constexpr byte operator<<(byte b, IntType shift); 
    template <class IntType> 
        constexpr byte operator>>(byte b, IntType shift); 
    constexpr byte operator|(byte l, byte r); 
    constexpr byte operator&(byte l, byte r); 
    constexpr byte operator~(byte b); 
    constexpr byte operator^(byte l, byte r); 

} 

 

You can use the function std::to_integer(std::byte b) to convert a std::byte to an integer type and the call std::byte{integer} to do it the other way around. integer has to be a non-negative value smaller than std::numeric_limits<unsigned_char>::max().

What's next?

I'm almost done with the rules to the standard library. Only a few rules to iostreams and the C-standard library are left. So you know, what I will write about in my next post.

 

Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, Marko, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Animus24, Jozo Leko, John Breland, espkk, Louis St-Amour, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Kris Kafka, Mario Luoni, Neil Wang, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Tobi Heideman, Daniel Hufschläger, Red Trip, Alexander Schwarz, Tornike Porchxidze, Alessandro Pezzato, Evangelos Denaxas, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Satish Vangipuram, and Michael Dunsky.

 

Thanks in particular to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, and Said Mert Turkal.

 

 

My special thanks to Embarcadero CBUIDER STUDIO FINAL ICONS 1024 Small

 

Seminars

I'm happy to give online seminars or face-to-face seminars worldwide. Please call me if you have any questions.

Bookable (Online)

German

Standard Seminars (English/German)

Here is a compilation of my standard seminars. These seminars are only meant to give you a first orientation.

New

Contact Me

Modernes C++,

RainerGrimmSmall

 

 

 

My Newest E-Books

Course: Modern C++ Concurrency in Practice

Course: C++ Standard Library including C++14 & C++17

Course: Embedded Programming with Modern C++

Course: Generic Programming (Templates)

Course: C++ Fundamentals for Professionals

Interactive Course: The All-in-One Guide to C++20

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 4724

Yesterday 6789

Week 48489

Month 209163

All 6857855

Currently are 223 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments