C++ Core Guidelines: Rules for Strings

Contents[Show]

The C++ core guidelines use the term string as a sequence of characters. Consequently, the guidelines are about  C-strings, C++-strings, the C++17 std::string_view's, and std::byte's. 

 

 thread 2995466 1280

I will in this post only loosely refer to the guidelines and ignore the strings which are part of the guidelines support library, such as gsl::string_span, zstring, and czstring. For short, I call in this post a std::string, a C++-string, and a const char* a C-string.

Let me start with the first rule:

SL.str.1: Use std::string to own character sequences

Maybe, you know another string that owns its character's sequence: a C-string. Don't use a C-string! Why? Because you have to take care of the memory management, the string termination character, and the string length.

 

// stringC.c

#include <stdio.h>
#include <string.h>
 
int main( void ){
 
  char text[10];
 
  strcpy(text, "The Text is too long for text.");   // (1) text is too big
  printf("strlen(text): %u\n", strlen(text));       // (2) text has no termination character '\0'
  printf("%s\n", text);
 
  text[sizeof(text)-1] = '\0';
  printf("strlen(text): %u\n", strlen(text));
 
  return 0;
}

 

The simple program stringC.c has inline (1) and line (2) undefined behavior. Compiling it with a rusty GCC 4.8 seems to work fine.

stringCThe C++ variant does not have the same issues.

// stringCpp.cpp

#include <iostream>
#include <string>

int main(){
 
  std::string text{"The Text is not too long."};  
 
  std::cout << "text.size(): " << text.size() << std::endl;
  std::cout << text << std::endl;
 
  text +=" And can still grow!";
 
  std::cout << "text.size(): " << text.size() << std::endl;
  std::cout << text << std::endl;
 
}

 

The output of the program should not surprise you.

stringCpp

In the case of a C++ string, I cannot make an error because the C++ runtime takes care of the memory management and the termination character. Additionally, if you access the elements of the C++ string with the at-operator instead of the index operator, bounds errors are not possible. You can read the details of the at-operator in my previous post: C++ Core Guidelines: Avoid Bounds Errors.

You know, what was strange in C++, including C++11? There was no way to create a C++ string without a C-string. This is strange because we want to get rid of the C-string. This inconsistency is gone with C++14.

 

Rainer D 6 P2 540x540Modernes C++ Mentoring

Be part of my mentoring programs:

 

 

 

 

Do you want to stay informed about my mentoring programs: Subscribe via E-Mail.

SL.str.12: Use the s suffix for string literals meant to be standard-library strings 

With C++14, we got C++-string literals. It's a C-string literal with the suffix s: "cStringLiteral"s.

Let me show you an example that makes my point: C-string literals and C++-string literals a different.

 

// stringLiteral.cpp

#include <iostream>
#include <string>
#include <utility>

int main(){
    
    using namespace std::string_literals;                         // (1)

    std::string hello = "hello";                                  // (2)
    
    auto firstPair = std::make_pair(hello, 5);
    auto secondPair = std::make_pair("hello", 15);                // (3)
    // auto secondPair = std::make_pair("hello"s, 15);            // (4)
    
    if (firstPair < secondPair) std::cout << "true" << std::endl; // (5)
    
}

 

It's a pity; I must include the namespace std::string_literals in line (1) to use the C++-string-literals. Line (2) is the critical line in the example. I use the C-string-literal "hello" to create a C++ string. This is why the type of firstPair is (std::string, int), but the type of the secondPair is (const char*, int). Ultimately, the comparison in line (5) fails because you can not compare different types. Look carefully at the last line of the error message: 

stringLiteralsError

When I use the C++-string-literal in line (4 ) instead of the C-string-literal in line (3), the program behaves as expected:

stringLiterals

C++-string-literals was a C++14 feature. Let's jump three years further. With C++17, we got std::string_view and std::byte. I already wrote, in particular, about std::string_view. Therefore, I will only recap the most important facts.

SL.str.2: Use std::string_view or gsl::string_span to refer to character sequences

Okay, a std::string view only refers to the character sequence. To say it more explicitly: A std::string_view does not own the character sequence. It represents a view of a sequence of characters. This sequence of characters can be a C++ string or a C-string. A std::string_view only needs two pieces of information: the pointer to the character sequence and their length. It supports the reading part of the interface of the std::string. Additionally to a std::string, std::string_view has two modifying operations: remove_prefix and remove_suffix.

Maybe you wonder: Why do we need a std::string_view? A std::string_view is relatively cheap to copy and needs no memory. My previous post C++17 - Avoid Copying with std::string_view shows the impressive performance numbers of a std::string_view.

As I already mentioned it, we got with C++17 also a std::byte.

SL.str.4: Use char* to refer to a single character and SL.str.5: Use std::byte to refer to byte values that do not necessarily represent characters

If you don't follow rule str.4 and use const char* as a C-string, you may end with critical issues.

 

char arr[] = {'a', 'b', 'c'};

void print(const char* p)
{
    cout << p << '\n';
}

void use()
{
    print(arr);   // run-time error; potentially very bad
}

 

arr decays to a pointer when used as an argument of the function print. The undefined behavior is that arr is not zero-terminated. You're mistaken if you now think you can use std::byte as a character.

std::byte is a distinct type implementing the concept of a byte as specified in the C++ language definition. This means a byte is not an integer or a character and is not open to programmer errors. Its job is to access object storage. Consequently, its interface consists only of methods for bitwise logical operations.

 

namespace std { 

    template <class IntType> 
        constexpr byte operator<<(byte b, IntType shift); 
    template <class IntType> 
        constexpr byte operator>>(byte b, IntType shift); 
    constexpr byte operator|(byte l, byte r); 
    constexpr byte operator&(byte l, byte r); 
    constexpr byte operator~(byte b); 
    constexpr byte operator^(byte l, byte r); 

} 

 

You can use the function std::to_integer(std::byte b) to convert a std::byte to an integer type and the call std::byte{integer} to do it the other way around. integer has to be a non-negative value smaller than std::numeric_limits<unsigned_char>::max().

What's next?

I'm almost done with the rules for the standard library. Only a few rules to iostreams and the C-standard library are left. So you know what I will write about in my next post.

 

Thanks a lot to my Patreon Supporters: Matt Braun, Roman Postanciuc, Tobias Zindl, G Prvulovic, Reinhold Dröge, Abernitzke, Frank Grimm, Sakib, Broeserl, António Pina, Sergey Agafyin, Андрей Бурмистров, Jake, GS, Lawton Shoemake, Animus24, Jozo Leko, John Breland, Venkat Nandam, Jose Francisco, Douglas Tinkham, Kuchlong Kuchlong, Robert Blanch, Truels Wissneth, Kris Kafka, Mario Luoni, Friedrich Huber, lennonli, Pramod Tikare Muralidhara, Peter Ware, Daniel Hufschläger, Alessandro Pezzato, Bob Perry, Satish Vangipuram, Andi Ireland, Richard Ohnemus, Michael Dunsky, Leo Goodstadt, John Wiederhirn, Yacob Cohen-Arazi, Florian Tischler, Robin Furness, Michael Young, Holger Detering, Bernd Mühlhaus, Matthieu Bolt, Stephen Kelley, Kyle Dean, Tusar Palauri, Dmitry Farberov, Juan Dent, George Liao, Daniel Ceperley, Jon T Hess, Stephen Totten, Wolfgang Fütterer, Matthias Grün, Phillip Diekmann, Ben Atakora, Ann Shatoff, and Rob North.

 

Thanks, in particular, to Jon Hess, Lakshman, Christian Wittenhorst, Sherhy Pyton, Dendi Suhubdy, Sudhakar Belagurusamy, Richard Sargeant, Rusty Fleming, John Nebel, Mipko, Alicja Kaminska, and Slavko Radman.

 

 

My special thanks to Embarcadero CBUIDER STUDIO FINAL ICONS 1024 Small

 

My special thanks to PVS-Studio PVC Logo

 

My special thanks to Tipi.build tipi.build logo

 

My special thanks to Take Up Code TakeUpCode 450 60

 

Seminars

I'm happy to give online seminars or face-to-face seminars worldwide. Please call me if you have any questions.

Bookable (Online)

German

Standard Seminars (English/German)

Here is a compilation of my standard seminars. These seminars are only meant to give you a first orientation.

  • C++ - The Core Language
  • C++ - The Standard Library
  • C++ - Compact
  • C++11 and C++14
  • Concurrency with Modern C++
  • Design Pattern and Architectural Pattern with C++
  • Embedded Programming with Modern C++
  • Generic Programming (Templates) with C++

New

  • Clean Code with Modern C++
  • C++20

Contact Me

Modernes C++,

RainerGrimmDunkelBlauSmall

 

 

 

Stay Informed about my Mentoring

 

Mentoring

English Books

Course: Modern C++ Concurrency in Practice

Course: C++ Standard Library including C++14 & C++17

Course: Embedded Programming with Modern C++

Course: Generic Programming (Templates)

Course: C++ Fundamentals for Professionals

Course: The All-in-One Guide to C++20

Course: Master Software Design Patterns and Architecture in C++

Subscribe to the newsletter (+ pdf bundle)

All tags

Blog archive

Source Code

Visitors

Today 4044

Yesterday 4344

Week 40922

Month 21168

All 12099377

Currently are 169 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments