C++ Core Guidelines: When RAII breaks

Contents[Show]

Before I write about the very popular RAII idiom in C++, I want to present you a trick, which is often quite handy, when you repeatedly search for a text pattern:  use negative search.

 sports 1777806 1280

Often, the text patterns or tokens, you are looking for, are following a repetitive structure. Here, the negative search comes in play.

Use Negative Search if applicable

The general idea is easy to explain. You define a complicate regular expression to search for tokens. Quite often, the tokens are separated by delimiters such as colons, commas, spaces, and so on. In this case, it easier to search for the delimiters and the tokens you are interested in is just the text between the delimiters. Let's what I mean.

 

// regexTokenIterator.cpp

#include <iostream>
#include <string>
#include <regex>
#include <vector>

std::vector<std::string> splitAt(const std::string &text,                     // (6)
                                 const std::regex &reg) {
  std::sregex_token_iterator hitIt(text.begin(), text.end(), reg, -1);
  const std::sregex_token_iterator hitEnd;
  std::vector<std::string> resVec;
  for (; hitIt != hitEnd; ++hitIt)
    resVec.push_back(hitIt->str());
  return resVec;
}

int main() {

  std::cout << std::endl;

  const std::string text("3,-1000,4.5,-10.5,5e10,2e-5");                      // (1)

  const std::regex regNumber(
      R"([-+]?([0-9]+\.?[0-9]*|\.[0-9]+)([eE][-+]?[0-9]+)?)");                // (2)
  std::sregex_iterator numberIt(text.begin(), text.end(), regNumber);         // (3)
  const std::sregex_iterator numberEnd;
  for (; numberIt != numberEnd; ++numberIt) {                        
    std::cout << numberIt->str() << std::endl;                                // (4)
  }

  std::cout << std::endl;

  const std::regex regComma(",");
  std::sregex_token_iterator commaIt(text.begin(), text.end(), regComma, -1); // (5)
  const std::sregex_token_iterator commaEnd;
  for (; commaIt != commaEnd; ++commaIt) {
    std::cout << commaIt->str() << std::endl;
  }

  std::cout << std::endl;

  std::vector<std::string> resVec = splitAt(text, regComma);                  // (7)
  for (auto s : resVec)
    std::cout << s << " ";
  std::cout << "\n\n";

  resVec = splitAt("abc5.4def-10.5hij2e-5klm", regNumber);                    // (8)
  for (auto s : resVec)
    std::cout << s << " ";
  std::cout << "\n\n";

  std::regex regSpace(R"(\s+)");
  resVec = splitAt("abc  123  456\t789    def hij\nklm", regSpace);           // (9)
  for (auto s : resVec)
    std::cout << s << " ";
  std::cout << "\n\n";
}

Line 1 contains a string of numbers, separated by commas. To get all numbers, I define in line 2 a regular expression, which matches each number. All numbers include natural numbers, floating-point numbers, and numbers written in scientific notation. Line 3 defines the iterator of type std::sregex_iterator, which gives me all tokens and displays them in line 4. The std::regex_token_iterator in line 5 is more powerful. It searches for commas and gives me the text between the commas back because I used the negative index -1. 

This pattern is so convenient that I put it in the function splitAt (line 6). splitAt takes a text and a regular expression applies the regular expression to the text and pushes the text between the regular expression onto the std::vector<std::string> res. Now, its quite easy to split a text on commas (line 7), on numbers (line 8), and on spaces (line 9).

As Martin Stockmayer suggested, you can write the function splitAt more concise, because a std::vector can directly deal with a begin- and an enditerator.

std::vector<std::string> splitAt(const std::string &text, 
                                 const std::regex &reg) {
  return std::vector<std::string>(std::sregex_token_iterator(text.begin(), text.end(), reg, -1), 
                                  std::sregex_token_iterator());
}

The output of the program shows the expected behaviour:

regexTokenIterator

Okay, this was my last rule to the regular expression library and I, therefore, finished the rules to the standard library of the C++ core guidelines. But hold, there is one rule to the C standard library.

SL.C.1: Don’t use setjmp/longjmp

The reason for this rule is quite concise: a longjmp ignores destructors, thus invalidating all resource-management strategies relying on RAII. I hope you know RAII. If not, here is the gist. 

RAII stands for Resource Acquisition Is Initialization. Probably, the most crucial idiom in C++ says that a resource should be acquired in the constructor and released in the destructor of the object. The key idea is that the destructor will automatically be called if the object goes out of scope.

The following example shows the deterministic behaviour of RAII in C++.

 

// raii.cpp

#include <iostream>
#include <new>
#include <string>

class ResourceGuard{
  private:
    const std::string resource;
  public:
    ResourceGuard(const std::string& res):resource(res){
      std::cout << "Acquire the " << resource << "." <<  std::endl;
    }
    ~ResourceGuard(){
      std::cout << "Release the "<< resource << "." << std::endl;
    }
};

int main(){

  std::cout << std::endl;

  ResourceGuard resGuard1{"memoryBlock1"};            // (1)

  std::cout << "\nBefore local scope" << std::endl;
  {
    ResourceGuard resGuard2{"memoryBlock2"};          // (3)
  }                                                   // (4)
  std::cout << "After local scope" << std::endl;
  
  std::cout << std::endl;

  
  std::cout << "\nBefore try-catch block" << std::endl;
  try{
      ResourceGuard resGuard3{"memoryBlock3"};
      throw std::bad_alloc();                        // (5)           
  }   
  catch (std::bad_alloc& e){                         // (6)
      std::cout << e.what();
  }
  std::cout << "\nAfter try-catch block" << std::endl;
  
  std::cout << std::endl;

}                                                     // (2)

ResourceGuard is a guard that managed its resource. In this case, the string stands for the resource. ResourceGuard creates in its constructor the resource and releases the resource in its destructor. It does its job very decent.

The destructor of resGuard1 (line 1) is called at the end of the main function (line 2). The lifetime of resGuard2 (line 3) already ends in line 4. Therefore, the destructor is automatically executed. Even the throwing of an exception does not affect the reliability of resGuard3 (line 5). The destructor is called at the end of the try block (line 6).

The screenshot shows the lifetimes of the objects.

 raii

I want to emphasis the key idea of RAII: The lifetime of a resource is bound to the lifetime of a local variable and C++ automatically manages the lifetime of locals.

Okay, but how can setjmp/longjmp break this automatism? Here is what the macro setjmp and std::longjmp does:

int setjmp(std::jmp_buf env):

  • saves the execution context  in env for std::longjmp
  • returns in its first direct invocation 0 
  • returns in further invocations by std::longjmp the value set by std::longjmp
  • it the target for the std::longjmp call 
  • corresponds to catch in exception handling

void std::longjmp(std::jmp_buf env, int status):

  • restores the execution context stored in env 
  • set the status for the setjmp call
  • corresponds to throw in exception handling

Okay, this was quite technical. Here is a simple example.

 

// setJumpLongJump.cpp

#include <cstdlib>
#include <iostream>
#include <csetjmp>
#include <string>

class ResourceGuard{
  private:
    const std::string resource;
  public:
    ResourceGuard(const std::string& res):resource(res){
      std::cout << "Acquire the " << resource << "." <<  std::endl;
    }
    ~ResourceGuard(){
      std::cout << "Release the "<< resource << "." << std::endl;
    }
};

int main(){

  std::cout << std::endl;
  
  std::jmp_buf env;
  volatile int val;
  
  val = setjmp(env);                                   // (1)
  
  if (val){
      std::cout << "val: " <<  val << std::endl;
      std::exit(EXIT_FAILURE);
  }
  
  {
    ResourceGuard resGuard3{"memoryBlock3"};           // (2)
    std::longjmp(env, EXIT_FAILURE);                   // (3)
  }                                                    // (4)

}

The call in line (1) saves the execution environment and returns 0. This execution environment is restored in line (3). The critical observation is that the destructor of resGuard3 (line 2) is not invoked in line 4. This means in the concrete case, you would get a memory leak or a mutex wouldn't be unlocked.

setJumpLongJump

EXIT_FAILURE is the return value of the second setjmp call (line 1) and also the return value of the executable.

What's next?

DONE, but not completely! I have written more than 100 posts to the main sections of the C++ core guidelines and learned a lot. Besides the main section, the guidelines also have supporting sections which sound very interesting. I will write about it in my next post.

 

 

 

Thanks a lot to my Patreon Supporters: Paul Baxter,  Meeting C++, Matt Braun, Avi Lachmish, Roman Postanciuc, Venkata Ramesh Gudpati, Tobias Zindl, Marko, Ramesh Jangama, G Prvulovic, Reiner Eiteljörge, Benjamin Huth, Reinhold Dröge, Timo, Abernitzke, Richard Ohnemus , Frank Grimm, Sakib, and Broeserl.

 

Thanks in particular to:
 TakeUpCode 450 60
crp4

 

   

Get your e-book at Leanpub:

The C++ Standard Library

 

Concurrency With Modern C++

 

Get Both as one Bundle

cover   ConcurrencyCoverFrame   bundle
With C++11, C++14, and C++17 we got a lot of new C++ libraries. In addition, the existing ones are greatly improved. The key idea of my book is to give you the necessary information to the current C++ libraries in about 200 pages.  

C++11 is the first C++ standard that deals with concurrency. The story goes on with C++17 and will continue with C++20.

I'll give you a detailed insight in the current and the upcoming concurrency in C++. This insight includes the theory and a lot of practice with more the 100 source files.

 

Get my books "The C++ Standard Library" (including C++17) and "Concurrency with Modern C++" in a bundle.

In sum, you get more than 600 pages full of modern C++ and more than 100 source files presenting concurrency in practice.

 

Get your interactive course

 

Modern C++ Concurrency in Practice

C++ Standard Library including C++14 & C++17

educative CLibrary

Based on my book "Concurrency with Modern C++" educative.io created an interactive course.

What's Inside?

  • 140 lessons
  • 110 code playgrounds => Runs in the browser
  • 78 code snippets
  • 55 illustrations

Based on my book "The C++ Standard Library" educative.io created an interactive course.

What's Inside?

  • 149 lessons
  • 111 code playgrounds => Runs in the browser
  • 164 code snippets
  • 25 illustrations

Add comment


My Newest E-Books

Course: Modern C++ Concurrency in Practice

Course: C++ Standard Library including C++14 & C++17

Course: Embedded Programming with Modern C++

Course: Generic Programming (Templates)

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 144

All 3001839

Currently are 129 guests and no members online

Kubik-Rubik Joomla! Extensions

Latest comments