The C language is dangerous

The Partridge Family were neither partridges nor a family. Discuss.
Post Reply
albinopapa
Posts: 4373
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States

The C language is dangerous

Post by albinopapa » March 18th, 2018, 10:53 pm

I have been messing around with C for a few days and crap it's dangerous.

No constructors nor destructors.
One of the benefits of C++ is the use of constructors to initialize resources and destructors to release those resources. A resource can be memory, a thread lock or even a file handle. The nice thing about this mechanism is the compiler calls these special functions for you. An example would be the std::string class. Say you want to assign string A to string B.

Code: Select all

void Fun()
{
     std::string A = "Some text to get you started.";
     std::string C = "Here comes the pain.";
     {
          std::string B = A;
          std::cout << B << '\n';
          C = B;
     }
     
}
So here A is constructed with the "Some text ... " const char string and then inside the {} B is constructed by making a copy of A. Since B is being constructed, no big deal because B isn't holding any resources. Now, what happens when B is copied to C? Well, in C++, the resources for C are released and then B is copied over. In C however, there aren't operator overloads so you can't tell C to release it's resources before copying. You end up copying over B which overwrites the memory C was holding, but not releasing it before hand.

C requires the programmer to be more cautious and aware of what they are doing. Functions should probably not return by value in C if their structures contain resources to avoid the above scenario.

I bring this up because something I've been playing around with is writing something similar to std::string, but all in C. I was going to have a construct function just return a cstring object, but quickly realized how asinine that would have been.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com

nG Inverse
Posts: 115
Joined: April 27th, 2012, 11:49 pm

Re: The C language is dangerous

Post by nG Inverse » March 18th, 2018, 11:15 pm

I'm not sure I follow. I tried your code using "C-strings" (char*) and got the correct results, as well with using the standard library strings. I guess "std::string" is a syntax macro you created in C to represent C++? I don't see how B is overwriting C in your example.

albinopapa
Posts: 4373
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States

Re: The C language is dangerous

Post by albinopapa » March 19th, 2018, 2:02 am

Well, I guess I wasn't very clear in separation. In the example, std::string IS the C++ std::string and the point was to show that in C++ resources such as memory is allocated and released for you if you use the constructors and destructors for that purpose. So copying B to C in C++ will release the memory in C, then copy over B into C. Thus preventing memory leaks.

Now, as I said, I have a struct called cstring in the C language. If I create/construct one of these cstring objects and just use the return value, then there is a chance for memory leaks.

Code: Select all

typedef struct 
{
     size_t length, capacity;
     char* buffer;
}cstring;

cstring cs_default_construct()
{
     cstring cs;
     cs.length = 0;
     cs.capacity = 1;
     cs.buffer = (char*)malloc(1);

     return cs;
}

cstring cs_string_construct( const char* str )
{
     const size_t len = strlen( str );
     cstring cs;
     cs.length = len;
     cs.capacity = len;
     cs.buffer = (char*)malloc(len);
     for(size_t i = 0; i < len; ++i)
     {
          cs.buffer[i] = str[i];
     }
     return cs;
}
cstring cs_copy(const cstring* _this)
{
     cstring cs;
     cs.length = _this->length;
     cs.capacity = _this->capacity;
     cs.buffer = (char*)malloc(_this->length);
     for(size_t i = 0; i < _this->length; ++i)
     {
          cs.buffer[i] = _this->buffer[i];
     }
     return cs;
}
void cs_destroy(cstring* _this)
{
     free( _this->buffer );
     _this->buffer = NULL;
     _this->length = 0;
     _this->capacity = 0;
}
int main( int argc, char* argv[] )
{
     cstring a = cs_string_construct("This is a line of text.");
     cstring c = cs_string_construct("Here comes the pain.");
     {
          cstring b = a;  // Problem here is b.buffer shares the same pointer as a.buffer
          c = b; // Here, c.buffer is now pointing to b.buffer and the original c.buffer data is lost...memory leak.
          c = cs_copy(b); // Still memory leaks, because the cs_copy function doesn't know about c so it can't release it's resources.
     }
     
     cs_destroy(a) // Should be fine
     cs_destroy(c) // Will crash because c.buffer points to b.buffer and b.buffer was pointing to a.buffer and a.buffer has already been freed.
     return 0;
}
This can still happen in C++ if you play with pointers, but if you use objects and their constructors, destructors and operator overloads of operator= you can avoid these types of memory leaks. This is why I say C is dangerous and if you are going to abstract the resource into a struct like this, you probably shouldn't return by value the resource wrapper like shown.

So I chose to do this instead:

Code: Select all

void cs_default_construct(cstring*const _this)
{
     if(_this->length > 0)
     {
          cs_destroy(_this);
     }

     _this->length = 0;
     _this->capacity = 1;
     _this->buffer = (char*)malloc(1);
}

void cs_string_construct( cstring*const _this, const char* str )
{
     if(_this->length > 0)
     {
          cs_destroy(_this);
     }
     const size_t len = strlen( str );

     _this->length = len;
     _this->capacity = len;
     _this->buffer = (char*)malloc(len);
     for(size_t i = 0; i < len; ++i)
     {
          _this->buffer[i] = str[i];
     }
}
void cs_copy(const cstring* const _this, cstring* const other)
{
     // Clean up resources if other contains something
     if( other->length != 0 )
     {
          cs_destroy( other );
     }

     other->length = _this->length;
     other->capacity = _this->capacity;
     other->buffer = (char*)malloc(_this->length);

     for(size_t i = 0; i < _this->length; ++i)
     {
          other->buffer[i] = _this->buffer[i];
     }
}
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com

nG Inverse
Posts: 115
Joined: April 27th, 2012, 11:49 pm

Re: The C language is dangerous

Post by nG Inverse » March 19th, 2018, 4:00 am

Ahh, okay. This makes sense now. This could all be fixed if you could have a custom assignment operator for structs like you said.

So to wrap up, stick to C++. :)

albinopapa
Posts: 4373
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States

Re: The C language is dangerous

Post by albinopapa » March 19th, 2018, 5:06 am

Lol, yeah, stick with C++.

This was just a rant against the C language and how spoiled C++ programmers are.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com

nG Inverse
Posts: 115
Joined: April 27th, 2012, 11:49 pm

Re: The C language is dangerous

Post by nG Inverse » March 19th, 2018, 2:40 pm

I think a proper garbage collector would be considered spoiled! We still have to make sure to free/delete our dynamically allocated memory. I think it's Java that you don't have to worry about this?

Smart pointers can make this much easier to deal with.

albinopapa
Posts: 4373
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States

Re: The C language is dangerous

Post by albinopapa » March 19th, 2018, 6:04 pm

C# also has garbage collection.

Since C++11, you only need to new/malloc,delete/free ... uh.... never. Just use std::vector or std::unique_ptr and be done with manual memory management. Ok, maybe if you are creating a custom allocator. Though, C++17 or 20 has polymorphic memory resources, so may not even need to worry about memory management when writing a custom allocator for much longer.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com

User avatar
chili
Site Admin
Posts: 3948
Joined: December 31st, 2011, 4:53 pm
Location: Japan
Contact:

Re: The C language is dangerous

Post by chili » March 24th, 2018, 1:53 am

Most languages don't have an idea of RAII, and this makes me sad. RAII is love, RAII is life (once you put in the hours of effort to understand all its mechanics properly that is :lol:)
Chili

albinopapa
Posts: 4373
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States

Re: The C language is dangerous

Post by albinopapa » March 24th, 2018, 3:59 am

Yes, sad indeed.

I can see why so many C APIs are riddled with error code return values and inout parameters for functions.

Using the return parameter of the function for returning useful information makes code so much more elegant and less error prone. Just to test something out, I used these 'cstring' objects and did return them upon 'construction'. This avoided having to declare a cstring object, then passing the address of to the construct function. This meant I didn't have to check for a NULL parameter so that cleaned up the complexity of the internal library implementation a bit. Without exceptions though, there are three options that come to mind.
  1. Use the return parameter for success/error codes and have the user pass in pointers to previously declared objects. As the API author, you have to make the distinction, are users going to pass in the address of an object, or preallocate and pass the pointer to uninitialized memory.
  2. Use the return parameter for returning data; like objects or ints or whatever, and have the user pass in a pointer to a result like an int* or custom error code enum*, this seems like the less complex solution, though it seems the idiom is to use inout parameters for objects and return params for error control.
  3. Use the return parameter for returning data; like objects, ints or whatever and maintain a global error state, like the Win32 API and the GetLastError/SetLastError APIs.
The Win32 library uses a mixture of all, returns BOOL ( typedef for int ) as the return parameter, maintains a global error state and uses inout parameters for objects. If the BOOL is 0 ( FALSE ) then the programmer knows to check GetLastError for error codes and not to use the returned object.

I pity the C programmers of the world, at least other languages do have garbage collection to reduce or eliminate memory leaks. The C programmer has to really know what they are doing, by keeping track of all their allocations and deallocations, file pointers and so on.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com

Post Reply