Dlls and ABI

The Partridge Family were neither partridges nor a family. Discuss.
Post Reply
FirepathAgain
Posts: 30
Joined: March 20th, 2021, 3:01 am

Dlls and ABI

Post by FirepathAgain » March 30th, 2021, 4:10 am

I probably don't have time to explain everything I want to ask right now, but I'm having a hard time getting the right answers in my brain with how this should work and should be done.

I have so many questions on this. I want to make it correctly, but can't find the answers. It all seems very vague. At the very least it seems compromised.


Say I want to make a utility dll to put things in to use from my exe.

Reading I have done is not giving me 100% answers. Every option seems compromised.


To make this dll project and get the things I want to use available outside the dll, I found 3 main options.

1 - Decorate (my term) classes in the dll with __declspec(dllexport/import) (depending on #define flag in .cpp file) to export the class and import where I use / include it.

2 - Similar, but only decorating the class functions (I think) with extern "C" (I think) to export them as C functions.

3 - Create factory functions and a DEF file to export the function (export by ordinal / assigning a number to each function).


I'm not clear on this (hence this topic) but from my fried memory on it:

The compromises I found are:

1 - name mangling means dll project may need the projects that reference it to be built if it is changed even if what is being used is unchanged.

2 - with C functions, overloading is not possible. And you're not exporting the class, just the functions in it?

3 - Similar to C functions, exported functions must be unique throughout the entire dll project.


Benefits:

1 - Seems to be the neater, C++-ier way of doing it. Convenient.

2 - I'm not very clear on this as I felt it was too compromised between both 1 and 3 options.

3 - Has the benefit of being able to add exported functions without needing to rebuild referencing projects. This to me (coming from C#) seems like a great benefit. You can fix bugs and add features without fear of breaking (by needing to recompile) referencing projects. However, in my use, it is probably irrelevent, but I want it.



I want to be able to change and build the dll project as flexibly as possible. Meaning I don't want to have to rebuild referencing projects if the interfaces they deal with aren't changing. Having a setup whereby compiling doesn't break that would be great.

I want to retain C++ and not have to resort to C (no overloading, odd (to me) function signatures and relying on lots of free and factory functions).

I want to export interfaces (pure virtual ABCs) but don't like the idea too much of free factory functions to pump them out. Maybe I just need to get used to the idea and accept it is what it is, if that is the way.



Is this a pipe dream? How can I best do it? Don't get me wrong I've gotten stuff working, but I feel it's not being done properly. I've even bounced back and forth between two different ways of doing it.

I'm considering pimpl pure virtual ABCs now, however I don't quite understand the dll boundary yet, and don't want to spend hours trying things and finding the compromises near the end.



Memory management - what is the best way to do this? Factory methods that generate the objects in the dll and give you a pointer seem the way. A pimpl ABC is basically this where the implementation is hidden under a pointer.



Dll boundary - types:

I'm not getting what is safe to "pass" to a dll. Do I only need to worry about interchange of data in the exported factory functions? Once I get an object using a factory function, is it now OK that the interface of that object has all sorts of STL types as parameters?

As I understand the problem lies in different compilers, or compiler versions, or compiler options being used to compile the different module?/translation unit?/project? This can lead to the types not being structured the same, even ints (for example) not being the same size (one side might expect 16 bit and one 32).


Dll boundary - memory management:

Everything needs to be constructed and destructed on the same heap. Does it matter if an object constructed on the dll's heap is passed a pointer to something created on the exe (referencing project)'s heap? Or should this object (or it's creating dll) be constructing it's things only?

What does it mean to destroy something on another heap? Is calling delete inside an object created on one heap, on something created on another the definition of this? Is newing something from a referenced project bad or is it only bad if that is then deleted on another heap / from something created on another heap?

Some stackoverflow information points to linking to STL (? I think) by dll not statically on every project to ensure a common heap is used, to make sure there are no cross-heap memory problems. This seems like a bandaid; I'd rather do it right. Thoughts? I do like the idea of linking this way to not bake in the STL used into the binaries, but also it requires the user to have the MSVC++ redistributable installed. Thoughts on this?




I have looked into making COM objects and found an older tutorial that glossed over the more technical parts. It was interesting to see that the way I assumed you would make them is basically how it was done.

I watched a 2017 cppcon video on audio engines last night and found that the general design of them is very similar to how I have been developing mine, based on basic system / learnings from chili. Just mentioning these to, I dunno, give me a bit of a pat on the back for seeming like I'm on the right path to doing things right as I'm feeling quite overwhelmed with trying to get these dll projects right.


I hope someone can point me in the right direction. I feel chili probably has the best idea, but at the same time, I'm not sure this is something he's even looked at much. Plus time man, lol this is deep stuff.

FirepathAgain
Posts: 30
Joined: March 20th, 2021, 3:01 am

Re: Dlls and ABI

Post by FirepathAgain » March 30th, 2021, 10:07 am

So I've found more info, one a cppcon talk on dlls by an expert (from MS, works on debug stuff) https://www.youtube.com/watch?v=JPQWQfDhICA


Kind of puts me at ease about a lot of it, but I still have questions about understanding what is exported and why and what is not, but you can still use it?

So for starters, his advice mirrored what I initially did when I made my dll.

1 - Used the DEF file to declare exported functions, by ordinal. This is the most stable way to ensure rebuilding the dll doesn't break others using it. (As long as you don't change existing exports signatures or their ordinal value...)

2 - Export a factory function for each type you want to generate.

3 - Declare those types in interfaces.


Things I learned or had reinforced (but also still confuse me) from the video:

Don't export C++ classes. That was his advice. Don't. He showed how to, then said don't. I have read this before a few times too. This is where a lot of my queries still lay.

So don't export C++ classes, OK. What about interfaces / ABCs? Do you even need to export them or can you just use them? (I feel you don't need to export them, they just declare the object you're going to get from the factory function.)

From my research it seems I should have C-style factory functions exported to produce (as ABCs) pointers to objects generated in the dll. Is it then OK to use the ABC's from inside the dll?

This was how I was doing it, but wasn't sure if I was just getting lucky.

I did do a test project where I took the dll built in release (pretty sure) and put it in the main project folder and included only the header files for the ABCs and the factory function from the dll. I think It needed the lib file as well. It is in the folder, pretty sure I needed it. I also built it statically linked to the runtime library. This worked.

For further robustness / if I was expecting others to use it I would think dll link to runtime would be better, for the tradeoff they need it installed. Files (on disk and in memory) would be smaller though, which is a win.



What about the types that are safe to pass to a dll? From reading, it makes sense now that (as an example) directx makes use of descriptor structs with simple types in it as parameters to pass to make a function do more than one thing / return various types that all inherit the same interface.

I've heard that the only truly save types to use are BYTE, WORD, DWORD. Then there was talk about BOOL, as a typedef of BYTE (or WORD?).
But in directx descriptors I'm pretty sure I've seen float, and maybe UINT?

What is the real deal on this? I will likely start using struct descriptors with simple types to make my exported functions more used. But yeh, dunno if it is correct.



And (assuming it is OK to use interfaces from a dll) once you have your object, is it now OK to pass it types that may be a different size to what it was compiled with? Say the header of this object returned from a dll declares the function DoStuff(int numberOfTimes). My code using this dll might be compiling with int being a different size to what the object will be created with right? (Unless dynamic link to the runtime...) How can that work without dynamic runtime linking. And does that even solve it?



Also trying to find out what is a guaranteed-fixed-size type is so hard. Some sources say that uint32_t is definitely 32-bit, while some say not necessarily... When I look at the definition of uint32_t it is just unsigned int. But using unsigned int isn't guaranteed a certain width... I've even heard that BYTE, WORD, DWORD may not be 100% as they are based on the definition of char(?) and that a byte isn't necessarily 8 bits.




Then there's the memory allocation & de-allocation must be in the same module(?) or heap corruption. Still unsure of what this means exactly. I feel if I use unique_ptr to generate and automatically destroy it will be fine, but for members that are pointers that I want to hold the objects they use externally so they can be shared, is that OK?



Also so much of stack overflow is about making sure everything is generic and templated and works on all operating systems for every scenario... why? Is everyone developing libraries for any and all developers to use? I would think that developing software for users is the vast majority of what developers and programmers do.


Also I guess I forgot the other option of loading dll using loadlibrary which seems very similar to C# using reflection to dig around and get functions out that you haven't been told are there but think might be. This seems like a last resort to me, but also probably the most robust in terms of rebuilding not breaking things as it looks for functions by name. I imagine this might not work if you export with NONAME in the def file? Or name mangling? Not sure how that affects getting things by name.

albinopapa
Posts: 4373
Joined: February 28th, 2013, 3:23 am
Location: Oklahoma, United States

Re: Dlls and ABI

Post by albinopapa » April 8th, 2021, 10:30 pm

crazy thing about name mangling is I think each compiler writer mangles names differently. C functions don't get mangled nearly as bad as C++ functions.

each dll and exe are processes. memory allocated from one process isn't necessarily compatible from another process as you've stated. If you allocate memory from the exe then pass that memory to the dll or vice versa, then this is where you might run into issues. You'd have to know the byte order of the compiled data as well as structure packing. If a structure was packed so that the members are not naturally aligned in one but not both, then the sizes could be different.

I would assume most people want generic code so that they can use it in different platforms. Also, it's nice to get answers about multiple operating environments.

The fixed sized integers are
8 bits = std::uint8_t, std::int8_t
16 bits = std::uint16_t, std::int16_t
32 bits = std::uint32_t, std::int32_t
64 bits = std::uint64_t, std::int64_t

Types like int and long were originally used to describe different sizes ( 16 bits and 32 bits respectively ). So now these aliases are used to "guarantee" a width. The only times I know of that this might be wrong is on small appliances where you might only need up to 16 bit types.

If you look into it, UINT and DWORD are both aliases for unsigned int.
WORD is an alias for unsigned short;
BYTE is an alias for unsigned char;
So they are the same as: std::uint32_t, std::uint16_t and std::uint8_t
I've read that data shared across process borders should be handled similar to transferring data over the internet, in bytes, though this seems silly if you know are the one in control of all the material.
If you think paging some data from disk into RAM is slow, try paging it into a simian cerebrum over a pair of optical nerves. - gameprogrammingpatterns.com

FirepathAgain
Posts: 30
Joined: March 20th, 2021, 3:01 am

Re: Dlls and ABI

Post by FirepathAgain » May 21st, 2021, 11:16 pm

Thanks papa.

It's been a while and I arrived at solutions / understanding that works for me and makes me feel ok.

In my application it basically doesn't matter but I've done all the things to make it as compatible as possible. At least if it needs fixing, it doesn't need much.

My interface basically ends up like COM stuff without the COM registration stuff and the whole reference counting thing, which I didn't need as I generate new objects where I need them. I can see why things like DirectX are done that way, with their descriptors passed in to generate a thing.


When I get bothered enough I'll come back and update with my findings and understanding.

Post Reply