memcpy

The Partridge Family were neither partridges nor a family. Discuss.
User avatar
npissoawsome
Posts: 114
Joined: June 8th, 2012, 3:01 pm

Re: memcpy

Post by npissoawsome » October 27th, 2012, 2:43 am

chili wrote:Yup, the string instructions work in 64-bit mode.

Okay sweet, I need to research this instruction lol

Too busy reading nvidia's cuda c manual. We should do some tutorials on OpenCl/cude

User avatar
chili
Site Admin
Posts: 3948
Joined: December 31st, 2011, 4:53 pm
Location: Japan
Contact:

Re: memcpy

Post by chili » October 27th, 2012, 2:51 am

Yeah, movsd (and movsq in 64-bit mode) are pretty good when used with the REP prefix. Not as fast as SSE or AVX, but the next best thing.

I actually wouldn't mind doing gpgpu someday (although it has little to do with game dev). I once wrote neuro-genetic simulator that offloaded the synapse step processing to GPU via CUDA. Worked pretty nice.
Chili

User avatar
Asimov
Posts: 814
Joined: May 19th, 2012, 11:38 pm

Re: memcpy

Post by Asimov » October 27th, 2012, 8:41 am

Hi all,

I am following this thread with interest, but I haven't got a clue what you are talking about LOL
----> Asimov
"You know no matter how much I think I have learnt. I always end up hitting brick walls"
http://www.asimoventerprises.co.uk

User avatar
npissoawsome
Posts: 114
Joined: June 8th, 2012, 3:01 pm

Re: memcpy

Post by npissoawsome » October 27th, 2012, 10:25 pm

chili wrote:Yeah, movsd (and movsq in 64-bit mode) are pretty good when used with the REP prefix. Not as fast as SSE or AVX, but the next best thing.

I actually wouldn't mind doing gpgpu someday (although it has little to do with game dev). I once wrote neuro-genetic simulator that offloaded the synapse step processing to GPU via CUDA. Worked pretty nice.
gpgpu would work because you can accelerate your custom code on your gpu. DirectX is only accelerates it's own code. Like if you used cuda for alpha blending, it would be incredibly fast

User avatar
chili
Site Admin
Posts: 3948
Joined: December 31st, 2011, 4:53 pm
Location: Japan
Contact:

Re: memcpy

Post by chili » October 28th, 2012, 1:44 am

Realtime graphics processing works much better (faster) if written as shaders as opposed to a CUDA kernel.

Also, the rasterization engine will do alpha blending faster than any shader program or CUDA solution ever could.
Chili

User avatar
npissoawsome
Posts: 114
Joined: June 8th, 2012, 3:01 pm

Re: memcpy

Post by npissoawsome » October 28th, 2012, 6:28 am

chili wrote:Realtime graphics processing works much better (faster) if written as shaders as opposed to a CUDA kernel.
assume that mean for graphics DirectX is better to use than CUDA. If so, then yes, I totally agree, I was just saying you can't make custom code to run on your gpu without CUDA or OpenCl
chili wrote: Also, the rasterization engine will do alpha blending faster than any shader program or CUDA solution ever could.
I have no idea what a rasterization engine is, could you explain?

User avatar
chili
Site Admin
Posts: 3948
Joined: December 31st, 2011, 4:53 pm
Location: Japan
Contact:

Re: memcpy

Post by chili » October 31st, 2012, 3:02 pm

Yeah, you can make custom code run on a gpu without gpgpu. That's exactly what a shader is: code you write to run on a gpu.

the rasterizer is the stage that comes after the pixel shader in a modern video card rendering pipeline. it handles writing to the framebuffer.
Chili

Post Reply