|
|
|||||||
| About Us | Register | FAQ | Members List | Calendar | Mark Forums Read |
![]() |
|
|
LinkBack | Thread Tools | Display Modes |
|
|
#1 (permalink) |
|
Guest
Posts: n/a
|
Pixel Shaders, Floating Point Precision. Morally Grungy Issue
(I apologise for uploading a few pictures but I have no place to "store" them
Ok First thing needs to be understood here. Precision in bits does not equate to color. It "Cannot" be compared to 16 bit 24 and 32 bit color. Precision is just that. An issue where the shader is used to shader textures without error, Lets get down to the issues of Why the shaders are such a morally grungy issue. First lets Get down to DirectX 9.0 Specification. 12x Integer is not supported by DX 9.0 specification. It's meant to do 12 bit precision, But is not floating point at all. Completely fixed. 16 Bit Floating Point precision is Completely supported by DirectX 9.0 specification should the aplication "request it" 24 Bit is the minimum required specification for Floating Point Data that is not "specified". And 32 bit Being the reccomended specification. This is probably done to keep rendering errors to occur during during large scenes of Floating Point Data, An Example would be the shaded Sky in 3dmark2003, which is a "large" spread of Pixels requiring high precision. Ok now to the issue of optimising code, And "proper" shading code on the Nv30 architecture compared to that of the r300 and the way it handles precision instruction. The Common Misconception about the Nv30 architecture is that it's required to operate a specific precision all the time. This is "False" information. The Nv30 can operate, 32 bit, 16 bit and 12x integer simontanously. So Take it you're rendering 3 objects in a Scene. Imagine you are rendering a desk with an apple on it. The apple surface is completely textured by the shaders in Question. At Fully 32 bit Floating point calculation. Meeting ultimate precision for that Apple. The Stem is shaded by 12x Integer fixed precision. This Stem not needing a great deal of precision. So "this" would be fine for that object. Now the desk surface being rendered in fully 16 bit precision. Offering the precision required for a shiny desktop. Or whatever you want. This would be what Demigod and I would refer to as "optimal coding" using as much resources as neccasary to render given scene ect. Now The R300 would handle that same scene different. Firstly the Apple would be downscaled to 24 bit. The Stem would be upscaled to 24 bit. and the Desk would be upscaled to 24 bit. This could be "ideal" And possibly produce greater results for the desk and stem. so you would recieve a higher Image Quality on those objects. However the pecision on the Apple itself would be lost and the apple would not look quite as "good" Now Here's the question. Is ATI's 24 bit "ideal" Or is Nvidias Method ideal. Once again this calls into question of what is being rendered, At a distance you probably will not see a difference between the objects. However if you are close the object and staring at it. You Will See a difference. Here's an example of something. 32 Bit precision of objects zoomed in and blown up and being rendered where high precision is very neccasary Here is 24 bit. As you can see in this case, High precision would be very neccasary for rendering something as sensitive to precision. Over a certain variance of space. This aplication is designed to be sensitive to Precision here's FX integer 12x compared to Radeon 24 bit Floating Point. FX 12 Bit Radeon 9600 Pro 24 bit. Now Heres a situation where FX 12x would be fine (Aka 12 bit) offering marginal differences in Quality from 24 bit 12 bit compared to 24 bit As you can see by the demonstrations I've given you. Shaders are sensitive to "whats" being textured, And too "how much" precision that object needs and is zoomed into. Anyway I hope this post gives some of you an idea of how shaders operate and a better understanding of the differences between various precisions :P Last edited by ChrisRay; June 1st, 2003 at 12:54. |
|
|
|
#2 (permalink) |
|
Flood Yourself
![]() ![]() ![]() ![]() ![]() Join Date: Aug 2001
Location: Adelaide, Australia
Posts: 1,342
|
In the end I think it's down to the level of precision that the actual coders want to include in their code. Better coders should appreciate the fact that the NV30 has the ability for them to choose the level of precision that they want to use, however lazy coders will probably welcome the fact that the ATi makes everything 24bit and therefore easier to code for (at the cost of flexibility).
|
|
|
|
|
|
#3 (permalink) |
|
...
![]() ![]() ![]() ![]() ![]() Join Date: Jan 2003
Location: Desa Park, Key Ell, Malaysia
Posts: 2,301
|
...I'm impressed by you, ChrisRay. Nice thread btw
__________________
System specs P4 2.6c | Abit AI7 | Radeon 9800pro | 1024mb kingston ddr400 | 120gb seagate sata | 40mb wd | And all the minor bits and pieces 3dMark2001SE |
|
|
|
|
|
#4 (permalink) |
|
Advanced Newbie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jul 2001
Location: Bogotá... not that it matters...
Posts: 4,902
|
Why Impressed? we all know that chrisRay know what he is talking about
__________________
Main Rig: Pentium Dual-Core E2160 @ 2.5GHz -- 7600GT @ 650/800MHz -- 2GB DDR2-667 4-4-4-12 -- Windows XP Pro SP3 Collecting dust: AMD Athlon XP 2600+ -- ATI Radeon 9500 Pro -- SB Live 5.1 Digital -- 2.5GB DDR 2-2-2-5 Ram -- Windows 2003 server r2 SP1 |
|
|
|
|
|
#5 (permalink) |
|
...
![]() ![]() ![]() ![]() ![]() Join Date: Jan 2003
Location: Desa Park, Key Ell, Malaysia
Posts: 2,301
|
Ah well, how many ppl in this forums would know as much as our dear ChrisRay here?
__________________
System specs P4 2.6c | Abit AI7 | Radeon 9800pro | 1024mb kingston ddr400 | 120gb seagate sata | 40mb wd | And all the minor bits and pieces 3dMark2001SE |
|
|
|
|
|
#6 (permalink) |
|
sakabatou_x
![]() ![]() ![]() ![]() ![]() Join Date: Apr 2001
Location: TEXAS (everything's bigger)
Posts: 1,256
|
ok, i was thrown by the end of your speal. So Nvidia can use the different levels of precision while ati will down or upscale the portions to one standard? If so, why are the FX images so poor in comparison to the radeon 9600?
edit: is it because in the end, the radeon ended up upscaling the entire scene? edit2: ok, just saw the line describing that the pics are done in 24bit by ati and 12x by nvidia... ![]() edit3: i guess the ideal method all comes down to how much strain ati's method brings onto its card. If for the most part it ends up upscaling the majority of scenes and doesn't receive serious performance hits for it, then more power to them, else Nvidia's method would seem to be the ideal as it doesn't do more work than necessary. the issue for ati would be the frequency with which its method actually downscales a scene. How common is it for images to use 32bit precision?
__________________
andré:.thrasher10:.sakabatou_x:. emulator:epsxe:.os:xp_pro:.processor:amd_athlonxp_3000+:.mobo:epox_8rda3+:. gfx_card:radeon9800pro_256mb_ddr:.ram:1.5gb_ddr:.pad:super_dual_box_2p_usb_adapter:. Last edited by andré; May 31st, 2003 at 18:00. |
|
|
|
|
|
#7 (permalink) |
|
Ah!
![]() ![]() ![]() ![]() Join Date: Dec 2001
Location: Evanston Il
Posts: 545
|
Sounds like ATI has the right ideal in general. It is nice that you can get 32 bit precision at all with nVidia cards, but with GPU's, if a small increase in rendering quality ain't fast, people don't really care. ATI's IQ is very nice and you'd really have to try pretty hard in general situations to make ATI's path look much worse than nVidia's at 32 bit, and I'm sure the extra FPS are more desirable than the extra precision that you could potentially have. By giving an acceptable level of precision in the shader calculations, ATI has a nice speed/quality compromise that gets the job done w/o the complexity of dynamically executing calculations at multiple precisions which could lead to shader compiling issues. It sounds like NVidia dug themselves into a complicated hole with their approach.
There also tend to be GPU specific rendering paths in many games since GPU's are no where near as predictable in their output as CPU's in general since you just need to cater to the limits of human visual perception, and that's what makes them more interesting .It boils down more to the specific hardware rather than DX9 abstraction layer which seems to have failed in its task of completely abstracting away hardware implementations. The good programmers probably will know this so it'd be the fault of the programmers rather than the cards if shader code came out looking like crap in their game. Again, even limited to 24 bit precision, there is plenty of room to maneuver so long as you don't write code specifically designed to break ATI's implementation. It's very nice that NVidia gives you plenty of precision options to render with but I'd favor ATI's well considered compromise in this situation. None the less, it still seems like a small point considering how overkill all these DX9 cards are for games of today, and that when DX9 itself becomes a widely used standard in things besides benchmarking, these 1st gen DX9 cards will undoubtedly be too slow and just be the cards you ran "Dawn" or "3dmark03" on. New DX10 architectures will probably allow for very fast native DX9 32 bit shader precision with no compromise at all in speed and quality and they will actually be cards that you'll actually play games with. There definitely will be give and take initial implementations of DX10 I'm sure, so I'd rather consider DX(N) cards polished DX(N-1) cards and ignore the latest DX(N). I've seen shots and vids of Dawn until I was sick of her and I have the "nature" benchmark on a video file so I've missed nothing .
__________________
Laptop: 1.6@2.13 Dothan 2MB (Pinmod), 2 GB 533mhz DDR2, GeForce 6800 360/730 256MB, 100 GB 5400 RPM, 17" UXGA+ Last edited by Raqia; June 1st, 2003 at 11:35. |
|
|
|
|
|
#11 (permalink) |
|
band
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Apr 2002
Location: HERE
Posts: 4,574
|
>When DX9 itself becomes a widely used standard in things besides benchmarking, these 1st gen DX9 cards will undoubtedly be too slow and just be the cards you ran "Dawn" or "3dmark03" on
So true...but still, I want a 9800 pro
__________________
![]() 1. Small, Cheap, Powerful - you can have only two 2. Everytime you engineer something foolproof, the world comes up with a better idiot. 3. 「学問とは虚栄である。」 The PS3 Cell Processor Explained Page, IBM Research's Introduction to Cell Multiprocessor and Cell Processor Programming Guide |
|
|
|
|
|
#12 (permalink) |
|
Nutcase
![]() ![]() ![]() ![]() Join Date: Dec 2002
Location: Great Southern Land
Posts: 703
|
Given the amount of programmability going into current graphics hardware, we'll basically be going back to all-software rendering within a few years. We may as well forget about making 3d graphics hardware and just make all our software do its rendering with the CPU.
__________________
"I feel that if a person can't communicate, the very least he can do is to shut up!" - Tom Lehrer |
|
|
|
|
|
#13 (permalink) |
|
これはバタスです
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jun 2001
Location: Toronto, Ontario, Canada
Posts: 5,811
|
But why do that when we can accelerate graphics by dividing up rendering tasks among the PC's components? That is essentially what hardware rendering is. It doesn't matter if you're using fixed function rendering or pixel shaders, it's still all hardware rendering. Consoles and arcade machines are built this way and they are very efficient.
I personally think GPU hardware should just be redesigned from the ground-up with simpler functions and with infinite programmability (like CPUs). I'm sort of leaning towards a "RISC movement" for GPUs .
__________________
CPU: Intel Core 2 Quad Q9450 @ 2.66 Ghz (Yorkfield) Mobo: Intel DX48BT2 Memory: 2048 MB PC10600 DDR3 Videocard: PNY Geforce 9800 GX2 PCIe w/ 1024 MB GDDR3 Soundcard: On-board SigmaTel High Definition Audio Hard drive: 300 MB Maxtor & 1 TB Hitachi Optical drive: LG GGW-H20L (2x BD-R DL) OS: Microsoft Windows Vista (32-bit) |
|
|
|
|
|
#14 (permalink) |
|
Nutcase
![]() ![]() ![]() ![]() Join Date: Dec 2002
Location: Great Southern Land
Posts: 703
|
So basically, we're going to end up with multiprocessor systems with cpus on the motherboard and graphics card?
And speaking of multiprocessors, take a look at the thread "New Computer Architecture" in this forum before it gets deleted.
__________________
"I feel that if a person can't communicate, the very least he can do is to shut up!" - Tom Lehrer |
|
|
|
|
|
#15 (permalink) |
|
Flood Yourself
![]() ![]() ![]() ![]() ![]() Join Date: Aug 2001
Location: Adelaide, Australia
Posts: 1,342
|
Not exactly. The GPU would still be used strictly for graphics calculations only. There's no point in giving a GPU instructions for doing anything other than graphics-related calculations, in fact I think it should be avoided to force programmers to stick to using the CPU for those calculations. However, with more ideas being thought up for graphics rendering the GPU will eventually become some kind of RISC style GPU. Who knows though, it also depends on how far the line between the APIs and the cards stretch.
|
|
|
|
|
|
#16 (permalink) |
|
Nutcase
![]() ![]() ![]() ![]() Join Date: Dec 2002
Location: Great Southern Land
Posts: 703
|
Okay, what falls into the area of graphics calculations, and what doesn't?
>eventually become some kind of RISC style GPU. Don't you mean RISC style CPU? Or am I reading that wrong?
__________________
"I feel that if a person can't communicate, the very least he can do is to shut up!" - Tom Lehrer |
|
|
|
|
|
#17 (permalink) | ||
|
これはバタスです
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jun 2001
Location: Toronto, Ontario, Canada
Posts: 5,811
|
Quote:
Quote:
__________________
CPU: Intel Core 2 Quad Q9450 @ 2.66 Ghz (Yorkfield) Mobo: Intel DX48BT2 Memory: 2048 MB PC10600 DDR3 Videocard: PNY Geforce 9800 GX2 PCIe w/ 1024 MB GDDR3 Soundcard: On-board SigmaTel High Definition Audio Hard drive: 300 MB Maxtor & 1 TB Hitachi Optical drive: LG GGW-H20L (2x BD-R DL) OS: Microsoft Windows Vista (32-bit) Last edited by Demigod; June 5th, 2003 at 18:38. |
||
|
|
|
|
|
#18 (permalink) |
|
Nutcase
![]() ![]() ![]() ![]() Join Date: Dec 2002
Location: Great Southern Land
Posts: 703
|
I think I understand now. The death of APIs is one thing I'm sure many programmers would be happy to see.
With this proposed new architecture, do you think it would be possible to have the GPU do raytracing fast enough for real-time applications?
__________________
"I feel that if a person can't communicate, the very least he can do is to shut up!" - Tom Lehrer |
|
|
|
|
|
#19 (permalink) | |
|
Puchiko-nyu!
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jul 2001
Location: 49° 11' N 123° 10' W
Posts: 2,854
|
Quote:
Short answer: no, not until your basic CPU can handle tens of thousands more instructions per second than what we have now. However, having said that, it does not prevent the use of other shading/rendering technique to be used instead to achieve real-time lifelike cinema quality animations.
__________________
"Not every ejaculation deserves a name." --- George Carlin |
|
|
|
|
|
|
#20 (permalink) | ||
|
Nutcase
![]() ![]() ![]() ![]() Join Date: Dec 2002
Location: Great Southern Land
Posts: 703
|
Quote:
Quote:
Would it be possible to use the GPU and the CPU as a kind of multiprocessor system for graphics stuff?
__________________
"I feel that if a person can't communicate, the very least he can do is to shut up!" - Tom Lehrer |
||
|
|
|
![]() |
| Thread Tools | |
| Display Modes | |
|
|