Emuforums.com

Go Back   Emuforums.com > General Discussion > Hardware Discussion
Home Register Downloads FAQ Members List Calendar Arcade Mark Forums Read


Reply
 
LinkBack Thread Tools Display Modes
Old October 1st, 2009   #21 (permalink)
Behind ur girlfriend :D
 
Squall-Leonhart's Avatar
 
Join Date: Feb 2006
Location: Sydney, Australia
Posts: 18,910
The scheduler is evolved to fix that RAP.

btw, those reports of 8800's fallin to pieces are caused by vendors cheaping out on build quality. BFG is well known for having 7xxx 8xxx and 9xxx cards that fall to bits (even having replacement cards falling to pieces in the box during RMA)
__________________


VBA-M | Xtemu | NGOHQ | Post Impact Productions | TNHW | XBCD 0.2.6 | Satanic666's Emulator Compiles
Don't be a NOOB, READ THE NGEmu/EmuForums Rules of Conduct
Need Help with ePSXe? This is your first stop!.

If you don't post all the required information, you don't get help.
Everytime someone posts a romsite, God kills a beautiful woman.
Squall-Leonhart is offline   Reply With Quote

Advertisement [Remove Advertisement]
Old October 1st, 2009   #22 (permalink)
Registered User
 
KrossX's Avatar
 
Join Date: Mar 2006
Location: Argentina
Posts: 926
Quote:
Originally Posted by masta.g.86 View Post
Personally, I think the native execution of C++ and Fortran code to be the most interesting aspect. But why is there native execution of Fortran code? Isn't Fortran Intel's native CPU compiler? Is nVidia teaming with Intel (unlikely I think) or is this a response to Larrabee?

I think it's a response to Larrabee. Seems like nVidia is just kicking Intel in the balls with a faster, better and more importantly, a product that will actually be in production for purchase within a reasonable timeframe rather than just appearing in a few crappy tech demos. (Larrabee)
Fortan is not an Intel's native compiler. Is a language, and Intel has a compiler for it as there are bunch of others. But now, one for nVIDIA has joined the bunch, which I think it will go nicely with HP Fortran ^_^. And yes, I also see that as a response to Larrabee.
__________________
KrossX is offline   Reply With Quote
Old October 1st, 2009   #23 (permalink)
Behind ur girlfriend :D
 
Squall-Leonhart's Avatar
 
Join Date: Feb 2006
Location: Sydney, Australia
Posts: 18,910
Fermi, once again, fixes this. Fermi's global dispatch logic can now issue multiple kernels in parallel to the entire system. At more than twice the size of GT200, the likelihood of idle SMs went up tremendously. NVIDIA needs to be able to dispatch multiple kernels in parallel to keep Fermi fed.
Application switch time (moving between GPU and CUDA mode) is also much faster on Fermi. NVIDIA says the transition is now 10x faster than GT200, and fast enough to be performed multiple times within a single frame. This is very important for implementing more elaborate GPU accelerated physics (or PhysX, great …).
The connections to the outside world have also been improved. Fermi now supports parallel transfers to/from the CPU. Previously CPU->GPU and GPU->CPU transfers had to happen serially.


AnandTech: NVIDIA's Fermi: Architected for Tesla, 3 Billion Transistors in 2010
__________________


VBA-M | Xtemu | NGOHQ | Post Impact Productions | TNHW | XBCD 0.2.6 | Satanic666's Emulator Compiles
Don't be a NOOB, READ THE NGEmu/EmuForums Rules of Conduct
Need Help with ePSXe? This is your first stop!.

If you don't post all the required information, you don't get help.
Everytime someone posts a romsite, God kills a beautiful woman.
Squall-Leonhart is offline   Reply With Quote
Old October 1st, 2009   #24 (permalink)
Hackin 'n Slashin
 
SCHUMI_4EVER's Avatar
 
Join Date: Jan 2007
Location: Corrupt Rapist run South Africa
Posts: 11,346
Quote:
Originally Posted by Cid Highwind View Post
The build quality of ATi's cards of these past generations is much higher apparently.
Or they are just throttling all their cards keeping them away from the breaking point.
__________________
Intel Core2Quad Q9550 (2.83Ghz stock) | ASUS P5Q | 2x2GB Transcend JetRam DDR2-800 | ASUS ENGTX260\HDTP\896M | Windows Vista Home Premium 64bit SP1
The Champ has retired but may his Legacy live on FOREVER !!!!
Get it right fools! The glass is HALF-EMPTY, not half-full!!!
!!! WARNING: Emulation requires a brain !!! WARNING: Emulation =/= Piracy !!!
SCHUMI_4EVER is online now   Reply With Quote
Old October 1st, 2009   #25 (permalink)
Human Metal
 
Cid Highwind's Avatar
 
Join Date: Oct 2002
Location: Holland / Hungary
Posts: 13,548
Quote:
Originally Posted by SCHUMI_4EVER View Post
Or they are just throttling all their cards keeping them away from the breaking point.
Well then we've found one big difference:

ATi decided to limit their hardware in such a way that artificial stress tests that don't translate to real gaming situations won't run at full speed, thus limiting your benchmark number. In order to still break your card you should rename the executable and run something as trivial as that. Seriously, who run these tests in the first place?

I feel pretty confident about my GPU knowing that this "crappily built" card's health is only at risk if I decide to do something reckless with it. If I'd have an nVidia card I'd worry if they wouldn't have messed up the soldering on my card, something that will have consequences even if I would limit myself to running applications for which my GPU was designed: Games.

So, let me quote from the Anandtech article linked by Squall.
Quote:
Originally Posted by Anandtech
NVIDIA's architecture is designed to address its primary deficiency: the company's lack of a general purpose microprocessor. As such, Fermi's enhancements over GT200 address that issue. While Fermi will play games, and NVIDIA claims it will do so better than the Radeon HD 5870, it is designed to be a general purpose compute machine.

ATI's approach is much more cautious. While Cypress can run DirectX Compute and OpenCL applications (the former faster than any NVIDIA GPU on the market today), ATI's use of transistors was specifically targeted to run the GPU's killer app today: 3D games.
Interesting read, it just boils down to the same questions as with the GTX2XX series: Is it relevant for gaming? And, is it available at a competitive pricepoint? Because all these new innovations seem to be targeted at the professional market, something ATi has ignored with the 4k and 5k series, and which has gained them a good advantage in the gaming business.
__________________

PC Specs:
CPU: Intel Q8200 @ 2.8GHz
GPU: Sapphire ATi HD4870 / 1024MB / Core: 801 / Mem: 1000
Mobo: Gigabyte EP35-DS3 (rev 2.1)
SPU: Creative X-Fi Xtreme Music
RAM: 2GB Kingston HyperX DDR2 1066 @ 4-4-4-12 ~ 800MHz
HDD: 1TB Samsung Spinpoint, 32MB
PSU: Hiper Type-R 580W
Monitor: Iiyama B2403WS / 1920*1200

Geometry Wars: 198.400
Lumines: 999.999

Join the NGEmu Folding @ Home community NOW!
Cid Highwind is offline   Reply With Quote
Old October 1st, 2009   #26 (permalink)
Registered User
 
Join Date: Aug 2009
Location: UK
Posts: 30
Who cares if this isn't just for gaming, I wonder (or drool) about the possibilities for Video Encoding, graphic/photoshop work along with cuda filters, Avisynth cuda-accelerated filters like FF3DCuda, or a hardware-assisted De-Interlacer with the quality of MCBob but the speed of a dumb bobber.
DKT70 is offline   Reply With Quote
Old October 1st, 2009   #27 (permalink)
No sir. I don't like it.
 
masta.g.86's Avatar
 
Join Date: Oct 2008
Location: Alabama
Posts: 2,456
A CUDA or DirectCompute version of AVISynth would be absolutely friggin awesome.
__________________
Quote:
The truth is there for those who choose to see it.
Phenom II X4 @ 3.6GHz | 4GB OCZ Dominator DDR3 @ 1600MHz
Sapphire Vapor-X Radeon HD4850 | Samsung TOC 24" 1920x1200
Auzentech X-Fi Forte 7.1 | Klipsch Promedia 5.1 THX
LG H20L BD-RE | WD Caviar Black 1TB 7200RPM
GIGABYTE GA-MA790FXT-UD5P | Windows 7 x64 Ultimate

Join the NGEmu Folding@Home Team! Info
Download the standard client here OR preferably download the GPU or SMP client here.
Set your team ID to: 161326
NGEmu Stats Page: Here
masta.g.86 is offline   Reply With Quote
Old October 1st, 2009   #28 (permalink)
Mobile Fanatic
 
runawayprisoner's Avatar
 
Join Date: Nov 2006
Location: Santa Cruz, CA
Posts: 6,207
Quote:
Originally Posted by Squall-Leonhart View Post
The scheduler is evolved to fix that RAP.
It can only do so much. If anything, it means you won't clog the system trying to communicate with the GPU, but it all boils down to one thing: PCI-E x16 2.0 or not... just isn't fast enough to make GPU offloading something noticeable if the CPU itself can already do the same thing in a few clock cycles. That's why very calculation-intensive tasks such as video encoding and folding would see a performance boost if you offload them to the GPU, but if you offload something like 1 + 1 over and over again then doing so on your CPU would be substantially faster.

Quote:
Originally Posted by Squall-Leonhart View Post
Fermi now supports parallel transfers to/from the CPU. Previously CPU->GPU and GPU->CPU transfers had to happen serially.
Great, better utilization of PCI-E bandwidth. But quite honestly... I can't see how this would be relevant to gaming or anything else outside of whatever the hell CUDA has allowed up to now.

Well, it might be relevant if you do ray-tracing on 3 or more of these units, though. Beats having to connect several PS3s through LAN because you can't be arsed to optimize a ray-tracer specifically for only one of them. (the HELL with having only 256MB of RAM to work)

Quote:
Originally Posted by DKT70 View Post
Who cares if this isn't just for gaming, I wonder (or drool) about the possibilities for Video Encoding, graphic/photoshop work along with cuda filters, Avisynth cuda-accelerated filters like FF3DCuda, or a hardware-assisted De-Interlacer with the quality of MCBob but the speed of a dumb bobber.
CUDA filters might not be faster than filters running on the fastest CPU on the planet... say... Core i7 at 4.2GHz, because filters typically branch a lot. The branching has to happen on CPU IIRC, and the calculations are usually not too intensive so the CPU can take care of all of that by itself.

It's like you... have the math problem "1 + 1 = ?". Typically... you can solve it yourself in less than a sec (this is doing it on CPU), but you send an email to your friend who is half the globe away, and you have to wait a day or even a week to receive an answer (this is doing it on CUDA).

Obviously if the question was like "PI to the 1 billionth digit?" then you might as well send the email... but typically, filters don't go that far. Since your color range doesn't stray too far from 0 - 255, you only have to calculate that much, which means modern CPUs are way too capable of doing so all by themselves, and offloading to CUDA is a waste of time.

Video encoding would be faster, though. Because it's just bruteforcing numbers. You just have to send an entire frame over... and you get an entire frame back since all of them is processed the same way. That's fast!
__________________
cChip interpreter WIP - current status: Release Candidate
LRx Filter RC - current performance rating: 9/10
runawayprisoner is offline   Reply With Quote
Old October 1st, 2009   #29 (permalink)
Registered User
 
Join Date: Aug 2009
Location: UK
Posts: 30
Quote:
CUDA filters might not be faster than filters running on the fastest CPU on the planet... say... Core i7 at 4.2GHz, because filters typically branch a lot. The branching has to happen on CPU IIRC, and the calculations are usually not too intensive so the CPU can take care of all of that by itself.
FFT3dGPU, says otherwise. I have still yet to see a CPU run fft3dfilter as fast as my aging 9800GT runs FFT3dGPU. Although, a fully multi-threaded fft3dfilter might run a bit faster than it's GPU version, though a version doesn't exist, only a sort of hacked-in workaround version to work more friendly with MT, yet it's still as unpredictable as it ever was. Sometimes running with 4 threads, I have to leave the room incase I sneeze and avisynth crashes out in the middle of a 4-hour encode.
DKT70 is offline   Reply With Quote
Old October 1st, 2009   #30 (permalink)
Behind ur girlfriend :D
 
Squall-Leonhart's Avatar
 
Join Date: Feb 2006
Location: Sydney, Australia
Posts: 18,910
RAP

the 8800 series were the first to allow full fps in pj64 with copy framebuffer to rdram enabled

the GT300 will expand this to more games :P

Quote:
Originally Posted by Cid Highwind View Post
Well then we've found one big difference:

ATi decided to limit their hardware in such a way that artificial stress tests that don't translate to real gaming situations won't run at full speed, thus limiting your benchmark number. In order to still break your card you should rename the executable and run something as trivial as that. Seriously, who run these tests in the first place?

I feel pretty confident about my GPU knowing that this "crappily built" card's health is only at risk if I decide to do something reckless with it. If I'd have an nVidia card I'd worry if they wouldn't have messed up the soldering on my card, something that will have consequences even if I would limit myself to running applications for which my GPU was designed: Games.

So, let me quote from the Anandtech article linked by Squall.
Interesting read, it just boils down to the same questions as with the GTX2XX series: Is it relevant for gaming? And, is it available at a competitive pricepoint? Because all these new innovations seem to be targeted at the professional market, something ATi has ignored with the 4k and 5k series, and which has gained them a good advantage in the gaming business.
Well since Games are dependant on Float, SM and SP performance, theres no doubt that games will benefit from the design, and improvements.

....im thinking its time they cut the SM3< chips from the drivers though.... since that'll remove a bulk of code that needs to be checked over every new build and allow for faster and less buggy drivers to be released
__________________


VBA-M | Xtemu | NGOHQ | Post Impact Productions | TNHW | XBCD 0.2.6 | Satanic666's Emulator Compiles
Don't be a NOOB, READ THE NGEmu/EmuForums Rules of Conduct
Need Help with ePSXe? This is your first stop!.

If you don't post all the required information, you don't get help.
Everytime someone posts a romsite, God kills a beautiful woman.

Last edited by Squall-Leonhart; October 1st, 2009 at 21:51.. Reason: Automerged Doublepost
Squall-Leonhart is offline   Reply With Quote
Old October 1st, 2009   #31 (permalink)
&-)---|--<
 
fivefeet8's Avatar
 
Join Date: Apr 2001
Location: Smallville
Posts: 7,698
Quote:
Originally Posted by runawayprisoner View Post
So I'd say native support for C++ and Fortran will... depend on what exactly are supported. Otherwise it's just your general stuffs, which is why even with all this talk about how awesome CUDA is, you can only "fold" or encode videos on it to name a few tasks.
The programming model has changed a bit with Fermi.
*Indirect branching
*Fine grained Exception handling (C++)
*Recursion
*Pointers
*Object references

Quote:
Originally Posted by runawayprisoner View Post
That's one thing. The other is... while the specs look impressive, the important thing is the performance, and I don't have to remind anyone that this time around SPs ain't SPs anymore (lest nVidia is a good liar), and the GPU itself will be geared more towards extreme computing rather than extreme graphics.
An SP is still and SP. From all the technical articles, they are still similar to the SP's in previous hardware. They have been ehanced to a point though and from what's been detailed, it shouldn't lower performance per SP compared to previous hardware. One example is that in previous cards, each SP could only do 1 MAD all of the time and Dual Issue a MUL sometimes. With Fermi, each SP can do a MAD operation and Dual issue another MAD all the time.
__________________
Play emulated games online
Main Rig||Intel Q6600@3.2 ghertz|4x1gb DDR2 1066|Asrock 1600sli 110db LGA 775|EVGA 8800gtx@620/1450/975|2x Seagate 160gb SATA150 Raid0|250 gb Samsung SATA2 HD|Seagate 7000.10 500gb HD|NEC 3520 4x/8x DVD+R/RW DL burner|Antec TP 650 watt|40" Sony Bravia 1080p|20.1" 5ms LCD 1680x1050 Native|Logitec 5.1 Speaker System w/15" Sub|Dual Boot Ubuntu 64bit/Vista 32bit||


SimplyBuyIt - Health&Nutrition

Last edited by fivefeet8; October 1st, 2009 at 22:59..
fivefeet8 is offline   Reply With Quote
Old October 1st, 2009   #32 (permalink)
Behind ur girlfriend :D
 
Squall-Leonhart's Avatar
 
Join Date: Feb 2006
Location: Sydney, Australia
Posts: 18,910
got any secret info FiveFeet8? or are you under nda?
__________________


VBA-M | Xtemu | NGOHQ | Post Impact Productions | TNHW | XBCD 0.2.6 | Satanic666's Emulator Compiles
Don't be a NOOB, READ THE NGEmu/EmuForums Rules of Conduct
Need Help with ePSXe? This is your first stop!.

If you don't post all the required information, you don't get help.
Everytime someone posts a romsite, God kills a beautiful woman.
Squall-Leonhart is offline   Reply With Quote
Old October 1st, 2009   #33 (permalink)
No sir. I don't like it.
 
masta.g.86's Avatar
 
Join Date: Oct 2008
Location: Alabama
Posts: 2,456
Oh... my... gawd!



I hope Sony feels ashamed for requiring multiple PS3 units to do something similar. Better yet, I wonder if Larrabee can do this?
__________________
Quote:
The truth is there for those who choose to see it.
Phenom II X4 @ 3.6GHz | 4GB OCZ Dominator DDR3 @ 1600MHz
Sapphire Vapor-X Radeon HD4850 | Samsung TOC 24" 1920x1200
Auzentech X-Fi Forte 7.1 | Klipsch Promedia 5.1 THX
LG H20L BD-RE | WD Caviar Black 1TB 7200RPM
GIGABYTE GA-MA790FXT-UD5P | Windows 7 x64 Ultimate

Join the NGEmu Folding@Home Team! Info
Download the standard client here OR preferably download the GPU or SMP client here.
Set your team ID to: 161326
NGEmu Stats Page: Here
masta.g.86 is offline   Reply With Quote
Old October 1st, 2009   #34 (permalink)
Behind ur girlfriend :D
 
Squall-Leonhart's Avatar
 
Join Date: Feb 2006
Location: Sydney, Australia
Posts: 18,910
water shaders + compute + physics.
__________________


VBA-M | Xtemu | NGOHQ | Post Impact Productions | TNHW | XBCD 0.2.6 | Satanic666's Emulator Compiles
Don't be a NOOB, READ THE NGEmu/EmuForums Rules of Conduct
Need Help with ePSXe? This is your first stop!.

If you don't post all the required information, you don't get help.
Everytime someone posts a romsite, God kills a beautiful woman.
Squall-Leonhart is offline   Reply With Quote
Old October 2nd, 2009   #35 (permalink)
Registered User
 
Join Date: Apr 2001
Posts: 47
Quote:
Originally Posted by fivefeet8 View Post


An SP is still and SP. From all the technical articles, they are still similar to the SP's in previous hardware. They have been ehanced to a point though and from what's been detailed, it shouldn't lower performance per SP compared to previous hardware. One example is that in previous cards, each SP could only do 1 MAD all of the time and Dual Issue a MUL sometimes. With Fermi, each SP can do a MAD operation and Dual issue another MAD all the time.
The dual-issue MUL in the G80/GT200 series was done on the SFU units and almost impossible to use in real-life scenarios. Ferni removes this capacity. It's true that each SP has two data paths one for integers(ALU) and another for floating point (FPU) but they share a single data port so dual-issue isn't possible.

Real World Technologies - Inside Fermi: Nvidia's HPC Push
wlee15 is offline   Reply With Quote
Old October 2nd, 2009   #36 (permalink)
&-)---|--<
 
fivefeet8's Avatar
 
Join Date: Apr 2001
Location: Smallville
Posts: 7,698
Quote:
Originally Posted by wlee15 View Post
The dual-issue MUL in the G80/GT200 series was done on the SFU units and almost impossible to use in real-life scenarios. Ferni removes this capacity. It's true that each SP has two data paths one for integers(ALU) and another for floating point (FPU) but they share a single data port so dual-issue isn't possible.

Real World Technologies - Inside Fermi: Nvidia's HPC Push
Must have read it wrong in the other articles. Interesting, but from that diagram of the SP, Fermi had 2 ALU's in each SP compared to 1 ALU for G8x/GTX2xx.
__________________
Play emulated games online
Main Rig||Intel Q6600@3.2 ghertz|4x1gb DDR2 1066|Asrock 1600sli 110db LGA 775|EVGA 8800gtx@620/1450/975|2x Seagate 160gb SATA150 Raid0|250 gb Samsung SATA2 HD|Seagate 7000.10 500gb HD|NEC 3520 4x/8x DVD+R/RW DL burner|Antec TP 650 watt|40" Sony Bravia 1080p|20.1" 5ms LCD 1680x1050 Native|Logitec 5.1 Speaker System w/15" Sub|Dual Boot Ubuntu 64bit/Vista 32bit||


SimplyBuyIt - Health&Nutrition

Last edited by fivefeet8; October 2nd, 2009 at 01:31..
fivefeet8 is offline   Reply With Quote
Old October 2nd, 2009   #37 (permalink)
&-)---|--<
 
fivefeet8's Avatar
 
Join Date: Apr 2001
Location: Smallville
Posts: 7,698
Quote:
Originally Posted by Squall-Leonhart View Post
got any secret info FiveFeet8? or are you under nda?
Not anymore than others.
__________________
Play emulated games online
Main Rig||Intel Q6600@3.2 ghertz|4x1gb DDR2 1066|Asrock 1600sli 110db LGA 775|EVGA 8800gtx@620/1450/975|2x Seagate 160gb SATA150 Raid0|250 gb Samsung SATA2 HD|Seagate 7000.10 500gb HD|NEC 3520 4x/8x DVD+R/RW DL burner|Antec TP 650 watt|40" Sony Bravia 1080p|20.1" 5ms LCD 1680x1050 Native|Logitec 5.1 Speaker System w/15" Sub|Dual Boot Ubuntu 64bit/Vista 32bit||


SimplyBuyIt - Health&Nutrition
fivefeet8 is offline   Reply With Quote
Old October 2nd, 2009   #38 (permalink)
Mobile Fanatic
 
runawayprisoner's Avatar
 
Join Date: Nov 2006
Location: Santa Cruz, CA
Posts: 6,207
Yeah, apparently there's quite a number of changes in the GT300 SPs compared to your traditional SPs. I won't go too deep because I'm clueless as to how those things will perform now, but after seeing the docs, I'm not at all amazed. I'd say that this card will perform just a bit over GTX 285 but under GTX 295 at most.

Coupled that with yield problems and other factors, I dare say nVidia is in quite a pinch this generation.
__________________
cChip interpreter WIP - current status: Release Candidate
LRx Filter RC - current performance rating: 9/10
runawayprisoner is offline   Reply With Quote
Old October 2nd, 2009   #39 (permalink)
Behind ur girlfriend :D
 
Squall-Leonhart's Avatar
 
Join Date: Feb 2006
Location: Sydney, Australia
Posts: 18,910
xD we were going over the details of this MIMD (and by definition the gpu has been mimd since G80) chip and discussing its capabilities in pcsx2 chan.

Its got a massive amount of potential for execution of native C.

I think it will be 1.2-1.5x the 295 at full performance.
__________________


VBA-M | Xtemu | NGOHQ | Post Impact Productions | TNHW | XBCD 0.2.6 | Satanic666's Emulator Compiles
Don't be a NOOB, READ THE NGEmu/EmuForums Rules of Conduct
Need Help with ePSXe? This is your first stop!.

If you don't post all the required information, you don't get help.
Everytime someone posts a romsite, God kills a beautiful woman.
Squall-Leonhart is offline   Reply With Quote
Old October 2nd, 2009   #40 (permalink)
Registered User
 
KrossX's Avatar
 
Join Date: Mar 2006
Location: Argentina
Posts: 926
Looks pretty interesting for non-gaming purposes though.
__________________
KrossX is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On


All times are GMT. The time now is 16:53.

© 2006 - 2008 Emu Forums | About Emu Forums | Legal | A member of the Crowdgather Forum Community


Powered by vBulletin® Version 3.7.6
Copyright ©2000 - 2009, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.2.0 RC5