News Project Cars: Entwickler und AMD arbeiten an Leistungssteigerung

berkeley · 11. Mai 2015

Das mag auch alles in gewisser Hinsicht durchaus praktische Relevanz für die heutige Zeit haben. Aber ich zitiere einfach:

"Many years ago, I briefly worked at NVIDIA on the DirectX driver team. This is Vista era, when a lot of people were busy with the DX10 transition......"

".....[Edit/update] I'm definitely not suggesting that the APIs have been made artificially difficult, by any means - the engineering work is solid in its own right. It's also become clear, since this post was originally written, that there's a commitment to continuing DX11 and OpenGL support for the near future. That also helped the decision to push these new systems out, I believe...."

Vetinari · 11. Mai 2015

Mit DX11 hat sich in den relevanten Punkten nicht besonders viel geändert (ist ja auch uralt), sonst müssten die GPU-Hersteller nicht so eng mit den Spieleproduzenten zusammenarbeiten. Was dabei immer noch herauskommt sieht man unter Anderem hier. Und sonst wären gewisse Spiele- respektive Engineentwickler nicht auf Vulkan/Mantle angesprungen.

berkeley · 11. Mai 2015

Dass DX11 nicht gerade das schnellste und beste sollte klar sein (im Vergleich zu LowLevel).
Seit der DX10 Einführung 2006 hat sich allerdings einiges getan. Da kamen schon noch ein paar Versionen raus mit 10, 10.1, 11, 11.1 und zuletzt 11.2 im Jahr 2013. Ganz so "für'n Arsch" waren die 9 Jahre dann doch nicht.

Vetinari · 11. Mai 2015

Aus meiner Froschperspektive hat sich die D3D API von 10 auf 11 nur weiter "aufgeblasen" , und die Wikipedia bestätigt mir das:

https://en.wikipedia.org/wiki/DirectX#DirectX_11 schrieb:
Direct3D 11 is a strict superset of Direct3D 10.1 — all hardware and API features of version 10.1 are retained, and new features are added only when necessary for exposing new functionality. This helps to keep backwards compatibility with previous versions of DirectX.

Daedal · 11. Mai 2015

Könntest du mal zitierten was du meinst?
Das ist ein 3 Seiten langer Thread.

Ergänzung (12. Mai 2015)

Vetinari schrieb:
@smilefaker:
Lies mal diesen Kommentar.

Den hast du schon zweimal verlinkt und ich habe keine Ahnung was du meinst

Ergänzung (12. Mai 2015)

Vetinari schrieb:
Aus meiner Froschperspektive hat sich die D3D API von 10 auf 11 nur weiter "aufgeblasen" , und die Wikipedia bestätigt mir das:

Deine Interpretation des zitierten Teils lässt schließen, dass du nicht sinnvoll aus dem Englischen übersetzt hast. Was du auf Deutsch schreibst steht da nicht in Englisch.

Vetinari · 11. Mai 2015

Ich meine den verlinkten Post, nicht den ganzen Thread.

Daedal · 11. Mai 2015

Den ersten?
Was soll da relevantes stehen? Der stellt ja selber nur Fragen. Zitier mal das wichtigste.

ZeT · 11. Mai 2015

Imho ist der Gedanke das man absichtlich Spiele schlechter programmiert um einem Hardwarehersteller eins auszuwischen absurd. Dadurch sägt man am Ast der verkauften Spiele. Man muss sich ja nur anschauen wie viele AMD KArten verbaut sind um zu sehen, das keiner freiwillig auf diese Käuferschicht verzichtet.

Andererseits ist es durchaus nachvollziehbar, das man seine Software nicht auf alles optimiert sondern wie in diesem Falle einfach DX11 nimmt. Das DX11 mit den Drawcalls Probleme hat ist bekannt, da war Endivia einfach schlau genug das per Treiber auszuhebeln. AMD hat da geschlafen und im Endeffekt sieht man jetzt sehr gut was für nen Blödsinn Microsoft da mit DX11 programmiert hat. Sobald es zu keiner speziellen Optimierung kommt gehen die Karten in die Knie.

Das die auf Nvidiakarten programmiert haben mag sicherlich ein Vorteil sein - aber die schlechte Performance der AMD Karten ist imho primär bei DX11 und den Treibern zu suchen.

Vetinari · 12. Mai 2015

@Daedal:

Promit schrieb:
Many years ago, I briefly worked at NVIDIA on the DirectX driver team (internship). This is Vista era, when a lot of people were busy with the DX10 transition, the hardware transition, and the OS/driver model transition. My job was to get games that were broken on Vista, dismantle them from the driver level, and figure out why they were broken. While I am not at all an expert on driver matters (and actually sucked at my job, to be honest), I did learn a lot about what games look like from the perspective of a driver and kernel.

The first lesson is: Nearly every game ships broken. We're talking major AAA titles from vendors who are everyday names in the industry. In some cases, we're talking about blatant violations of API rules - one D3D9 game never even called BeginFrame/EndFrame. Some are mistakes or oversights - one shipped bad shaders that heavily impacted performance on NV drivers. These things were day to day occurrences that went into a bug tracker. Then somebody would go in, find out what the game screwed up, and patch the driver to deal with it. There are lots of optional patches already in the driver that are simply toggled on or off as per-game settings, and then hacks that are more specific to games - up to and including total replacement of the shipping shaders with custom versions by the driver team. Ever wondered why nearly every major game release is accompanied by a matching driver release from AMD and/or NVIDIA? There you go.

The second lesson: The driver is gigantic. Think 1-2 million lines of code dealing with the hardware abstraction layers, plus another million per API supported. The backing function for Clear in D3D 9 was close to a thousand lines of just logic dealing with how exactly to respond to the command. It'd then call out to the correct function to actually modify the buffer in question. The level of complexity internally is enormous and winding, and even inside the driver code it can be tricky to work out how exactly you get to the fast-path behaviors. Additionally the APIs don't do a great job of matching the hardware, which means that even in the best cases the driver is covering up for a LOT of things you don't know about. There are many, many shadow operations and shadow copies of things down there.

The third lesson: It's unthreadable. The IHVs sat down starting from maybe circa 2005, and built tons of multithreading into the driver internally. They had some of the best kernel/driver engineers in the world to do it, and literally thousands of full blown real world test cases. They squeezed that system dry, and within the existing drivers and APIs it is impossible to get more than trivial gains out of any application side multithreading. If Futuremark can only get 5% in a trivial test case, the rest of us have no chance.

The fourth lesson: Multi GPU (SLI/CrossfireX) is fucking complicated. You cannot begin to conceive of the number of failure cases that are involved until you see them in person. I suspect that more than half of the total software effort within the IHVs is dedicated strictly to making multi-GPU setups work with existing games. (And I don't even know what the hardware side looks like.) If you've ever tried to independently build an app that uses multi GPU - especially if, god help you, you tried to do it in OpenGL - you may have discovered this insane rabbit hole. There is ONE fast path, and it's the narrowest path of all. Take lessons 1 and 2, and magnify them enormously.

Deep breath.

Ultimately, the new APIs are designed to cure all four of these problems.

* Why are games broken? Because the APIs are complex, and validation varies from decent (D3D 11) to poor (D3D 9) to catastrophic (OpenGL). There are lots of ways to hit slow paths without knowing anything has gone awry, and often the driver writers already know what mistakes you're going to make and are dynamically patching in workarounds for the common cases.

* Maintaining the drivers with the current wide surface area is tricky. Although AMD and NV have the resources to do it, the smaller IHVs (Intel, PowerVR, Qualcomm, etc) simply cannot keep up with the necessary investment. More importantly, explaining to devs the correct way to write their render pipelines has become borderline impossible. There's too many failure cases. it's been understood for quite a few years now that you cannot max out the performance of any given GPU without having someone from NVIDIA or AMD physically grab your game source code, load it on a dev driver, and do a hands-on analysis. These are the vanishingly few people who have actually seen the source to a game, the driver it's running on, and the Windows kernel it's running on, and the full specs for the hardware. Nobody else has that kind of access or engineering ability.

* Threading is just a catastrophe and is being rethought from the ground up. This requires a lot of the abstractions to be stripped away or retooled, because the old ones required too much driver intervention to be properly threadable in the first place.

* Multi-GPU is becoming explicit. For the last ten years, it has been AMD and NV's goal to make multi-GPU setups completely transparent to everybody, and it's become clear that for some subset of developers, this is just making our jobs harder. The driver has to apply imperfect heuristics to guess what the game is doing, and the game in turn has to do peculiar things in order to trigger the right heuristics. Again, for the big games somebody sits down and matches the two manually.

Part of the goal is simply to stop hiding what's actually going on in the software from game programmers. Debugging drivers has never been possible for us, which meant a lot of poking and prodding and experimenting to figure out exactly what it is that is making the render pipeline of a game slow. The IHVs certainly weren't willing to disclose these things publicly either, as they were considered critical to competitive advantage. (Sure they are guys. Sure they are.) So the game is guessing what the driver is doing, the driver is guessing what the game is doing, and the whole mess could be avoided if the drivers just wouldn't work so hard trying to protect us.

So why didn't we do this years ago? Well, there are a lot of politics involved (cough Longs Peak) and some hardware aspects but ultimately what it comes down to is the new models are hard to code for. Microsoft and ARB never wanted to subject us to manually compiling shaders against the correct render states, setting the whole thing invariant, configuring heaps and tables, etc. Segfaulting a GPU isn't a fun experience. You can't trap that in a (user space) debugger. So ... the subtext that a lot of people aren't calling out explicitly is that this round of new APIs has been done in cooperation with the big engines. The Mantle spec is effectively written by Johan Andersson at DICE, and the Khronos Vulkan spec basically pulls Aras P at Unity, Niklas S at Epic, and a couple guys at Valve into the fold.

Three out of those four just made their engines public and free with minimal backend financial obligation.

Now there's nothing wrong with any of that, obviously, and I don't think it's even the big motivating raison d'etre of the new APIs. But there's a very real message that if these APIs are too challenging to work with directly, well the guys who designed the API also happen to run very full featured engines requiring no financial commitments*. So I think that's served to considerably smooth the politics involved in rolling these difficult to work with APIs out to the market, encouraging organizations that would have been otherwise reticent to do so.

[Edit/update] I'm definitely not suggesting that the APIs have been made artificially difficult, by any means - the engineering work is solid in its own right. It's also become clear, since this post was originally written, that there's a commitment to continuing DX11 and OpenGL support for the near future. That also helped the decision to push these new systems out, I believe.

The last piece to the puzzle is that we ran out of new user-facing hardware features many years ago. Ignoring raw speed, what exactly is the user-visible or dev-visible difference between a GTX 480 and a GTX 980? A few limitations have been lifted (notably in compute) but essentially they're the same thing. MS, for all practical purposes, concluded that DX was a mature, stable technology that required only minor work and mostly disbanded the teams involved. Many of the revisions to GL have been little more than API repairs. (A GTX 480 runs full featured OpenGL 4.5, by the way.) So the reason we're seeing new APIs at all stems fundamentally from Andersson hassling the IHVs until AMD woke up, smelled competitive advantage, and started paying attention. That essentially took a three year lag time from when we got hardware to the point that compute could be directly integrated into the core of a render pipeline, which is considered normal today but was bluntly revolutionary at production scale in 2012. It's a lot of small things adding up to a sea change, with key people pushing on the right people for the right things.

Phew. I'm no longer sure what the point of that rant was, but hopefully it's somehow productive that I wrote it. Ultimately the new APIs are the right step, and they're retroactively useful to old hardware which is great. They will be harder to code. How much harder? Well, that remains to be seen. Personally, my take is that MS and ARB always had the wrong idea. Their idea was to produce a nice, pretty looking front end and deal with all the awful stuff quietly in the background. Yeah it's easy to code against, but it was always a bitch and a half to debug or tune. Nobody ever took that side of the equation into account. What has finally been made clear is that it's okay to have difficult to code APIs, if the end result just works. And that's been my experience so far in retooling: it's a pain in the ass, requires widespread revisions to engine code, forces you to revisit a lot of assumptions, and generally requires a lot of infrastructure before anything works. But once it's up and running, there's no surprises. It works smoothly, you're always on the fast path, anything that IS slow is in your OWN code which can be analyzed by common tools. It's worth it.

Das ist der Post auf den der Link zeigt.

Faust2011 · 12. Mai 2015

Mal ganz provokativ gefragt: kann es einfach sein, dass dieser hier zitierte Entwickler einfach überfordert ist?

Meine Sicht und Interpretation dazu: Er schildert seine Eindrücke und beschreibt, wie komplex DX9, DX11 und OpenGL sind. Ja, das sind sie auch. OpenGL als State-Machine und die DX-APIs mit ihrem umständlichen C-Style prozeduralen API. Wenn man damit jedoch ein Problem hat, dann schreibt man sich eine höhere Abstraktionsschicht darum...

dominiczeth · 12. Mai 2015

PiPaPa schrieb:
AMD kommt nicht erst danach in die Pötte, bei nvidia gesponserten Titeln darf AMD nicht vorab Einsicht nehmen....

Warum hat AMD dann bereits vor 2 Jahren 20 Keys für pCars bekommen dass die optimieren können?

Daedal · 12. Mai 2015

Faust2011 schrieb:
Mal ganz provokativ gefragt: kann es einfach sein, dass dieser hier zitierte Entwickler einfach überfordert ist?

Meine Sicht und Interpretation dazu: Er schildert seine Eindrücke und beschreibt, wie komplex DX9, DX11 und OpenGL sind. Ja, das sind sie auch. OpenGL als State-Machine und die DX-APIs mit ihrem umständlichen C-Style prozeduralen API. Wenn man damit jedoch ein Problem hat, dann schreibt man sich eine höhere Abstraktionsschicht darum...

Das würde ich auch sagen. Und das meiste der darauf folgenden Kommentare schreibt, zwar etwas diplomatischer, genau das.
Vieles des hier geschriebenen wird ihm erläutert und richtig gestellt.

@Ventnari
Und wenn ich dich darum bitte das wichtigste aus einem so langen Textabschnitt zu kopieren, auf was du dich beziehst, dann nützt es keinem wenn du alles zitierst was ich sowieso schon lesen konnte und nicht die für die Diskussion relevanten Stellen finden konnte. Ich habe eher das Gefühl du kopierst hier einfach mal alles rein ohne zu wissen was da überhaupt genau steht.

Vetinari · 12. Mai 2015

Wenn jemand wie ArasP - der den Post hier verlinkt hat - dem Inhalt im Wesentlichen zustimmt, dann gibt wohl keine grösseren Korrekturen anzubringen. Der Punkt war, dass es schwer ist, sehr schnellen Code ohne Unterstützung von nvidia/AMD zu schreiben gerade weil API und Treiber im Hintergrund noch ordentlich Arbeit leisten (müssen) und für den Programmierer teilweise nicht transparent ist was wirklich schnell ist.

@Daedal:
Einigen (den meisten) Leuten hier täte es gut wenn sie etwas mehr reflektiert lesen als schreiben würden. Und das schliesst dich mit ein.

@dominiczeth:
Mit 20 Keys optimiert man kein Spiel.

Daedal · 12. Mai 2015

Was sollte ich reflektieren, wenn du nicht konkret sagst was du mit dem Link ausdrücken willst? Wenn du meine Nachfragen als Angriff interpretierst, so entschuldige ich mich. Ich bin recht direkt mit meinen Äusserungen, beabsichtige aber keine Herabsetzung.

Dennoch bleibt die Tatsache, dass ich nicht nachvollziehen kann was der Link hier für einen Beitrag oder Input bieten soll.

Cat Toaster · 12. Mai 2015

Der Punkt ist, das man NORMAL nicht einfach das Programm her nimmt und dann "reverse engineered" was es wohl macht, was durch nVidia Closed-Source-Bibliotheken in GameWorks ohnehin noch mehr erschwert wird als es ohnehin schon ist. Doch hier geht es halt nicht anders und vor der Fertigstellung würde man mit einem derartigen Prozess gar nicht beginnen wollen.

Das Studio selbst muss entsprechenden Zugriff auf den relevanten Source-Code geben, den hat das Studio mitunter aber selbst nicht wenn sie mit GameWorks kompilieren.

Für den anderen Markteilnehmer ist das Szenario die Höchststrafe ohne Zugriff auf irgendetwas nun zu schauen warum ein vom Mitbewerber gesponsortes Produkt auf dem eigenen nicht ordentlich performt.

Diese einseitige Entscheidung so getroffen hat das Studio. Die Aussage man habe AMD doch 20 Keys zwei Jahre vor Entwicklungsende hingeworfen, damit hätten sie ja sehen können wie sie klar kommen ist nun mal nicht die feine Art. Sie hatten sich auch ohne einseitige Optimierungen an die API halten können. Das wäre natürlich teurer gewesen als sich GPUs und Entwicklungs-Ressourcen von nVidia schenken zu lassen und im Gegenzug alles mit entsprechenden Logos zuzupflastern. Eventuell hätte man sich das sogar gar nicht leisten können.

Da lobe ich mir halt Star Citizen, dort hat CIG offen kund getan entsprechende "Werbegeschenke" von AMD UND nVidia bekommen zu haben und sie entsprechend beide gleichsam unterstützen wollen.

RayKrebs · 12. Mai 2015

Und jetzt, nach dem Schlagabtausch? Schauen wir mal was da rauskommt. Viele Vermutungen wurden geäußert. Das NVIDIA hier einen Vorteil gegenüber AMD hat läßt sich nicht leugnen, alleine schon durch die (noch geschlossene GameWorks Library). Wie nun PhsyX oder CUDA pCars unterstützt weiß nur der Entwickler. Wir vermuten nur, genau wie meine Vermutung das zur Entwicklung bestimmt nur NVIDIA GPUs zum Einsatz kamen.

Ein Hinweis war schon interessant, die bessere Compute Performance von Maxwell, daher könnte man schon verstehen warum Kepler auch nicht mithalten kann.

Keine Ahnung wie CUDA auf nicht NVIDIA Hardware emuliert wird.

dominiczeth · 12. Mai 2015

Vetinari schrieb:
@dominiczeth:
Mit 20 Keys optimiert man kein Spiel.

AMD hätte jderzeit mehr bekommen hätten sie nur gefragt. Haben sie aber nicht.

@RayKrebs: Nein es kamen beim Lead Graphics Programmer tatsächlich vor allem AMD Hardware zum Einsatz.

PiPaPa · 12. Mai 2015

Mit Keys kann man aber nicht viel anfangen wenn man keine Einsicht auf den relevanten Spielecode hat.
Keys erlauben einem eine bestimmte Fassung des Spiels zu testen. Mehr nicht.

Und wenn es hauptsächlich auf AMD Hardware entwickelt wurde, so ist dann wohl es noch mehr ein Armutszeugnis für den Entwickler.

RayKrebs · 12. Mai 2015

dominiczeth schrieb:
AMD hätte jderzeit mehr bekommen hätten sie nur gefragt. Haben sie aber nicht.

@RayKrebs: Nein es kamen beim Lead Graphics Programmer tatsächlich vor allem AMD Hardware zum Einsatz.

Woher stammt diese Info. Das verstehe ich nun wirklich nicht. Dann hätte SMS schon lange von der schwachen Performance von AMD gewusst.

Ich selber habe kein pCars, daher kann ich selber nicht testen. Aber es gab schon immer spezielle Effekte die extrem die Hardware ausbremsen.

Cat Toaster · 12. Mai 2015

dominiczeth schrieb:
AMD hätte jderzeit mehr bekommen hätten sie nur gefragt. Haben sie aber nicht.

Kopf -> Tisch.

News Project Cars: Entwickler und AMD arbeiten an Leistungssteigerung

Commander

Lt. Commander

Commander

Lt. Commander

Banned

Lt. Commander

Banned

ZeT

Gast

Lt. Commander

HTTP 418 - I'm a teapot

Vice Admiral

Banned

Lt. Commander

Banned

Captain

Lt. Commander

Vice Admiral

Banned

Lt. Commander

Captain

Ähnliche Themen

Passend zum Thema