AMD ZEN - nova X86 Jedra @ Slo-Tech

Forum » Strojna oprema »
AMD ZEN - nova X86 Jedra

AMD ZEN - nova X86 Jedra

Temo vidijo: vsi

«« «

142 / 628

»»»

Predator X :: 16. mar 2017, 11:14

Najprej smo bili pri 22,6gb/s.
Potem preko 40gb/s. Zdej je pa tolk hitr kot l3 sam ma nizji clock

x45 :: 16. mar 2017, 11:18

že kdo kupil Ryzen, pa da nardi kak home test?

AMD x4 965 BE, 8 gb ddr3 1600, SSD Samsung, 960 GT
Xeon x3430, 8 GB DDR3, Matrox grafa

gddr85 :: 16. mar 2017, 11:23

čakamo, da predatorX naredi first move ;)

D3m :: 16. mar 2017, 11:27

Ene 3, da so ga že vzeli tukaj gor.

|HP EliteBook|R7 8840U|

iloveboobz :: 16. mar 2017, 12:20

clix je 16. mar 2017 ob 11:09 izjavil:

Potencialno lahko pridobiš nekaj L3 cacha z 2+2, to je pa tudi edina prednost.

če bi se amd odloču, da da tudi 4 core partom 16MB. Ampak je vseeno bolje, da so vsa jedra na istem CCXu.

smoki

Predator X :: 16. mar 2017, 12:24

iloveboobz je 16. mar 2017 ob 12:20 izjavil:

clix je 16. mar 2017 ob 11:09 izjavil:
Potencialno lahko pridobiš nekaj L3 cacha z 2+2, to je pa tudi edina prednost.

če bi se amd odloču, da da tudi 4 core partom 16MB. Ampak je vseeno bolje, da so vsa jedra na istem CCXu.

Noben tega dela ne razloži dobr.
Sej kokr vidš v špilih ni velik problemov so le izjeme.

iloveboobz :: 16. mar 2017, 12:25

z apuji sigurno nebo težav, ker bodo mel samo en CCX. Sam vprašanje če bodo mel kej L3ja

smoki

Predator X :: 16. mar 2017, 12:25

iloveboobz je 16. mar 2017 ob 12:25 izjavil:

z apuji sigurno nebo težav, ker bodo mel samo en CCX. Sam vprašanje če bodo mel kej L3ja

Ta CCX ma več kot 100GB/s.

iloveboobz :: 16. mar 2017, 12:28

o kakih 100GB/s ti sanjaš ?

smoki

D3m0r4l1z3d :: 16. mar 2017, 12:29

iloveboobz je 16. mar 2017 ob 12:28 izjavil:

o kakih 100GB/s ti sanjaš ?

Njegov brat ima r7 1700x, pa mu je vse povedal.

ETN Wallet addr.: etnkGuvhDzR7Dh8us4e69VStubGbmQHrh5pe2fnpNDhEhX5
A1nCWrFBMK2NmkycgVN4sAwhvY8YyNNbF6KUSJyFZ99QKU8phCn
Cryptopia ref. link: https://www.cryptopia.co.nz/Register?referrer=Anymalus

Predator X :: 16. mar 2017, 12:31

iloveboobz je 16. mar 2017 ob 12:28 izjavil:

o kakih 100GB/s ti sanjaš ?

Data fabric ima 32b/cycle. (stock 1033MHz)
L3 cache ima 32b/cycle. (stock = core clock = 3GHz+)

iloveboobz :: 16. mar 2017, 12:32

komunikacija med ccxi je kr lepo 22GB/s

smoki

Predator X :: 16. mar 2017, 12:32

iloveboobz je 16. mar 2017 ob 12:32 izjavil:

komunikacija med ccxi je kr lepo 22GB/s

Ne ni.
32b/cycle.

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 12:35)

iloveboobz :: 16. mar 2017, 12:39

32B/cycle je za L3, ne za CCX

https://www.techpowerup.com/forums/thre...

However, even if we were to distribute workload in-between two different cores from each CCX, so as to be able to access the entirety of the 1800X's 16 MB cache... we'd still be somewhat constrained by the inter-CCX bandwidth achieved by AMD's Data Fabric interconnect... 22 GB/s, which is much lower than the L3 cache's 175 GB/s - and even lower than RAM bandwidth. That the Data Fabric interconnect also has to carry data from AMD's IO Hub PCIe lanes also potentially interferes with the (already meagre) available bandwidth

smoki

gddr85 :: 16. mar 2017, 12:39

inter-CCX bandwidth, torej za uporabit cel l3 cache med ccx, je 22 GB/s..

Predator X :: 16. mar 2017, 12:42

iloveboobz je 16. mar 2017 ob 12:39 izjavil:

32B/cycle je za L3, ne za CCX

https://www.techpowerup.com/forums/thre...

However, even if we were to distribute workload in-between two different cores from each CCX, so as to be able to access the entirety of the 1800X's 16 MB cache... we'd still be somewhat constrained by the inter-CCX bandwidth achieved by AMD's Data Fabric interconnect... 22 GB/s, which is much lower than the L3 cache's 175 GB/s - and even lower than RAM bandwidth. That the Data Fabric interconnect also has to carry data from AMD's IO Hub PCIe lanes also potentially interferes with the (already meagre) available bandwidth

The Stilt

The data fabric

The northbridge of Zeppelin is officially called as the data fabric (DF). The DF frequency is always linked to the operating frequency of the memory controller with a ratio of 1:2 (e.g. DDR4-2667 MEMCLK = 1333MHz DFICLK). This means that the memory speed will directly affect the data fabric performance as well. In some cases, it may appear that the performance of Zeppelin scales extremely well with the increased memory speed, however that is necessarily not the case.

In many of these cases the abnormally good scaling is caused by the higher data fabric clock (DFICLK) resulting from the higher memory speed, rather than the increased performance of the memory itself.

The highest officially supported memory speed for consumer (AM4) Zeppelin parts is 2667MHz (two single rank / sided modules in total) or 2400MHz (two dual rank / sided modules in total), however memory ratios for 2933MHz and 3200MHz speeds are available (not officially supported), at least on some motherboards.

iloveboobz :: 16. mar 2017, 12:45

1:2 ratio pr 2667MT ramu pomeni efektivno 667Mhz, ne 1333Mhz, saj je ddr4 double data rate.

smoki

Predator X :: 16. mar 2017, 12:46

https://www.techpowerup.com/forums/prox...

iloveboobz je 16. mar 2017 ob 12:45 izjavil:

1:2 ratio pr 2667MT ramu pomeni efektivno 667Mhz, ne 1333Mhz, saj je ddr4 double data rate.

Nerazumeš.

Zgodovina sprememb…

predlagal izbris: SuperVeloce (16. mar 2017 ob 22:52)

iloveboobz :: 16. mar 2017, 12:46

Nerazumeš.

ti ne razumeš fundamentalnih zadev pri dramu

2667MT ram interno laufa na 1333Mhz, ne 2667

smoki

Zgodovina sprememb…

spremenil: iloveboobz (16. mar 2017 ob 12:47)

Predator X :: 16. mar 2017, 12:47

iloveboobz je 16. mar 2017 ob 12:46 izjavil:

Nerazumeš.

ti ne razumeš fundamentalnih zadev pri dramu

Okay

iloveboobz je 16. mar 2017 ob 12:46 izjavil:

Nerazumeš.

ti ne razumeš fundamentalnih zadev pri dramu

2667MT ram interno laufa na 1333Mhz, ne 2667

A si se le popravil. Sej sm že prej pisal data fabric clock = 1:2 DDR

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 12:48)

iloveboobz :: 16. mar 2017, 12:50

Če ti matematika ne gre, bandwidth med ccxi je polovico od DRAM bandwidth-a. Torej 100GB/s je pač nemogoče, kot ti trdiš

smoki

Predator X :: 16. mar 2017, 12:55

iloveboobz je 16. mar 2017 ob 12:50 izjavil:

Če ti matematika ne gre, bandwidth med ccxi je polovico od DRAM bandwidth-a. Torej 100GB/s je pač nemogoče, kot ti trdiš

50GB/s v eno in 50GB/s v drugo.

iloveboobz :: 16. mar 2017, 12:55

In od kje si zdej potegnu 50GB/s ?

smoki

Predator X :: 16. mar 2017, 12:56

iloveboobz je 16. mar 2017 ob 12:55 izjavil:

In od kje si zdej potegnu 50GB/s ?

DDR4 3200MHz
Sej poglej benche AIDA, če ima 2667/2993/3200MHz DDR4 se latenca pri L3 tako giblje pri 20ns.

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 12:57)

iloveboobz :: 16. mar 2017, 12:58

In zdaj to razpolovi in dobiš ven 25GB/s v eno smer.
Naredi izračun za 2667 (uradno maks podprto) in dobiš ven 21.3GB/s, kar je zelo blizu 22GB/s, o katerem je skos govora.

Sej poglej benche AIDA, če ima 2667/2993/3200MHz DDR4 se latenca pri L3 tako giblje pri 20ns.

latenca l3ja nima nobene veze z hitrostjo ddr4, ker hitrost l3ja je vezana na core, ne na memclk

smoki

Zgodovina sprememb…

spremenil: iloveboobz (16. mar 2017 ob 13:00)

Predator X :: 16. mar 2017, 13:00

iloveboobz je 16. mar 2017 ob 12:58 izjavil:

In zdaj to razpolovi in dobiš ven 25GB/s v eno smer.
Naredi izračun za 2667 (uradno maks podprto) in dobiš ven 21.3GB/s, kar je zelo blizu 22GB/s, o katerem je skos govora.

Kaj zdej že spet bluziš? Prav nič ne preberš... bluziš po svoje.

Zgodovina sprememb…

predlagal izbris: SuperVeloce (16. mar 2017 ob 22:53)

iloveboobz :: 16. mar 2017, 13:01

Ti bluziš po svoje, ker si zmišljuješ svojo definicijo bandwidth-a.

Fact of the matter is, med CCXi se _nikol_ ne prenaša z 100GB/s.

smoki

Zgodovina sprememb…

spremenil: iloveboobz (16. mar 2017 ob 13:01)

Predator X :: 16. mar 2017, 13:02

iloveboobz je 16. mar 2017 ob 13:01 izjavil:

Ti bluziš po svoje, ker si zmišljuješ svojo definicijo bandwidth-a.

Fact of the matter is, med CCXi se _nikol_ ne prenaša z 100GB/s.

Še zadnjič

The Data Fabric is reponsible for the core’s communication with the memory controller, and more importantly, inter-CCX communication. As previously explained, AMD’s Ryzen is built in modular blocks called CCX’s, each containing four cores and its own bank of L3 cache. An 8 core chip like Ryzen contains two of these. In order for CCX to CCX communication to take place, such as when a core from CCX 0 attempts to access data in the L3 cache of CCX 1, it has to do so through the Data Fabric. Assuming a standard 2667MT/s DDR4 kit, the Data Fabric has a bandwidth of 41.6GB/s in a single direction, or 83.2GB/s when transfering in both directions. This bandwidth has to be shared between both inter-CCX communication, and DRAM access, quickly creating data contention whenever a lot of data is being transfered from CCX to CCX at the same time as reading or writing to and from memory.

2667MHz DDR4

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 13:03)

iloveboobz :: 16. mar 2017, 13:05

Assuming a standard 2667MT/s DDR4 kit, the Data Fabric has a bandwidth of 41.6GB/s in a single direction, or 83.2GB/s when transfering in both directions.

to je zajeb, zato ker data fabric je na polovični frekvenci drama, torej nikakor nemore bit bandwidth enak dramu. Dej nared izračun, pred vse slepo verjameš.

smoki

Predator X :: 16. mar 2017, 13:05

https://www.techpowerup.com/231268/amds...

Poglej 5 stran 1600MHz memory bus.
20ns L3 in ne 42ns+-

iloveboobz je 16. mar 2017 ob 13:05 izjavil:

Assuming a standard 2667MT/s DDR4 kit, the Data Fabric has a bandwidth of 41.6GB/s in a single direction, or 83.2GB/s when transfering in both directions.

to je zajeb, zato ker data fabric je na polovični frekvenci drama, torej nikakor nemore bit bandwidth enak dramu. Dej nared izračun, pred vse slepo verjameš.

Sej data fabric ma 64b/cycle v obe smeri oziroma 32b/cycle v eno stran.

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 13:05)

iloveboobz :: 16. mar 2017, 13:06

še enkrat, od kje ti ideja da je data fabric 32b/cycle ?

smoki

Predator X :: 16. mar 2017, 13:06

iloveboobz je 16. mar 2017 ob 13:06 izjavil:

še enkrat, od kje ti ideja da je data fabric 32b/cycle ?

AMD slide. Že 10x sem ti poslal.

tole je zadnjič

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 13:08)

iloveboobz :: 16. mar 2017, 13:08

Ok, potem pa razloži, zakaj ma L1/l2 tolk višji bandwidth in nižjo latenco, če ma še vedno 32b/cycle $:\$

smoki

Zgodovina sprememb…

spremenil: iloveboobz (16. mar 2017 ob 13:08)

Predator X :: 16. mar 2017, 13:10

iloveboobz je 16. mar 2017 ob 13:08 izjavil:

Ok, potem pa razloži, zakaj ma L1/l2 tolk višji bandwidth in nižjo latenco, če ma še vedno 32b/cycle $:\$

Aa? zdej si počas začel dojemat da se motiš?
https://thetechaltar.com/amd-ryzen-cloc...

https://i1.wp.com/thetechaltar.com/wp-c...

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 13:10)

iloveboobz :: 16. mar 2017, 13:13

še eni, ko jim dela matematka probleme.

smoki

Predator X :: 16. mar 2017, 13:14

iloveboobz je 16. mar 2017 ob 13:13 izjavil:

še eni, ko jim dela matematka probleme.

Zgleda, da se vsi motimo edino ti imaš prav.

Zgodovina sprememb…

predlagal izbris: SuperVeloce (16. mar 2017 ob 22:53)

iloveboobz :: 16. mar 2017, 13:15

Kateri del polovičnega bandwidtha ti in tvoje čungalunga strani ne razumejo ?

smoki

Predator X :: 16. mar 2017, 13:17

iloveboobz je 16. mar 2017 ob 13:15 izjavil:

Kateri del polovičnega bandwidtha ti in tvoje čungalunga strani ne razumejo ?

Da si tolk počasn... tudi prav.

Lp

iloveboobz :: 16. mar 2017, 13:19

Pa sej grafi pinganja jeder od pcper povedo vse. Sploh ni treba fantazirat o 100GB/s ali 22GB/s, dejstvo je, da je data fabric počasen in v določenih primerih hud bottleneck.

smoki

Predator X :: 16. mar 2017, 13:20

iloveboobz je 16. mar 2017 ob 13:19 izjavil:

Pa sej grafi pinganja jeder od pcper povedo vse. Sploh ni treba fantazirat o 100GB/s ali 22GB/s, dejstvo je, da je data fabric počasen in v določenih primerih hud bottleneck.

Kateri ram so uporabili?

D3m :: 16. mar 2017, 13:20

Od pcper prosijo za njihovo izvorno kodo za testiranje, da še drugi preizkusijo.

Za enkrat ni še nič z njihove strani.

|HP EliteBook|R7 8840U|

Zgodovina sprememb…

spremenil: D3m (16. mar 2017 ob 13:21)

iloveboobz :: 16. mar 2017, 13:22

ta ti bo malo lažje

Interesting stuff. I think your conclusion is bang-on (that overall L3 cache performance will be affected greatly by memory clock), but your math in the first 3 paragraphs might be a little off.

The way I see it, the total bandwidth of the infinity fabric is related to the memclk, but the total bandwidth of the RAM itself and the dual-channel dual-CCX configuration is only semi-relevant. The fabric moves 32 bytes per cycle from each L3 - for a memclk of 1333 (to take your example), that means a peak performance of 42.6GB/s bandwidth per CCX.

So the question then, is why does the article say the inter-CCX bandwidth is only 22GB/s? The answer to that is that the bandwidth from L3 must be shared between the interconnect and the main memory. Consider a case where you've queued data to move to RAM and the same data to be moved to the other CCX - great! We get 32 bytes from L3 and move it to both locations simultaneously. No harm, no foul. Now consider a case where you've queued data to move to RAM and different data in the next cycle needs to move to the other CCX. You've basically just cut the fabric's efficiency in half compared to the previous scenario.

smoki

Predator X :: 16. mar 2017, 13:25

D3m je 16. mar 2017 ob 13:20 izjavil:

Od pcper prosijo za njihovo izvorno kodo za testiranje, da še drugi preizkusijo.

Za enkrat ni še nič z njihove strani.

Nič novega. Tud tole z gamingom mi ni ravno jasno kaj točn hočjo povedat.
Sej v osnovi si core ne deli L cacha.

iloveboobz je 16. mar 2017 ob 13:22 izjavil:

ta ti bo malo lažje

Interesting stuff. I think your conclusion is bang-on (that overall L3 cache performance will be affected greatly by memory clock), but your math in the first 3 paragraphs might be a little off.

The way I see it, the total bandwidth of the infinity fabric is related to the memclk, but the total bandwidth of the RAM itself and the dual-channel dual-CCX configuration is only semi-relevant. The fabric moves 32 bytes per cycle from each L3 - for a memclk of 1333 (to take your example), that means a peak performance of 42.6GB/s bandwidth per CCX.

So the question then, is why does the article say the inter-CCX bandwidth is only 22GB/s? The answer to that is that the bandwidth from L3 must be shared between the interconnect and the main memory. Consider a case where you've queued data to move to RAM and the same data to be moved to the other CCX - great! We get 32 bytes from L3 and move it to both locations simultaneously. No harm, no foul. Now consider a case where you've queued data to move to RAM and different data in the next cycle needs to move to the other CCX. You've basically just cut the fabric's efficiency in half compared to the previous scenario.

Dobr poglejte sliko.

https://i1.wp.com/thetechaltar.com/wp-c...

Res dobr poglej.

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 13:26)

iloveboobz :: 16. mar 2017, 13:30

Sj maš prav, da boš lažje spal. V perfektnem svetu lahko res premikaš z 41GB/s; razlaga stilta je res malo butasta, ker če na hitro pogledaš, zveni kot da je res prepolovljen bandwidth, ampak v bistvo ni, ker assuma kot osnovno memory frekvenco "efektivno frekvenco/2". Efektivna frenvenca je pa v tem primeru res 2667Mhz.

Ampak še vseeno, worst case scenario je pa lahko res polovica bandwidth-a.

smoki

Zgodovina sprememb…

spremenil: iloveboobz (16. mar 2017 ob 13:31)

Predator X :: 16. mar 2017, 13:32

iloveboobz je 16. mar 2017 ob 13:30 izjavil:

Sj maš prav, da boš lažje spal. V perfektnem svetu lahko res premikaš z 41GB/s (razlaga stilta je res malo butasta, ker če na hitro pogledaš, zveni kot da je res prepolovljen bandwidth).
Ampak še vseeno, worst case scenario je pa lahko res polovica bandwidth-a.

Ni res, sej ti slika pokaže vse.
Noben ne bo lažje spal zaradi teh pogovorov, ker ima vsek čist druge probleme.

Zgodovina sprememb…

predlagal izbris: SuperVeloce (16. mar 2017 ob 22:56)

iloveboobz :: 16. mar 2017, 13:32

Ni res, sej ti slika pokaže vse.

32b*1333Mhz=42656MB/s

smoki

Predator X :: 16. mar 2017, 13:33

iloveboobz je 16. mar 2017 ob 13:32 izjavil:

Ni res, sej ti slika pokaže vse.

32b*1333Mhz=42656MB/s

V eno smer.

Ti res misliš, da bi bla stvar tolk hitra z 22,6GB/s? To pa je čudno, potem bi vsi čakal na 1xCCX čip. Vse skup so mal zabluzili.

Zgodovina sprememb…

spremenilo: Predator X (16. mar 2017 ob 13:34)

iloveboobz :: 16. mar 2017, 13:36

Nwm, kako hitra bi bla zadeva z 22GB/s, je pa jasno, da je to lahko worst case scenarij, kot razloži zgoraj post.

smoki

Predator X :: 16. mar 2017, 13:38

iloveboobz je 16. mar 2017 ob 13:36 izjavil:

Nwm, kako hitra bi bla zadeva z 22GB/s, je pa jasno, da je to lahko worst case scenarij, kot razloži zgoraj post.

Pcper je zabluzil.

Data fabric ima počasnejšo komniciranje kot pa dva čipa z hitrjšim DDR3jem. Sure.

iloveboobz :: 16. mar 2017, 13:42

mah, nevem zakaj ni mogu amd nardit monolitnega 8 jedernika in nebi blo problemov.

smoki

«« «

142 / 628

»»»

Forum » Strojna oprema » AMD ZEN - nova X86 Jedra

AMD ZEN - nova X86 Jedra