Performanceprobleme mit dd und NVME SSD

I'm unknown

Rear Admiral
🎅 Nikolaus-Rätsel-Elite
Registriert
Feb. 2005
Beiträge
5.502
Hallo zusammen,

für den Lesertest der Crucial P5 bin ich auf ein Luxusproblem gestoßen: Ich kann mit dd von der Samsung 970 Evo+ 1TB nicht mit mehr als 1,9 bis 2,1 GB (nicht GiB) pro Sekunde lesen.

System: Ubuntu 21.04 mit Live-USB-Stick, Asus Crosshair Hero VI, 3700X, 3200er RAM (4x16 GiB) und Samsung 970 Evo + im NVME-Slot vom Board per PCIe 3.0 x4 angebunden.

Code:
ubuntu@ubuntu:~$ sudo dd if=/dev/nvme0n1 of=/dev/null bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 5,99427 s, 1,7 GB/s

iostat zeigt dabei dieselben plausiblen Werte an:
Code:
Linux 5.11.0-16-generic (ubuntu)     06.05.2021     _x86_64_    (16 CPU)

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1          91,19     57380,22         0,00         0,00   10306634          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1         928,00    633344,00         0,00         0,00     633344          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2519,00   1719296,00         0,00         0,00    1719296          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2541,00   1734656,00         0,00         0,00    1734656          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2560,00   1747968,00         0,00         0,00    1747968          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2648,00   1807360,00         0,00         0,00    1807360          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2633,00   1797632,00         0,00         0,00    1797632          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        1173,00    800768,00         0,00         0,00     800768          0          0

Die Anbindung funktioniert jedoch, der integrierte Disk Benchmark kann auf Dateisystemebene deutlich schneller lesen:
1620330967158.png


iostat ist ebenfalls plausiebel zu den gezeigten Werten:
Code:
ubuntu@ubuntu:~$ sudo iostat nvme0n1 -d 1
Linux 5.11.0-16-generic (ubuntu)     06.05.2021     _x86_64_    (16 CPU)

Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1         132,84     86977,90         0,00         0,00   20547658          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        1084,00   1380748,00         0,00         0,00    1380748          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2057,00   2625288,00         0,00         0,00    2625288          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2365,00   3015692,00         0,00         0,00    3015692          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        1911,00   2434572,00         0,00         0,00    2434572          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2063,00   2632968,00         0,00         0,00    2632968          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        1482,00   1889288,00         0,00         0,00    1889288          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        1729,00   2205448,00         0,00         0,00    2205448          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2035,00   2593292,00         0,00         0,00    2593292          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2136,00   2726788,00         0,00         0,00    2726788          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2068,00   2635152,00         0,00         0,00    2635152          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2099,00   2675212,00         0,00         0,00    2675212          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2167,00   2766088,00         0,00         0,00    2766088          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2033,00   2590732,00         0,00         0,00    2590732          0          0


Device             tps    kB_read/s    kB_wrtn/s    kB_dscd/s    kB_read    kB_wrtn    kB_dscd
nvme0n1        2397,00   3056652,00         0,00         0,00    3056652          0          0

Ebenfalls liegt kein generelles Performanceproblem mit dem System vor:
Code:
ubuntu@ubuntu:~$ sudo dd if=/dev/zero of=/dev/null bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 0,406004 s, 25,8 GB/s

Und auch die Werte der 850 Evo 1TB passen mit dd:
Code:
ubuntu@ubuntu:~$ sudo dd if=/dev/sda of=/dev/null bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 21,3434 s, 491 MB/s

Für Hinweise warum dd in ein hartes Limit läuft wäre ich dankbar, sonst kann ich den Schreibcache der Crucial SSD nicht voll auslasten und die Ergebnisse sind wertlos.
 
Das GUI-Tool und dd messen nicht die selbe Partition, allerdings sollte das keinen Unterschied machen.

Code:
# sync; echo 3 > /proc/sys/vm/drop_caches; dd if=/dev/nvme0n1p5 of=/dev/null bs=1M count=40000
40000+0 records in
40000+0 records out
41943040000 bytes (42 GB, 39 GiB) copied, 13.4444 s, 3.1 GB/s
root@srv-d11-t:~# hwinfo --short --disk /dev/nvme0n1
disk:                                                        
  /dev/nvme0n1         Samsung Electronics NVMe SSD Controller SM961/PM961
root@srv-d11-t:~# lscpu
Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          16
On-line CPU(s) list:             0-15
Thread(s) per core:              2
Core(s) per socket:              8
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           113
Model name:                      AMD Ryzen 7 3700X 8-Core Processor
Stepping:                        0
Frequency boost:                 enabled
CPU MHz:                         2199.659
CPU max MHz:                     5224.2178
CPU min MHz:                     2200.0000
BogoMIPS:                        7199.66
Virtualization:                  AMD-V
L1d cache:                       256 KiB
L1i cache:                       256 KiB
L2 cache:                        4 MiB
L3 cache:                        32 MiB
NUMA node0 CPU(s):               0-15
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full AMD retpoline, IBPB conditional, STIBP conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Flags:                           fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt
                                 pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_
                                 1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch os
                                 vw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb
                                  stibp vmmcall sev_es fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsav
                                 es cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save ts
                                 c_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor
                                 smca
# cat /etc/debian_version
bullseye/sid
# uname -a
Linux XXXX 5.10.0-6-amd64 #1 SMP Debian 5.10.28-1 (2021-04-09) x86_64 GNU/Linux
# lspci -vt
-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
           +-00.2  Advanced Micro Devices, Inc. [AMD] Starship/Matisse IOMMU
           +-01.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-01.2-[20-2c]----00.0-[21-2c]--+-01.0-[23]----00.0  Samsung Electronics Co Ltd NVMe SSD Controller SM961/PM961/SM963
           |                               +-02.0-[24]--+-00.0  Intel Corporation Ethernet Controller 10G X550T
           |                               |            \-00.1  Intel Corporation Ethernet Controller 10G X550T
           |                               +-04.0-[26]----00.0  Intel Corporation I210 Gigabit Network Connection
           |                               +-05.0-[27]----00.0  Intel Corporation I210 Gigabit Network Connection
           |                               +-06.0-[28-29]----00.0-[29]----00.0  ASPEED Technology, Inc. ASPEED Graphics Family
           |                               +-08.0-[2a]--+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
           |                               |            +-00.1  Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
           |                               |            \-00.3  Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
           |                               +-09.0-[2b]----00.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
           |                               \-0a.0-[2c]----00.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
           +-02.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-03.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-04.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-05.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-07.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-07.1-[2d]----00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function
           +-08.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-08.1-[2e]--+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
           |            +-00.1  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP
           |            +-00.3  Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
           |            \-00.4  Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller
           +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
           +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
           +-18.0  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0
           +-18.1  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1
           +-18.2  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2
           +-18.3  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3
           +-18.4  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4
           +-18.5  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5
           +-18.6  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6
           \-18.7  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7
#

Schau dir mal die CPU-Auslastung mit top oder "sar -P ALL 3" an während gelesen wird, bei mir hängt dabei eine CPU wie folgt:

top:

Code:
%Cpu11 :  0.0 us, 59.7 sy,  0.0 ni,  0.0 id, 40.3 wa,  0.0 hi,  0.0 si,  0.0 st

sar:

Code:
10:50:18 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
10:50:21 PM     all      0.06      0.00      3.71      2.44      0.00     93.78
10:50:21 PM       0      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM       1      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM       2      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM       3      1.00      0.00      0.00      0.00      0.00     99.00
10:50:21 PM       4      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM       5      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM       6      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM       7      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM       8      0.00      0.00      0.00      0.33      0.00     99.67
10:50:21 PM       9      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM      10      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM      11      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM      12      0.00      0.00     60.54     39.46      0.00      0.00
10:50:21 PM      13      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM      14      0.00      0.00      0.00      0.00      0.00    100.00
10:50:21 PM      15      0.00      0.00      0.00      0.00      0.00    100.00

10:50:21 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
10:50:24 PM     all      0.04      0.00      3.57      2.53      0.00     93.86
10:50:24 PM       0      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       1      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       2      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       3      0.67      0.00      0.00      0.00      0.00     99.33
10:50:24 PM       4      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       5      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       6      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       7      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       8      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM       9      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM      10      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM      11      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM      12      0.00      0.00     58.56     41.44      0.00      0.00
10:50:24 PM      13      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM      14      0.00      0.00      0.00      0.00      0.00    100.00
10:50:24 PM      15      0.00      0.00      0.00      0.00      0.00    100.00

10:50:24 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
10:50:27 PM     all      0.06      0.00      3.41      2.63      0.00     93.90
10:50:27 PM       0      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       1      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       2      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       3      1.00      0.00      0.00      0.00      0.00     99.00
10:50:27 PM       4      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       5      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       6      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       7      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       8      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM       9      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM      10      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM      11      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM      12      0.00      0.00     56.40     43.60      0.00      0.00
10:50:27 PM      13      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM      14      0.00      0.00      0.00      0.00      0.00    100.00
10:50:27 PM      15      0.00      0.00      0.00      0.00      0.00    100.00

10:50:27 PM     CPU     %user     %nice   %system   %iowait    %steal     %idle
10:50:30 PM     all      0.08      0.00      3.65      2.84      0.00     93.43
10:50:30 PM       0      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM       1      0.00      0.00      0.33      0.00      0.00     99.67
10:50:30 PM       2      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM       3      1.00      0.00      0.00      0.00      0.00     99.00
10:50:30 PM       4      0.00      0.00      2.01      0.00      0.00     97.99
10:50:30 PM       5      0.00      0.00      4.00      0.00      0.00     96.00
10:50:30 PM       6      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM       7      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM       8      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM       9      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM      10      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM      11      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM      12      0.34      0.00     52.19     45.79      0.00      1.68
10:50:30 PM      13      0.00      0.00      0.00      0.00      0.00    100.00
10:50:30 PM      14      0.00      0.00      0.33      0.00      0.00     99.67
10:50:30 PM      15      0.00      0.00      0.00      0.00      0.00    100.00
^C


dd läuft dabei mit ~60%

Den GUI-Test kann ich nicht nachstellen, poste deine Werte und dann schauen wir weiter
 
Zuletzt bearbeitet:
du kannst dd mal mit bs=64K oder 256K probieren statt 1M

dann schauen ob ein readahead buffer für das blockdevice existiert (blockdev)

dd ist generell ein schlechter Benchmark. das liest den jeweils nächsten Block erst wenn der vorige geschrieben wurde, eine Parallelität findet da also nicht statt oder eben nur wenn der Kernel es puffert. Deswegen kann das mit großen Blöcken auch wieder langsamer werden.

Dann liest du damit gar keine Daten sondern greifst auf "freie Speicherbereiche" (TRIM) der SSD zurück, die SSD liefert da Nullen ohne was zu lesen, und das kann langsamer sein als tatsächlich Daten zu lesen. Auf freie Speicherbereiche wird normal ja gar nicht zugegriffen (das Dateisystem liest nur wo es vorher was geschrieben hat), von daher kann das sein daß das nicht optimiert ist.

Beim Dateisystembenchmark hast du dafür halt den ganzen RAM Cache usw. mit drin, es sei denn das Benchmark Programm umgeht diese, aber das reflektiert dann auch wieder nicht die reale Performanz des Dateisystems.

Am Ende spielt es aber sowieso keine Rolle, da es viel mehr um Zugriffszeiten/IOPS geht als um diese lineare Geschwindigkeit.

Du kannst noch lspci schauen ob die Anbindung richtig ist (bei PCI SSDs) bzw. sonst eben die Anbindung des Controllers.
 
  • Gefällt mir
Reaktionen: Mac_Leod
kieleich schrieb:
du kannst dd mal mit bs=64K oder 256K probieren statt 1M

dann schauen ob ein readahead buffer für das blockdevice existiert (blockdev)

dd ist generell ein schlechter Benchmark. das liest den jeweils nächsten Block erst wenn der vorige geschrieben wurde, eine Parallelität findet da also nicht statt oder eben nur wenn der Kernel es puffert. Deswegen kann das mit großen Blöcken auch wieder langsamer werden.

lsblk -t spuckt die readahead Werte aus.

kieleich schrieb:
Am Ende spielt es aber sowieso keine Rolle, da es viel mehr um Zugriffszeiten/IOPS geht als um diese lineare Geschwindigkeit.

Dann sollte man mit fio messen. (Achtung: Monstertool)
Ergänzung ()

Gerade gefunden, hier scheint jemand was sehr schickes gebaut zu haben: https://www.thomas-krenn.com/de/wiki/TKperf
 
Zuletzt bearbeitet:
Danke für die Rückmeldungen, was ich im Startpost vergessen hatte:

Blockgröße ist egal ob 128k oder 1M das ändert sich maximal von 1,7 bis 2,2 GB/s. Zumindest laut dem Gnome System Monitor war kein Kern/Thread der CPU voll ausgelastet, genauer kann ich mir das morgen anschauen. Außerdem ist die SSD zu >95% gefüllt, leere Sektoren durch TRIM gibt es im sichtbaren NAND daher eher kaum.

Die Anbindung der SSD passt zumindest unter Windows - werde ich morgen aber nochmal prüfen (sollte aber passen da von dem Read-Only-Benchmark zuvor nichts gelesen wurde, daher wirken die Werte auf mich eben plausibel).

Zum Hintergrund: Ich werde die 1TB SSD auf die 2TB SSD (beides NVME PCIe 3.0 X4) klonen (bitgenau) und möchte das mit der maximal möglichen Geschwindigkeit durchführen. Wenn es da was besseres wie dd gibt wäre das auch gut :).
Ergänzung ()

foofoobar schrieb:
Das GUI-Tool und dd messen nicht die selbe Partition, allerdings sollte das keinen Unterschied machen.
Ja, die selbe Partition hatte ich zuvor auch schon getestet, die Werte bleiben gleich.

Hier die Ausgabe:
Code:
root@ubuntu:/home/ubuntu# sync; echo 3 > /proc/sys/vm/drop_caches; dd if=/dev/nvme0n1p5 of=/dev/null bs=1M count=40000
40000+0 Datensätze ein
40000+0 Datensätze aus
41943040000 Bytes (42 GB, 39 GiB) kopiert, 17,1307 s, 2,4 GB/s


root@ubuntu:/home/ubuntu# hwinfo --short --disk /dev/nvme0n1
disk:                                                           
  /dev/nvme0n1         Samsung Electronics NVMe SSD Controller SM981/PM981/PM983

root@ubuntu:/home/ubuntu# lscpu
Architektur:                     x86_64
CPU Operationsmodus:             32-bit, 64-bit
Byte-Reihenfolge:                Little Endian
Adressgrößen:                    43 bits physical, 48 bits virtual
CPU(s):                          16
Liste der Online-CPU(s):         0-15
Thread(s) pro Kern:              2
Kern(e) pro Socket:              8
Sockel:                          1
NUMA-Knoten:                     1
Anbieterkennung:                 AuthenticAMD
Prozessorfamilie:                23
Modell:                          113
Modellname:                      AMD Ryzen 7 3700X 8-Core Processor
Stepping:                        0
Frequenzanhebung:                aktiviert
CPU MHz:                         2200.000
Maximale Taktfrequenz der CPU:   5224,2178
Minimale Taktfrequenz der CPU:   2200,0000
BogoMIPS:                        7200.28
Virtualisierung:                 AMD-V
L1d Cache:                       256 KiB
L1i Cache:                       256 KiB
L2 Cache:                        4 MiB
L3 Cache:                        32 MiB
NUMA-Knoten0 CPU(s):             0-15
Vulnerability Itlb multihit:     Not affected
Vulnerability L1tf:              Not affected
Vulnerability Mds:               Not affected
Vulnerability Meltdown:          Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1:        Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:        Mitigation; Full AMD retpoline, IBPB conditional, STIBP conditional, RSB filling
Vulnerability Srbds:             Not affected
Vulnerability Tsx async abort:   Not affected
Markierungen:                    fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe
                                 1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2
                                  movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skini
                                 t wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd mba sev ibpb stibp vmmcall s
                                 ev_es fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup
                                 _llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr rdpru wbnoinvd arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flush
                                 byasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca

ubuntu@ubuntu:~$ cat /etc/debian_version 
bullseye/sid

ubuntu@ubuntu:~$ uname -a
Linux ubuntu 5.11.0-16-generic #17-Ubuntu SMP Wed Apr 14 20:12:43 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

ubuntu@ubuntu:~$ lspci -vt
-[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Root Complex
           +-00.2  Advanced Micro Devices, Inc. [AMD] Starship/Matisse IOMMU
           +-01.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-01.1-[01]----00.0  Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
           +-01.3-[02-0a]--+-00.0  Advanced Micro Devices, Inc. [AMD] X370 Series Chipset USB 3.1 xHCI Controller
           |               +-00.1  Advanced Micro Devices, Inc. [AMD] X370 Series Chipset SATA Controller
           |               \-00.2-[03-0a]--+-00.0-[04]----00.0  ASMedia Technology Inc. ASM1143 USB 3.1 Host Controller
           |                               +-02.0-[05]----00.0  Intel Corporation I211 Gigabit Network Connection
           |                               +-03.0-[06]--
           |                               +-04.0-[07]--
           |                               +-05.0-[08]--
           |                               +-06.0-[09]--
           |                               \-07.0-[0a]--
           +-02.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-03.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-03.1-[0b-0d]----00.0-[0c-0d]----00.0-[0d]--+-00.0  Advanced Micro Devices, Inc. [AMD/ATI] Navi 21 [Radeon RX 6800/6800 XT / 6900 XT]
           |                                            +-00.1  Advanced Micro Devices, Inc. [AMD/ATI] Device ab28
           |                                            +-00.2  Advanced Micro Devices, Inc. [AMD/ATI] Device 73a6
           |                                            \-00.3  Advanced Micro Devices, Inc. [AMD/ATI] Device 73a4
           +-04.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-05.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-07.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-07.1-[0e]----00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Function
           +-08.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse PCIe Dummy Host Bridge
           +-08.1-[0f]--+-00.0  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Reserved SPP
           |            +-00.1  Advanced Micro Devices, Inc. [AMD] Starship/Matisse Cryptographic Coprocessor PSPCPP
           |            +-00.3  Advanced Micro Devices, Inc. [AMD] Matisse USB 3.0 Host Controller
           |            \-00.4  Advanced Micro Devices, Inc. [AMD] Starship/Matisse HD Audio Controller
           +-08.2-[10]----00.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
           +-08.3-[11]----00.0  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
           +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
           +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
           +-18.0  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 0
           +-18.1  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 1
           +-18.2  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 2
           +-18.3  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 3
           +-18.4  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 4
           +-18.5  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 5
           +-18.6  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 6
           \-18.7  Advanced Micro Devices, Inc. [AMD] Matisse Device 24: Function 7

sar -P ALL 3 zeigt bei mir auch bei einem Kern eine hohe I/O Last an:
Code:
22:25:14        CPU     %user     %nice   %system   %iowait    %steal     %idle
22:25:17        all      0,86      0,00      1,48      1,73      0,00     95,93
22:25:17          0      2,99      0,00      0,33      0,00      0,00     96,68
22:25:17          1      0,33      0,00      1,00      0,00      0,00     98,66
22:25:17          2      3,00      0,00      3,33      0,00      0,00     93,67
22:25:17          3      0,34      0,00      0,00      0,00      0,00     99,66
22:25:17          4      0,00      0,00      0,00      0,00      0,00    100,00
22:25:17          5      0,00      0,00      0,33      1,33      0,00     98,34
22:25:17          6      1,66      0,00      1,99      0,66      0,00     95,70
22:25:17          7      0,00      0,00      9,22     19,11      0,00     71,67
22:25:17          8      0,67      0,00      0,67      0,00      0,00     98,66
22:25:17          9      1,00      0,00      0,33      0,00      0,00     98,66
22:25:17         10      0,00      0,00      0,00      0,00      0,00    100,00
22:25:17         11      2,68      0,00      0,00      0,00      0,00     97,32
22:25:17         12      0,00      0,00      0,00      0,00      0,00    100,00
22:25:17         13      0,00      0,00      0,33      0,33      0,00     99,33
22:25:17         14      1,00      0,00      2,01      0,00      0,00     96,99
22:25:17         15      0,00      0,00      4,29      6,60      0,00     89,11

22:25:17        CPU     %user     %nice   %system   %iowait    %steal     %idle
22:25:20        all      0,96      0,00      2,50      4,25      0,00     92,29
22:25:20          0      7,64      0,00      1,33      0,00      0,00     91,03
22:25:20          1      1,00      0,00      0,33      0,00      0,00     98,66
22:25:20          2      2,02      0,00      0,00      0,00      0,00     97,98
22:25:20          3      0,33      0,00      0,33      0,00      0,00     99,33
22:25:20          4      0,00      0,00      0,00      0,00      0,00    100,00
22:25:20          5      0,00      0,00      0,00      0,00      0,00    100,00
22:25:20          6      1,00      0,00      1,00      0,33      0,00     97,67
22:25:20          7      0,00      0,00      0,00      0,00      0,00    100,00
22:25:20          8      0,00      0,00      0,00      0,00      0,00    100,00
22:25:20          9      1,00      0,00      1,00      0,00      0,00     98,01
22:25:20         10      0,00      0,00      0,66      0,00      0,00     99,34
22:25:20         11      0,00      0,00      0,33      0,00      0,00     99,67
22:25:20         12      0,00      0,00      0,34      0,00      0,00     99,66
22:25:20         13      0,66      0,00      1,66      0,00      0,00     97,67
22:25:20         14      1,66      0,00      1,33      0,33      0,00     96,68
22:25:20         15      0,00      0,00     31,99     68,01      0,00      0,00

22:25:20        CPU     %user     %nice   %system   %iowait    %steal     %idle
22:25:23        all      1,29      0,00      2,66      3,99      0,00     92,06
22:25:23          0      4,61      0,00      0,99      0,00      0,00     94,41
22:25:23          1      5,25      0,00      1,64      0,00      0,00     93,11
22:25:23          2      2,99      0,00      0,00      0,00      0,00     97,01
22:25:23          3      0,33      0,00      0,00      0,00      0,00     99,67
22:25:23          4      0,00      0,00      0,00      0,00      0,00    100,00
22:25:23          5      0,00      0,00      0,00      0,00      0,00    100,00
22:25:23          6      0,00      0,00      0,33      0,00      0,00     99,67
22:25:23          7      0,00      0,00      0,00      0,00      0,00    100,00
22:25:23          8      0,33      0,00      0,66      0,00      0,00     99,00
22:25:23          9      2,31      0,00      0,33      0,00      0,00     97,36
22:25:23         10      0,34      0,00      0,00      0,00      0,00     99,66
22:25:23         11      1,32      0,00      0,33      0,00      0,00     98,34
22:25:23         12      0,33      0,00      0,00      0,00      0,00     99,67
22:25:23         13      0,00      0,00      0,00      0,33      0,00     99,67
22:25:23         14      2,67      0,00      3,00      0,00      0,00     94,33
22:25:23         15      0,00      0,00     35,69     64,31      0,00      0,00

22:25:23        CPU     %user     %nice   %system   %iowait    %steal     %idle
22:25:26        all      1,04      0,00      2,77      3,83      0,00     92,35
22:25:26          0      0,00      0,00      0,99      0,00      0,00     99,01
22:25:26          1      8,97      0,00      0,00      0,00      0,00     91,03
22:25:26          2      1,00      0,00      0,00      0,00      0,00     99,00
22:25:26          3      0,00      0,00      0,00      0,00      0,00    100,00
22:25:26          4      0,00      0,00      0,00      0,33      0,00     99,67
22:25:26          5      0,33      0,00      0,00      0,00      0,00     99,67
22:25:26          6      0,00      0,00      0,00      0,00      0,00    100,00
22:25:26          7      0,00      0,00      0,00      0,00      0,00    100,00
22:25:26          8      0,00      0,00      0,00      0,00      0,00    100,00
22:25:26          9      1,00      0,00      0,33      0,00      0,00     98,67
22:25:26         10      2,00      0,00      0,33      0,00      0,00     97,67
22:25:26         11      0,34      0,00      0,00      0,00      0,00     99,66
22:25:26         12      0,00      0,00      0,00      0,00      0,00    100,00
22:25:26         13      0,33      0,00      0,00      0,00      0,00     99,67
22:25:26         14      2,68      0,00      3,01      0,00      0,00     94,31
22:25:26         15      0,00      0,00     39,40     60,60      0,00      0,00

Mit top habe ich nichts auffälliges gesehen.

@foofoobar lsblk -t:
Code:
ubuntu@ubuntu:~$ lsblk -t
NAME        ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED       RQ-SIZE  RA WSAME
loop0               0    512      0     512     512    1 mq-deadline     256 128    0B
loop1               0    512      0     512     512    0 mq-deadline     256 128    0B
loop2               0    512      0     512     512    0 mq-deadline     256 128    0B
loop3               0    512      0     512     512    0 mq-deadline     256 128    0B
loop4               0    512      0     512     512    0 mq-deadline     256 128    0B
loop5               0    512      0     512     512    0 mq-deadline     256 128    0B
sda                 0    512      0     512     512    0 mq-deadline      64 128    0B
├─sda1              0    512      0     512     512    0 mq-deadline      64 128    0B
└─sda2              0    512      0     512     512    0 mq-deadline      64 128    0B
sdb                 0    512      0     512     512    1 mq-deadline       2 128    0B
├─sdb1              0    512      0     512     512    1 mq-deadline       2 128    0B
└─sdb2              0    512      0     512     512    1 mq-deadline       2 128    0B
sdc                 0    512      0     512     512    1 mq-deadline       2 128    0B
└─sdc1              0    512      0     512     512    1 mq-deadline       2 128    0B
sr0                 0    512      0     512     512    1 mq-deadline      64 128    0B
nvme0n1             0    512      0     512     512    0 none           1023 128    0B
├─nvme0n1p1         0    512      0     512     512    0 none           1023 128    0B
├─nvme0n1p2         0    512      0     512     512    0 none           1023 128    0B
├─nvme0n1p3         0    512      0     512     512    0 none           1023 128    0B
├─nvme0n1p4         0    512      0     512     512    0 none           1023 128    0B
└─nvme0n1p5         0    512      0     512     512    0 none           1023 128    0B
 
Zuletzt bearbeitet:
Und noch die Details inkl. Link-Speed:
Code:
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 (prog-if 02 [NVM Express])
    Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 47
    NUMA node: 0
    IOMMU group: 16
    Region 0: Memory at fcf00000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
        Address: 0000000000000000  Data: 0000
    Capabilities: [70] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta:    CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
            ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl:    ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 8GT/s (ok), Width x4 (ok)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
             10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS- TPHComp- ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
             AtomicOpsCtl: ReqEn-
        LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
             EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
             Retimer- 2Retimers- CrosslinkRes: unsupported
    Capabilities: [b0] MSI-X: Enable+ Count=33 Masked-
        Vector table: BAR=0 offset=00003000
        PBA: BAR=0 offset=00002000
    Capabilities: [100 v2] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
        AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
            MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
        HeaderLog: 00000000 00000000 00000000 00000000
    Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
    Capabilities: [158 v1] Power Budgeting <?>
    Capabilities: [168 v1] Secondary PCI Express
        LnkCtl3: LnkEquIntrruptEn- PerformEqu-
        LaneErrStat: 0
    Capabilities: [188 v1] Latency Tolerance Reporting
        Max snoop latency: 1048576ns
        Max no snoop latency: 1048576ns
    Capabilities: [190 v1] L1 PM Substates
        L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
              PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
               T_CommonMode=0us LTR1.2_Threshold=32768ns
        L1SubCtl2: T_PwrOn=10us
    Kernel driver in use: nvme
    Kernel modules: nvme
 
Update: Die Crucial SSD ist in einem PCIe zu M.2 Adapter im zweiten PCIe-Slot vom Board für GPUs verbaut und bringt in etwa die erwartete Leistung, die Samsung im M.2 Port weiterhin nicht. Daher gehe ich von einer Degradierung der (zu) vollen Samsung-SSD aus. Für den Test werden die beiden SSDs eh die Position tauschen, damit kann ich dann ein Problem vom M.2-Port ausschließen (wobei der keine Probleme macht und eigene Lanes zur CPU hat, also eher unwahrscheinlich):

Noch ein paar Details zur aktuellen Situation mit GPU im ersten, der Crucial im zweiten. PCIe 3.0 Slot und der Samsung im M.2 Slot:

lspci -vv (Samsung 970 Evo+ 1TB):
Code:
01:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 (prog-if 02 [NVM Express])
    Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 49
    NUMA node: 0
    IOMMU group: 17
    Region 0: Memory at fcf00000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
        Address: 0000000000000000  Data: 0000
    Capabilities: [70] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
            ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
            RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta:    CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <64us
            ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl:    ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 8GT/s (ok), Width x4 (ok)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
             10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS- TPHComp- ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR+ OBFF Disabled,
             AtomicOpsCtl: ReqEn-
        LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
             EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
             Retimer- 2Retimers- CrosslinkRes: unsupported
    Capabilities: [b0] MSI-X: Enable+ Count=33 Masked-
        Vector table: BAR=0 offset=00003000
        PBA: BAR=0 offset=00002000
    Capabilities: [100 v2] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
        AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
            MultHdrRecCap+ MultHdrRecEn- TLPPfxPres- HdrLogCap-
        HeaderLog: 00000000 00000000 00000000 00000000
    Capabilities: [148 v1] Device Serial Number 00-00-00-00-00-00-00-00
    Capabilities: [158 v1] Power Budgeting <?>
    Capabilities: [168 v1] Secondary PCI Express
        LnkCtl3: LnkEquIntrruptEn- PerformEqu-
        LaneErrStat: 0
    Capabilities: [188 v1] Latency Tolerance Reporting
        Max snoop latency: 1048576ns
        Max no snoop latency: 1048576ns
    Capabilities: [190 v1] L1 PM Substates
        L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
              PortCommonModeRestoreTime=10us PortTPowerOnTime=10us
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
               T_CommonMode=0us LTR1.2_Threshold=32768ns
        L1SubCtl2: T_PwrOn=10us
    Kernel driver in use: nvme
    Kernel modules: nvme

lspci -vv Crucial P5 2TB:
Code:
0e:00.0 Non-Volatile memory controller: Micron Technology Inc Device 5405 (prog-if 02 [NVM Express])
    Subsystem: Micron Technology Inc Device 0100
    Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
    Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
    Latency: 0, Cache Line Size: 64 bytes
    Interrupt: pin A routed to IRQ 51
    NUMA node: 0
    IOMMU group: 25
    Region 0: Memory at fce00000 (64-bit, non-prefetchable) [size=16K]
    Capabilities: [40] Power Management version 3
        Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
        Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
    Capabilities: [50] MSI-X: Enable+ Count=32 Masked-
        Vector table: BAR=0 offset=00003000
        PBA: BAR=0 offset=00002000
    Capabilities: [60] Express (v2) Endpoint, MSI 00
        DevCap:    MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
            ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
        DevCtl:    CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
            RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
            MaxPayload 256 bytes, MaxReadReq 512 bytes
        DevSta:    CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
        LnkCap:    Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <8us
            ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
        LnkCtl:    ASPM Disabled; RCB 64 bytes, Disabled- CommClk+
            ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
        LnkSta:    Speed 8GT/s (ok), Width x4 (ok)
            TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
        DevCap2: Completion Timeout: Range BCD, TimeoutDis+ NROPrPrP- LTR+
             10BitTagComp- 10BitTagReq- OBFF Via message, ExtFmt- EETLPPrefix-
             EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
             FRS- TPHComp- ExtTPHComp-
             AtomicOpsCap: 32bit- 64bit- 128bitCAS-
        DevCtl2: Completion Timeout: 17s to 64s, TimeoutDis- LTR+ OBFF Disabled,
             AtomicOpsCtl: ReqEn-
        LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer+ 2Retimers+ DRS-
        LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
             Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
             Compliance De-emphasis: -6dB
        LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
             EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
             Retimer- 2Retimers- CrosslinkRes: unsupported
    Capabilities: [100 v1] Advanced Error Reporting
        UESta:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UEMsk:    DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
        UESvrt:    DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
        CESta:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
        CEMsk:    RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
        AERCap:    First Error Pointer: 00, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
            MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
        HeaderLog: 00000000 00000000 00000000 00000000
    Capabilities: [190 v1] Alternative Routing-ID Interpretation (ARI)
        ARICap:    MFVC- ACS-, Next Function: 0
        ARICtl:    MFVC- ACS-, Function Group: 0
    Capabilities: [2a0 v1] Secondary PCI Express
        LnkCtl3: LnkEquIntrruptEn- PerformEqu-
        LaneErrStat: 0
    Capabilities: [2d0 v1] Latency Tolerance Reporting
        Max snoop latency: 1048576ns
        Max no snoop latency: 1048576ns
    Capabilities: [700 v1] L1 PM Substates
        L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1+ ASPM_L1.2+ ASPM_L1.1+ L1_PM_Substates+
              PortCommonModeRestoreTime=32us PortTPowerOnTime=20us
        L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
               T_CommonMode=0us LTR1.2_Threshold=32768ns
        L1SubCtl2: T_PwrOn=20us
    Kernel driver in use: nvme
    Kernel modules: nvme

Performance Samsung:
Code:
root@ubuntu:/home/ubuntu# dd if=/dev/nvme0n1 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 5,99142 s, 1,8 GB/s

Performance Crucial:
Code:
root@ubuntu:/home/ubuntu# dd if=/dev/nvme1n1 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 3,70358 s, 2,8 GB/s
root@ubuntu:/home/ubuntu# dd if=/dev/nvme1n1 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 3,7072 s, 2,8 GB/s
root@ubuntu:/home/ubuntu# dd if=/dev/nvme1n1 of=/dev/zero bs=1M count=100000
100000+0 Datensätze ein
100000+0 Datensätze aus
104857600000 Bytes (105 GB, 98 GiB) kopiert, 37,0181 s, 2,8 GB/s
root@ubuntu:/home/ubuntu# dd if=/dev/nvme1n1 of=/dev/zero bs=128K count=100000
100000+0 Datensätze ein
100000+0 Datensätze aus
13107200000 Bytes (13 GB, 12 GiB) kopiert, 5,68677 s, 2,3 GB/s

Mehr Updates wird es im Test geben - mal sehen was dabei noch so auftaucht :hammer_alt:.
 
Ok, beide Platten hängen mit 4 Lanes und maximaler Geschwindigkeit am Bus.

Gutfall: LnkSta: Speed 8GT/s (ok), Width x4 (ok) LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis- Schlechtfall: LnkSta: Speed 8GT/s (ok), Width x4 (ok) LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-

Die Cpu Auslastung korreliert grob mit der Transferrate, kann es sein das einer der beiden Platten nicht direkt an der CPU hängt?
Zeig mal die Bustopologie mit "lspci -vt".
Und gibt es von deinem Board ein Bild der Bustopologie?
Hast du die Platten schon mal über Kreuz getauscht?

So kann man sehen wo die Zeit liegen bleibt: http://wiki.dreamrunner.org/public_html/Low_Latency_Programming/blktrace.html
Aber tausch erstmal die Platten und zeig die Bustopologie.
 
foofoobar schrieb:
Ok, beide Platten hängen mit 4 Lanes und maximaler Geschwindigkeit am Bus.
Ja, das hatte ich auch gesehen, wird im BIOS und unter Windows genauso angezeigt.
foofoobar schrieb:
Die Cpu Auslastung korreliert grob mit der Transferrate, kann es sein das einer der beiden Platten nicht direkt an der CPU hängt?
Laut Diagramm hängen beide an der CPU.
foofoobar schrieb:
Und gibt es von deinem Board ein Bild der Bustopologie?
https://images.bit-tech.net/content...i-hero-review/crosshair-diagram-1280x1024.jpg
foofoobar schrieb:
Hast du die Platten schon mal über Kreuz getauscht?
Das kommt dann im Test sowieso. Bin gespannt was dabei heraus kommt - eines meiner Testziele ist ja zu messen ob eine fast volle SSD spürbare Nachteile gegenüber einer größeren mit (viel) freiem Speicher bestehen.
 
I'm unknown schrieb:
Laut Diagramm hängen beide an der CPU.

Jein, die Werte haben eine andere Bedeutung als man auf den ersten Blick denkt:

Code:
              %user  Percentage of CPU utilization that occurred while executing at the user level (application). Note that this field includes time  spent  running
                     virtual processors.

              %usr   Percentage  of  CPU utilization that occurred while executing at the user level (application). Note that this field does NOT include time spent
                     running virtual processors.

              %nice  Percentage of CPU utilization that occurred while executing at the user level with nice priority.

              %system
                     Percentage of CPU utilization that occurred while executing at the system level (kernel). Note that this field includes  time  spent  servicing
                     hardware and software interrupts.

              %sys   Percentage of CPU utilization that occurred while executing at the system level (kernel). Note that this field does NOT include time spent ser‐
                     vicing hardware or software interrupts.

              %iowait
                     Percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request.

              %steal Percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor.

              %irq   Percentage of time spent by the CPU or CPUs to service hardware interrupts.

              %soft  Percentage of time spent by the CPU or CPUs to service software interrupts.

              %guest Percentage of time spent by the CPU or CPUs to run a virtual processor.

              %gnice Percentage of time spent by the CPU or CPUs to run a niced guest.

              %idle  Percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request.
Ergänzung ()

I'm unknown schrieb:
Das kommt dann im Test sowieso. Bin gespannt was dabei heraus kommt - eines meiner Testziele ist ja zu messen ob eine fast volle SSD spürbare Nachteile gegenüber einer größeren mit (viel) freiem Speicher bestehen.

Ok, ich hab mir diesen Thread auf beobachten gesetzt, hoffentlich passiert dann auch das was ich erwarte :-)
 
@foofoobar Der Fehler ist beim Wechsel des Slots mitgewandert, daher kann ich sagen: Meine 970 Evo+ möchte mit dd im aktuellen Zustand nicht mehr MB/s lesen...

Die Crucial hingegen kann in beiden Slots die ~3 GB/s. Die Anbindung an die CPU passt - hätte mich auch gewundert da der X370 Chipsatz genauso die die x4x0 Chipsätze nur ein paar PCIe 2.0 Lanes zur Verfügung stellen können.

Was für die Auswertung noch wichtig wäre (ich habe in kB/s geloggt): Rechnet iostat intern mit Kilobyte oder Kibibyte pro Sekunde? Manche Tools haben die Anzeige ja umgestellt und zeigen die 1000er Werte an - zu iostat konnte ich leider keine Infos finden.
 
Um dem Verhalten von dem Samsung weiter auf die Spur zu kommen wäre fio und blktrace notwendig, wird aber aufwendig. Evtl. kommt die mit serieller Ansprache nicht auf Trab.

Für iostat gilt rtfm:

IOSTAT(1) Linux User's Manual IOSTAT(1) NAME iostat - Report Central Processing Unit (CPU) statistics and input/output statistics for devices and partitions. SYNOPSIS iostat [ -c ] [ -d ] [ -h ] [ -k | -m ] [ -N ] [ -s ] [ -t ] [ -V ] [ -x ] [ -y ] [ -z ] [ --dec={ 0 | 1 | 2 } ] [ { -f | +f } directory ] [ -j { ID | LABEL | PATH | UUID | ... } ] [ -o JSON ] [ [ -H ] -g group_name ] [ --human ] [ --pretty ] [ -p [ device[,...] | ALL ] ] [ device [...] | ALL ] [ interval [ count ] ] DESCRIPTION The iostat command is used for monitoring system input/output device loading by observing the time the devices are active in relation to their average trans- fer rates. The iostat command generates reports that can be used to change system configuration to better balance the input/output load between physical disks. The first report generated by the iostat command provides statistics concerning the time since the system was booted, unless the -y option is used (in this case, this first report is omitted). Each subsequent report covers the time since the previous report. All statistics are reported each time the iostat com- mand is run. The report consists of a CPU header row followed by a row of CPU statistics. On multiprocessor systems, CPU statistics are calculated sys- tem-wide as averages among all processors. A device header row is displayed followed by a line of statistics for each device that is configured. The interval parameter specifies the amount of time in seconds between each report. The count parameter can be specified in conjunction with the interval pa- rameter. If the count parameter is specified, the value of count determines the number of reports generated at interval seconds apart. If the interval param- eter is specified without the count parameter, the iostat command generates reports continuously. REPORTS The iostat command generates two types of reports, the CPU Utilization report and the Device Utilization report. CPU Utilization Report The first report generated by the iostat command is the CPU Utilization Report. For multiprocessor systems, the CPU values are global averages among all processors. The report has the following format: %user Show the percentage of CPU utilization that occurred while executing at the user level (application). %nice Show the percentage of CPU utilization that occurred while executing at the user level with nice priority. %system Show the percentage of CPU utilization that occurred while executing at the system level (kernel). %iowait Show the percentage of time that the CPU or CPUs were idle during which the system had an outstanding disk I/O request. %steal Show the percentage of time spent in involuntary wait by the virtual CPU or CPUs while the hypervisor was servicing another virtual processor. %idle Show the percentage of time that the CPU or CPUs were idle and the system did not have an outstanding disk I/O request. Device Utilization Report The second report generated by the iostat command is the Device Utilization Report. The device report provides statistics on a per physical device or partition basis. Block devices and partitions for which statistics are to be displayed may be entered on the command line. If no device nor partition is entered, then statistics are displayed for every device used by the system, and providing that the kernel maintains statistics for it. If the ALL keyword is given on the command line, then statistics are displayed for every device defined by the system, including those that have never been used. Transfer rates are shown in 1K blocks by default, unless the environment variable POSIXLY_CORRECT is set, in which case 512-byte blocks are used. The report may show the following fields, depending on the flags used (e.g. -x, -s and -k or -m): Device: This column gives the device (or partition) name as listed in the /dev directory. tps Indicate the number of transfers per second that were issued to the device. A transfer is an I/O request to the device. Multiple logical re- quests can be combined into a single I/O request to the device. A transfer is of indeterminate size. Blk_read/s (kB_read/s, MB_read/s) Indicate the amount of data read from the device expressed in a number of blocks (kilobytes, megabytes) per second. Blocks are equivalent to sectors and therefore have a size of 512 bytes. Blk_wrtn/s (kB_wrtn/s, MB_wrtn/s) Indicate the amount of data written to the device expressed in a number of blocks (kilobytes, megabytes) per second. Blk_dscd/s (kB_dscd/s, MB_dscd/s) Indicate the amount of data discarded for the device expressed in a number of blocks (kilobytes, megabytes) per second. Blk_w+d/s (kB_w+d/s, MB_w+d/s) Indicate the amount of data written to or discarded for the device expressed in a number of blocks (kilobytes, megabytes) per second. Blk_read (kB_read, MB_read) The total number of blocks (kilobytes, megabytes) read. Blk_wrtn (kB_wrtn, MB_wrtn) The total number of blocks (kilobytes, megabytes) written. Blk_dscd (kB_dscd, MB_dscd) The total number of blocks (kilobytes, megabytes) discarded. Blk_w+d (kB_w+d, MB_w+d) The total number of blocks (kilobytes, megabytes) written or discarded. r/s The number (after merges) of read requests completed per second for the device. w/s The number (after merges) of write requests completed per second for the device. d/s The number (after merges) of discard requests completed per second for the device. f/s The number (after merges) of flush requests completed per second for the device. This counts flush requests executed by disks. Flush requests are not tracked for partitions. Before being merged, flush operations are counted as writes. sec/s (kB/s, MB/s) The number of sectors (kilobytes, megabytes) read from, written to or discarded for the device per second. rsec/s (rkB/s, rMB/s) The number of sectors (kilobytes, megabytes) read from the device per second. wsec/s (wkB/s, wMB/s) The number of sectors (kilobytes, megabytes) written to the device per second. dsec/s (dkB/s, dMB/s) The number of sectors (kilobytes, megabytes) discarded for the device per second. rqm/s The number of I/O requests merged per second that were queued to the device. rrqm/s The number of read requests merged per second that were queued to the device. wrqm/s The number of write requests merged per second that were queued to the device. drqm/s The number of discard requests merged per second that were queued to the device. %rrqm The percentage of read requests merged together before being sent to the device. %wrqm The percentage of write requests merged together before being sent to the device. %drqm The percentage of discard requests merged together before being sent to the device. areq-sz The average size (in kilobytes) of the I/O requests that were issued to the device. Note: In previous versions, this field was known as avgrq-sz and was expressed in sectors. rareq-sz The average size (in kilobytes) of the read requests that were issued to the device. wareq-sz The average size (in kilobytes) of the write requests that were issued to the device. dareq-sz The average size (in kilobytes) of the discard requests that were issued to the device. await The average time (in milliseconds) for I/O requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. r_await The average time (in milliseconds) for read requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. w_await The average time (in milliseconds) for write requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. d_await The average time (in milliseconds) for discard requests issued to the device to be served. This includes the time spent by the requests in queue and the time spent servicing them. f_await The average time (in milliseconds) for flush requests issued to the device to be served. The block layer combines flush requests and executes at most one at a time. Thus flush operations could be twice as long: Wait for current flush request, then execute it, then wait for the next one. aqu-sz The average queue length of the requests that were issued to the device. Note: In previous versions, this field was known as avgqu-sz. %util Percentage of elapsed time during which I/O requests were issued to the device (bandwidth utilization for the device). Device saturation occurs when this value is close to 100% for devices serving requests serially. But for devices serving requests in parallel, such as RAID arrays and modern SSDs, this number does not reflect their performance limits. OPTIONS -c Display the CPU utilization report. -d Display the device utilization report. --dec={ 0 | 1 | 2 } Specify the number of decimal places to use (0 to 2, default value is 2). -f directory +f directory Specify an alternative directory for iostat to read devices statistics. Option -f tells iostat to use only the files located in the alternative direc- tory, whereas option +f tells it to use both the standard kernel files and the files located in the alternative directory to read device statistics. directory is a directory containing files with statistics for devices managed in userspace. It may contain: - a "diskstats" file whose format is compliant with that located in "/proc", - statistics for individual devices contained in files whose format is compliant with that of files located in "/sys". In particular, the following files located in directory may be used by iostat: directory/block/device/stat directory/block/device/partition/stat partition files must have an entry in directory/dev/block/ directory, e.g.: directory/dev/block/major:minor --> ../../block/device/partition -g group_name { device [...] | ALL } Display statistics for a group of devices. The iostat command reports statistics for each individual device in the list then a line of global statis- tics for the group displayed as group_name and made up of all the devices in the list. The ALL keyword means that all the block devices defined by the system shall be included in the group. -H This option must be used with option -g and indicates that only global statistics for the group are to be displayed, and not statistics for individual devices in the group. -h This option is equivalent to specifying --human --pretty. --human Print sizes in human readable format (e.g. 1.0k, 1.2M, etc.) The units displayed with this option supersede any other default units (e.g. kilobytes, sectors...) associated with the metrics. -j { ID | LABEL | PATH | UUID | ... } [ device [...] | ALL ] Display persistent device names. Keywords ID, LABEL, etc. specify the type of the persistent name. These keywords are not limited, only prerequisite is that directory with required persistent names is present in /dev/disk. Optionally, multiple devices can be specified in the chosen persistent name type. Because persistent device names are usually long, option --pretty is implicitly set with this option. -k Display statistics in kilobytes per second. -m Display statistics in megabytes per second. -N Display the registered device mapper names for any device mapper devices. Useful for viewing LVM2 statistics. -o JSON Display the statistics in JSON (Javascript Object Notation) format. JSON output field order is undefined, and new fields may be added in the future. -p [ { device[,...] | ALL } ] Display statistics for block devices and all their partitions that are used by the system. If a device name is entered on the command line, then sta- tistics for it and all its partitions are displayed. Last, the ALL keyword indicates that statistics have to be displayed for all the block devices and partitions defined by the system, including those that have never been used. If option -j is defined before this option, devices entered on the command line can be specified with the chosen persistent name type. --pretty Make the Device Utilization Report easier to read by a human. -s Display a short (narrow) version of the report that should fit in 80 characters wide screens. -t Print the time for each report displayed. The timestamp format may depend on the value of the S_TIME_FORMAT environment variable (see below). -V Print version number then exit. -x Display extended statistics. -y Omit first report with statistics since system boot, if displaying multiple records at given interval. -z Tell iostat to omit output for any devices for which there was no activity during the sample period. ENVIRONMENT The iostat command takes into account the following environment variables: POSIXLY_CORRECT When this variable is set, transfer rates are shown in 512-byte blocks instead of the default 1K blocks. S_COLORS By default statistics are displayed in color when the output is connected to a terminal. Use this variable to change the settings. Possible values for this variable are never, always or auto (the latter is equivalent to the default settings). Please note that the color (being red, yellow, or some other color) used to display a value is not indicative of any kind of issue simply because of the color. It only indicates different ranges of values. S_COLORS_SGR Specify the colors and other attributes used to display statistics on the terminal. Its value is a colon-separated list of capabilities that defaults to H=31;1:I=32;22:M=35;1:N=34;1:Z=34;22. Supported capabilities are: H= SGR (Select Graphic Rendition) substring for percentage values greater than or equal to 75%. I= SGR substring for device names. M= SGR substring for percentage values in the range from 50% to 75%. N= SGR substring for non-zero statistics values. Z= SGR substring for zero values. S_TIME_FORMAT If this variable exists and its value is ISO then the current locale will be ignored when printing the date in the report header. The iostat command will use the ISO 8601 format (YYYY-MM-DD) instead. The timestamp displayed with option -t will also be compliant with ISO 8601 format. EXAMPLES iostat Display a single history since boot report for all CPU and Devices. iostat -d 2 Display a continuous device report at two second intervals. iostat -d 2 6 Display six reports at two second intervals for all devices. iostat -x sda sdb 2 6 Display six reports of extended statistics at two second intervals for devices sda and sdb. iostat -p sda 2 6 Display six reports at two second intervals for device sda and all its partitions (sda1, etc.) BUGS /proc filesystem must be mounted for iostat to work. Kernels older than 2.6.x are no longer supported. Although iostat speaks of kilobytes (kB), megabytes (MB)..., it actually uses kibibytes (kiB), mebibytes (MiB)... A kibibyte is equal to 1024 bytes, and a mebibyte is equal to 1024 kibibytes. FILES /proc/stat contains system statistics. /proc/uptime contains system uptime. /proc/diskstats contains disks statistics. /sys contains statistics for block devices. /proc/self/mountstats contains statistics for network filesystems. /dev/disk contains persistent device names. AUTHOR Sebastien Godard (sysstat <at> orange.fr) SEE ALSO sar(1), pidstat(1), mpstat(1), vmstat(8), tapestat(1), nfsiostat(1), cifsiostat(1) https://github.com/sysstat/sysstat http://pagesperso-orange.fr/sebastien.godard/ Linux OCTOBER 2020 IOSTAT(1)
 
foofoobar schrieb:
Um dem Verhalten von dem Samsung weiter auf die Spur zu kommen
Wird interessanter, die Crucial bricht bei Inhalten ebenfalls ein - bei den leeren Bereichen jedoch nicht.
Code:
root@ubuntu:/home/ubuntu# dd if=/dev/nvme0n1 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 6,34366 s, 1,7 GB/s
root@ubuntu:/home/ubuntu# dd if=/dev/nvme1n1 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 5,99599 s, 1,7 GB/s
root@ubuntu:/home/ubuntu# dd if=/dev/nvme0n1p6 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 3,77585 s, 2,8 GB/s
1620660801672.png


Am Kernel sollte es ebenfalls nicht liegen da mit einem Manjaro Image mit 5.10er Kernel die Samsung das gleiche Verhalten zeigte.

foofoobar schrieb:
Für iostat gilt rtfm:
Danke fürs Posten der ganzen Manpage, Bugs hatte ich übersprungen. Kann man aber trotzdem schöner posten ;).
Edit: In der Online-Version die ich gelesen hatte war Kibi- und Mebibytes gar nicht enthalten...
 
I'm unknown schrieb:
Wird interessanter, die Crucial bricht bei Inhalten ebenfalls ein - bei den leeren Bereichen jedoch nicht.
Code:
root@ubuntu:/home/ubuntu# dd if=/dev/nvme0n1 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 6,34366 s, 1,7 GB/s
root@ubuntu:/home/ubuntu# dd if=/dev/nvme1n1 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 5,99599 s, 1,7 GB/s
root@ubuntu:/home/ubuntu# dd if=/dev/nvme0n1p6 of=/dev/zero bs=1M count=10000
10000+0 Datensätze ein
10000+0 Datensätze aus
10485760000 Bytes (10 GB, 9,8 GiB) kopiert, 3,77585 s, 2,8 GB/s


Am Kernel sollte es ebenfalls nicht liegen da mit einem Manjaro Image mit 5.10er Kernel die Samsung das gleiche Verhalten zeigte.


Danke fürs Posten der ganzen Manpage, Bugs hatte ich übersprungen. Kann man aber trotzdem schöner posten ;).
Edit: In der Online-Version die ich gelesen hatte war Kibi- und Mebibytes gar nicht enthalten...

So richtig auf Touren kommen SSDs mit mehreren Zugriffen gleichzeitig, und es gab mal Firmware-"Bugs" wo "alte" Daten langsamer gelesen wurden als "neue" Daten.
 
Zurück
Oben