those two disks should become one md0 RAID1 mirroring array…

the hardware: what drives to use?

this was NEVER a problem in the past… so things become increasingly complicated as vendors try to push the limits of the technology, in order to increase capacity RAID-ability might have been lost.

  • rule of thumb: exactly two identical drive models would be nice (exactly same model, same vendor)
  • larger disks (2TB+X) is not better, because when one disk fails and is replaced, it can take many hours or even days to “sync” all data to the new device (especially if the server is in production use, slowing down the sync)
  • do regular backups of the RAID, because even RAIDs ain’t perfect (when second harddisk fails while the 16TB harddisk are syncing… argh)

new factors:

  • Do not use post-2019 WD Red drives
  • Do not use post-2020 desktop drives in an array
    • Unfortunately, it now seems that if you want to run an array, you can NOT use cheap 2020 or later drives.
    • For the current state of affairs (mid 2020) WD has said that all the WD Red line (and also “Green”) is now Shingled magnetic recording (SMR) (bad).
    • WD: To get raid-suitable CMR buy WD Red Plus” (check sales on ebay) or “WD Red Pro” (more expensive check it on ebay).
    • never have been using Seagate Barracudas anyway, but these have now pretty much all moved over to SMR (and been renamed BarraCuda).
    • Seagate: have said that their “IronWolf” and “IronWolf Pro” lines will remain CMR, and the FireCuda line seems all CMR at the moment (guess these will be a bit like the Red Pros, the CMR equivalent of the BarraCuda).
  • checkout this vendor (usually only recommends high quality hardware)
  • why/reason/what’s wrong?:
    • Pretty much all these drives are shingled and are completely unsuitable for raid, be warned that array creation and use may work fine.
    • The problem is that replacing a drive may be impossible when one fails!
    • Shingled Magnetic Recording is a method of storing data on magnetic disks.
    • It represents a further development of Perpendicular Magnetic recording (PMR) and, in contrast to conventional methods, describes the magnetic disks not with strictly separated tracks but with overlapping tracks.
    • The method increases the storage density of the hard drives. (but makes drive unsuitable for RAID).
  • src: raid.wiki.kernel.org -> Timeout Mismatch
  • how to test for RAID-ability:
    • if smartctl -l scterc,70,70 /dev/sdX > /dev/null ; then echo " is good"; else echo "no good sue vendor"; fi;
      

      scterc[,READTIME,WRITETIME] – [ATA only] prints values and descriptions of the SCT Error Recovery Control settings.

    • These are equivalent to TLER (as used by Western Digital), CCTL (as used by Samsung and Hitachi/HGST) and ERC (as used by Seagate).
    • READTIME and WRITETIME arguments (deciseconds) set the specified values.
    • Values of 0 disable the feature, other values less than 65 are probably not supported.
    • For RAID configurations, this is typically set to 70,70 deciseconds. (src: manpage of smartctl.man.txt)

script that shall be started on every start/boot to compensate for that problem:

vim /scripts/raid_correct_smr_drives.sh

#!/bin/bash
for i in /dev/sd? ; do
    if smartctl -l scterc,70,70 $i > /dev/null ; then
        echo -n $i " is good "
    else
        echo 180 > /sys/block/${i/\/dev\/}/device/timeout
        echo -n $i " is  bad "
    fi;
    smartctl -i $i | egrep "(Device Model|Product:)"
    blockdev --setra 1024 $i
done
# when done enable
chmod +x /scripts/*.sh
echo "/scripts/raid_correct_smr_drives.sh" >> /etc/rc.local
# tested on
hostnamectl 
Operating System: CentOS Linux 8
CPE OS Name: cpe:/o:centos:centos:8
Kernel: Linux 4.18.0-240.10.1.el8_3.x86_64
Architecture: x86-64

# tested on
hostnamectl 
   Static hostname: storagepc
         Icon name: computer-desktop
           Chassis: desktop
  Operating System: Debian GNU/Linux 9 (stretch)
            Kernel: Linux 4.9.0-11-amd64
      Architecture: x86-64

# *.deb
apt install smartmontools
# *.rpm
yum install smartmontools
# enable smart monitor service
systemctl start smartd ; systemctl enable smartd
# this generates pretty nice output
# define new alias command
alias harddisks='lsblk -o '\''NAME,MAJ:MIN,RM,SIZE,RO,FSTYPE,MOUNTPOINT,UUID'\'''
# run alias
harddisks
NAME      MAJ:MIN RM   SIZE RO FSTYPE            MOUNTPOINT
fd0         2:0    1     4K  0                                             
sda         8:0    0 238.5G  0                                             
├─sda1      8:1    0 230.8G  0 ext4              /         
├─sda2      8:2    0     1K  0                                             
└─sda5      8:5    0   7.7G  0 swap              [SWAP]    
sdb         8:16   0   3.7T  0                                             
└─sdb1      8:17   0   3.7T  0 linux_raid_member           
  └─md127   9:127  0   3.7T  0 ext4              /media/user/ext4RAID1
sdc         8:32   0   3.7T  0                                        
└─sdc1      8:33   0   3.7T  0 linux_raid_member                      
  └─md127   9:127  0   3.7T  0 ext4              /media/user/ext4RAID1
sdd         8:48   0   1.8T  0                                        
└─sdd1      8:49   0   1.8T  0 linux_raid_member                      
  └─md126   9:126  0   1.8T  0 ext3              /media/user/ext3RAID1
sde         8:64   0   1.8T  0                                        
└─sde1      8:65   0   1.8T  0 linux_raid_member                      
  └─md126   9:126  0   1.8T  0 ext3              /media/user/ext3RAID1

# raid1 will be recreated on those disks
# alternatively: list installed block devices
lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0 931,5G  0 disk
└─sda1   8:1    0 931,5G  0 part
sdd      8:16   0   1,8T  0 disk
sdc      8:32   0  29,8G  0 disk
├─sdc1   8:33   0     1G  0 part [SWAP]
└─sdc2   8:34   0  28,8G  0 part /
sde      8:48   0   1,8T  0 disk
sr0     11:0    1  1024M  0 rom
sr1     11:1    1  1024M  0 rom
zram0  251:0    0   2,6G  0 disk [SWAP]

# create new partitions (scripted way unkown)
# so interactive way (fdisk and gparted are per default interactive programs) 

# to erase all harddisk partitions filesystem meta information (and also all data) from harddisk
# WARNING! PICKING WRONG DRIVE LETTER WILL VERY EFFECTIVELY DESTROY ALL DATA ON THIS HARDDISK!
if=/dev/zero of=/dev/sdX
# let it run for a while (not long like 1-5min should be enough)

# alternative method
wipefs -a /dev/sdX
mdadm --zero-superblock /dev/sdX

fdisk /dev/sdb
# create new primary partition
n (new partition)
p (primary partition, hit enter, enter, enter to use the full disk as one big partition)
t (hit t then enter: fd followed by enter to label the partition "linux raid autodetect")
(for a list of all partition types look here)
w (hit w to write changes to disk)

# ... do the same with with all your disks.

# create new partitions with parted
parted -a optimal /dev/sde

GNU Parted 3.2
Using /dev/sde

Welcome to GNU Parted! Type 'help' to view a list of commands.

mklabel msdos

(parted) mkpart

Partition type?  primary/extended? p

File system type?  [ext2]? ext4

Start? 2048s                                           

End? -1                                                              

(parted) set 1 raid on

(parted) align-check                                                    

alignment type(min/opt)  [optimal]/minimal? optimal                       

Partition number? 1
1 aligned
(parted) print                                                       
Model: ATA WDC WD20EARS-00M (scsi)
Disk /dev/sde: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  2000GB  2000GB  primary  ext4         raid, lba

(parted) unit chs                                                         
(parted) print                                                            
Model: ATA WDC WD20EARS-00M (scsi)
Disk /dev/sde: 7660837,92,0
Sector size (logical/physical): 512B/512B
BIOS cylinder,head,sector geometry: 7660837,255,2.  Each cylinder is 261kB.
Partition Table: msdos
Disk Flags: 

Number  Start  End           Type     File system  Flags
 1      4,4,0  7660833,32,1  primary               raid

(parted) q                                                                
Information: You may need to update /etc/fstab.

# same with the other disk

(parted) print                                                            
Model: ATA ST2000NM0033-9ZM (scsi)
Disk /dev/sdd: 2000GB
Sector size (logical/physical): 512B/512B
Partition Table: msdos
Disk Flags: 

Number  Start   End     Size    Type     File system  Flags
 1      1049kB  2000GB  2000GB  primary  ext4         raid, lba

(parted) unit chs
(parted) print                                                            
Model: ATA ST2000NM0033-9ZM (scsi)
Disk /dev/sdd: 7660841,128,1
Sector size (logical/physical): 512B/512B
BIOS cylinder,head,sector geometry: 7660841,255,2.  Each cylinder is 261kB.
Partition Table: msdos
Disk Flags: 

Number  Start  End           Type     File system  Flags
 1      4,4,0  7660837,36,1  primary  ext4         raid, lba

# it is possible to use harddisk of different sized
# (2TB Seagate != different amount of blocks than 2TB WD)
# start with smaller device (less blocks) (with missing parameter)
# then add the larger device to the raid1

# show model of harddisk
hdparm -I /dev/sdd|grep Model
	Model Number:       ST2000NM0033-9ZM175
# show blocksize
blockdev --report /dev/sdd
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw  8192   512  4096          0   2000398934016   /dev/sdd <- this harddisk has +1081856 more blocks

# harddisk: get some health info
# harddisk: overall health status (usually PASSED)
smartctl -H /dev/sdd
# harddisk: show error log
smartctl -l error /dev/sdd
# harddisk: detailed smart info
smartctl -i /dev/sdd
smartctl -a -d ata /dev/sdd
# check if TRIM command is supported
smartctl -l scterc /dev/sdd

# show model of harddisk
hdparm -I /dev/sde|grep Model
	Model Number:       WDC WD20EARS-00MVWB0 <- this is a pretty old harddisk "Green" but it also seems to have the               
# show blocksize
blockdev --report /dev/sde
RO    RA   SSZ   BSZ   StartSec            Size   Device
rw  8192   512   512          0   2000397852160   /dev/sde <- this harddisk has -1081856 less blocks will use this value for new partitions

# check what was before on those harddisks:
# print contents of the metadata stored on the named device(s)
mdadm --examine /dev/sdd /dev/sde
mdadm: No md superblock detected on /dev/sdd.
mdadm: No md superblock detected on /dev/sde.


# what raid devices are already existing?
ll /dev/md*
brw-rw----. 1 root disk 9, 127 2021-01-23 /dev/md127

/dev/md:
total 0
drwxr-xr-x.  2 root root   60 2021-01-23 .
drwxr-xr-x. 21 root root 3.5K 2021-01-23 ..
lrwxrwxrwx.  1 root root    8 2021-01-23 storagepc:2 -> ../md127

# create new one (if disks are equal model/size)
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sdd1 /dev/sde1

# create raid on smaller harddisk with "missing" parameter first
mdadm --create /dev/md0 --level=1 --raid-devices=2 /dev/sde1 missing
mdadm: Note: this array has metadata at the start and
may not be suitable as a boot device. If you plan to
store '/boot' on this device please ensure that
your boot-loader understands md/v1.x metadata, or use
--metadata=0.90
Continue creating array? (y/n) y
mdadm: Defaulting to version 1.2 metadata
mdadm: array /dev/md0 started.
# create filesystem on the newly created RAID1
mkfs.ext4 -L "ext4RAID2" /dev/md0
# mount it
mkdir /mnt/md0
mount /dev/md0 /mnt/md0
ll /mnt/md0
total 20K
drwxr-xr-x. 3 root root 4.0K 2021-01-23 .
drwxr-xr-x. 3 root root 17 2021-01-23 ..
drwx------. 2 root root 16K 2021-01-23 lost+found

#  two options:
# 1) build-sync raid1 disks now and restore the backuped data later (all 1010001 need to be 1:1 written from sde1 to sdd1)
# or
# 2) restore backuped data now to md0 and build-sync raid1 later
# both works
# both things take time
mdadm /dev/md0 --add /dev/sdd1
mdadm: added /dev/sdd1

# watch progress of raid1 rebuild/sync
# so system predicts at current speed rebuild/sync takes 6 hours
watch -n 2 cat /proc/mdstat;

Personalities : [raid1]
md0 : active raid1 sdd1[2] sde1[0]
      1953379328 blocks super 1.2 [2/1] [U_]
      [>....................]  recovery =  1.3% (25814720/1953379328) finish=395.1min speed=81292K/sec
      bitmap: 15/15 pages [60KB], 65536KB chunk

md127 : active (auto-read-only) raid1 sdb1[0] sdc1[1]
      3906886464 blocks super 1.2 [2/2] [UU]
      bitmap: 0/30 pages [0KB], 65536KB chunk

# is the user happy with raid config
# DO NOT FORGET TO SAVE RAID CONFIG
# TO PERMANENT CONFIG FILE (survives reboot)
mdadm --examine --scan > /etc/mdadm.conf
ARRAY /dev/md/2 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx name=hostname:2
ARRAY /dev/md/0 metadata=1.2 UUID=xxxxxxxx:xxxxxxxx:xxxxxxxx:xxxxxxxx name=hostname.domain:0

# if user wants to create RAID10 (reliability of RAID1 + speed of RAID0)

# /dev/md0 should not be already in use
# maybe this parameter should not be given at all
# because as discovered Debian 9 automatically numerates the raid devices
# itself during next boot, so even attaching a usb device could change the md12... number
# means: next boot could fail "Control+D" prompt
# (no ssh login, need actual monitor access)
# use UUID in fstab instead of /dev/md0 /dev/md126
# and reboot and everything should be fine

# list all disks
ls -l /dev | grep sd;
brw-rw----. 1 root disk 8, 16 Jan 15 00:02 sdb
brw-rw----. 1 root disk 8, 17 Jan 15 00:02 sdb1
brw-rw----. 1 root disk 8, 32 Jan 14 23:50 sdc
brw-rw----. 1 root disk 8, 33 Jan 14 23:50 sdc1
brw-rw----. 1 root disk 8, 48 Jan 14 23:51 sdd
brw-rw----. 1 root disk 8, 49 Jan 14 23:51 sdd1
brw-rw----. 1 root disk 8, 64 Jan 14 23:51 sde
brw-rw----. 1 root disk 8, 65 Jan 14 23:51 sde1
brw-rw----. 1 root disk 8, 80 Jan 14 23:51 sdf
brw-rw----. 1 root disk 8, 81 Jan 14 23:51 sdf1
brw-rw----. 1 root disk 8, 96 Jan 14 23:51 sdg
brw-rw----. 1 root disk 8, 97 Jan 14 23:51 sdg1
brw-rw----. 1 root disk 8, 112 Jan 15 00:02 sdh
brw-rw----. 1 root disk 8, 113 Jan 15 00:02 sdh1 # hot spare
# if you want to create RAID10
mdadm --create /dev/md0 --level=10 --raid-devices=6 /dev/sd[b-g]1

mdadm: Note: this array has metadata at the start and
    may not be suitable as a boot device.  If you plan to
    store '/boot' on this device please ensure that
    your boot-loader understands md/v1.x metadata, or use
    --metadata=0.90
Continue creating array? Y
mdadm: Defaulting to version 1.2 metadata

mdadm: array /dev/md0 started.

mdadm --add /dev/md0 /dev/sdh1; # add hot spare

watch -n 2 cat /proc/mdstat; # now you will see it build the array

Every 2.0s: cat /proc/mdstat                                                                                                         Mon Jan 15 01:38:56 2018

Personalities : [raid10]
md0 : active raid10 sdh1[6](S) sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0]
      5860147200 blocks super 1.2 512K chunks 2 near-copies [6/6] [UUUUUU]
      [>....................]  resync =  1.7% (102443584/5860147200) finish=532.3min speed=180266K/sec
      bitmap: 44/44 pages [176KB], 65536KB chunk

# now you have an new device md0 that represents your raid1-array

# lets format this new array and label it RAID10
mkfs.ext4 -L RAID10 /dev/md0

# it's a good idea to give it a meaningful label
e2label /dev/md0 "RAID1DATA"

Add entry to /etc/fstab:

# get uuid of raid
blkid
UUID=c308208c-46f7-4c46-b95a-dda3fac36bdb /run/media/username/RAID10 ext4 noatime,rw 0 0

if you use this moint point CentOS MATE Desktop CAJA (File Manager) will show it like this:

and you can mount it from the GUI by a click on RAID10. NEAT!

Create mdadm config file:

# test config output
root@Microknoppix:/home/knoppix# mdadm --examine --scan
ARRAY /dev/md/0 metadata=1.2 UUID=53ca5f4e:c16a6706:a216c510:5122e016 name=Microknoppix:0

# save config output to config file
root@Microknoppix:/home/knoppix# mdadm --examine --scan >> /etc/mdadm/mdadm.conf

Note that the array is actually started in /etc/rc0.d/K13mdadm-raid (via mdadm -A -s) or mdrun command

# test what would happen on reboot
/etc/rc0.d/K13mdadm-raid start

# sample output
[ ok ] Assembling MD array md0...done (started [2/2]).
[ ok ] Generating udev events for MD arrays...done.

# for auto mount of raid at boot
# creat mountpoint so it shows up in filemanager (debian 9 and 10)
mkdir /media/user/ext4RAID1

# creat mountpoint so it shows up in filemanager (Fedora/Redhat/CentOS)
mkdir /run/media/user/ext4RAID1

# fstab should look something like this
# ALWAYS USE UUIDs in fstab!
# md126 MIGHT CHANGE DURING NEXT BOOT to md127!
cat /etc/fstab
# <file system>                                 <mount point>         <type>  <options>         <dump>  <pass>
UUID=c308208c-46f7-4c46-b95a-dda3fac36bdb	/media/user/ext3RAID1 ext3    defaults		0	2
UUID=1fb6ee8e-c472-4095-bbf7-32d7e9f6d303	/media/user/ext4RAID1 ext4    errors=remount-ro	0       2
# dump = backup on boot? (not used a lot)
# pass = what order filesystem checks should be carried out on boot
# /dev/sda (assumed root device) should have 1
# all others 2 or maybe 3

# regular filesystem checks are important
# even when they take some time (ext4 fast, ext3 slow)
tune2fs -c 3 -i 7 /dev/md127
tune2fs -c 3 -i 7 /dev/md126
tune2fs 1.43.4 (31-Jan-2017)
Setting maximal mount count to 3
Setting interval between checks to 604800 seconds (7 days)
# to watch your raids at work
watch -n 2 cat /proc/mdstat

source: https://wiki.debian.org/SoftwareRAID

how to delete a raid:

su - root;
sync;
umount /media/user/mountpoint
mdadm --stop /dev/mdX
mdadm --remove /dev/mdX
# zero out superblock of all devices
mdadm --zero-superblock /dev/sdX

manpages:

smartctl.man.txt

parted.man.txt

mdadm.man.txt

Links:

https://raid.wiki.kernel.org/index.php/Linux_Raid

https://www.hostakers.com/kb/how-to-install-smartctl-utility-on-centos-7/

https://computingforgeeks.com/configuring-raid-storage-on-centos-linux/

https://arstechnica.com/gadgets/2020/05/western-digital-gets-sued-for-sneaking-smr-disks-into-its-nas-channel/

WD Red SMR vs CMR Tested Avoid Red SMR

liked this article?

  • only together we can create a truly free world
  • plz support dwaves to keep it up & running!
  • (yes the info on the internet is (mostly) free but beer is still not free (still have to work on that))
  • really really hate advertisement
  • contribute: whenever a solution was found, blog about it for others to find!
  • talk about, recommend & link to this blog and articles
  • thanks to all who contribute!
admin