smartctlhdparmlshwfdisk badblock
软raid
mount /dev/md0 /opt [root@localhost root]# cp /usr/share/doc/raidtools-1.00.3/raid*.conf.* /etc[root@localhost root]# ls -l /etc/ |grep raid[root@localhost root]# vi /etc/raid0.conf.sample mkraid /dev/md0mkfs.ext3 /dev/md0lsraid -A -a /dev/md0[root@localhost root]# more /proc/mdstat不使用的时候请直接删除/etc/raidtab文件. # rm /etc/raidtab 有时想知道服务器上有几块磁盘,如果没有做raid,则可以简单使用fdisk -l就可以看到。但是做了raid呢,这样就看不出来了。那么如何查看服务器上做了raid?windows:RAID卡厂商都有RAID安装程序与驱动的。在配置完RAID后,进WINDOWS系统,下载相应的RAID安装程序并安装。比如 LSI 1064E 在官网上就可以下载到。 或者HD tune可以查看基本的raid信息linux:分软与硬软件raid:只能通过Linux系统本身来查看cat /proc/mdstat,可以看到raid级别,状态等信息。硬件raid:最佳的办法是通过已安装的raid厂商的管理工具来查看,有cmdline,也有图形界面。如Adaptec公司的硬件卡就可以通过下面的命令进行查看:# /usr/dpt/raidutil -L all可以看到非常详细的信息。当然更多情况是没有安装相应的管理工具,只能依靠Linux本身,一般有两种方式:# dmesg |grep -i raid# cat /proc/scsi/scsi显示的信息差不多,raid的厂商,型号,级别,但无法查看各块硬盘的信息。[root@coreserv log]# cat /proc/scsi/scsi
Attached devices:Host: scsi6 Channel: 02 Id: 00 Lun: 00 Vendor: IBM Model: ServeRAID M1015 Rev: 2.13 Type: Direct-Access ANSI SCSI revision: 05Host: scsi7 Channel: 00 Id: 00 Lun: 00 Vendor: IBM SATA Model: DEVICE 81Y3672 Rev: SA81 Type: CD-ROM ANSI SCSI revision: 00# fdisk -l Disk /dev/sda: 145.9 GB, 145999527936 bytes255 heads, 63 sectors/track, 17750 cylindersUnits = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System/dev/sda1 * 1 13 104391 83 Linux/dev/sda2 14 17750 142472452+ 8e Linux LVM# cat /proc/scsi/scsiAttached devices:Host: scsi0 Channel: 00 Id: 00 Lun: 00 Vendor: SEAGATE Model: ST3146356SS Rev: HS09 Type: Direct-Access ANSI SCSI revision: 05Host: scsi0 Channel: 00 Id: 01 Lun: 00 Vendor: SEAGATE Model: ST3146356SS Rev: HS09 Type: Direct-Access ANSI SCSI revision: 05Host: scsi0 Channel: 01 Id: 00 Lun: 00 Vendor: Dell Model: VIRTUAL DISK Rev: 1028 Type: Direct-Access ANSI SCSI revision: 05通过以上信息可以看出,该服务器有两块磁盘。品牌是希捷的,磁盘代号为 ST3146356SS,如果你熟悉细节磁盘的代号命名规则,你会轻易判定该磁盘大小为146G 。再根据fdisk 得出的结果可以判定,该服务器是拿两块146G的硬盘做的raid1.不同的文件系统(xfs,reiserfs,ext3)都有自己的检测和修复工具。检测之前可以先使用dmesg命令查看有没有硬件I/O故障的日志,如果有,先用fsck看看是不是文件系统有问题,如果不是则可以使用下面介绍硬盘检测和优化方法来修复它。 grep "error" /va/log/messages*
--------------------------------------------------------------------------------------------------------------使用SMART检测硬盘
SMART是一种磁盘自我分析检测技术,早在90年代末就基本得到了普及每一块硬盘(包括IDE、SCSI),在运行的时候都会将自身的若干参数记录下来,这些参数包括型号、容量、温度、密度、扇区、寻道时间、传输、误码率等。硬盘运行了几千小时后,很多内在的物理参数都会发生变化,某一参数超过报警阈值,则说明硬盘接近损坏,此时硬盘依然在工作,如果用户不理睬这个报警继续使用,那么硬盘将变得非常不可靠,随时可能故障。启用SMARTSMART是和主板BIOS上相应功能配合的,要使用SMART,必须先进入到主板BIOS设置里边启动相关设置。一般从Pentium2级别起的主板,都支持SMART,BIOS启动以后,就是操作系统级别的事情了(Windows没有内置SMART相关工具,需要安装第三方工具软件),好在Linux上很早就有了SMART支持了,如果把Linux装在VMware等虚拟机上,在系统启动时候可以看到有个服务启动报错:smartd。这个服务器就是smart的daemon进程(因为vmware虚拟机的硬盘不支持SMART,所以报错)。smartd是一个守护进程(一个帮助程序),它能监视拥有自我监视,分析和汇报技术(Self-Monitoring, Analysis, and Reporting Technology - SMART)的硬盘。SMART体系使得硬盘能监视并汇报自己的运行状况.它的一个重要特性是能够预测失败,使得系统管理员能避免数据丢失。[root@coreserv log]# rpm -qf /usr/sbin/smartctl
smartmontools-5.42-2.el6.x86_64[root@coreserv log]# rpm -ql smartmontools/etc/rc.d/init.d/smartd/etc/smartd.conf/etc/sysconfig/smartmontools/usr/sbin/smartctl/usr/sbin/smartd/usr/sbin/update-smart-drivedb[root@localhost ~]# smartctl --scan/dev/sda -d scsi # /dev/sda, SCSI device/dev/sdb -d scsi # /dev/sdb, SCSI device这是一个固态盘[root@localhost ~]# smartctl -i /dev/sdasmartctl 5.43 2016-09-28 r4347 [x86_64-linux-2.6.32-431.el6.x86_64] (local build)Copyright (C) 2002-12 by Bruce Allen, http://smartmontools.sourceforge.net=== START OF INFORMATION SECTION ===Device Model: Kingstek 120GBSerial Number: AA000000000000001053LU WWN Device Id: 0 000000 000000000Firmware Version: 20150818User Capacity: 120,034,123,776 bytes [120 GB]Sector Size: 512 bytes logical/physicalDevice is: Not in smartctl database [for details use: -P showall]ATA Version is: 8ATA Standard is: ACS-2 (revision not indicated)Local Time is: Tue Jan 8 09:26:49 2019 CSTSMART support is: Available - device has SMART capability.SMART support is: Enabled
----------------------------------------------------------------------------------------------------------------------------------
使用badblocks检测硬盘坏块
badblocks命令可以检查磁盘装置中损坏的区块。执行该指令时须指定所要检查的磁盘装置,及此装置的磁盘区块数。badblocks -s//显示进度 -v//显示执行详细情况 /dev/sda1# badblocks -s -v /dev/sda正在检查从 0 到 244198583的块Checking for bad blocks (read-only test): ^C0.10% done, 0:04 elapsedInterrupted at block 272896$badblocks -s//显示进度 -w//以写去检测 -v//显示执行详细情况 /dev/sda2# badblocks -w -s -v /dev/sda1Checking for bad blocks in read-write modeFrom block 0 to 25607577Testing with pattern 0xaa: ^C0.73% done, 0:03 elapsed注意,不能以写的方式检测已经挂载的硬盘 ----------------------------------------------------------------------------------------------------------------------------使用hdparm测试yum install hdparm
测试硬盘读写速度
# hdparm -Tt /dev/sda
可以查看转速,型号
[root@kvm2 ~]# hdparm -I /dev/sda
/dev/sda:ATA device, with non-removable media Model Number: ST1000DM003-1ER162 Serial Number: Z4YBD720 Firmware Revision: CC45 Transport: Serial, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0[root@kvm2 ~]# hdparm -i /dev/sda/dev/sda: Model=ST1000DM003-1ER162, FwRev=CC45, SerialNo=Z4YBD720---------------------------------------------------------------------------------------------------------------------
下载安装
下载地址:ftp://download2.boulder.ibm.com/ecc/sar/CMA/XSA/ibm_utl_sraidmr_megacli-8.00.48_linux_32-64.zip
或https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/8-07-06_MegaCLI.zip在线下载:wget ftp://download2.boulder.ibm.com/ecc/sar/CMA/XSA/ibm_utl_sraidmr_megacli-8.00.48_linux_32-64.zip磁硬盘阵列后如何检测和监控硬盘健康状况?
https://blog.csdn.net/enweitech/article/details/82893085https://blog.csdn.net/xinqidian_xiao/article/details/80940306 MegaCli使用手册 wget https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/8-07-06_MegaCLI.zipunzip -d me 8-07-06_MegaCLI.zip cd linux rpm -ivh MegaCli-8.07.06-1.noarch.rpm cd /opt/MegaRAID/MegaCli/ ./MegaCli64 -adpcount ./MegaCli64 -AdpAllInfo -aALL
[root@kvm1 MegaCli]# ./MegaCli64 -adpcount
[root@kvm1 MegaCli]# ./MegaCli64 -AdpAllInfo -aALL[root@kvm1 MegaCli]# ./MegaCli64 -LdPdInfo -aALL[root@kvm1 MegaCli]# ./MegaCli64 -LDInfo -Lall -aALL[root@kvm1 MegaCli]# ./MegaCli64 -AdpBbuCmd -aALL命令行具体使用
[root@kvm1 MegaCli]# ./MegaCli64 -AdpAllInfo -aALLAdapter #0============================================================================== Versions ================Product Name : ServeRAID M5210Serial No : SV61224052FW Package Build: 24.9.0-0029 Mfg. Data ================Mfg. Date : 03/18/16Rework Date : 00/00/00Revision No : 04EBattery FRU : N/A Image Versions in Flash: ================BIOS Version : 6.25.03.3_4.17.08.00_0x060E0301FW Version : 4.290.00-4923NVDATA Version : 3.1507.00-0011Ctrl-R Version : 5.10-0710Preboot CLI Version: 01.07-05:#%0000Boot Block Version : 3.07.00.00-0002 Pending Images in Flash ================None PCI Info ================Controller Id : 0000Vendor Id : 1000Device Id : 005dSubVendorId : 1014SubDeviceId : 0454Host Interface : PCIEChipRevision : C0Link Speed : 0Number of Frontend Port: 0Device Interface : PCIENumber of Backend Port: 8Port : Address0 50000397081bdd321 50000397081b39322 5000c50096e015913 50000397a84304764 50000397a84303065 00000000000000006 00000000000000007 0000000000000000 HW Configuration ================SAS Address : 500605b00ba2c280BBU : AbsentAlarm : AbsentNVRAM : PresentSerial Debugger : PresentMemory : PresentFlash : PresentMemory Size : 1024MBTPM : AbsentOn board Expander: AbsentUpgrade Key : PresentTemperature sensor for ROC : PresentTemperature sensor for controller : AbsentROC temperature : 58 degree Celsius Settings ================Current Time : 8:40:57 1/7, 2019Predictive Fail Poll Interval : 300secInterrupt Throttle Active Count : 16Interrupt Throttle Completion : 50usRebuild Rate : 30%PR Rate : 30%BGI Rate : 30%Check Consistency Rate : 30%Reconstruction Rate : 30%Cache Flush Interval : 4sMax Drives to Spinup at One Time : 2Delay Among Spinup Groups : 12sPhysical Drive Coercion Mode : 1GBCluster Mode : DisabledAlarm : DisabledAuto Rebuild : EnabledBattery Warning : DisabledEcc Bucket Size : 15Ecc Bucket Leak Rate : 1440 MinutesRestore HotSpare on Insertion : DisabledExpose Enclosure Devices : EnabledMaintain PD Fail History : EnabledHost Request Reordering : EnabledAuto Detect BackPlane Enabled : SGPIO/i2c SEPLoad Balance Mode : AutoUse FDE Only : YesSecurity Key Assigned : NoSecurity Key Failed : NoSecurity Key Not Backedup : NoDefault LD PowerSave Policy : Controller DefinedMaximum number of direct attached drives to spin up in 1 min : 10Auto Enhanced Import : YesAny Offline VD Cache Preserved : NoAllow Boot with Preserved Cache : NoDisable Online Controller Reset : NoPFK in NVRAM : NoUse disk activity for locate : NoPOST delay : 90 secondsBIOS Error Handling : Stop On ErrorsCurrent Boot Mode :Normal Capabilities ================RAID Level Supported : RAID0, RAID1, RAID5, RAID00, RAID10, RAID50, PRL 11, PRL 11 with spanning, SRL 3 supported, PRL11-RLQ0 DDF layout with no span, PRL11-RLQ0 DDF layout with spanSupported Drives : SAS, SATAAllowed Mixing:Mix in Enclosure Allowed Status ================ECC Bucket Count : 0 Limitations ================Max Arms Per VD : 32Max Spans Per VD : 8Max Arrays : 128Max Number of VDs : 64Max Parallel Commands : 928Max SGE Count : 60Max Data Transfer Size : 8192 sectorsMax Strips PerIO : 42Max LD per array : 64Min Strip Size : 64 KBMax Strip Size : 1.0 MBMax Configurable CacheCade Size: 0 GBCurrent Size of CacheCade : 0 GBCurrent Size of FW Cache : 831 MB Device Present ================Virtual Drives : 3 Degraded : 0 Offline : 0Physical Devices : 6 Disks : 5 Critical Disks : 0 Failed Disks : 0 Supported Adapter Operations ================Rebuild Rate : YesCC Rate : YesBGI Rate : YesReconstruct Rate : YesPatrol Read Rate : YesAlarm Control : NoCluster Support : NoBBU : YesSpanning : YesDedicated Hot Spare : YesRevertible Hot Spares : YesForeign Config Import : YesSelf Diagnostic : YesAllow Mixed Redundancy on Array : NoGlobal Hot Spares : YesDeny SCSI Passthrough : NoDeny SMP Passthrough : NoDeny STP Passthrough : NoSupport Security : YesSnapshot Enabled : NoSupport the OCE without adding drives : YesSupport PFK : YesSupport PI : YesSupport Boot Time PFK Change : YesDisable Online PFK Change : YesSupport LDPI Type1 : NoSupport LDPI Type2 : NoSupport LDPI Type3 : NoPFK TrailTime Remaining : 0 days 0 hoursSupport Shield State : YesBlock SSD Write Disk Cache Change: YesSupport Online FW Update : Yes Supported VD Operations ================Read Policy : YesWrite Policy : YesIO Policy : YesAccess Policy : YesDisk Cache Policy : YesReconstruction : YesDeny Locate : NoDeny CC : NoAllow Ctrl Encryption: NoEnable LDBBM : NoSupport Breakmirror : YesPower Savings : No Supported PD Operations ================Force Online : YesForce Offline : YesForce Rebuild : YesDeny Force Failed : NoDeny Force Good/Bad : NoDeny Missing Replace : NoDeny Clear : NoDeny Locate : NoSupport Temperature : YesDisable Copyback : NoEnable JBOD : NoEnable Copyback on SMART : YesEnable Copyback to SSD on SMART Error : YesEnable SSD Patrol Read : NoPR Correct Unconfigured Areas : Yes Error Counters ================Memory Correctable Errors : 0Memory Uncorrectable Errors : 0 Cluster Information ================Cluster Permitted : NoCluster Active : No Default Settings ================Phy Polarity : 0Phy PolaritySplit : 0Background Rate : 30Strip Size : 256kBFlush Time : 4 secondsWrite Policy : WBRead Policy : AdaptiveCache When BBU Bad : DisabledCached IO : NoSMART Mode : Mode 6Alarm Disable : NoCoercion Mode : 1GBZCR Config : UnknownDirty LED Shows Drive Activity : NoBIOS Continue on Error : 0Spin Down Mode : NoneAllowed Device Type : SAS/SATA MixAllow Mix in Enclosure : YesAllow HDD SAS/SATA Mix in VD : NoAllow SSD SAS/SATA Mix in VD : NoAllow HDD/SSD Mix in VD : NoAllow SATA in Cluster : NoMax Chained Enclosures : 16Disable Ctrl-R : YesEnable Web BIOS : NoDirect PD Mapping : NoBIOS Enumerate VDs : YesRestore Hot Spare on Insertion : NoExpose Enclosure Devices : YesMaintain PD Fail History : YesDisable Puncturing : YesZero Based Enclosure Enumeration : NoPreBoot CLI Enabled : NoLED Show Drive Activity : YesCluster Disable : YesSAS Disable : NoAuto Detect BackPlane Enable : SGPIO/i2c SEPUse FDE Only : YesEnable Led Header : NoDelay during POST : 0EnableCrashDump : YesDisable Online Controller Reset : NoEnableLDBBM : NoUn-Certified Hard Disk Drives : AllowTreat Single span R1E as R10 : NoMax LD per array : 64Power Saving option : All power saving options are disabledDefault spin down time in minutes: 30Enable JBOD : NoTTY Log In Flash : NoAuto Enhanced Import : YesBreakMirror RAID Support : YesDisable Join Mirror : NoEnable Shield State : YesTime taken to detect CME : 60sExit Code: 0x00