[clug] S.M.A.R.T message for hd failure
sjenkin at canb.auug.org.au
Thu Jan 31 22:59:02 GMT 2008
Joshua Worth wrote on 31/1/08 9:28 PM:
> It doesn't look good, but I saw on a forum that it might be lying to me
> but I cant be sure. This message was appearing when I had an extra 80
> gigabyte drive in my computer, but after taking that out and doing some
> tests, it turned out to be fine. I am using OpenSuSE 10.3 X86_64
> Is there a way I could fix this without destroying any data?
> Here is the forum I looked at:
'S.M.A.R.T.' stands for "Self-Monitoring, Analysis, and Reporting
In the last year there have been two major studies published on the
failure rates of newish technology disk drives.
[Why 'newish' drives? You have to run drives for 5 years to collect the
Notably, SMART addresses mechanical faults and cannot warn/report on
electronics failures - which are always sudden and catastrophic.
"The Google team found that 36% of the failed drives did not exhibit a
single SMART-monitored failure."
* Keep an eye on the SMART output. It *will* tell you about some failing
* Don't forget that for one third of failures,
* Old drives fail much more...
* System/Controller/Software faults that scramble data are as or more
likely than drive failure
Two blogposts that pull those studies together:
"Everything You Know About Disks Is Wrong" and "Google’s Disk Failure
CMU paper: "Disk failures in the real world: What does an MTTF of
1,000,000 hours mean to you?"
[They looked at 100,000 drives]
Google: "Failure Trends in a Large Disk Drive Population"
As another post advised - run the Linux SMART utility to check the drive.
Make sure you have second, safe copies of important data.
Steve Jenkin, Info Tech, Systems and Design Specialist.
0412 786 915 (+61 412 786 915)
PO Box 48, Kippax ACT 2615, AUSTRALIA
sjenkin at canb.auug.org.au http://members.tip.net.au/~sjenkin
More information about the linux