DPS Computing Uncovers Solution to OS X ‘Superbug’
[adrotate banner=”45″]
Following numerous hours of investigation by DPS Computing into the OS X ‘superbug’, which causes repeated kernel panics over and over again with apparently no common cause, we have uncovered a solution that, in testing, has proved to work and correct the critical bug in the OS X operating system, which Apple have been attempting to deal with for the past 18 months.
Now firstly, there’s a couple of points that we need to make. One, it’s doubtful that this is the only solution, or necessarily the best, however, our tests have shown that it should work in a lot of circumstances, if not the majority.
As some of you will remember, we reported two days ago on the plight of MacBook Pro users experiencing kernel panics persistently out of the blue. Following in depth research into this issue, we discovered that Apple initially for six months denied that the problem was cause by Apple hardware or software, instead indicating that it must be third party software and hardware that users are adding to their machines. Around 12 months ago, Apple appear to have finally conceded that there was a problem with the aptly named ‘Black Screen of Death’ (silent kernel panics) was due to a low level hardware and OS X bug.
It was initially indicated that a fix for the bug would be available in the 10.7.2 update to the OS X operating system. Users eagerly anticipated this update, to free them of this catastrophic ailment of their once mighty machine, however disappointment was to come. 10.7.2 was released, everybody updated and……… nothing changed.
Many users vented further frustration in the Apple community on their support forums and again another announcement was made by Apple on the support forums that this was likely being caused by a firmware issue, which it was promptly announced would be upgraded in the near future. Firmware update 2.6 was this time supposed to be the saviour of the MacBook Pros all over the country experiencing this weird and disabling bug in Apples flagship operating system. But….. you guessed it. Firmware update 2.6 came, and it went. And still, things were no better.
As mentioned in our previous article a solution, albeit temporary, that seemed to work was the use of the freeware program gfxCardStatus, which, when used to stop the dynamic switching of the graphics card, seemed to work. The main issue with this fix was that, to have a chance of temporarily fixing the issue, you had to restrict the Macs capabilities to using on the integrated Intel graphics card, which is the least powerful of the two graphics cards shipped with the MacBook Pro (the other external card being the nVidea card). Many users were however, happy to take a performance hit as they were absolutely desperate to stop the excessive kernel panics, which rendered their £2,500 Macs into little more use than a £2 paper weight.
This temporary solution, which of the many shared in the community support forums provided by Apple is without doubt the one with the highest success rate.
We tested out the gfxCardStatus fix on one of our machines and low and behold, it worked. OK, so it wasn’t perfect, but we knew that we’d all be able to wait a little while for Apple to sort the problem out.
However, when we continued testing this solution over the following two days, there appeared to be a flaw in the solution which would render it useless, at least in some cases – it is not clear whether what we experienced would happen in all the other cases, but it is a safe bet to say that it is not an isolated phenomena.
After 24 hours of uptime, with the gfxCardStatus solution implemented, we rebooted the MacBook Pro. OK, so nothing drastic about that. The Mac booted up fine and we logged in as per usual. gfxCardStatus had reverted to dynamic switching, so we changed it back to the ‘Integrated only’ option, as it had been previously set prior to the reboot. And then, the display went ‘crazy’. Remember the slider puddles you used to play with as a kid? Well that’s exactly what OS X did with our screen. It was cut up into tiny pieces, mixed up, and put back together. And as if that wasn’t enough, the OS decided to throw in some ‘interference’ for good measure – the kind that you used to see on Analogue TV when you weren’t quite tuned into the channel correctly.
So a couple of reboots later and lots of tinkering we finally manage to get back on to the ‘Discrete only’ option using the Nvidea card and the display returns to normal – great, except we’re back to where we started now. But the problems are much worse. The integrated graphics card plays havoc with the display, rendering it useless. And the discrete graphics card works perfectly in 5 minute intervals before promptly and timely causing a kernel panic and the Black Screen of Death (similar to the Blue Screen of Death in Windows – apart from this one doesn’t have any text!)
Hmmm, so we we’re a bit unhappy at this stage as you can imagine. Rather than settling for this temporary solution, which although good, was by no means good enough for any machine that would be used in a production environment, we decided to set about finding a more temporary solution – seen as though Apple are showing few signs that they are any closer to solving the problem than they were 18 months ago.
One recommendation from Apple was to run the AHT – Apple Hardware Test. This would indeed be useful as it would allow us to identify whether the panics are being caused by hardware or software – and in this case, software referring to OS X. Great, we started to follow the recommendation. So, firstly we are informed that we need to shut down (and it must be a shut down) followed by a power up (not, I repeat not a reboot!). Before the Apple logo appears we were told to hold down D. This would start the AHT for us.
So, I followed these instructions and I did it. While waiting for the magic to happen, I did question what holding the ‘D’ key would do. Normally there is another key used to create a interrupt signal or similar to the OS to tell it we don’t want a normal boot. So I was sat there holding the ‘D’ key with one finger while using my other hand to read more on the issue on an iPad.
5 minutes passed and we’ve gone from a black screen to an off white screen. I’m still holding the ‘D’ key. And I’m still reading about the problem on the iPad. A couple more minutes passed, by which stage my finger had turned a shade of bright red and my arm was crying out for a rest.
Another couple of minutes, and we’ve got a log in screen. The same old login screen as normal. Not the special 16bit lookalike computer icon that was indicated in the instructions. So we tried a couple more times, same result. Changing how and when the key was pressed each time.
Convinced that this advice seemed a little bit woeful, we tried Cmd-D upon boot. Same effect, nothing. We do a little more (soul) searching on the iPad and I discover an excellent piece of information, which works for any users of Lion (10.7.x). Excellent, so this isn’t the AHT test via your own hard drive or DVD, this time you hold Option (Alt) + D which boots the online AHT from Apple.
We follow through the same steps using the new key combination and ta-da!! It starts to load the AHT. Just before I can crack a smile…… we gather around the MacBook and inspect the stop error that has just been displayed saying ‘AHT cannot run on this system’.
Great, normal reboot is initiated and I plan to use my 5 minutes of uptime to read a bit more advice from Apple. The reasons for this error are a) out of date version of OS X – i.e. you need to upgrade to 10.7.4 and b) out of date firmware – i.e. you need to upgrade to 2.6. The only problem with this advice and reasoning for the error is that the machine was running both OS X 10.7.4 and the 2.6 firmware update. Two pieces of advice from Apple, and two dead ends.
So with a bit more digging on AHT we discover its convoluted and apparently unnecessary complexity. AHT has been included in the past few major versions of OS X. There is also a version on all new MacBooks over the past few years. However, if at any time during your ownership of your MacBook Pro, you have upgraded to a new version of OS X (a new major version), then AHT won’t be there any more. But if you’ve got your copy of OS X on DVDs AHT will be present on Disc 1, that is unless you have Snow Leopard in which case it’ll be on Disc 2. Keeping up so far? We’re not done yet. However, if say for example you upgraded online (via the AppStore) to Lion, and your using your original Snow Leopard discs, it’ll probably still work, however some Snow Leopard files will be copied across during the AHT process, which is completely unnecessary and it may make Lion go wonky. And finally, the logical solution to the previously mentioned problem would be to use the AHT included in Lion on the disc. Well, no, as you can see from this entire situation, absolutely nothing is following logic. If you have the Lion DVD, great, but you don’t have AHT on it. And if you downloaded it from the App Store, you definitely don’t have AHT on it. One solution to ‘burn the image to a disc’ was evidently never going to work as the App Store version of OS X isn’t provided as an image, it’s an application. Next, the reason why Lion doesn’t include AHT is because its all changed to exclusively the online AHT from Lion onwards. So, if you haven’t got an Internet connection, you would probably start crying at this point. But don’t worry, many more of us will be crying along with you now as we then discover the revelation that the online AHT online works if your MacBook Pro originally shipped with Lion, upgrades don’t count.
So after a couple of hours of a fruitless AHT search we give up on that. It might not even necessarily give us any answers, so we move on.
Now we start investigating tools…… no, not that kind of tools (although you could be forgiven for thinking that with the runaround Apple is giving the community), software tools.
As experienced Mac users will know, there isn’t the plethora of free software maintenance and utilities tools that there is available for Windows. The apparent reasoning behind this is that Macs don’t break, well at least not until now. There’s a few paid for tools, some of which tease you into a ‘free download’, perform a scan, declare you MacBook a write off and offer to fix all your problems with 10 minutes and a click of a button – after you pay them £50-£75.
So theres, ToolKit, ToolTip, Tool….. something anyway. I can’t quite remember the exact name off the top of my head now but it’s the $99 pro version we were looking at (the cheaper sister product, the Deluxe version is given to AppleCare customers free of charge, apparently because it’s ‘that good’). One of the Pro versions selling points is that it goes ‘much further than Apples Hardware Test’. Great, no need to worry about the AHT saga now! But, the price tag… well we’ll keep this in reserve in case we get even more desperate than we already are.
Free system utilities, well there isn’t many. The only one that we found to be of any potential use in this situation was OnyX, which has many positive reviews and endorsements across the Internet.
In my complete an utter desperation at this point, I concede that I’ll just have to go through the 5 minute kernel panics for the next 25 years (or at least until we replace the affected MacBook Pro) and decide the system could do with a bit of a general major cleanup anyway, after all, what more harm can it do?
And now…..
The Solution!
I thought’d I’d make that stand out for the non techie users who just want it fixed and don’t care why, how or the amazing adventure we embarked upon to get to this point.
So, reboot, hold down the shift key on the keyboard, and we get it into ‘Safe Boot’ (similar to Windows Safe Mode). And yes, most of your things will be disabled, including all non Apple start up items. But on the plus side, the system runs a lot faster.
So, we then start OnyX. You’ll be greeted by a dialog asking you to check S.M.A.R.T status, which it highly recommends. And so do we. It should only take a few seconds, and then we’re onto the next stage, assuming there is no problems. If you do encounter problems at this point, follow the repair instructions given by OnyX.
All being well and good, you’ll be greeted by a second dialog box. This time its to verify the integrity of the start up volume. Again this highly recommended by OnyX and by us. This is the same process that ‘Disk Utility’ follows to check the start up volume integrity. If any errors are shown, it’s best to reboot into the recovery area, run Disk Utility and repair the start up volume and permissions. Not doing so, when errors are shown, and then using the tools in OnyX can, for want of a better term, ‘brick’ your system, and believe me, that is not good. Assuming everything is fine, we can then carry on.
We’re then greeted by an Administration authorisation dialog box. This allows us to give OnyX permission to do its magic. Enter your username and password.
Then we arrive at the main menu:
Click on the cleaning option, at which point you should be presented with a screen similar to the following:
On this first tab, the system tab (shown above) ensure that all the options are ticked in this ‘Delete the cache’ section. By default some are left unticked. Tick them all. Don’t worry, a cache is just temporary files that can speed up common tasks and applications after you’ve used them a few times. Their removal will not harm your system. Click execute and wait for the function to complete.
Then move onto the User tab:
Select everything again. As above, nothing bad is going to happen (lets be honest, things can’t really get any worse can they). Click execute and wait for the function to complete.
Move onto the Internet tab:
Now some people are attached to their cookies, browser history etc. If you are, then don’t worry, you don’t have to do anything on this tab. But equally, you could also take this opportunity to sort out the thousands of temporary files for your different browsers, a good many of which won’t be used any more. Either way, if you decided to delete them, your system is going to be fine. Click execute and wait for the function to complete or move onto the next tab.
Move onto the Fonts tab:
Again, tick all the options and delete all the font caches. Ignore the doomsday warning bout Apps taking usually long to load the first time after clearing all these caches. Yes, Apps will take a bit longer to load the first few times…. however, if we don’t follow these steps, the chances of us ever using any of our Apps productively again is minimal. Click execute and wait for the function to complete.
Close OnyX and empty the trash. Don’t worry at the amount of files that are getting junked now. We had over 100,000 – this is normal if you’ve been using your MacBook a long time and / or you haven’t cleared your caches recently (or ever).
Reboot out of ‘Safe Boot’.
The boot and log on will take a little longer than usual. As will the loading of your different apps for the first few times. But…… the good old persistent kernel panics should now have resolved. No need to set the gfxCardStatus settings, feel free to have dynamic switching on – you shouldn’t have any more problem and this setting does give the best performance and power consumption for your system.
Conclusion.
Basically, corruption can occur in caches (especially system and kernel caches, although it can be others) which can cause persistent kernel panics. Cleaning your caches every once in a while is a good idea to maintain performance anyway. Delete the cache, and you delete the corruption. Our test machine that we used has stopped having 5 minute kernel panics and, touch wood, there hasn’t been any since.
This particular problem seems to be particularly affecting mid 2010 MacBook Pro 15″ models (manufactured between April 2010 and February 2011) running OS X Lion (10.7.x). It isn’t clear why this very specific model (6.2) would be affected in particular, maybe it is just coincidence.
Either way, this fix should work if you are experiencing the same problem, irregardless of model.
The only one, very minor, temporary downside, is that due to the caches being cleared the first boot and the first few times starting each application are going to be a bit slower. But it’s a price worth paying for actually turning your MacBook Pro back from being a paper weight into being the excellent productive machine that it should be! This is by the far the best and most permanent solution to the problem out there currently.
At one point, David, our Managing Director, was so disheartened he was contemplating replacing all the office Macs with PCs (perish the thought!). But crisis averted, with next to no help from Apple ;). Reading the community forums you can see that Apple support and the ‘Genius’ bar weren’t much help to users who accessed their services.
Faith in Macs restored, lets continue with the productivity! Hopefully it’ll be at least another 20 years at least before another critical bug occurs in Mac OS!
Very good article David! Very good.
I too am a little disheartened with Apple’s failures recently. I’ve gone through many products that have had problems with Apple – in fact, I’ve noticed my iPad has bleeding issues but is only visible when watching a movie for example, because of the border often due to the resolution of the movie.
I’m getting pretty tired of, what I consider, poor quality from Apple for such expensive products. I expect master perfection from any expensive product, and I would absolutely contemplate sacking anyone who is responsible for releasing products that are absolute crap. While Apple products are great, if they don’t work as they should, then they’re nothing more than crap.
If Apple continues and I keep hearing reports of faulty Apple hardware (heck, even users of the new MacBook Pro have had some issues I’ve read), my new computer purchase will unfortunately be on a Windows 8 computer. I love Macs because of how great the design, hardware and build-quality is and the operating system, but if I can’t buy an Apple computer without having hardware issue even though I absolutely love Macs (they’re great machines), then I have no choice but to opt for something else.
Companies need to prioritise the quality of their products and to make sure THEY’RE ABSOLUTELY PERFECT, and to have GOOD COMMUNICATION with EVERYONE IN THE COMPANY and with CUSTOMERS for when there are problems. I tell you David, if I owned the company I work, the biggest dismissal would be for either bad customer service or product / service issues. It’s the most important thing. Apple gets a lot of things right, and I share their vision on things like end user experience, design, amazing customer service and caring about what you’re doing, but it seems they can do better on some things – and another one of which is public image. They don’t do well by just hiding in Cupertino without any responses when there are quality control issues with new products. No, that’s how you f*ck up your PR and customer loyalty and satisfaction. Steve Jobs was wrong on some things when he was running Apple, and this is one of them.
I know I’m right, because time and again I know what customers want and I’ve seen how easy it is to both resolve a customer issue by understanding the customer and people more broadly, and create a customer issue because of incompetence to do things properly. If I ran the company I worked for, I’d be employing customer relations experts and psychologists just to train technical support and customer relations staff how to deal with customers as best as they can.
Thanks Ben :-).
Yes, you make some very valid points. I mean obviously you can’t blame Apple for there initially being an issue – after all no piece of hardware / software is infallible.
However, the two absolutely monumental mistakes I believe Apple made in this instance were a) denying the problem was anything to do with Apple hardware or OS X, citing third party hardware or software, when it was clear from the substantial amounts of user reports in Apples support community that all the evidence was pointing towards an Apple hardware and / or OS X issue. And then if that wasn’t bad enough b) 12 months after finally admitting it was an Apple hardware and / or OS X problem there is still no solution provided by them and there isn’t even any progress in the right direction.
In fact, most of the ‘steps’ provided by Apple, even the diagnostic ones either don’t work or are so convoluted you can’t expect your average user to be able to complete the steps (you only have to look in the AHT part of the article to understand what I mean).
I agree, Mac users expect higher quality products than this, much higher. Because of the price tag they also expect less issues and faster responses to issues as Macs can be 4x, 5x, 6x, even 10x more expensive than a PC/laptop.
The fact that this issue even happened, surprised me, but that wasn’t what annoyed me – it was the fact that 18 months after this was brought to the companies attention they a) outright denied it for 6 months (note, they didn’t say it was ‘likely’ third party issues, they said it was third party issues – which now appears to be completely false) and b) 12 months on, with all the resources available to them, there isn’t any solution from Apple.
I mean they’ve tried with the OS X and firmware updates, promising a couple of time that update blah with fix the issue, but none of the updates have made the slightest bit of difference for most users. And it honestly does surprise me the little to no progress that Apple have managed to make on these issues!
By the way, if anyone from Apple is reading, I’d be happy for you to engage my services as a consultant ;).
It is likely someone at Apple does read what people say online to understand what people are saying about Apple and their products. I would assume they have an entire department to update the executive team as to PR related aspects and issues – for example, what is said online ;).
Yeah, you would hope so. Haha, probably ;-). And if not, they need to, to stay in touch with their customers!
[…] reported by our friends at DPS Computing, it appears the 2010 MacBook Pros that have a hybrid graphics solution, comprising of both […]
You bloody Legend! I’ve had the exact same experience over the last few months. And dealing with Apple has been a pain. So I was going into Apple tomorrow to get my logic board replaced, which would have cost me $700! I just did what you said and now dynamic switching works. Thanks heaps mate. Made my day!
Thank you!! Glad that we managed to help out :). Yeah, we’ve tried dealing with Apple but they are less than eager to help – well without it costing lots of money for new parts we probably don’t need (judging by the reports from Apples support forums). Excellent, glad it worked for you :). Hopefully now you will have stress free Mac use!!
If you have any more issues then feel free to come back, we’ve covered the problem and solutions quite extensively over a number of articles :).
This solution worked long enough for me to create a time machine backup!
Then it started again ;(
The kernel panics I am getting are:
Kernel Extensions in backtrace:
com.apple.NVDAResman(8.0)[6A699209-FB98-316B-A3C0-
com.apple.nvidia.nv50hal(8.0)[9CD95A4A-FD94-349E-A4B6-
com.apple.GeForce(8.0)[91C40470-82BA-329A-A9D7-
Its pretty clear that something about mountain lion doesnt agree with the nvidia graphics card. There is no point in talking to either company about it – they dont have a clue! When I was on the phone with Apple, they were going to charge me $50. Even when I linked them to this article : http://support.apple.com/kb/TS4088
I am not sure what to do at this point, I have tried just about everything.
Hi Taylor,
Glad that the solution worked for you so that you could at least backup your system but sorry to hear that it didn’t solve the problem completely.
Just wondering, have you seen one of the follow up articles to this one? – http://www.dpscomputing.com/blog/2012/07/09/update-os-x-bsod-superbug-gfxcardstatus-2-0/ .
We too on our office computers still experienced problems by *only* implementing this solution (albeit less frequent). However, by using gfxCardStatus – in the way described in the article that I’ve linked to above – its solved the problem for our mid-2010 MacBook Pro.
Yeah, the kexts that are causing you a problem definitely indicate that the nVidia drivers are the source. Following the instruction in the follow on article (above) will hopefully ‘solve’ the problem completely for you. However, it obviously means effectively ‘disabling’ the nVidia graphics card – but we thought this was a very small price to pay considering the regular kernel panics which rendered the machines useless.
The problem appears to be due to the fact that nVidia didn’t write the nVidia drivers provided in Macs – Apple did – and they don’t appear to have done a very good job of it!
Yeah, there’s been many stories of Apple charging, sometimes nearly a thousand dollars for hardware (usually logic board) replacements that do not have an effect on the problem in the vast majority of cases.
Even though Apple have known about this problem for around 18 months, and acknowledged it for at least the past 12 months, they don’t seem to be any closer to a solution unfortunately.
We’ve yet to test Mountain Lion to see whether new nVidia drivers or some other bug fix has been applied – but we’re not realistically expecting that it has been solved – I think there would have been more coverage on the Internet if it had. But, depending on how bad it gets – anything is worth a try I guess!
If you’ve not started using gfxCardStatus yet (http://www.dpscomputing.com/blog/2012/07/09/update-os-x-bsod-superbug-gfxcardstatus-2-0/) that’d be my next recommended step. It’s worked on all our Mac machines for over a month with no kernel panics :).
Let us know if it works for you :). Feel free to comment again if the above doesn’t work and we’ll see if we can work something out :).
Update: I have followed the method on this site:
http://osx86.wikidot.com/known-issues
It had me delete the kext files for the graphics card. This fixed all my kernel panic issues. However, now the graphics are terrible glitchy. No video and things like launch pad look like someone is trying to draw them with a crayon.
Think i may restore and try to remove those folders one at a time
thanks for the response! I hadnt seen that post yet. It will be my next step.
I should probably clarify: These problems didnt exist for me until I upgraded from snow leopard to mountain lion.
No problem :). Hopefully gfxCardStatus will sort it out for you – don’t forget use v2.0 or below – for some reason versions above v2.0 don’t solve the problem – Cody (gfxCardStatus developer) said that some changes in the later versions had meant that they don’t currently work for solving this problem. And when we tried with versions above v2.0 it would only work once and then it was back to the same old problem!
Ah right, that’s very interesting to know Tyler – thanks for clarifying – so Mountain Lion doesn’t solve the problem then. I don’t know what Apple are playing at – this problem seems fairly widespread but they just don’t appear to want to rewrite the drivers causing the problem – which is poor customer / technical service to say the least.
One of our machines came with Snow Leopard and there were no problems. However, the problems began when we updated to Lion – due to the new Apple built nVidia drivers.
Obviously, Apple are continuing to provide the same broken nVidia drivers in Mountain Lion, which is very disappointing to say the least. There is absolutely no reason why a company with the resources Apple has couldn’t have come up with a solution in over 18 months. Personally I think its completely disgraceful.
Fingers crossed that the gfxCardStatus extension to the solution works for you as well :). You’ll have to let us know – it’d be good to confirm that whether the same solution works on Mountain Lion as well – we’ve only verified it on Lion thus far :).
Just seen your other comment:
Haven’t come across that way of dealing with it before. I can obviously see how it would ‘work’ as it removes the kexts causing the panic. But, as you’ve seen, that can possibly cause other issues. Using gfxCardStatus makes the nVidia drivers dormant as opposed to removing them and that’s been a stable solution for us :).
I restored my ssd to the time machine backup that finished this morning. It took a really long time – almost 6 hours! I havent had any problems since it restored from that backup.
I have been using it for around 4 hours since then without any kernel panics. I will let you know if it stays free of issues. If the issue is indeed fixed, I will go in and look which files I deleted yesterday that weren’t restored. Or try to find what is different.
Hi Taylor,
Glad to hear you’ve managed to stop the kernel panics. We’ll keep our fingers crossed that it stays free of issues for you. I can understand how frustrating it is to experience this problem – on one of our machines we were getting kernel panics every 5-10 minutes after reboot! :(.
Thanks for letting us know :).
Hey there,
Im using mountain lion since i hoped it would stop the superbug wich i had with lion… it was still present..
so after quite some time researching with google, i found your little solution here, and I tried it out right away (since i tried EVERYTHING else!)….
So, my hopes were not very high.. but bam… it´s been running 3 days permanent now without even one slightest BSOD or something similar… no graphic Klitsches, nothing… it all runs smooth and fast like it should!
Thanks so much for uncovering this, and this solution definately should be made public, so everyone who has the problem can do it… since apple doesnt seem to care anymore.
thanks!
Hi Jamey,
Really glad to hear that our solution has worked for you :). It’s still working on our mid 2010 MacBook Pros running Lion so its looking good for at least temporarily sorting the mess that Apple have created!
Originally the solution was intended to be a temporary solution until Apple got its backside in gear and released and official solution but unfortunately it seems like they aren’t going to do this so I guess this will have to be in place of an ‘official’ resolution to the issue.
No problem, glad the solution worked for you and glad that we could help.
If you have any further issues, questions etc then please feel free to come back and we’ll try and help out :).