LSI Raid Firmware Bios Flashing

From Hack Sphere Labs Wiki
Jump to: navigation, search

Most Recent

sas2flsh -o -e 6

DO NOT REBOOT

sas2flsh -o -f 2118it.bin -b mptsas2.rom

Easy [1]

EFI

ERROR: Failed to initialize PAL. Exiting program.

Copy and paste:


    Formatting the USB as freedos (using Rufus) - https://www.all4os.com/windows/create-a-bootable-ms-dos-or-freedos-usb-drive.html
    Downloading the Shell_full.efi, renaming to shellx64.efi, and putting it on the root of the drive - https://svn.code.sf.net/p/edk2/code/trunk/edk2/EdkShellBinPkg/FullShell/X64/Shell_Full.efi
    Downloading the sas2flash.efi - http://www.lsi.com/products/host-bus-adapters/pages/lsi-sas-9211-8i.aspx#tab/tab4
    Booting to the ASUS bios and loading EFI shell (it's the last step, in the last page on the advanced screen on the bottom of the page, same page as "Save and Reset"

    Follow the rest of the instructions on this page (http://digitalcardboard.com/blog/2014/07/09/flashing-it-firmware-to-the-lsi-sas-9211-8i-hba-2014-efi-recipe/)

    ; to show the controller and verify the current version.
    sas2flash.efi -listall 

    ; to erase the BIOS, do not reboot after this command.
    sas2flash.efi -o -e 6

    ; to write the new firmware and BIOS.
    sas2flash.efi -o -f 2118it.bin -b mptsas2.rom

More instructions here to compare: http://brycv.com/blog/2012/flashing-it-firmware-to-lsi-sas9211-8i/


Apparently there is a web bios: https://forums.freenas.org/index.php?threads/confused-about-that-lsi-card-join-the-crowd.11901/

First Time

Best way is to create a UEFI boot disk as you will get command errors if you do not in most situations, at least in DOS.

Get the firmware use the sas2flash.efi

Create a usb key with fat16 and then create the following:
/EFI/boot
Get the shell.efi (it sounds like you need the older shell.efi because it supports older versions of EFI specs.)
yaourt uefi-shell-svn
cp shell.efi to /EFI/boot/boot.efi for 32bit/ bootx64.efi 64bit.

use the shellx64.efi

Boot into the EFI shell...

Run the proper command:

sas2flash.efi -o -f 2118ir.bin -b mptsas2.rom
sas2flash.efi -o -sasadd 500605bxxxxxxxxx (x= numbers for SAS address)

I switched from IR to IT bios with this command... I did not write a sas address (actually I may have)

If I add another card I have to. (http://lime-technology.com/forum/index.php?topic=12767.210)

6. Program SAS address in IT-mode:
sas2flsh -o -sasadd 500605bxxxxxxxxx 
where "500605bxxxxxxxxx" SAS address from small green sticker on yor card, without "-"

Despite me missing this, it is humming away nicely in my now converted production box on 5.0.12a.

BUT - I now have a second card, that I 'm intending to integrate.

I did the reflash including step 5 on the only machine I have been able to. Then I figured I would reboot to do step 6 - also to read the sticker for the number.

Unfortunately, that machine now can not boot, as it is hanging at the BIOS of the adapter. So I have no option at my disposal to do step 6 on any of the cards.  :-[

Will I have any issues in an unRAID scenario by having two cards that has both not been through step 6 to get individual addresses?

Notes

Second Time

I was having issues with getting the server to boot from a USB stick. I ended up booting the system rescue live cd and used the USB stick as a source for the flashing util and bios + firmware file.

I could not get this to work:

sas2flsh -o -e 6

But I wanted to stay in IR mode so I just attempted to overwrite existing firmware.

sas2flsh -o -f 2118it.bin -b mptsas2.rom

IR over IR firmware worked great.

Third Time

I do not have a EFI bios, I could only get FreeDOS to work. I had to use FreeDOS to erase flash and flash the card.

NVDATA Image does not match Controller Device ID (Crossflashing)

You need to use an older flash util because the device ID is one that is from an OEM. You are supposed to use one that is 14 or below.

Or you could have the wrong bios image and firmware image like I did. 4i does not equal 8i, duh.

It has also been said that the -o option needs to be used.[2]

Others have used the megarec (dos)[3] software from supermicro. It looks like a util used to recover bad flashes.

I tried all of these and they failed, then I realized that I really did have the wrong firmware. The NVDATA had a bad DEVICE id and not VENDOR id, I think that is the difference.

IT to IR firmware or Vice Versa

You have to erase the firmware of the card first:

sas2flsh -o -e 6

Then you run the normal flash command. If you need to reprogram your sas address (read below) do it after programming the card with the it firmware.

I have read [4] sites saying to reboot after erase but a quote from another site[5]:

DO NOT REBOOT. If you do reboot, or if you attempt to flash the firmware and/or BIOS image and it does not flash correctly, you will have to RMA the controller.

The site states that it is from an LSI article (that I cannot find) so I did not reboot. I doubt you can reboot but who knows, I think the instructions to reboot are for a different reason.

Here is the guide to the -e parm (what # = what)

1 NVSRAM
2 Backup firmware
3 Persistent pages
4 Manufacturing area
5 Boot services
6 Clean flash (erase everything except manufacturing area)
7 Erase complete flash

It looks like if you erase complete flash (7) you have to program the sas address back in:

sas2flsh -o -listsasadd

Here is some info[6]: PS: Oh, important! After erasing the card with the MegaRAID HWR Contoller (sic!) Recovery tool, you have to reset its original SAS address! Otherwise it will be 0000000-0-0000-0000. Not nice if you use multiple completely wiped and then crossflashed former M1015s in one system. I'm not sure if the command sas2flash.efi -o -e 6 which can be seen in many guides also wipes the SAS address, you better check with sas2flash.efi -o -listsasadd. Anyway use sas2flash.efi -o -c <controller number, in case you have more than one LSI HBA!!!> -sasadd 500605b0xxxxxxxx. You can find the address on a sticker on your card, omit the whitespace and dashes.

I am thinking that -e 7 will erase everything, plus the address.

It does erase the address.

I too had the issue where the flash util refused to flash over the existing firmware.

I used the sas2flash -o -e 7 line posted in the comments and it worked. BUT it erased the WWN number from the card giving me an error every boot about the SAS address not being programmed.

SO DO NOT USE sas2flash -o -e 7. Instead use sas2flash -o -e 6. That will erase the flash but it won’t erase the manufacturing area which contains the WWN.

If you do erase the wwn. You’ll need to reenter it using sas2flash -o -sasadd (wwn number). Usually this number is on a sticker somewhere on the card or motherboard.

If you can’t find it. You can make up an 8 byte hex number. A WWN is like an ethernet mac address. Every card in the system needs to have a unique one.


sas2flash -o -c <controller number, in case you have more than one LSI HBA!!!> -sasadd 500605b0xxxxxxxx
sas2flash -o -sasadd 500605B0046B20B0

Does the SAS address need to match the card

It seems to work with some weird address I had in my card that does not match: 50000000:80000000

It is assigned by the device manufacturer, like an Ethernet device's MAC address, and is typically world-wide unique as well.

A SAS Domain is the SAS version of a SCSI domain—it consists of a set of SAS devices that communicate with one another by means of a service delivery subsystem. Each SAS port in a SAS domain has a SCSI port identifier that identifies the port uniquely within the SAS domain. It is assigned by the device manufacturer, like an Ethernet device's MAC address, and is typically world-wide unique as well. SAS devices use these port identifiers to address communications to each other.

In addition, every SAS device has a SCSI device name, which identifies the SAS device uniquely in the world. One doesn't often see these device names because the port identifiers tend to identify the device sufficiently.

For comparison, in parallel SCSI, the SCSI ID is the port identifier and device name. In Fibre Channel, the port identifier is a WWPN and the device name is a WWNN.

In SAS, both SCSI port identifiers and SCSI device names take the form of a SAS address, which is a 64 bit value, normally in the NAA IEEE Registered format. People sometimes refer to a SCSI port identifier as the SAS address of a device, out of confusion. People sometimes call a SAS address a World Wide Name or WWN, because it is essentially the same thing as a WWN in Fibre Channel. For a SAS expander device, the SCSI port identifier and SCSI device name are the same SAS address.

More info

Each SAS port is identified with a unique SAS address, which is shared by all phys on that port.

For example, a SAS disk drive might have two narrow ports. Each port has one unique SAS address. The single phy in each port uses its port’s SAS address.

In another example, a SAS device might have one 4-wide port. That port has one SAS address, which is shared by all four phys in the port.

Unlike SCSI devices and SCSI IDs, SAS devices self-configure their SAS addresses. User intervention is not required to set SAS addresses, and SAS addresses cannot be modified.

UEFI BIOS

Error: Failed to initialize PAL

It has been stated that this error means that you have a UEFI bios that is blocking the flash. You need to boot into the UEFI bios shell and execute sas2flash.efi or if your bios does not have this, boot from a UEFI boot disk.[7]

Another notable thing from the referenced[8] site: (NOTE: That screenshot shows some commands failing…that was me trying the v2 Shell….aparantly it doesn’t have support for some commands we need! Use v1!!!)

Also: http://brycv.com/blog/2012/flashing-it-firmware-to-lsi-sas9211-8i/

and

PAL is a set of code in typical BIOSes for accessing hardware directly. It's rarely used except when doing direct access to a device's hardware (for firmware updates most commonly).

99.9% of UEFIs out there have the PAL removed (along with a crapload of other units from BIOSes) to make room for the UEFI shell. An alternative to PAL is to use the UEFI shell and an application that you run in the UEFI shell to ensure direct connection to the hardware without anything going wrong.

In essence, newer stuff has no PAL command codes so you have to use the UEFI shell.

Yes, in theory you could code a DOS application to do the same job as the PAL and then you could reflash it from the DOS shell. The problem is that you can't really stop the OS from saying "hey SAS card.. let's talk" and midway through your firmware update the card receives some bytes that the card writes blindly to its internal flash memory because it thinks all of the bytes are for the firmware update. Next thing you know your SAS card is borked forever because the firmware actually finished, but the data written was garbage.

What version of firmware to use

For the 9211-8i use P19

23:55 < skoef> yes, 7.39 is the most recent I believe
23:55 < skoef> we use firmware 19.0.0.0-it
23:55 < skoef> 20. wasn't stable
23:56 < webdawg> That was the last thing that was bothering me, the bios comes up as sas2008 which I thought was a 
                 different controller.
23:56 < webdawg> Really, you had problems with 20?  Can you tell me what problems you have?
23:56 < webdawg> Once I get into the bios, the controller shows up and works just fine.
23:56 < skoef> can't fully recall, but random crashes with OI
23:57 < skoef> lsi confirmed issues I believe
23:57 < webdawg> Wow, thanks for letting me know, I was just about to flash P20

RE: Which version of LSI 9211-i8 IT-mode firmware should I flash? Should I flash the latest P20?

I just put together a system with three LSI 9211-8i cards and flashed them to P20, and ran into problems on the latest OmniOS during benchmarking with bonnie++. A single SSD (or multiples) connected via the LSI 9211-8i cards were throwing errors in the console and /var/adm/messages.

This was via the SSD drive on a non-expander backplane. To rule out the backplane, I hooked a single Intel S3500 480GB SSD up to a breakout cable but no improvment - the same errors show up in the root console and log..

OmniOS 5.11 omnios-8c08411 2014.04.28
LSI 9211-8i cards - Flashed to P20 firmware.
Nov 4 09:39:30 napp-it-14b Log info 0x31080000 received for target 9.
Nov 4 09:39:30 napp-it-14b scsi_status=0x0, ioc_status=0x804b, scsi_state=0x0
Nov 4 09:39:30 napp-it-14b scsi: [ID 107833 kern.notice] /pci@0,0/pci15ad,7a0@16/pci1000,3020@0 (mpt_s as1):
Nov 4 09:39:30 napp-it-14b Aborted_command!
Nov 4 09:39:30 napp-it-14b scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci (scsi_vhci0):
Nov 4 09:39:30 napp-it-14b /scsi_vhci/disk@g55cd2e404b58b0a7 (sd2): Parity Error on path mpt_sas14 /disk@w55cd2e404b58b0a7,0


Arg. Transport errors and hard errors looked something like this:

iostat -En
-------------------
c8t55CD2E404B58E103d0 Soft Errors: 0 Hard Errors: 1028 Transport Errors: 870
Vendor: ATA Product: INTEL SSDSC2BB48 Revision: 0370 Serial No: BTWL404501HQ480
Size: 480.10GB <480103981056 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 0 Predictive Failure Analysis: 0


So I took to IRC.. and found some advice on #illumos. Someone there (patdk-wk_) mentioned he runs P19.

Well, lets give it a try - So I downgraded by re-flashing all 3 HBAs to P19 and rebooted.

No more errors during benchmarking!!!!

So if you have a LSI 9211-8i hba I would stick with P19. There seems to be some error with P20 and these cards.

(Also, someone on IRC also mentioned 9201-16e having problems with P20)

Hope this saves someone else some time.

Note: I also had a striped mirror of 2TB WD Re drives that also displayed some hard + transport errors with P20, but much less than the SSDs. (only 2 or 4 per WDRe drive after a benchmark run) So it seems the faster speed of the SSD was triggering the problem more frequently on P20.

I actually got this figured out a few hours after posting it and I wanted to make sure that the information was out there. There is a fairly serious firmware issue with the entire LSI 92XX HBA product line. The currently shipping firmware version for the 92XX HBA line is P20 which was released on September 22. The P20 firmware contains a bug that will show up as IO timeouts, disk errors and corruption on nearly every storage drive on LSI’s compatibility list. I became aware of the issue after observing a catastrophic ZFS storage failure and tracking it back to the HBAs. After isolating the issue I contacted LSI support who informed me of the bug and the fact that they have not publically documented the issue. LSI support also informed me that the only fix for this issue is to return any effected HBA to the P19 firmware.
  1. https://nguvu.org/freenas/Convert-LSI-HBA-card-to-IT-mode/
  2. http://hardforum.com/showthread.php?p=1038602393
  3. https://forums.servethehome.com/index.php?threads/dell-h200-lsi-9211-8i-brick-revival.361/
  4. http://mycusthelp.info/LSI/_cs/AnswerPreview.aspx?sSessionID=&inc=7954
  5. http://brycv.com/blog/2012/flashing-it-firmware-to-lsi-sas9211-8i/
  6. https://forums.freenas.org/index.php?threads/ibm-serveraid-m1015-and-no-lsi-sas-adapters-found.27445/
  7. https://scriptthe.net/2013/11/16/flashing-different-lsi-firmware-on-the-m1015on-a-uefi-bios/
  8. https://scriptthe.net/2013/11/16/flashing-different-lsi-firmware-on-the-m1015on-a-uefi-bios/