Where performance, reliability and cost-effectiveness
govern the choice of storage technology, IT managers put their money on a
combination of SCSI, Fiber Channel and SATA based storage. However, each of
these technologies has their own limitations. For instance, SCSI is not very
scalable as it can have only 15 drives per host and is completely unsuited to
dense-computing environments given the large size of its cabling. SATA drives
were never meant for enterprise-class storage having been developed purely as a
high performance alternative to PATA drives for the desktop. Further, it has
been always known that serial devices are better performers than parallel
devices. Look to the table on this page to see what the inherent problems of
each of these three technologies are before we go on to see how to solve them.
Serial vs. Parallel |
When we talk of serial What we are talking While when seen this |
What is it?
Serial Attached SCSI or SAS is the evolution of SCSI that's been in
parallel mode so far as a serial mode device. These new breed of drives are
meant for mainline servers and can serve a variety of needs from high
performance OLTP applications to back office and backup tasks. As a technology,
SAS does away with traditional parallel SCSI needs of having multiple initiators
to ramp up performance and reliability. SAS introduces the ability to plug in
numerous drives and from any of the end points the ability to extend the
interface and add even more storage as the need for it grows. This is something
analogous to what we do as a common thing on USB (where you can plug in a USB
hub at any USB connector and add as many ports as needed). Every SAS
'extender' (that's what they are called) can be connected either to drives
units or more extenders. The connection back to its own parent is a wide
interface, letting it communicate without a performance hit as more devices are
added to the tree. See the figure for an idea of what this would look like in a
real life scenario.
One problem extendable interfaces have always faced is the
reliability of routing information to the right attached device. For instance,
how would the correct hard disk be found and data routed to it? While parallel
SCSI similarly allows IT managers to extend more interfaces, it shares a common
backplane bus. This reduces performance when more drives are added since
computational overheads are used to arbitrate which drive gets access to the
bus. Add to this the transactional nature of HDD to host communications and the
problem becomes more complex as the drive that had initiated a request may have
closed the transaction and even gone offline. SAS cuts down this problem by
having dedicated point-to-point connectivity between each initiator and target.
Also, to find the right drive at all times, each SAS extender uses routing
tables. Similar to network cards having MAC addresses, disk drives have unique
IDs that help their controllers identify and route commands and data to them.
While SCSI requires each drive to be assigned this ID manually-a Herculean task
when you have a thousand drives-SAS drives are stamped with a universally unique
ID at the time of manufacture, relieving administrators from the cumbersome task
of having to manage so many IDs.
Benefits
SAS drives have serial SCSI interfaces that resemble SATA interfaces. Conversely
seen, this allows a SAS host to use both SAS drives as well as SATA drives
interchangeably, without needing hardware modifications or adapters. How is this
important? Consider the common scenario of having different sets of data in your
storage pool which separately need performance, redundancy and reliable
handling. Instead of resorting to the traditional solution of deploying separate
parallel SCSI and SATA based storage and incurring larger costs-not to mention
the cost of transporting the information from one to the other as its need
changes-SAS can host drives of both kinds together on a common backbone. IT
managers can then selectively add more serial SCSI or more SATA disk drives as
needed. While the initial costs may be on the higher side because of the cost of
the host hardware, the TCO would be lower than deploying P-SCSI and SATA
separately since you would only incur the cost of the extenders and disk drives.
Limitations of SCSI, Fiber Channel and SATA | ||||
Limitation | SCSI | FC | SATA | SAS |
Scalability | 15 drives | 127 devices | 128 devices | 16,384 drives |
Data rate | 320 Mb/s | 2.0 Gb/s |
1.5 Gb/s |
3.0 Gb/s |
Throughput | Half duplex | Full duplex | Half duplex | Full duplex |
Bus | Shared | Dedicated | Shared | Dedicated |
Connectivity | SCSI only | FC only | SATA only | SATA and SAS |
Performance wise, transfer rates for SAS drives are at 3 Gb/s
(envisaged target of 12 Gb/s) with full duplex transmission. Also, each drive
has two ports in redundant configuration. This means connectivity is ensured
even if one port fails. SAS supports a configuration called a “wide port”
what this means is two separate links are established between the same pair of
PHY layers. This increases availability. A SAS controller with four 3 Gb/s links
configured as a wide pair will support throughputs at 1 Gb/s-a configuration
that would easily saturate a parallel Ultra320 SCSI bus or Fabric.
SAS allows extenders and hard drives to be attached in any combination to increase the available storage pool |
Now, the magic of SAS lies in the fact that it can maintain
this data rate with all 16,384 drives (theoretical maximum) connected and
operational. On the physical front, SAS drives are also smaller in form
factor (2.5” compared to 3.5” Parallel SCSI or SATA disks) and use SATA type
connectors that are narrower. This lets one deploy SAS devices in dense and
highly dense environments easily. Also, unlike Parallel SCSI (12 meters total)
and SATA (1 meter), SAS connectors can be as long as 8 meters each, with a total
data center length in thousands of meters. SAS drives are also hot-pluggable
with RAID letting you maintain storage availability in case individual drive
units fail.
In operation
Serial Attached SCSI has its own light-weight communication protocols: the
STA (Serial ATA Tunneling Protocol) and SSP (Serial SCSI Protocol). SSP is not
meant to be used in a SAN scenario, but purely as a cabinet-class protocol. This
“cabinet” can be internal drives within servers or DAS/NAS boxes. Outside of
this cabinet, one would still need to use traditional storage protocols like
TCP/IP, iSCSI or FC and this would connect the SAS cabinet to the SAN through a
RAID controller. STP is used for communication between the host and SATA devices
while SSP will be used between the host and SAS disks. SAS devices still use the
SCSI command set, ensuring full compatibility with that platform.
Finally
What this would also do is reduce the cost of information lifecycle
management (ILM). When all your tiers of storage are sharing the same backbone
hardware, your cost of ownership and management of that infrastructure is
dramatically reduced. This in turn affects the cost of management of the
information that is stored on that infrastructure. We expect SAS to be in a
couple of years where SCSI was expected to be today, with higher adoption
figures and support on platforms.