Zálohovací a archivační technologie IBM Simon Podepřel Storage Sales, IBM
IBM FORUM 2010
© 2010 IBM Corporation
Agenda Doporučené způsoby zálohování a retence dat - zhodnocení jednotlivých technologií a rady pro jejich použití v modelových situacích IBM portfolio produktů pro oblast zálohování a archivace - Páskové systémy (Tape Systems) - Virtualizace pásek (IBM Protectier VTL and deduplication) - IBM Information Archive
© 2010 IBM Corporation
Nazadržitelný růst dat Více než 50% firem aktivně spravuje mezi 1TB až 99TB dat, důvody:
1)
Business požadavky – držení detailních informací o produktech, zákaznících, službách pro tvorbu strategie, obch. růstu a zvyšování úrovně služeb (business aplikace, data warehousing)
2)
Uživatelská (nestrukturovaná) data – foto, video, audio
3)
Interní a externí pravidla, která nutí firmy držet data delší dobu
© 2010 IBM Corporation
Příčiny ztráty dat minimalizovat nárůst investic do storage (omezený budget) Zajistit ochranu dat v souladu s požadavky a ochranu dat před jejich ztrátou
© 2010 IBM Corporation
Infrastruktura pro zálohování a obnovu – páska nebo disk ?
© 2010 IBM Corporation
Osvědčené postupy při zálohování a archivaci dat 1) Zálohujte efektivně Data Lifecycle management Uvážit hodnotu dat a jak se jejich hodnota měmí v čase - volba média (vysokorychlostní disky nebo páska) - proč mají data „letět první třídou, když by stačil vlak“
© 2010 IBM Corporation
Osvědčené postupy při zálohování a archivaci dat 2) Tiered Storage Architecture Určení storage tier pro jednotlivé typy dat: RTO (Recovery Time Objective) – jak rychle potřebujete mít data obnovena
RPO (Recovery Point Objective) – jak aktuální musí být data, aby byl dopad na váš business minimální
Budget – omezení volby technologie množstvím finančních prostředků
© 2010 IBM Corporation
Outage Days Hrs Past
Mins Secs
Recovery Point Objective (RPO)
Mins Hrs
Present
Days Wks
Recovery Time Objective (RTO)
Future
?? Sync Replication Journaling
Tape Backup
Clusters Tape Restore Journal Apply Manual Transaction Recovery © 2010 IBM Corporation
Osvědčené postupy při zálohování a archivaci dat 2) Tiered Storage Architecture Archivace Neměnná data, možný přesun na nižší storage tier (např.) Volba technologie dle RTO, náklady na nižší storage tier, škálovatelnost, floor space, životnost médií, bezpečnost Data Retention Data pro dlouhodobou archivaci za účelem shody s externími pravidly (zákony, vyhlášky) a za účelem interního kontrolingu Zálohování a obnova RTO, RPO, bezpečnost dat Business Continuity a Disaster Recovery RTO, RPO Správný technologický mix pro vzdálenou lokalitu může značně ušetřit celkové náklady © 2010 IBM Corporation
Osvědčené postupy při zálohování a archivaci dat 3) Nepodrobujte svá data riziku Více úrovní ochrany dat
3 kopie dat v různých lokalitách včetně vzdálené lokality pro případ katastrofy Různé formy médií (mix disky a pásky) – předcházení selhání systému. Alespoň jedna kopie offline (izolace) Předchází replikaci poškozených (napadených) dat Data encryption Dnes dobře dostupné jako samostaný SW nebo součást HW Technology mix
© 2010 IBM Corporation
Osvědčené postupy při zálohování a archivaci dat 4) TCO Kromě nákladů na pořízení technologie, dále: Elektrická energie Floor space Management Škálovatelnost
© 2010 IBM Corporation
Osvědčené postupy při zálohování a archivaci dat 5) Ověrení možnosti obnovy dat Pravidelné ověřování Životnost technologie Efektivní obnova
© 2010 IBM Corporation
Vybalancovaná volba technologie
© 2010 IBM Corporation
Tape Drives
Tape automation
Tape virtualization
By Providing a Comprehensive Tape Portfolio Entry
Enterprise
Midrange Protectier appliance
Protectier gateway TS7700 (mainframe)
TS2900 (3572) TS3500 (3584) TS3100 (3573)
TS3200 (3573)
TS3310 (3576) TS3400
LTO 3 & 4 HH
LTO 3 & 4
LTO Media
IBM Tape Systems Portfolio
TS1130 (3592)
3592 Media
© 2010 IBM Corporation
TS3000 System Console1 (TSSC) Summary TSSC functions include: – Call Home in the event of certain outages – Remote services including remote monitoring support – Local service console which includes view-only access to TS3500 web UI TSSC offers the following advantages: – Reduced telephone line charges – Faster data offload – Reduced repair costs – Improved serviceability – Better proactive maintenance TSSC can support up to 43 tape systems – Rack mount version available for open systems
1 FC2720
desktop, FC2730 rack mount
© 2010 IBM Corporation
August 2008, R8B Announcement (High Density frames & active frames restrictions) TS3500 High Density (HD) Tape Frames TS3500 w/S24
TS3500 w/S54
Sun SL8500
Media
3592
LTO
T10K/LTO
Max. slots
>15,000
>20,000Preview 10,000
Max. Slots/sq.ft
95
125
72
Total PB1
15
16
10
TB/Sq.ft1
95
100
72
Current library capacity support: up to 6887 slots
1
Active Expansion Frames are reduced when Sxx frames are added
Uncompressed data
© 2010 IBM Corporation
September 22 2009, R9A Announcement Planned eGA (Web Release): October 9, 2009
TS3500 High Density (HD) Tape Frames Library
TS3500 w/S24
TS3500 w/S54
Sun SL8500
Media
3592
LTO
T10K/LTO
Max. slots
>15,000 6,887
>20,000 6,887
10,000
Max. Slots/sq.ft
95
125
72
Total PB11
6.9 15
5.5 16
10
TB/Sq.ft1
95
100
72
Score: 4-0 Current library capacity support: Plus: up to 6887 slots 6x better scalability2
1 4
Customer WINS!
August ’08 announcement 33% Highfaster performance tape drives3 Density frames restriction +2x faster average moving time4
Uncompressed data 2 300x vs 50x scalability based on published specs. 4.7 sec. Vs 11 sec. Published avg. Moving time
3
IBM TS1130 160 MBps vs T10KB 120 MBps native performance published specs
© 2010 IBM Corporation
How ProtecTIER works Repository
New Data Stream
HyperFactor™
Memory Resident Index
ProtecTIER™ Server
Backup Servers
Only 4GB Inline needed deduplication to map 1PB Up of to physical 500MB/sec disk! per server or 1000MB/sec with 2data node cluster! “Filtered”
© 2010 IBM Corporation
Overview of ProtecTIER Operations
Backup application writes data to ProtecTIER as it would to tape 1
2
3
4
Data goes through HyperFactor deduplication engine
5
Only unique data is stored Existing duplicate data is referenced When a data object expires or is overwritten, references are removed A B C D E
F
G H I
J
Free space is reclaimed and reused
© 2010 IBM Corporation
Production Customers Deployment Results
© 2010 IBM Corporation
Significantly Reduces Replication Bandwidth Primary Site
Represented capacity
Backup Server ProtecTIER Gateway
Physical capacity Backup Server IP-based WAN link
Deduplication enables a large amounts of data to be replicated with significantly less bandwidth
Secondary Site
Backup Server
ProtecTIER Gateway
Physical capacity
Virtual cartridges can be cloned to tape at DR site Tape library
© 2010 IBM Corporation
IBM TS7650 ProtecTIER® Deduplication Family TS7650G Gateways
TS7650 Appliance Highest Performance
Highest Performance Largest Capacity Larg High Availability
Highest Performance largest Capacity High Availability High Performance High Capacity Flexible Storage
Largest Capacity Better Performance Larger Capacity Scalable Good Performance Highly Scalable Low cost
ble a l a c S
ance m r o f r d Pe n a y cit Capa ActiveActive-Active Cluster
ActiveActive-Active Cluster Single Node Up to 500 MB/sec
Up to 1000 MB/sec 1 PB useable
1 PB useable
Up to 500 MB/sec Up to 500 MB/sec
36 TB useable
36 TB useable Up to 250 MB/sec Up to 100 MB/sec
18 TB useable
7 TB useable © 2010 IBM Corporation
Deduplication Market at a Glance DD880
DEDUPE TECHNOLOGY
ProtecTIER with HyperFactor
Byte-level diff
RockSoft Hash-based
DXi7500 RockSoft Hash-based
! Potential Hash
! Potential Hash
collision
collision
collision
Inline
Inline
Deduplication
Block Level
Ø File Level
See Note (4)
! 188 MB/s
! 160 MB/s
Dual node Cluster
!Clustering not available
available
RESOURCE UTILIZATION
for a 100TB repository
See Note (2)
comparison
Block Level
! 130 MB/s !Clustering not
Only 4GB RAM needed
Byte-level diff
See Note (3)
! 300 MB/s
area required
See Note (1)
! Post process
Single node
performance 1000MB/s
DeltaStor
! Post process
Block Level
performance 500 MB/s
S2100-ES2
! Post process
Block Level
No disk staging
SIR Hash-based
! Potential Hash
comparison
Deduplication PERFORMANCE
VTL 700
See Notes (5-6)
!Clustering with !Clustering with
See Notes (7-8)
Global Dedupe not Global Dedupe not available available
! Staging area >
! Staging area >
Ø Staging area >
than the size of largest full backup
than the size of largest full backup
twice the size of
!Over 300GBs
!Over 300GBs
!Over 300GBs
of RAM!
of RAM!
of RAM!
No disk staging
area required
See Notes (9-10)
largest full backup
24GB of RAM
See Note (11)
Not hash based
© 2010 IBM Corporation
Deduplication Market at a Glance ProtecTIER with HyperFactor
DD880 RockSoft Hash-based
VTL 700 SIR Hash-based
S2100-ES2
! Over $400
! Small struggling
Ø Acquisition or
million in debt
company
failure imminent
DXi7500 RockSoft Hash-based
DeltaStor
PRODUCT STABILITY
IBM in business for nearly 100 years
ProtecTIER in
production since 2006
Over 25PBs of in
production CAPACITY-SCALABILTY
Single system can
scale to 1PB capacity
Up to 16 virtual tape libraries
Up to 512 virtual tape drives Up to 512,000 virtual tape cartridges
Acquired by EMC
In production
! Post process
Many small
! Very few small
since 2006
systems in production
customers
! GA October 2008 ! GA May 2008 Ø Almost no ! Very few small deduplication in customers
!Limited by rapid !Limited by rapid !Limited by huge
useable capacity
hash table growth
published
hash table growth
See Note (14-15)
production
! 58TB Maximum !Limits not published !Limits not published !Limits not
See Note (12-13)
storage requirements
Up to 64 virtual Up to 128 virtual Up to 192 virtual tape libraries
tape libraries
tape libraries
Up to160 virtual
Up to 1024
Up to 192 virtual
tape drives Up to130,000 virtual cartridges
virtual drives Up to 64,000 virtual cartridges
tape drives Up to 5.3 million virtual cartridges
MEETS ENTERPRISE REQUIREMENTS?
YES
! NO
! NO
! NO
! NO © 2010 IBM Corporation
Introducing IBM Information Archive Next Generation Information Retention Solution
Universal, scalable, and secure storage repository for structured and unstructured information, compliant or non-compliant Integrated Archive Appliance – combines the best of IBM Software, Hardware & Services Protects Data by enforcing the industry’s most stringent information retention laws Highly versatile, highly scalable information retention solution for mid-size and enterprise organizations
© 2010 IBM Corporation
Multiple Archive Collections, Multiple Protection Levels
A single Information Archive can be partitioned into 3 archive collections Collections are accessible via Network File System (NFS) or System Storage Archive Manager (SSAM) protocols or a combination of both Each collection can be customized to support different protection levels
Supports multiple ingest and input models including custom applications
ECM Archive Repository
Users and Applications
Custom Applications
LAN
NFS
NFS
SSAM
One Namespace
NAS
NAS
SSAM
Disk
Disk
Disk
Collection 1
Collection 2
Collection 3
Clustered
Clustered
IBM Information Archive Tape
© 2010 IBM Corporation
Customizable Protection Levels Most Flexible and Most Comprehensive Data Retention Policies
Basic Protection enables the greatest flexibility for managing an organization’s information retention needs Intermediate Protection allows IT administrators to increase and decrease retention periods as needed, but information deletion is only allowed after the retention period has expired
Basic
Intermediate
Maximum
Maximum Protection helps IT administrators manage information with strict business, legal or regulatory retention needs
© 2010 IBM Corporation
Advanced Data Protection Features
Multiple Protection Levels
Basic to maximum protection levels address all possible data retention requirements
Enhanced Disaster Recovery
Advance Copy Services increase the availability of archived documents and prevents data loss in the event of a disaster
Redundancy
Clustered nodes & Redundant Array of Independent Disks (RAID) 6 to maintain data integrity even in the event of two disk failures.
© 2010 IBM Corporation
Advanced Data Protection Features
Encryption
Enhanced Tamper Protection
Shredding
Encryption can provide added security for data storage and remote data transmission
Patent-pending feature eliminates root access
The destruction of deleted data to make it difficult to discover or reconstruct that data later.
© 2010 IBM Corporation
Data Optimization Features
Deduplication
Single instance storage of data
Compression
Compression can, on average, reduce the data’s size by 60 - 90%
Hierarchical Storage Management
Automatically distributes and manages information on disk, tape, or both
© 2010 IBM Corporation
?
© 2010 IBM Corporation
Děkuji!
© 2010 IBM Corporation