Architecture
Comprehensive guide to HBase architecture including client-server model, regions, WAL, compaction, and advanced features.
Resources
-
More information about the design and implementation can be found at the jira issue: HBASE-10070
-
HBaseCon 2014 talk: HBase Read High Availability Using Timeline-Consistent Region Replicas also contains some details and slides.
In this section:
Overview
Introduction to HBase as a NoSQL distributed database, key features, scalability characteristics, and when to use HBase.
Catalog Tables
Understanding hbase:meta catalog table structure, location tracking, and how HBase maintains region metadata.
Client
HBase client architecture, connection management, metadata caching, and client-side configuration for optimal performance.
Client Request Filters
Using filters with Get and Scan operations to efficiently query HBase data, including comparison, column, row, and utility filters.
Master
HBase Master server responsibilities including RegionServer monitoring, metadata operations, load balancing, and failover behavior.
RegionServer
HBase RegionServer implementation, interfaces, read/write paths, block cache, memstore management, and performance tuning.
Regions
Understanding HBase regions, stores, memstore, write-ahead log (WAL), compaction, splits, and region management strategies.
Bulk Loading
Efficient methods for loading large datasets into HBase using MapReduce to generate HFiles and directly load them into the cluster.
HDFS
How HBase leverages HDFS for distributed storage, including NameNode and DataNode architecture and file replication.
Timeline-consistent High Available Reads
Using region replicas to achieve high availability for reads with timeline consistency, reducing read unavailability during failures.
Storing Medium-sized Objects (MOB)
Optimized storage and handling of medium-sized objects (100KB-10MB) in HBase using the MOB feature for improved performance.
Scan Over Snapshot
Using TableSnapshotScanner to scan HBase snapshots directly from HDFS, bypassing RegionServers for better performance.
Security Configuration Example
This configuration example includes support for HFile v3, ACLs, Visibility Labels, and transparent encryption of data at rest and the WAL. All options have been discussed separately in the sections above.
Overview
Introduction to HBase as a NoSQL distributed database, key features, scalability characteristics, and when to use HBase.