Apache HBase

FAQ

Frequently asked questions about Apache HBase.

General

When should I use HBase?

See arch.overview in the Architecture chapter.

Does HBase support SQL?

Not really. SQL-ish support for HBase via Hive is in development, however Hive is based on MapReduce which is not generally suitable for low-latency requests. See the datamodel section for examples on the HBase client.

How can I find examples of NoSQL/HBase?

See the link to the BigTable paper in other.info, as well as the other papers.

What is the history of HBase?

See hbase.history.

Large cells don't fit well into HBase's approach to buffering data. First, the large cells bypass the MemStoreLAB when they are written. Then, they cannot be cached in the L2 block cache during read operations. Instead, HBase has to allocate on-heap memory for them each time. This can have a significant impact on the garbage collector within the RegionServer process.

Upgrading

How do I upgrade Maven-managed projects from HBase 0.94 to HBase 0.96+?

In HBase 0.96, the project moved to a modular structure. Adjust your project's dependencies to rely upon the hbase-client module or another module as appropriate, rather than a single JAR. You can model your Maven dependency after one of the following, depending on your targeted version of HBase. See Section 3.5, "Upgrading from 0.94.x to 0.96.x" or Section 3.3, "Upgrading from 0.96.x to 0.98.x" for more information.

Maven Dependency for HBase 0.98

<dependency>
  <groupId>org.apache.hbase</groupId>
  <artifactId>hbase-client</artifactId>
  <version>0.98.5-hadoop2</version>
</dependency>

Maven Dependency for HBase 0.96

<dependency>
  <groupId>org.apache.hbase</groupId>
  <artifactId>hbase-client</artifactId>
  <version>0.96.2-hadoop2</version>
</dependency>

Maven Dependency for HBase 0.94

<dependency>
  <groupId>org.apache.hbase</groupId>
  <artifactId>hbase</artifactId>
  <version>0.94.3</version>
</dependency>

Architecture

How does HBase handle Region-RegionServer assignment and locality?

See regions.arch.

Configuration

How can I get started with my first cluster?

See quickstart.

Where can I learn about the rest of the configuration options?

See configuration.

Schema Design / Data Access

How should I design my schema in HBase?

See datamodel and schema.

How can I store (fill in the blank) in HBase?

See supported.datatypes.

How can I handle secondary indexes in HBase?

See secondary.indexes.

Can I change a table's rowkeys?

This is a very common question. You can't. See changing.rowkeys.

What APIs does HBase support?

See datamodel, architecture.client, and external_apis.

MapReduce

How can I use MapReduce with HBase?

See mapreduce.

Performance and Troubleshooting

How can I improve HBase cluster performance?

See performance.

How can I troubleshoot my HBase cluster?

See trouble.

Amazon EC2

I am running HBase on Amazon EC2 and...

EC2 issues are a special case. See trouble.ec2 and perf.ec2.

Operations

How do I manage my HBase cluster?

See ops_mgt.

How do I back up my HBase cluster?

See ops.backup.

HBase in Action

Where can I find interesting videos and presentations on HBase?

See other.info.

Edit on GitHub