1、启动
cd <hbase_home>/bin$ ./start-hbase.sh2、启动hbase shell
# find hadoop-hbase dfs files
hadoop fs -ls /hbase#start shell
hbase shell#Run a command to verify that cluster is actually running#
list 3、logs配置Change the default by editing <hbase_home>/conf/hbase-env.sh export HBASE_LOG_DIR=/new/location/logs4、后台管理
HBase comes with web based management
– http://localhost:600105、端口服务
Both Master and Region servers run web server
– Browsing Master will lead you to region servers– Regions run on port 600306、基本命令
Quote all names
Table and column names– Single quotes for text• hbase> get 't1', 'myRowId'– Double quotes for binary• Use hexadecimal representation of that binary value• hbase> get 't1', "key\x03\x3f\xcd"Display cluster's status via status command
– hbase> status– hbase> status 'detailed'• Similar information can be found on HBaseWeb Management Console– http://localhost:600107、建表
Create Table
Create table called 'Blog' with the following
schema– 2 families–'info' with 3 columns: 'title', 'author', and 'date'–'content' with 1 column family: 'post'首先建立表,附带列族columns familiescreate 'Blog', {NAME=>'info'}, {NAME=>'content'}然后,添加数据,注意hbase是基于rowkey的列数据库,可以一次添加一列或多列,必须每次添加指定rowkey使用Put命令:hbase> put 'table', 'row_id', 'family:column', 'value'例子:put 'Blog', 'Michelle-001', 'info:title', 'Michelle'put 'Blog', 'Matt-001', 'info:author', 'Matt123'put 'Blog', 'Matt-001', 'info:date', '2009.05.01'put 'Blog', 'Matt-001', 'content:post', 'here is content'列可以任意的扩展,比如
put 'Blog', 'Matt-001', 'content:news', 'news is new column'
8、查看数据-指定rowid
#查看数据库
count 'Blog'count 'Blog', {INTERVAL=>2}#查看行数据
get 'table', 'row_id'get 'Blog', 'Matt-001'get 'Blog', 'Matt-001',{COLUMN=>['info:author','content:post']}#时间戳get 'Blog', 'Michelle-004',{COLUMN=>['info:author','content:post'],TIMESTAMP=>1326061625690}#版本get 'Blog', 'Matt-001',{ VERSIONS=1}get 'Blog', 'Matt-001',{COLUMN=>'info:date', VERSIONS=1}get 'Blog', 'Matt-001',{COLUMN=>'info:date', VERSIONS>=2}get 'Blog', 'Matt-001',{COLUMN=>'info:date'} 9、查看数据-通过scan指定范围,注意,所有的记录均按时间戳作为范围排序Limit what columns are retrieved– hbase> scan 'table', {COLUMNS=>['col1', 'col2']}• Scan a time range– hbase> scan 'table', {TIMERANGE => [1303, 13036]}• Limit results with a filter– hbase> scan 'Blog', {FILTER =>org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)}– More about filters laterscan 'Blog', {COLUMNS=>'info:title'}
开始于John,结束并排除Matt的scan 'Blog', {COLUMNS=>'info:title',STARTROW=>'John', STOPROW=>'Matt'}scan 'Blog', {COLUMNS=>'info:title', STOPROW=>'Matt'} 10、版本put 'Blog', 'Michelle-004', 'info:date', '1990.07.06'put 'Blog', 'Michelle-004', 'info:date', '1990.07.07'put 'Blog', 'Michelle-004', 'info:date', '1990.07.08'put 'Blog', 'Michelle-004', 'info:date', '1990.07.09'get 'Blog', 'Michelle-004',{COLUMN=>'info:date', VERSIONS=>3} 11、Delete recordsdelete 'Blog', 'Bob-003', 'info:date'12、Drop table
– Must disable before dropping– puts the table “offline” so schema based operations canbe performed– hbase> disable 'table_name'– hbase> drop 'table_name'