Hbase 查询命令 条件筛选

方便测试

建一下表
hbase(main):001:0> create 'student','c1'
不写namespace的话就是默认在default里

查询有哪些namespace
hbase(main):001:0> list_namespace
查看表的全量数据
hbase(main):002:0> scan 'default:student'
放入一些测试数据
put 'student','1001','c1:id','1001' put 'student','1002','c1:id','1002' put
'student','1003','c1:id','1003' put 'student','1004','c1:id','1004' put
'student','1005','c1:id','1005'
只查询一行
hbase(main):025:0> scan 'student',LIMIT=>1 ROW COLUMN+CELL 1001 column=c1:id,
timestamp=1658911986336, value=1001
查询表的总记录数
count 'student'
按写入的时间戳查询数据
scan 'student', {COLUMN => 'c1', TIMERANGE => [1658827317000,1658913717000]}
查询值为1002的记录
hbase(main):004:0> scan 'student',FILTER=>"ValueFilter(=,'binary:1002')" ROW
COLUMN+CELL1002 column=c1:id, timestamp=1658911989184, value=1002 1 row(s) in
0.1060 seconds
查询c1:id列的值为1002的
hbase(main):006:0> scan 'student',COLUMNS => 'c1:id',FILTER=>
"ValueFilter(=,'binary:1002')" ROW COLUMN+CELL 1002 column=c1:id, timestamp=
1658911989184, value=1002 1 row(s) in 0.0340 seconds
查询值包含100的记录,就跟sql的模糊匹配一样
hbase(main):007:0> scan 'student',FILTER=>"ValueFilter(=,'substring:100')" ROW
COLUMN+CELL 1001 column=c1:id, timestamp=1658911986336, value=1001 1002
column=c1:id, timestamp=1658911989184, value=1002 1003 column=c1:id,
timestamp=1658911989217, value=1003 1004 column=c1:id, timestamp=1658911989243,
value=1004 1005 column=c1:id, timestamp=1658911989788, value=1005 5 row(s) in
0.0470 seconds
为了方便列的其他查询,多放入一个列
put 'student','1001','c1:sex','1' put 'student','1002','c1:sex','2' put
'student','1003','c1:sex','1' put 'student','1004','c1:sex','2' put 'student',
'1005','c1:sex','1' hbase(main):015:0* scan 'student' ROW COLUMN+CELL 1001
column=c1:id, timestamp=1658911986336, value=1001 1001 column=c1:sex, timestamp=
1658914149713, value=1 1002 column=c1:id, timestamp=1658911989184, value=1002
1002 column=c1:sex, timestamp=1658914152500, value=2 1003 column=c1:id,
timestamp=1658911989217, value=1003 1003 column=c1:sex, timestamp=1658914152535,
value=1 1004 column=c1:id, timestamp=1658911989243, value=1004 1004 column=
c1:sex,timestamp=1658914152563, value=2 1005 column=c1:id, timestamp=
1658911989788, value=1005 1005 column=c1:sex, timestamp=1658914153242, value=1 5
row(s) in 0.0390 seconds
查询列为id打头的值
hbase(main):019:0> scan 'student',FILTER=>"ColumnPrefixFilter('id')" ROW
COLUMN+CELL1001 column=c1:id, timestamp=1658911986336, value=1001 1002 column=
c1:id,timestamp=1658911989184, value=1002 1003 column=c1:id, timestamp=
1658911989217, value=1003 1004 column=c1:id, timestamp=1658911989243, value=1004
1005 column=c1:id, timestamp=1658911989788, value=1005 5 row(s) in 0.0270
seconds
各项查询的条件是可以叠加的,比如下面这个

查询列为id打头且值为1003的
hbase(main):020:0> scan 'student',FILTER=>"ColumnPrefixFilter('id') AND
ValueFilter(=,'binary:1003')" ROW COLUMN+CELL 1003 column=c1:id, timestamp=
1658911989217, value=1003 1 row(s) in 0.0550 seconds
查询rowkey为100打头的
hbase(main):021:0> scan 'student',FILTER=>"PrefixFilter('100')" ROW COLUMN+CELL
1001 column=c1:id, timestamp=1658911986336, value=1001 1001 column=c1:sex,
timestamp=1658914149713, value=1 1002 column=c1:id, timestamp=1658911989184,
value=1002 1002 column=c1:sex, timestamp=1658914152500, value=2 1003 column=
c1:id,timestamp=1658911989217, value=1003 1003 column=c1:sex, timestamp=
1658914152535, value=1 1004 column=c1:id, timestamp=1658911989243, value=1004
1004 column=c1:sex, timestamp=1658914152563, value=2 1005 column=c1:id,
timestamp=1658911989788, value=1005 1005 column=c1:sex, timestamp=1658914153242,
value=1
查询rowkey为100打头的且不同返回列信息
hbase(main):022:0> scan 'student',FILTER=>"PrefixFilter('100') AND
KeyOnlyFilter()" ROW COLUMN+CELL 1001 column=c1:id, timestamp=1658911986336,
value= 1001 column=c1:sex, timestamp=1658914149713, value= 1002 column=c1:id,
timestamp=1658911989184, value= 1002 column=c1:sex, timestamp=1658914152500,
value= 1003 column=c1:id, timestamp=1658911989217, value= 1003 column=c1:sex,
timestamp=1658914152535, value= 1004 column=c1:id, timestamp=1658911989243,
value= 1004 column=c1:sex, timestamp=1658914152563, value= 1005 column=c1:id,
timestamp=1658911989788, value= 1005 column=c1:sex, timestamp=1658914153242,
value= 5 row(s) in 0.0670 seconds
从特定行开始查三行
hbase(main):006:0> scan 'student',{STARTROW=>'1002',LIMIT=>3} ROW COLUMN+CELL
1002 column=c1:id, timestamp=1658911989184, value=1002 1002 column=c1:sex,
timestamp=1658914152500, value=2 1003 column=c1:id, timestamp=1658911989217,
value=1003 1003 column=c1:sex, timestamp=1658914152535, value=1 1004 column=
c1:id,timestamp=1658911989243, value=1004 1004 column=c1:sex, timestamp=
1658914152563, value=2 3 row(s) in 0.0300 seconds
获取特定的行
hbase(main):007:0> get 'student','1001' COLUMN CELL c1:id timestamp=
1658911986336, value=1001 c1:sex timestamp=1658914149713, value=1 2 row(s) in
0.0170 seconds
默认的查询是正序,倒叙使用REVERSED => TRUE
scan 'student',{REVERSED => TRUE,LIMIT=>1}
以上这些命令基本满足大部分的查询需求了

技术
今日推荐
下载桌面版
GitHub
百度网盘(提取码:draw)
Gitee
云服务器优惠
阿里云优惠券
腾讯云优惠券
华为云优惠券
站点信息
问题反馈
邮箱:[email protected]
QQ群:766591547
关注微信