I was wondering if HBase supports wildcards on RowKey scans. Something similar to:
select * from TABLE where KEY like '%SEARCH_KEY%';
I understand we can use a partial key scan if we have some knowledge of the prefix to the rowkey (and HBase is very efficient with that scan). However, if we don't have the information prefixed (meaning the search key could be anywhere in the RowKey), then Hbase has to run a full table scan, correct?
Also, how can I form such a query in HBase (Either code or through the shell)?
preguntado el 25 de abril de 13 a las 02:04
You can only do a prefix-based row-key scan.
Say you have data like:
aaa_001 aaa_002 aab_001 aac_001 baa_001 ... zzz_001
Usar felizbase, you can write code like this to get aaa*
for key in table.scan(row_prefix="aaa"): print key
mientras que este código:
for key in table.scan(row_prefix="aa"): print key
te conseguirá esto:
aaa_001 aaa_002 aab_001 aac_001
So you can do prefix-based matching, but not suffix-based. Hope this is useful.
You can use RegexStringComparator along with RowFilter and specify the regex, however it will endup in a full table scan.
RegexStringComparator comp = new RegexStringComparator("my."); // any value that starts with 'my' SingleColumnValueFilter filter = new SingleColumnValueFilter( cf, column, CompareOp.EQUAL, comp ); scan.setFilter(filter);
Instead of SingleColumnValueFilter use a RowFilter