¿Cuál es la forma común de ejecutar un trabajo de HBase?

I'm using the HBase Client API to connect to a remote cluster and do some operations. This project will certainly require hbase and hadoop-core jars. And my question is whether I should use 'java' command and handle all the dependencies (using maven shaded plugin, or set the classpath environment), or there's a magic utility command to handle all these for me?

Take map-redcue job for an instance. Typically the main class will extend Configured and implement Tool. The job will be executed by 'hadoop jar' command and all environment and hadoop-core dependency are at hand. This approach also handles the common command line parsing for me, and I can easily get an instance of Configuration by 'this.getConf()';

I'm wondering whether HBase provides the same utiliy command?

preguntado el 12 de febrero de 14 a las 05:02

1 Respuestas

You can use HBase in two modes - one as a source/target in a map/reduce job - in which case you invoke it as you would any other map/reduce job. The second way is sort of like a regular database in which case you use the HBase client API and invoke it like any other regular java program

Respondido 12 Feb 14, 08:02

Exactly. Why would HBase provide duplicate functionality? Use your build tool to make shaded jar that contains all of your dependencies, and use the hadoop command for your M/R jobs. - David

This is not duplicate functionality these are different access patterns - one is random read/write and the other is batch processing. HBase accommodates both - Arnon Rotem-Gal-Oz

No es la respuesta que estás buscando? Examinar otras preguntas etiquetadas or haz tu propia pregunta.