mysql权限
1 2 3 4
| CREATE USER 'hive'@'%' IDENTIFIED BY 'hive'; GRANT all on *.* to 'hive'@'%' identified by 'hive'; flush privileges;
|
mysql 驱动
maven 库上找即可
把MySQL的JDBC驱动包复制到Hive的lib目录下
1 2 3 4 5 6
| <dependency> <groupId>mysql</groupId> <artifactId>mysql-connector-java</artifactId> <version>5.1.44</version> </dependency>
|
hadoop 准备工作
1 2 3 4 5 6 7
| hdfs dfs -mkdir -p /user/hive/warehouse hdfs dfs -mkdir -p /tmp/hive hdfs dfs -chmod 777 /user/hive/warehouse hdfs dfs -chmod 777 /tmp/hive hdfs dfs -chmod 777 /tmp
|
hive-site.xml 中与mysql相关的配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
| <configuration> <property> <name>javax.jdo.option.ConnectionURL</name> <value>jdbc:mysql://localhost/metastore?createDatabaseIfNotExist=true&useUnicode=true&characterEncoding=UTF-8</value> <description>metadata is stored in a MySQL server</description> </property> <property> <name>javax.jdo.option.ConnectionDriverName</name> <value>com.mysql.jdbc.Driver</value> <description>MySQL JDBC driver class</description> </property> <property> <name>javax.jdo.option.ConnectionUserName</name> <value>hive</value> <description>user name for connecting to mysql server</description> </property> <property> <name>javax.jdo.option.ConnectionPassword</name> <value>hive</value> <description>password for connecting to mysql server</description> </property> </configuration>
|
hive-site.xml 与hadoop相关配置
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| <property> <name>hive.metastore.warehouse.dir</name> <value>/usr/hive/warehouse</value> </property> <property> <name>hive.exec.scratchdir</name> <value>/usr/hive/tmp</value> </property> <property> <name>hive.querylog.location</name> <value>/usr/hive/log</value> </property>
|
hive 以mysql保存元数据模式初始化
hive 默认是采用derby来保存元数据的,如果第一次采用了derby方式,需要先把源数据删除才行
1
| schematool -dbType mysql -initSchema
|
初始化只需要执行一次,第二次执行会报错
1 2 3
| Error: Duplicate key name 'PCS_STATS_IDX' (state=42000,code=1061) org.apache.hadoop.hive.metastore.HiveMetaException: Schema initialization FAILED! Metastore state would be inconsistent !! *** schemaTool failed ***
|
进入MySQL,删掉metastore数据库再重新创建metastore数据库即可
参考链接
常见错误
ls: 无法访问’/data/spark/spark-2.2.0-bin-hadoop2.6/lib/spark-assembly-*.jar’:
没有那个文件或目录其主要的原因是:在hive.sh的文件中,发现了这样的命令,原来初始当spark存在的时候,进行spark中相关的JAR包的加载。而自从spark升级到2.0.0之后,原有的lib的整个大JAR包已经被分散的小JAR包的替代,所以肯定没有办法找到这个spark-assembly的JAR包。这就是问题所在。
1 2 3 4 5
| if [[ -n "$SPARK_HOME" ]] then sparkAssemblyPath=`ls ${SPARK_HOME}/lib/spark-assembly-*.jar` CLASSPATH="${CLASSPATH}:${sparkAssemblyPath}"
|
将这个spark-assembly-.jar`替换成jars/.jar,就不会出现这样的问题。
参考链接