公司网站建设泉州,网站开发工作方向,wordpress tag 数字,discuzq文章目录 数据库和数据仓库的区别Hive安装配置Hive使用方式Hive日志配置 数据库和数据仓库的区别
数据库#xff1a;传统的关系型数据库主要应用在基本的事务处理#xff0c;比如交易#xff0c;支持增删改查数据仓库#xff1a;主要做一些复杂的分析操作#xff0c;侧重… 文章目录 数据库和数据仓库的区别Hive安装配置Hive使用方式Hive日志配置 数据库和数据仓库的区别
数据库传统的关系型数据库主要应用在基本的事务处理比如交易支持增删改查数据仓库主要做一些复杂的分析操作侧重决策支持相对于数据库而言数据仓库分析的数据规模要大的多只支持查询本质区别是OLTP(On-Line-Transaction Processing)和OLAP(On-Line-Analytical Processing)的区别OLTP称为联机事务处理也是面向交易的处理系统它是针对具体的业务在数据库联机的日常操作通常对少数记录进行查询、修改用户关心的是响应时间数据的安全性完整性等问题;OLAP是分析性处理称为联机分析处理一般针对某些主题历史数据进行分析支持管理决策 Hive安装配置
# 解压完之后
[roothadoop04 conf]# mv hive-env.sh.template hive-env.sh
[roothadoop04 conf]# mv hive-default.xml.template hive-site.xml#修改配置
[roothadoop04 conf]# vim hive-env.sh
export JAVA_HOME/home/soft/jdk1.8
export HIVE_HOME/home/soft/apache-hive-3.1.2
export HADOOP_HOME/home/soft/hadoop-3.2.0# 根据name修改对应配置
[roothadoop04 conf]# vim hive-site.xml /propertypropertynamehive.exec.local.scratchdir/namevalue/home/hive_repo/scratchdir/valuedescriptionLocal scratch space for Hive jobs/description/propertypropertynamehive.downloaded.resources.dir/namevalue/home/hive_repo/resources/valuedescriptionTemporary local directory for added resources in the remote file system./description/propertypropertynamejavax.jdo.option.ConnectionURL/namevaluejdbc:mysql://ip:port/hive?serverTimezoneAsia/Shanghai/valuedescriptionJDBC connect string for a JDBC metastore.To use SSL to encrypt/authenticate the connection, provide database-specific SSL flag in the connection URL.For example, jdbc:postgresql://myhost/db?ssltrue for postgres database./description/propertypropertynamejavax.jdo.option.ConnectionUserName/namevalueroot/valuedescriptionUsername to use against metastore database/description/propertypropertynamejavax.jdo.option.ConnectionPassword/namevalue123456/valuedescriptionpassword to use against metastore database/description/propertypropertynamejavax.jdo.option.ConnectionDriverName/namevaluecom.mysql.cj.jdbc.Driver/valuedescriptionDriver class name for a JDBC metastore/description/property# 初始化数据仓库[roothadoop04 apache-hive-3.1.2]# bin/schematool -dbType mysql -initSchema# 看到有下面那些表就算完成啦Hive使用方式 命令行方式 # 连接hive
[roothadoop04 apache-hive-3.1.2]# bin/hive
which: no hbase in (/home/soft/jdk1.8/bin:/home/soft/hadoop-3.2.0/bin:/home/soft/hadoop-3.2.0/sbin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/root/bin)
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/soft/apache-hive-3.1.2/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/soft/hadoop-3.2.0/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
Hive Session ID 505baa88-4bd1-4f00-9345-448ae17ab151Logging initialized using configuration in jar:file:/home/soft/apache-hive-3.1.2/lib/hive-common-3.1.2.jar!/hive-log4j2.properties Async: true
Hive-on-MR is deprecated in Hive 2 and may not be available in the future versions. Consider using a different execution engine (i.e. spark, tez) or using Hive 1.X releases.
Hive Session ID dfffb77e-23d3-4c56-9457-32f30b5f4e3c# 查询
hive show tables;
OK
Time taken: 1.019 seconds
# 建表
hive create table t1(id int,name string);
OK
Time taken: 1.875 seconds
hive show tables;
OK
t1
Time taken: 0.388 seconds, Fetched: 1 row(s)
# 插入数据 会进行mapreduce
hive insert into t1(id,name)values(1,test);
Query ID root_20240311140339_1e1450d1-2227-4b3d-bb10-e21f0016903b
Total jobs 3
Launching Job 1 out of 3
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):set hive.exec.reducers.bytes.per.reducernumber
In order to limit the maximum number of reducers:set hive.exec.reducers.maxnumber
In order to set a constant number of reducers:set mapreduce.job.reducesnumber
Starting Job job_1710135432246_0001, Tracking URL http://hadoop01:8088/proxy/application_1710135432246_0001/
Kill Command /home/soft/hadoop-3.2.0/bin/mapred job -kill job_1710135432246_0001
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2024-03-11 14:04:00,036 Stage-1 map 0%, reduce 0%
2024-03-11 14:04:08,605 Stage-1 map 100%, reduce 0%, Cumulative CPU 2.66 sec
2024-03-11 14:04:16,949 Stage-1 map 100%, reduce 100%, Cumulative CPU 4.69 sec
MapReduce Total cumulative CPU time: 4 seconds 690 msec
Ended Job job_1710135432246_0001
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to directory hdfs://hadoop01:9000/user/hive/warehouse/t1/.hive-staging_hive_2024-03-11_14-03-39_724_266361142260875320-1/-ext-10000
Loading data to table default.t1
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1 Reduce: 1 Cumulative CPU: 4.69 sec HDFS Read: 15158 HDFS Write: 236 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 690 msec
OK
Time taken: 42.328 seconds
hive select * from t1;
OK
1 test
Time taken: 0.726 seconds, Fetched: 1 row(s)
hive drop table t1;
OK
Time taken: 1.368 seconds
# 退出
hive quit;
Hive日志配置 运行时日志 [roothadoop04 conf]# mv hive-log4j2.properties.template hive-log4j2.properties
[roothadoop04 conf]# vim hive-log4j2.properties
# list of properties
property.hive.log.level INFO
property.hive.root.logger DRFA
property.hive.log.dir /home/hive_repo/log
property.hive.log.file hive.log
property.hive.perflogger.log.level INFO 任务执行日志 [roothadoop04 conf]# mv hive-exec-log4j2.properties.template hive-exec-log4j2.properties
[roothadoop04 conf]# vim hive-exec-log4j2.properties status INFO
name HiveExecLog4j2
packages org.apache.hadoop.hive.ql.log# list of properties
property.hive.log.level INFO
property.hive.root.logger FA
property.hive.query.id hadoop
property.hive.log.dir /home/hive_repo/log
property.hive.log.file ${sys:hive.query.id}.log
level INFO property.hive.root.logger FA property.hive.query.id hadoop property.hive.log.dir /home/hive_repo/log property.hive.log.file ${sys:hive.query.id}.log