Hive(四)——电商交易项目案例

电商交易项目案例

Sdate定义了日期的分类,将每天分别赋予所属的月份、星期、季度等属性,
字段分别为日期、年月、年、月、日、周几、第几周、季度、旬、半月;
Stock定义了订单表头,字段分别为订单号、交易位置、交易日期;
StockDetail文件定义了订单明细,该表和Stock以交易号进行关联,
字段分别为订单号、行号、货品、数量、价格、金额;


CREATE TABLE sdate(
dateID string, --日期
theyearmonth string,--年月
theyear string,--年
themonth string,--月
thedate string,--日
theweek string,--周几
theweeks string,--第几周
thequot string,--季度
thetenday string,--旬
thehalfmonth string --半月

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' ;


CREATE TABLE stock(
ordernumber string,--订单号
locationid string,--交易位置
dateID string --交易日期

ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n' ;


CREATE TABLE stockdetail(
ordernumber string,--订单号
rownum int,--行号
itemid string, --货品
qty int, --数量
price int, --价格
amount int --总金额
) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 
LINES TERMINATED BY '\n' ;


创建表,导入数据
load data local inpath '/home/zkpk/tbdata/sdate.txt'
overwrite into table sdate;


load data local inpath '/home/zkpk/tbdata/stock.txt'
overwrite into table stock;


load data local inpath '/home/zkpk/tbdata/stockdetail.txt'
overwrite into table stockdetail;


1、计算所有订单每年的总金额
算法分析:
要计算所有订单每年的总金额,首先需要获取所有订单的订单号、订单日期和订单金信息,
然后把这些信息和日期表进行关联,
获取年份信息,最后根据这四个列按年份归组统计获取所有订单每年的总金额。
关于三张表:stock a, stockdetail b, sdate c
select c.theyear,sum(b.amount) 
from stock a,stockdetail b,sdate c
where a.ordernumber=b.ordernumber and a.dateid=c.dateid
group by c.theyear order by c.theyear;


Result:
2004 3265696
2005 13247234
2006 13670416
2007 16711974
2008 14670698
2009 6322137
2010 210924


2、计算所有订单每年最大金额订单的销售额
算法分析:

该算法分为两步:
1.按照日期和订单号进行归组计算,
 获取所有订单每天的销售数据;
stock a,stockdetail b


select a.dateid, a.ordernumber,sum(b.amount) as sumofamount
from stock a,stockdetail b
where a.ordernumber=b.ordernumber
group by a.dateid,a.ordernumber;


2.把第一步获取的数据和日期表进行关联获取的年份信息,
然后按照年份进行归组,使用Max函数,获取所有订单每年最大金额订单的销售额。
sdate c,第一步获取的数据 d
select c.theyear,max(d.sumofamount) from sdate c,
(select a.dateid, a.ordernumber,sum(b.amount) as sumofamount
from stock a,stockdetail b
where a.ordernumber=b.ordernumber
group by a.dateid,a.ordernumber)d
where c.dateid=d.dateid
group by c.theyear order by c.theyear;


Result:
2004 23612
2005 38180
2006 36124
2007 159126
2008 55828
2009 25810
2010 13063


3、统计所有订单中季度销售额前10位
stock a,stockdetail b,sdate c


select c.theyear,c.thequot,sum(b.amount) as sumofamount
from stock a,stockdetail b,sdate c
where a.ordernumber=b.ordernumber and a.dateid=c.dateid
group by c.theyear,c.thequot
order by sumofamount desc limit 10;


Result:
2008 1 5252819
2007 4 4613093
2007 1 4446088
2006 1 3916638
2008 2 3886470
2007 3 3870558
2007 2 3782235
2006 4 3691314
2005 1 3592007
2005 3 3304243
4、列出销售金额在100000以上的单据(订单号)
stock a,stockdetail b


select a.ordernumber,sum(b.amount) as sumofamount
from stock a,stockdetail b
where a.ordernumber=b.ordernumber
group by a.ordernumber
having sumofamount>100000;


Result:
HMJSL00009024 119058
HMJSL00009958 159126


5、所有订单中每年最畅销货品
第一步:
统计出每年每种货品的销售总金额
stock a,stockdetail b,sdate c
===================================
select c.theyear,b.itemid,sum(b.amount) as sumofamount
from stock a,stockdetail b,sdate c
where a.ordernumber=b.ordernumber and a.dateid=c.dateid
group by c.theyear,b.itemid;


Result:
.........


2010 ZX219365210101 299
2010 ZX219373110101 269
2010 ZX219373810101 -269
2010 ZX219373812201 657
2010 ZX219381020101 1196
2010 ZX219392110101 299
2010 ZX219392112201 598
2010 ZX219392212201 598
2010 yl427465200101 398


第二步:
在第一步的数据上,统计出每年最大的销售总金额
将第一步的数据集起别名为d;
select d.theyear,max(sumofamount) as maxofamount from 
(select c.theyear,b.itemid,sum(b.amount) as sumofamount
from stock a,stockdetail b,sdate c
where a.ordernumber=b.ordernumber and a.dateid=c.dateid
group by c.theyear,b.itemid) d
group by d.theyear;


Result:
2004 53374
2005 56569
2006 113684
2007 70226
2008 97981
2009 30029
2010 4494


第三步:所有订单中每年最畅销货品
e:每年每种货品的销售总金额
f:每年最大的销售总金额
select distinct e.theyear,e.itemid,f.maxofamount from 
(select c.theyear,b.itemid,
sum(b.amount) as sumofamount from stock a,stockdetail b,sdate c
where a.ordernumber=b.ordernumber and a.dateid=c.dateid 
group by c.theyear,b.itemid) e, 
(select d.theyear,max(d.sumofamount) as maxofamount from
(select c.theyear,b.itemid,sum(b.amount) as sumofamount 
from stock a,stockdetail b,sdate c 
where a.ordernumber=b.ordernumber and a.dateid=c.dateid 
group by c.theyear,b.itemid) d 
group by d.theyear) f 
where e.theyear=f.theyear and e.sumofamount=f.maxofamount 
order by e.theyear;


Result:
2004 JY424420810101 53374
2005 24124118880102 56569
2006 JY425468460101 113684
2007 JY425468460101 70226
2008 E2628204040101 97981
2009 YL327439080102 30029
2010 SQ429425090101 4494
阅读更多

更多精彩内容