《Oracle查询优化改写》笔记(基础语句)

《Oracle查询优化改写》笔记(高级查询)

《Oracle查询优化改写》笔记(应用案例)

单表查询

空值转换为实际值

相对于nvl来说,coalesce支持多个参数,能很方便的返回第一个不为空的值。

1
2
3
4
SELECT coalesce(comm,0) FROM emp;
SELECT COALESCE(C1,C2,C3,C4,C5,C6) AS c FROM V;
# 若改用nvl,要嵌套很多层
SELECT nvl(nvl(nvl(nvl(nvl(c1,c2),c3),c4),c5),c6) AS c FROM V;

随机返回n条记录

正确写法:先随机排序,再取数据

1
2
3
SELECT empno,ename 
FROM (SELECT empno,ename FROM emp ORDER BY dbms_random.value())
WHERE rownum <= 3;

错误写法:先取数据,再随机排序

1
SELECT empno,ename FROM emp WHERE rownum <=3 ORDER BY dbms_random.value;

模糊查询(转义)

在LIKE子句中有两个通配符:“%”(替代一个或多个字符)、“_”(替代一个或多个字符)。“_”被当作通配符了怎么办,可以用转义字符:

1
SELECT * FROM v WHERE vname LIKE '\_BCE%' ESCAPE '\';
1
2
3
4
SELECT * FROM v WHERE vname LIKE '_\BCE%' ESCAPE '\';
ORA-01424:转义符之后字符缺失或非法
# 双写转义字符即可,对于字符串中包含“%”同理
SELECT * FROM v WHERE vname LIKE '_\\BCE%' ESCAPE '\';

查询结果排序

以数字指定排序列

1
2
3
4
5
# 按第三列排序
SELECT empno,ename,hiredate FROM emp WHERE deptno = 10 ORDER BY 3 ASC;
# 当取值不定时很方便,比如有时取sal,有时取comm来显示
SELECT empno,ename,sal FROM emp WHERE deptno = 10 ORDER BY 3 ASC;
SELECT empno,ename,comm FROM emp WHERE deptno = 10 ORDER BY 3 ASC;

translate

语法格式:TRANSLATE(expr, from_string, to_string)

from_string 与 to_string 以字符为单位,对应字符一一替换。

1
2
SELECT TRANSLATE('ab 你好 bcadefg', 'abcdefg', '1234567') AS NEW_STR FROM DUAL;
# 12 你好 2314567

如果 to_string 为空,则返回空值。

1
2
SELECT TRANSLATE('ab 你好 bcadefg', 'abcdefg', '') AS NEW_STR FROM DUAL;
# NULL

如果 to_string 对应的位置没有字符,删除 from_string 中列出的字符将会被消掉。

1
2
SELECT TRANSLATE('ab 你好 bcadefg', '1abcdefg', '1') AS NEW_STR FROM DUAL;
# 你好

处理排序空值

Oracle 默认排序空值在后面,如果想把空值(如 emp.comm)显示在前面:

1
2
SELECT ename,sal,comm,nvl(comm, -1) order_col FROM emp ORDER BY 4;
# 这种方法需要对列类型及其中保存的数据有所了解才行

其实可以用关键字 NULLS FIRST 和 NULLS LAST:

1
2
3
4
# 空值在前
SELECT ename,sal,comm FROM emp ORDER BY 3 NULLS FIRST;
# 空值在后
SELECT ename,sal,comm FROM emp ORDER BY 3 NULLS LAST;

操作多个表

rowid

除了用唯一列、主键列外,还可以使用 rowid

1
SELECT ROWID,deptno FROM emp WHERE mgr = 7698;

如果数据不是取自表,而是取自VIEW

1
2
select rowid from v;
# ORA-01445:无法从不带保留关键字的表的连接视图中选择 ROWID 或采样

查询期间创建临时视图

仅在查询期间存在,查询结束消失

1
2
WITH e AS (SELECT ROWNUM AS sn,deptno,mgr,job FROM v)
SELECT sn,deptno FROM e WHERE mgr = 7698;

explain分析查询语句

1
2
EXPLAIN PLAN FOR SELECT empno,ename FROM emp;
select * from table(dbms_xplan.display());

join查询不同写法

INNER JOIN

1
2
3
4
SELECT 1.str AS left_str, r.str AS right_str
FROM 1, r
WHERE l.v = r.v
ORDER BY 1, 2;

LEFT JOIN

1
2
3
4
SELECT l.str AS left_str, r.str AS right_str
FROM l, r
WHERE l.v = r.v(+)
ORDER BY 1, 2;

RIGHT JOIN

1
2
3
4
SELECT l.str AS left_str, r.str AS right_str
FROM l, r
WHERE l.v(+) = r.v
ORDER BY 1, 2;

FULL JOIN 无(+)的写法。

not in (空值) 返回空

在Oracle中子查询()中包含空值,NOT IN (空值) 返回为空。

1
2
SELECT COUNT(*) FROM dept WHERE deptno NOT IN (SELECT emp.deptno FROM emp);# 返回0行
SELECT COUNT(*) FROM dept WHERE deptno NOT IN (SELECT emp.deptno FROM emp WHERE emp.deptno IS NOT NULL);# 正常返回

插入、更新与删除

使用默认值default

1
2
3
4
5
6
7
8
9
10
CREATE TABLE TEST (
c1 VARCHAR2(10) DEFAULT '默认1',
c2 VARCHAR2(10) DEFAULT '默认2',
c3 VARCHAR2(10) DEFAULT '默认3',
c4 DATE DEFAULT SYSDATE
);
INSERT INTO TEST(c1,c2,c3) VALUES (DEFAULT,NULL,'手输值');
# SELECT * FROM TEST;
# c1 c2 c3 c4
# 默认1 手输值 2013-11-22

组织对某几列插入

c4列默认值为 SYSDATE,如何控制不允许手动录入:

1
2
3
4
5
6
7
# 建一个不包括 c4 列的VIEW,通过VIEW新增数据
CREATE OR REPLACE VIEW v_test AS SELECT c1,c2,c3 FROM TEST;
INSERT INTO v_test(c1,c2,c3) VALUES ('手输c1',NULL,'不能改c4');
# SELECT * FROM TEST;
# c1 c2 c3 c4
# 默认1 手输值 2013-11-22
# 手输c1 不能改c4 2013-11-22

注意:通过VIEW新增数据,不能再使用关键字DEFAULT。

1
2
INSERT INTO v_test(c1,c2,c3) VALUES (default,NULL,'不能改c4');
# ORA-32575:对于正在修改的视图,不支持显示列默认设置

复制表的定义与数据

1
2
3
4
5
6
7
# 方法一
CREATE TABLE test2 AS SELECT * FROM TEST;
# 方法二
//注意:该方式复制的表不包含默认值等约束信息,需重建默认值及索引和约束等信息。
//DESC test2;
CREATE TABLE test2 AS SELECT * FROM TEST WHERE 1=2;
INSERT INTO test2 SELECT * FROM TEST;

限制数据录入

约束条件比较简单时,可以直接加在表中,如工资必须大于0:

1
alter table emp add constraints ch_sal check(sal>0);

但有些复杂或特殊的约束条件是不能放在表里的,如雇佣日期大于当前日期:

1
2
alter table emp add constraints ch_hiredate check(hiredate>=sysdate);
# ORA-02436:日期或系统变量在 CHECK 约束条件中指定错误

我们可以使用加了 WITH CHECK OPTION 关键字的 VIEW 来达到目的。如下我们限制了不符合内联视图条件的数据(SYSDATE+1):

1
2
3
4
5
6
INSERT INTO 
(SELECT empno,ename,hiredate FROM emp
WHERE hiredate<=SYSDATE WITH CHECK OPTION)
VALUES
(9999,'test',SYSDATE+1);
# ORA-01402:视图 WITH CHECK OPTION where 子句违规

多表插入语句

建两个测试用表:

1
2
CREATE TABLE emp1 AS SELECT empno,ename,job FROM emp WHERE 1=2;
CREATE TABLE emp2 AS SELECT empno,ename,deptno FROM emp WHERE 1=2;

无条件 INSERT:

1
2
3
4
5
INSERT ALL
INTO emp1(empno,ename,job) VALUES (empno,ename,job)
INTO emp2(empno,ename,deptno) VALUES (empno,ename,deptno)
SELECT empno,ename,job,deptno FROM emp WHERE deptno IN (10,20);
# 因为没有加条件,所以会同时向两个表中插入数据,且插入条数一样。

有条件 INSERT ALL:

1
2
3
4
5
6
7
8
9
delete emp1;
delete emp2;
INSERT ALL
WHEN job IN ('SALESMAN','MANAGER') THEN
INTO emp1(empno,ename,job) VALUES (empno,ename,job)
WHEN deptno IN ('20','30') THEN
INTO emp2(empno,ename,deptno) VALUES (empno,ename,deptno)
SELECT empno,ename,job,deptno FROM emp;
# 增加条件后会按条件插入,但如 empno=7654 等数据会在两个表中都有。

INSERT FIRST 就不一样:

1
2
3
4
5
6
7
8
9
10
delete emp1;
delete emp2;
INSERT FIRST
WHEN job IN ('SALESMAN','MANAGER') THEN
INTO emp1(empno,ename,job) VALUES (empno,ename,job)
WHEN deptno IN ('20','30') THEN
INTO emp2(empno,ename,deptno) VALUES (empno,ename,deptno)
SELECT empno,ename,job,deptno FROM emp;
# 当第一个表符合条件后,第二个表将不再插入对应行
# 表emp2中不再有与表emp1相同的数据 ”empno=7654“

转置 INSERT 与其说是一个分类,不如算 INSERT ALL 的一个用法:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
DROP TABLE T1;
DROP TABLE T2;
CREATE TABLE T2 (d VARCHAR2(10),des VARCHAR2(50));
CREATE TABLE T1 AS
SELECT '熊样' AS d1,
'猫样' AS d2,
'狗样' AS d3,
'鸟样' AS d4,
'花样' AS d5
FROM dual;
# 转置 INSERT
INSERT ALL
INTO T2(d,desc) VALUES('周一',d1)
INTO T2(d,desc) VALUES('周二',d2)
INTO T2(d,desc) VALUES('周三',d3)
INTO T2(d,desc) VALUES('周四',d4)
INTO T2(d,desc) VALUES('周五',d5)
SELECT d1,d2,d3,d4,d5 FROM T1;
# SELECT * FROM T2;
# D DESC
# 周一 熊样
# 周二 猫样
# 周三 狗样
# 周四 鸟样
# 周五 花样
//转置 INSERT 的等价语句如下:
INSERT INTO T2(d,desc)
SELECT '周一',d1 FROM T1 UNION ALL
SELECT '周二',d2 FROM T1 UNION ALL
SELECT '周三',d3 FROM T1 UNION ALL
SELECT '周四',d4 FROM T1 UNION ALL
SELECT '周五',d5 FROM T1;

用其他表中值更新

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# 方法一
UPDATE emp SET emp.dname =
(SELECT dept.dname FROM dept
WHERE dept.deptno = emp.deptno
AND dept.dname IN ('ACCOUNTING','RESEARCH'))
WHERE EXISTS (SELECT dept.dname FROM dept
WHERE dept.deptno = emp.deptno
AND dept.dname IN ('ACCOUNTING','RESEARCH'));
# 方法二,使用可更新VIEW
UPDATE (SELECT emp.dname,dept.dname AS new_dname FROM emp
INNER JOIN dept ON dept.deptno = emp.deptno
WHERE dept.dname IN ('ACCOUNTING','RESEARCH'))
SET dname = new_dname;
//ORA-01779:无法修改与非键值对保存表对应的列
//需在表dept中增加唯一索引或主键
//alter table dept add constraints pk_dept primary key (deptno);
# 方法三(推荐),MERGE改写,MERGE INTO 语句只访问一次 DEPT
MERGE INTO emp
USING (SELECT dname,deptno FROM dept WHERE dept.dname IN ('ACCOUNTING','RESEARCH')) dept
ON (dept.deptno = emp.deptno)
WHEN MATCHED THEN
UPDATE SET emp.dname = dept.dname;

合并记录

1
2
3
4
5
6
7
8
9
10
11
12
MERGE INTO bonuses d
USING (SELECT employee_id,salary,department_id FROM hr.employees WHERE department_id = 80) s
ON (d.employee_id = s.employee_id)
WHEN MATCH THEN
UPDATE SET d.bonus = d.bonus + s.salary * 0.01
/*WHERE只能出现一次,如果这里加了WHERE,DELETE后面的WHERE就无效*/
DELETE WHERE (s.salary > 8000)
/*删除时,只更新s.salary > 8000时的数据*/
WHEN NOT MATCHED THEN
INSERT (d.employee_id,d.bonus) VALUES (s.employee_id,s.salary * 0.01)
WHERE (s.salary <= 8000)
/*新增时,只更新s.salary <= 8000时的数据*/
  • 更新、删除、插入这三个操作同时进行,不分先后。
  • 不能更新 JOIN 列。
  • UPDATE的范围是:匹配到的数据减去删除的数据。(d.employee_id = s.employee_id)减去范围(s.salary > 8000)。

删除名称重复数据

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# 方法一:通过name相同、id不同的方式来判断
//这种方式删除数据时需要建组合索引:create index idx_name_id on dupes(name,id);
DELETE FROM dupes a WHERE EXISTS
(SELECT NULL FROM dupes b WHERE b.name=a.name AND b.id>a.id);
# 方法二:用ROWID来代替其中的id
//因为不需要关联id列,只需要建立单列索引:create index idx_name on dupes(name);
DELETE FROM dupes a WHERE EXISTS (
SELECT NULL FROM dupes b WHERE b.name=a.name AND b.rowid > a.rowid
);
# 方法三:通过分析函数根据name分组生成序号,然后删除序号大于1的数据
DELETE FROM dupes WHERE ROWID IN (
SELECT rid FROM (
SELECT ROWID AS rid,row_number() over(PARTITION BY NAME ORDER BY id) AS seq FROM dupes
) WHERE seq > 1
);

使用字符串

遍历字符串

1
2
3
4
5
6
SELECT LEVEL FROM DUAL CONNECT BY LEVEL<=4;

1
2
3
4
1
2
3
4
5
6
7
WITH V AS (SELECT '天天向上' AS 汉字,'TTXS' AS 首拼 FROM DUAL)
SELECT V.汉字,V.首拼,LEVEL FROM V CONNECT BY LEVEL<=LENGTH(V.汉字);

天天向上 TTXS 1
天天向上 TTXS 2
天天向上 TTXS 3
天天向上 TTXS 4
1
2
3
4
5
6
7
8
9
10
WITH V AS (SELECT '天天向上' AS 汉字,'TTXS' AS 首拼 FROM DUAL)
SELECT V.汉字,V.首拼,LEVEL,
SUBSTR(V.汉字,LEVEL,1) AS 汉字拆分,
SUBSTR(V.首拼,LEVEL,1) AS 首拼拆分
FROM V CONNECT BY LEVEL<=LENGTH(V.汉字);

天天向上 TTXS 1 天 T
天天向上 TTXS 2 天 T
天天向上 TTXS 3 向 X
天天向上 TTXS 4 上 S

字符串中包含引号

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# 方法一:一个单引号换成两个单引号表示
select 'g''day mate' qmarks from dual union all
select 'beavers'' teeth' from dual union all
select '''' from dual;

QMARKS
------------
g'day mate
beavers' teeth
'

# 方法二:使用q-quote特性,Q或q开头,字符串前后使用界定符“''”
1)q-quote界定符可以是除了TAB、空格、回车外的任何单字节或多字节字符
2)界定符可以是[]、{}、<>、(),而且必须成对出现
select q'[g'day mate]' qmarks from dual union all
select q'[beavers' teeth]' from dual union all
select q'[']' from dual;

QMARKS
------------
g'day mate
beavers' teeth
'

计算字符串出现次数

若分割符只有一个:

1
2
3
4
5
6
7
8
9
10
11
12
13
# 方法一:regexp_count
with v as (select 'CLARK,KING,MILLER' as str from dual)
select regexp_count(str,',')+1 as cnt from v;
# 方法二:
with v as (select 'CLARK,KING,MILLER' as str from dual)
select length(regexp_replace(str,'[^,]'))+1 as cnt from v;
# 方法三:(若分割符有多个,计算出的长度须再除以分割符长度)
with v as (select 'CLARK,KING,MILLER' as str from dual)
select length(translate(str,','||str,','))+1 as cnt from v;

CNT
------------
3

若分割符有多个:

1
2
3
4
5
6
7
8
9
10
11
12
13
# 错误写法:
with v as (select '10$#CLARK$#MANAGER' as str from dual)
select length(translate(str,'$#'||str,'$#'))+1 as cnt from v;
# 正确写法一:
with v as (select '10$#CLARK$#MANAGER' as str from dual)
select length(translate(str,'$#'||str,'$#'))/length('$#')+1 as cnt from v;
# 正确写法二:(regexp_count不用考虑分割符长度,因为“$”是通配符需用“\”转义)
with v as (select '10$#CLARK$#MANAGER' as str from dual)
select regexp_count(str,'\$#')+1 as cnt from v;

CNT
------------
3

(regexp_)replace字符串替换

1
2
3
# 除了用translate
select ename,regexp_replace(ename,'[AEIOU]') AS strippedl FROM emp;
# 相当于 replace(replace(replace(replace(replace(ename,'A'),'E'),'I'),'O'),'U')

字符和数字数据分离

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
select data from dept2;

DATA
------------
ACCOUNTING10
RESEARCH20
SALES30
OPERATIONS40

# 方法一:(“^”在方括号内表否定,否则如'^hell'表示字符串的开始位置)
select regexp_replace(data,'[0,9]','') dname,
regexp_replace(data,'[^0,9]','') deptno
from dept2;
# 方法二:
select translate(data,'a0123456789','a') dname,
translate(data,'0123456789'||data,'0123456789') deptno
from dept2;

DNAME DEPTNO
-------------------
ACCOUNTING 10
RESEARCH 20
SALES 30
OPERATIONS 40

wm_concat替代品

wm_concat是一个非公开函数,具有不稳定性(返回值原来是varchar,Oracle11.2下成了clob),Oracle11.2开始有了分析函数listagg

1
2
3
4
5
# 同sum一样,listagg起汇总作用,把字符串连在一起
select deptno,sum(sal) as total_sal,
listagg(ename,',') within group(order by ename) as total_ename
from emp
group by deptno;

regexp_substr分解IP

1
2
3
4
5
6
7
8
9
10
11
12
13
# regexp_substr(参数1,参数2,参数3,参数4)
# 参数2'[^.]+'表示不包含点号的多个字符
# 参数3表示从第n个字符开始
# 参数4表示第n个能匹配的字符串
select regexp_substr(v.ip,'[^.]+',1,1) a,
regexp_substr(v.ip,'[^.]+',1,2) b,
regexp_substr(v.ip,'[^.]+',1,3) c,
regexp_substr(v.ip,'[^.]+',1,4) d
from (select '192.168.1.118' as ip from dual) v;

A B C D
------------------------------
192 168 1 118

按字母重排序字符串

ADAMS->AADMS

  1. 把字符串拆分为单个字母显示。
  2. 把多行数据合并为一行显示。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
# 1.拆分
select ename,substr(ename,level,1) as c
from dual
connect by level<=length(ename);

ENAME C
------------
ADAMS A
ADAMS D
ADAMS A
ADAMS M
ADAMS S

# 2.用listagg合并
select ename,
listagg(substr(ename,level,1)) within group(order by substr(ename,level,1)) as new_name
from dual
connect by level<=length(ename)
group by ename;

ENAME NEW_NAME
--------------------
ADAMS AADMS

# 3.处理全表数据
select ename,
(select listagg(substr(ename,level,1)) within group(order by substr(ename,level,1))
from dual
connect by level<=length(ename)) as new_name
from emp;

# 4.字母去重(把“substr(ename,level,1)”当作整体)
select ename,
(select listagg(MIN(substr(ename,level,1))) within group(order by MIN(substr(ename,level,1)))
from dual
connect by level<=length(ename)
group by substr(ename,level,1)) as new_name
from emp;

rpad()字符串填充函数

rpad(string,padded_length,[pad_string]),从右边对字符串使用指定的字符进行填充:

  • string 表示:被填充的字符串
  • padded_length 表示:字符的长度,是返回的字符串的数量,如果这个数量比原字符串的长度要短,rpad函数将会把字符串截取成从左到右的n个字符;
  • pad_string 是个可选参数,这个字符串是要粘贴到string的右边,如果这个参数未写,lpad函数将会在string的右边粘贴空格。
1
2
3
4
5
6
# 例:
rpad('tech', 7); 将返回'tech '   
rpad('tech', 2); 将返回'te'   
rpad('tech', 8, '0'); 将返回'tech0000'   
rpad('tech on the net', 15, 'z'); 将返回 'tech on the net'   
rpad('tech on the net', 16, 'z'); 将返回 'tech on the netz'

使用数字

生成累计和

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
select empno as 编号,
ename as 姓名,
sal as 人工成本,
sum(sal) over(order by empno) as 成本累计
from emp
where deptno=30
order by empno;
# 最后也可以按ename排序,与分析函数的结果无关,一致是为了方便观察

编号 姓名 人工成本 成本累计
---------------------------------
7499 ALLEN 1600 1600
7521 WARD 1250 2850
7654 MARTIN 1250 4100
7698 BLAKE 2850 6950

分析函数 “sum(sal) over(order by empno)” 的结果(4100)是排序 “over(order by empno)” 后第一行到当前行的所有工资(1600+1250+1250)之和。

计算累计差

  1. 对数据排序
  2. 待减数变为负数
  3. 用累加计算累计差
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
select * from detail;

编号 项目 金额
----------------------
1000 预交费用 30000
7782 支出1 3450
7839 支出2 6000
7934 支出3 2300

with x as (
select rownum as seq,a.* from (
select 编号,项目,金额 from detail order by 编号
) a
)
select 编号,项目,金额
sum(case when seq=1 then 金额 else -金额 end) over(order by seq) as 余额
from x;

编号 项目 金额 余额
------------------------------
1000 预交费用 30000 30000
7782 支出1 3450 26550
7839 支出2 6000 20550
7934 支出3 2300 18250

三种方式生成序号排名

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
select deptno,empno,sal,
row_number() over(partition by deptno order by sal desc) as row_number,
rank() over(partition by deptno order by sal desc) as rank,
dense_rank over(partition by deptno order by sal desc) as dense_rank
from emp
where deptno in (20,30)
order by 1,3 desc;

DEPTNO EMPNO SAL ROW_NUMBER RANK DENSE_RANK
----------------------------------------------------------
20 7788 3000 1 1 1
20 7902 3000 2 1 1
20 7566 2975 3 3 2
20 7876 1100 4 4 3
20 7369 800 5 5 4
30 7698 2850 1 1 1
30 7499 1600 2 2 2
30 7844 1500 3 3 3
30 7521 1250 4 4 4
30 7654 1250 5 4 4
30 7900 950 6 6 5

生成序号排名取值

  1. 计算不同工资出现次数。
  2. 按次数排序生成序号。
  3. 根据序号过滤取值。
1
2
3
4
5
6
7
8
9
10
select sal from (
select sal,dense_rank() over(order by 出现次数 desc) as 次数排序 from (
select sal,count(*) as 出现次数 from emp where deptno=20 group by sal
) x
) y
where 次数排序=1;

SAL
--------
3000.00

利用 partition by 子句分别查询各部门哪个工资等级的员工最多。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
select deptno,sal from (
select deptno,sal,dense_rank() over(partition by deptno order by 出现次数 desc) as 次数排序 from (
select deptno,sal,count(*) as 出现次数 from emp group by deptno,sal
) x
) y
where 次数排序=1;

DEPTNO SAL
-----------------
10 5000.00
10 1300.00
10 2450.00
20 3000.00
30 1250.00

返回最值所在行数据

标量:先取出最大值,然后与最大值关联

1
2
3
4
5
6
7
8
select deptno,empno,ename,sal,
(select max(b.ename) from emp b where b.sal=a.max_sal) as 工资最高的人
from (
select deptno,empno,ename,sal,
max(sal) over(partition by deptno) as max_sal
from emp where deptno=10
) a
order by 1,5 desc;

分析函数:可以直接满足上述需求,并同时取最大及最小值

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
select deptno,empno,ename,sal,
max(ename) keep(dense_rank first order by sal) over(partition by deptno) as工资最低的人,
max(ename) keep(dense_rank last order by sal) over(partition by deptno) as工资最高的人
from emp
where deptno=10
order by 1,6 desc;

DEPTNO EMPNO 工资最低的人 工资最高的人 ENAME SAL
-------------------------------------------------------------
10 7839 MILLER KING KING 5000
10 7782 MILLER KING CLARK 2450
10 7934 MILLER KING MILLER 1300

select deptno,
min(sal) as min_sal,
max(ename) keep(dense_rank first order by sal) as 工资最低的人,
max(sal) as max_sal,
max(ename) keep(dense_rank last order by sal) as 工资最高的人
from emp
where deptno=10
group by deptno;

DEPTNO MIN_SAL 工资最低的人 MAX_SAL 工资最高的人
-----------------------------------------------------------
10 1300 MILLER 5000 KING

# 当最值有重复数据时,keep(...)部分得到的是一个数据集(SCOTT,FORD)
# 这时前面的聚合函数max()/min()就会起作用

first_value

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
select deptno,empno,ename,sal,
first_value(ename) over(partition by deptno order by sal desc) as 工资最高的人
from emp
where deptno=10
group by 1,5 desc;

DEPTNO EMONO 工资最高的人 ENAME SAL
--------------------------------------------
10 7839 KING KING 5000
10 7782 KING CLARK 2450
10 7934 KING MILLER 1300

# last_value取不到相反数据
# select id,min(id) over(order by id asc) from tableName order by id;
# select id,max(id) over(order by id desc) from tableName order by id;

求占总和的比例

部门 工资合计 总合计 工资比例
10 8750 29025 30.15
20 10875 29025 37.47
30 9400 29025 32.39
  1. 分组汇总
  2. 通过分析函数获取总合计
  3. 得到总合计后计算比例
1
2
3
4
5
6
7
8
9
10
# over()后不加任何内容时,表示对所有数据汇总
select deptno as 部门,工资合计,总合计,
round((工资合计/总合计)*100,2) as 工资比例
from (
select deptno,工资合计,sum(工资合计) over() as 总合计
from (
select deptno,sum(sal) 工资合计 from emp group by deptno
) x
) y
order by 1;

用专用比例函数 “ratio_to_report” 直接计算:

1
2
3
4
5
select deptno,round(ratio_to_report(工资合计) over() * 100, 2) as 工资比例
from (
select deptno,sum(sal) 工资合计 from emp group by deptno
)
order by 1;

同其他分析函数可以使用 partition by 分组计算,查询各员工占本部门的工资比例:

1
2
3
4
select deptno,empno,ename,sal,
round(ratio_to_report(sal) over(partition by deptno) * 100, 2) as 工资比例
from emp
order by 1,2;

日期运算

加减日、月、年

1
2
3
4
5
6
7
8
9
select hiredate as 聘用日期,
hiredate - 5 as5天,
hiredate + 5 as5天,
add_month(hiredate,-5) as5个月,
add_month(hiredate,5) as5个月,
add_month(hiredate,-5*12) as5年,
add_month(hiredate,5*12) as5
from emp
where rownum<=1;

加减时、分、秒

1
2
3
4
5
6
7
8
9
select hiredate as 聘用日期,
hiredate - 5/24/60/60 as5秒,
hiredate + 5/24/60/60 as5秒,
hiredate - 5/24/60 as5分钟,
hiredate + 5/24/60 as5分钟,
hiredate - 5/24 as5小时,
hiredate + 5/24 as5小时
from emp
where rownum<=1;

日期间隔(时、分、秒)

两个date相减,得到的就是天数:

1
2
3
4
5
6
7
8
9
10
11
SELECT
间隔天数,
间隔天数 * 24 AS 间隔小时,
间隔天数 * 24 * 60 AS 间隔分,
间隔天数 * 24 * 60 * 60 AS 间隔秒
FROM (
SELECT
(TO_DATE ('2021-05-08', 'yyyy-MM-dd') -
TO_DATE ('2021-05-05', 'yyyy-MM-dd')) AS 间隔天数
FROM DUAL
)

日期间隔(日、月、年)

1
2
3
4
5
6
7
8
9
10
SELECT
(max_hd - min_hd) AS 间隔天数,
MONTHS_BETWEEN(max_hd,min_hd) 间隔月,
MONTHS_BETWEEN(max_hd,min_hd)/12 间隔年
FROM (
SELECT
TO_DATE ('2021-05-08', 'yyyy-MM-dd') AS max_hd,
TO_DATE ('2020-05-08', 'yyyy-MM-dd') AS min_hd
FROM DUAL
)

取上/下一条记录日期

1
2
3
4
5
6
# lag上一条,lead下一条
SELECT deptno,ename,
hiredate,
lead(hiredate) over(order by hiredate) as lead_hd,
lag(hiredate) over(order by hiredate) as lag_hd
FROM emp WHERE deptno = 10;

日期操作

常用取值

一些例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
SELECT 
SYSDATE AS 当前日期,
TO_DATE(TO_CHAR(SYSDATE,'yyyy-MM')||'-1','yyyy-MM-dd') AS 月初
FROM DUAL;
或:
SELECT
SYSDATE AS 当前日期,
trunc(SYSDATE,'MM') AS 月初
FROM DUAL;

SELECT
sysdate,
to_number(to_char(sysdate,'hh24')) as 时,
to_number(to_char(sysdate,'mi')) as 分,
to_number(to_char(sysdate,'ss')) as 秒,
to_number(to_char(sysdate,'dd')) as 日,
to_number(to_char(sysdate,'mm')) as 月,
to_number(to_char(sysdate,'yyyy')) as 年,
to_number(to_char(sysdate,'ddd')) as 年内第几天,
trunc(sysdate,'dd') as 一天之始,
trunc(sysdate,'day') as 周初,
trunc(sysdate,'mm') as 月初,
last_day(sysdate) as 月末,
add_months(trunc(sysdate,'mm'),1) as 下月初,
trunc(sysdate,'yy') as 年初,
to_char(sysdate,'day') as 周几,
to_char(sysdate,'month') as 月份
FROM DUAL;

注:last_day函数返回的时分秒与参数时间中的一样,用作区间条件时需注意。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
# 错误(未选定行):
WITH T AS (
SELECT
TO_DATE('1980-12-31 15:20:30','yyyy-mm-dd hh24:mi:ss') as d1,
TO_DATE('1980-12-31 05:20:30','yyyy-mm-dd hh24:mi:ss') as d2
FROM DUAL
)
SELECT d1,d2 FROM T WHERE d1 BETWEEN trunc(d2,'mm') AND last_day(d2);

# 正确改写:
WITH T AS (
SELECT
TO_DATE('1980-12-31 15:20:30','yyyy-mm-dd hh24:mi:ss') as d1,
TO_DATE('1980-12-31 05:20:30','yyyy-mm-dd hh24:mi:ss') as d2
FROM DUAL
)
SELECT d1,d2 FROM T
WHERE d1 >= trunc(d2,'mm') AND d1 < add_months(trunc(d2,'mm'),1);

EXTRACT

TO_CHAR 一样,EXTRACT 可以提取时间字段中的年月日时分秒,但不能提取 DATE 中的时分秒,返回值为 NUMBER 类型。

1
2
3
4
5
6
7
8
9
10
11
SELECT
EXTRACT(YEAR from systimestamp) as "YEAR",
EXTRACT(MONTH from systimestamp) as "MONTH",
EXTRACT(DAY from systimestamp) as "DAY",
EXTRACT(HOUR from systimestamp) as "HOUR",//不能为DATE类型
EXTRACT(MINUTE from systimestamp) as "MINUTE",//不能为DATE类型
EXTRACT(SECOND from systimestamp) as "SECOND"//不能为DATE类型
FROM DUAL;

若解析DATE中的时分秒:
[Err] ORA-30076: invalid extract field for extract source

判断是否闰年

只要判断二月的月末是哪一天

1
2
3
4
5
6
7
SELECT trunc(sysdate,'y') AS 年初 FROM DUAL;

SELECT add_months(trunc(sysdate,'y'),1) AS 二月初 FROM DUAL;

SELECT last_day(add_months(trunc(sysdate,'y'),1)) AS 二月底 FROM DUAL;

SELECT to_char(last_day(add_months(trunc(sysdate,'y'),1)),'DD') AS 二月底 FROM DUAL;

周的计算

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
WITH X AS (
SELECT trunc(sysdate,'YY')+(LEVEL-1) AS 日期
FROM DUAL CONNECT BY LEVEL <= 8
)
SELECT
日期,
/*返回值1代表周日,2代表周一...*/
to_char(日期,'d') as d,
to_char(日期,'day') as day,
/*参数2中1代表周日,2代表周一...*/
next_day(日期,1) as 下个周日,
/*ww的算法为每年1月1日为第一周开始,date+6为每一周结束*/
to_char(日期,'ww') as ww,
/*iw的算法为周一至周日算一周,且每年的第一个周一为第一周*/
to_char(日期,'iw') as iw
FROM X;

日期 D DAY 下个周日 WW IW
-----------------------------------------------------------------------------------
2021-01-01 00:00:00 6 friday 2021-01-03 00:00:00 01 53
2021-01-02 00:00:00 7 saturday 2021-01-03 00:00:00 01 53
2021-01-03 00:00:00 1 sunday 2021-01-10 00:00:00 01 53
2021-01-04 00:00:00 2 monday 2021-01-10 00:00:00 01 01
2021-01-05 00:00:00 3 tuesday 2021-01-10 00:00:00 01 01
2021-01-06 00:00:00 4 wednesday 2021-01-10 00:00:00 01 01
2021-01-07 00:00:00 5 thursday 2021-01-10 00:00:00 01 01
2021-01-08 00:00:00 6 friday 2021-01-10 00:00:00 02 01

年内所有周内某天

返回指定年份内的所有周五:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
WITH X AS (
SELECT trunc(sysdate,'y')+(level-1) as dy
FROM DUAL CONNECT BY
LEVEL <= add_months(trunc(sysdate,'y'),12)-trunc(sysdate,'y')
)
SELECT
dy,
to_char(dy,'day') as 周五
FROM X WHERE TO_CHAR(dy,'d')=6;

DY 周五
-------------------------------
2021-01-01 00:00:00 friday
2021-01-08 00:00:00 friday
2021-01-15 00:00:00 friday
2021-01-22 00:00:00 friday
2021-01-29 00:00:00 friday
......

# 当使用参数“day”时,不同字符集返回的结果不一样
# 但“d”不受影响,所以用来判断周几

月内首/末个周内某天

返回当月内第一个周一与最后一个周一,分别找上月末及当月末之前七天的下一周周一即可:

1
2
3
4
SELECT
next_day(trunc(sysdate,'mm')-1,2) as 第一个周一,
next_day(last_day(trunc(sysdate,'mm'))-7,2) as 最后一个周一
FROM DUAL;

创建本月日历

  1. 枚举指定月份所有的日期
  2. 转换为对应的周信息
  3. 按所在周做一次“行转列”
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
WITH
x1 AS
/*1. 给定一个日期*/
(SELECT TO_DATE('2021-05-24','yyyy-mm-dd') AS cur_date FROM dual),
x2 AS
/*2. 取月初*/
(SELECT TRUNC(cur_date,'mm') AS 月初,add_months(TRUNC(cur_date,'mm'),1) AS 下月初 FROM x1),
x3 AS
/*3. 枚举当月所有的天*/
(SELECT 月初 + (LEVEL - 1) ASFROM x2 CONNECT BY LEVEL <= (下月初 - 月初)),
x4 AS
/*4. 提取周信息*/
(SELECT TO_CHAR(日,'iw') AS 所在周,
TO_CHAR(日,'dd') AS 日期,
TO_NUMBER(TO_CHAR(日,'d')) AS 周几 FROM x3)

SELECT MAX(CASE 周几 WHEN 2 THEN 日期 END) 周一,
MAX(CASE 周几 WHEN 3 THEN 日期 END) 周二,
MAX(CASE 周几 WHEN 4 THEN 日期 END) 周三,
MAX(CASE 周几 WHEN 5 THEN 日期 END) 周四,
MAX(CASE 周几 WHEN 6 THEN 日期 END) 周五,
MAX(CASE 周几 WHEN 7 THEN 日期 END) 周六,
MAX(CASE 周几 WHEN 1 THEN 日期 END) 周日
FROM x4
GROUP BY 所在周
ORDER BY 所在周;

全年日历

注意:第一周的数据 to_char(日期,’iw’) 返回值为去年第53周,需转换。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
WITH
x0 AS
(SELECT 2021 AS 年份 FROM dual),
x1 AS
(SELECT trunc(to_date(年份,'yyyy'),'YYYY') AS 本年初,
add_months(trunc(to_date(年份,'yyyy'),'YYYY'),12) AS 下年初 FROM x0),
x2 AS
/*枚举日期*/
(SELECT 本年初 + (LEVEL - 1) AS 日期 FROM x1 CONNECT BY LEVEL <= 下年初 - 本年初),
x3 AS
/*取月份,及周信息*/
(SELECT to_char(日期,'dd') AS 日期,
to_char(日期,'mm') AS 所在月份,
to_char(日期,'iw') AS 所在周,
to_number(to_char(日期,'d')) AS 周几 FROM x2),
x4 AS
/*修正周,去年12月的“第五十三周”改为“第0周”*/
(SELECT 日期,
所在月份,
CASE WHEN 所在月份 = '01' AND 所在周 = '53' THEN '00' ELSE 所在周 END AS 所在周,
周几 FROM x3)

SELECT CASE WHEN lag(所在月份) over(order by 所在周) = 所在月份 THEN NULL ELSE 所在月份 END AS 月份,
所在周,
MAX(CASE 周几 WHEN 2 THEN 日期 END) 周一,
MAX(CASE 周几 WHEN 3 THEN 日期 END) 周二,
MAX(CASE 周几 WHEN 4 THEN 日期 END) 周三,
MAX(CASE 周几 WHEN 5 THEN 日期 END) 周四,
MAX(CASE 周几 WHEN 6 THEN 日期 END) 周五,
MAX(CASE 周几 WHEN 7 THEN 日期 END) 周六,
MAX(CASE 周几 WHEN 1 THEN 日期 END) 周日
FROM x4
GROUP BY 所在月份,所在周
ORDER BY 2;

季度的开始结束日期

使用场景:生成汇总报表时,按季度分类汇总。

1
2
3
4
5
SELECT sn AS 季度,
(sn-1) * 3 + 1 AS 开始月份,
add_months(to_date(年,'yyyy-mm'),(sn-1)*3) AS 开始日期,
add_months(to_date(年,'yyyy-mm'),sn*3) - 1 AS 结束日期
FROM (SELECT '2021-01' AS 年, LEVEL AS sn FROM dual CONNECT BY LEVEL<=4);

补充范围内缺少的值

使用场景:业务数据不连续,为了统计显示美观,补充并给予默认值。

1
2
3
4
5
6
7
8
WITH x AS
(SELECT 开始年份 + (LEVEL - 1) AS 年份 FROM (
SELECT extract(YEAR FROM MIN(hiredate)) AS 开始年份,
extract(YEAR FROM MAX(hiredate)) AS 结束年份
FROM scott.emp WHERE hiredate IS NOT NULL)
CONNECT BY LEVEL <= 结束年份-开始年份 + 1)

SELECT * FROM x;

使用日期特殊部分比较

使用场景:统计相同月份与周内日期聘用的员工。

1
2
3
4
5
6
7
8
9
10
11
# 使用分析函数 count() over() 只访问一次emp,效率高
SELECT ename AS 姓名,
hiredate AS 聘用日期,
to_char(hiredate,'MON day') AS 月周
FROM (
SELECT ename,
hiredate,
COUNT(*) OVER(PARTITION BY to_char(hiredate,'MON day')) AS ct
FROM emp
)
WHERE ct > 1;

识别重叠的日期范围

使用场景:很多员工在旧工程结束之前就开始了新的工程,现要求返回这些工程时间重复的数据。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# 使用分析函数 LAG 和 LEAD,分别访问结果集的前一行和后一行
# lpad 和 rpad 分别为向左、向右字符填充函数
SELECT a.员工编码,
a.姓名,
a.工程号,
a.开始日期,
a.结束日期,
CASE WHEN 上一工程结束日期 >= 开始日期 /*筛选时间重复的数据*/
THEN '(工程 ' || lpad(a.工程号,2,'0') || ') 与工程 ( ' || lpad(a.上一工程号,2,'0') || ')重复'
END AS 描述
FROM (
SELECT empno AS 员工编码,
ename AS 姓名,
proj_id AS 工程号,
proj_start AS 开始日期,
proj_end AS 结束日期,
lag(proj_end) over(partition by empno order by proj_start) AS 上一工程结束日期,
lag(proj_id) over(partition by empno order by proj_start) AS 上一工程号
FROM emp_project
) a

按指定间隔汇总数据

使用场景:按指定时间间隔(10分钟)汇总数据。

  1. 截取到分钟,提取分钟信息。
  2. 对10取余。
  3. 时间减去余数。
1
2
3
4
5
6
7
8
9
# MOD为取余函数
SELECT gp,COUNT(*) AS cnt
FROM (
SELECT TIMESTAMP,
trunc(TIMESTAMP,'mi')-MOD(to_char(TIMESTAMP,'mi'),10)/24/60 AS gp,
action,
action_name
FROM t_trail
) GROUP BY gp;
0%