Oracle 关联更新 update

Hehuyi_In

17695人浏览 · 2022-04-17 10:35:33

Hehuyi_In · 2022-04-17 10:35:33 发布

以下案例及写法来自 triger liu《专题培训-SQL写法与改写》，有部分个人测试及删改，原文请参考原课程。

一、创建测试表

--数据源表
create table testa as select * from dba_objects where rownum<=5000;

--被update的目标表
create table testb as select * from dba_objects where rownum<=30000;

--数据源表关联字段上要有索引, 可以先不建, 看看效率
create unique index UIDX_TESTA_OBJECT_ID on testa(object_id);

二、各种关联update写法

1. 通常可能错误的写法

update testb b set object_name=
(select a.object_name from testa a where a.object_id=b.object_id) ;

错在哪里？

主表testb没有where条件，所以一定会被全表更新，符合a.object_id=b.object_id的被更新为a.object_name，不符合的则被更新为null。通常这不是我们想要的。

2. 普通正确写法

where条件加in或者exist，确保只更新符合a.object_id=b.object_id的，其余不更新。
数据量小时性能ok，数据量大时效率低，因为本质是nest loop join。
testa表的关联字段object_id要有，否则外层有多少行内层就执行多少次全表扫描。

--exists
update testb b set object_name=(select a.object_name from testa a where
a.object_id=b.object_id)
where exists (select 1 from testa a where a.object_id=b.object_id);

--in
update testb b set object_name=(select a.object_name from testa a where
a.object_id=b.object_id)
where b.object_id in (select object_id from testa a)

但是这个写法有什么问题？

即使 testb表的object_name的值已经跟testa是一样的，还是会照样更新。尤其当反复执行该update时，会发现明明已经没有要更新的，每次还是更新5000行（补充：MySQL不会有这个问题）。

3. 改进写法

只更新需要更新的匹配记录。如果b表与a表object_name原本就相等，或者均为空，则也不重复更新。注意是用lnnvl不是<>，否则会漏掉空值（类似or的改写）。

update testb b set
object_name=(select a.object_name from testa a where a.object_id=b.object_id)
where exists (select 1 from testa a where a.object_id=b.object_id
and lnnvl(a.object_name=b.object_name)
);

4. 推荐用法 merge

简洁、高效，在数据量多和少的时候都是高效的
可以不需要创建索引（如果是大表处理少量, 也建议使用索引）
还能同时实现 insert匹配不到的记录，插入到testb

merge into testb b
using testa a
on (b.object_id=a.object_id)
when matched then
update set b.object_name = a.object_name
where lnnvl(a.object_name=b.object_name); -- 只更新不同值的行

关于merge的原理和之前遇到的一个问题，参考 Oracle merge into 为何只能update不能insert ？_Hehuyi_In的博客-CSDN博客

三、超大表的关联更新

直接将更新后的数据拼成新表，然后重命名。

因为update是dml操作，它要对每一行更新记录新值和旧值，对于大表会产生非常大的日志信息，效率很低。

1. 创建新表

create table testb_new as
select
b.OWNER , decode(a.object_id, null, b.object_name, a.object_name)
b.SUBOBJECT_NAME , b.OBJECT_ID , b.DATA_OBJECT_ID , b.OBJECT_TYPE ,
b.CREATED , b.LAST_DDL_TIME , b.TIMESTAMP , b.STATUS ,
b.TEMPORARY , b.GENERATED , b.SECONDARY , b.NAMESPACE ,
b.EDITION_NAME
from testb b left join testa a
on b.object_id=a.object_id;

语句分析：

因为只更新object_name字段，所以只有那个字段需要处理，其余照抄。如果还要更新别的字段，对着object_name的写法改即可。

decode(条件,值1,返回值1,值2,返回值2,…值n,返回值n,缺省值)

decode(a.object_id, null, b.object_name, a.object_name)

如果 a.object_id is null，即匹配不到 b.object_id，则返回 b.object_name（不需要更新）。否则如果匹配到了 a.object_id = b.object_id，则返回a.object_name（将对应值更新为a.object_name）

为什么是left join，因为testb是被更新表，是主表，关联完后所有数据要保留

2. 改名

rename testb to testb_old;
rename testb_new to testb;

四、更新关联字段的merge写法

常规merge语法会报错:

merge into testb b
using testa a
on (b.object_id=a.object_id)
when matched then
update set b.object_id = a.data_object_id;

可以改为用rowid 关联

merge into testb b
using (select b.rowid as rid,a.data_object_id from testa a,testb b where b.object_id =a.object_id) x
on (b.rowid=x.rid)
when matched then
update set b.object_id = x.data_object_id;