bat365在线登录-直播365体育-365BET是不是上市了

重建索引要多久

bat365在线登录 📅 2025-08-26 08:49:43 👤 admin 👁️ 5786 ❤️ 10

重建索引要多久

前言最近有套XXX系统的数据库要割接，我们团队已经模拟割接几遍了，各种时间算下来基本上要4小时。

突然QQ群滴滴声一响，我一看是XXX系统的开发妹子头像在跳动，我是真滴怕了她啊。

她娇滴滴的说道：“DBA小哥哥，我们的割接计划定下来了，割接时间2小时，能搞定不？”

我们几个一盘算：“我去，这不是不可能完成的任务吗？”

然后一番讨价还价，各砍50大板，定到3个小时。

为了提高速度，我们首先对表进行了分类。一部分表的数据是静态的，比如那些历史月表，因为数据不会变化，这些表可以提前迁移过去。剩下的动态数据表就只能当晚停了库迁。

备端的表事先都创建好了，再迁移数据之前，我们会把表上的索引删掉，等迁完之后再重建。而有的表数据量很大，重建索引的速度会很慢。我们今天就要从这块开刀，压缩一下索引重建的时间。

并行首先我们应该想到的就是利用并行来增加速度。当前我们的版本是PostgreSQL 12。默认创建索引会根据参数max_parallel_maintenance_workers来决定和并行度。这个参数的含义是控制可用于创建btree索引的辅助进程的最大数量。该进程默认值为2，当前在我们的服务器上设置是4.

e2e=> show max_parallel_maintenance_workers;

max_parallel_maintenance_workers

----------------------------------

4

(1 row)

e2e=> select pg_size_pretty(pg_relation_size('e2e_busi_accept'));

pg_size_pretty

----------------

70 GB

(1 row)

e2e=> create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

CREATE INDEX

Time: 217000.335 ms (03:37.000)可以从另外一个进程观察pg_stat_activity视图看到开启了4个进程并行创建

postgres=# select query from pg_stat_activity WHERE backend_type = 'parallel worker';

query

---------------------------------------------------------------------

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

(4 rows)可以看到在4个并行进程下创建时间是3分37秒。这速度还行，我们把并行进程在加一下试试，将参数改到64在测试。

e2e=> SET max_parallel_maintenance_workers TO 64;

SET

Time: 1.102 ms

e2e=> create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

CREATE INDEX

Time: 156543.775 ms (02:36.544)可见此次速度提示到了2分36秒，虽然参数max_parallel_maintenance_workers开了64个，确没有达到64。只使用了9个。

postgres=# select query from pg_stat_activity WHERE backend_type = 'parallel worker';

query

---------------------------------------------------------------------

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

(9 rows)这有点奇怪，通过研究发现不仅仅系统参数max_parallel_maintenance_workers要设置，还需要设置表上的并行度。通过设置表上的并行度也为64，就可以彻底的打开64个并行。

e2e=> alter table e2e_busi_accept set (parallel_workers=64);

ALTER TABLE

Time: 4.798 ms

e2e=> create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

CREATE INDEX

Time: 144274.618 ms (02:24.275)在另外一个窗口查看，只能开出47个并行进程。这主要是受限于max_parallel_workers参数的限制。

postgres=# select count(1) from pg_stat_activity WHERE backend_type = 'parallel worker';

count

-------

47

(1 row)

postgres=# show max_parallel_workers;

max_parallel_workers

----------------------

48

(1 row)我们从8个调整到48个进程，而创建时间只从2分36秒提升到2分24秒，缩减了12秒，上升空间不是特别大。并行这条路暂时无法得到线性的提升。

修改参数默认情况下，创建索引需要排序，而排序我们更希望它能够多在内存中执行。所以我们需要调整参数maintenance_work_mem。它指定每次索引构建操作整体可用的最大内存量。我们当前数据库设置的是2GB，我们把它先设置成4GB。

postgres=# show maintenance_work_mem;

maintenance_work_mem

----------------------

2GB

(1 row)

e2e=> SET maintenance_work_mem TO '4 GB';

SET

e2e=> create index idx1 on e2e_busi_accept(cust_id,staff_id,create_date);

CREATE INDEX

Time: 134956.833 ms (02:14.957)设置成4GB又能提升10s。提升也是比较小。这是因为我们服务器的maintenance_work_mem参数设置的本身不小。如果你是默认值，那提升到4GB的影响还是很显著的。

缩写键最后说到的一个算法叫缩写键abbreviated keys，这个算法加快了字符/字符串列（文本和varchar（n））的排序和索引创建速度。

首先我们需要将数据在外部进行预先排序，导入之后创建索引，这样索引的创建速度非常快。但是这对我们迁移的要求就高起来了。

以下用个小案例进行演示。

e2e=> set max_parallel_workers_per_gather=48;

SET

e2e=> select cust_id,staff_id,create_date into indexing_e2e_busi_accept_sorted from e2e_busi_accept order by cust_id,staff_id,create_date;

SELECT 73101417

Time: 160949.947 ms (02:40.950)首先我们造一个表，是预排序插入数据的。

hbe2e=> create index idx1 on indexing_e2e_busi_accept_sorted(cust_id,staff_id,create_date);

CREATE INDEX

Time: 74960.184 ms (01:14.960)接下类在预排序的表上创建索引可以发现，速度是1分14秒。

如果在C语言环境和排序规则下，据称速度还会得到提升。

create database testdb lc_collate "C" lc_ctype "C" template template0;

通过pg_dump导入indexing_e2e_busi_accept_sorted的数据

testdb=# select cust_id,staff_id,create_date into indexing_e2e_busi_accept_sorted2 from indexing_e2e_busi_accept_sorted order by cust_id,staff_id,create_date;

SELECT 73101417

Time: 116103.450 ms (01:56.103)

testdb=# create index idx1 on indexing_e2e_busi_accept_sorted2(cust_id,staff_id,create_date);

CREATE INDEX

Time: 64462.081 ms (01:02.462)wiki上表示能提升20倍的速度，但是通过测试速度大概只能提升15-17%%，

需要注意的是，使用非C排序规则可能会产生Bug，出现缺少索引扫描的行。具体Bug参考：https://wiki.postgresql.org/wiki/Abbreviated_keys_glibc_issue总结我们在创建索引的过程中，可以通过并行和调整内存参数来加快创建速度。同时我们还可以使用预排序技术。

最后在割接的过程中我们通过并行+参数的方式，将创建索引的速度大幅提升。满足了开发妹子的要求。

← 苹果手机拍不了视频怎么回事？购房指南：在美国买一套房子需要多少钱？ →

相关推荐

bat365在线登录

湟鱼洄游：一条鱼的生命传奇

2022年6月29日，在青海省海北藏族自治州刚察县泉吉河，湟鱼正在洄游。（张海东摄）它的家，在青海湖蓝蓝的湖水，高寒，又有盐碱虽是一个

📅 07-17 ⭐ 5625

从零起步：学习数据结构的完整路径

直播365体育

从零起步：学习数据结构的完整路径

文章目录 1. 基础概念和前置知识2. 线性数据结构3. 栈和队列4. 树结构5. 图结构6. 散列表和哈希表7. 高级数据结构8. 复杂性分析和算法设计9. 实践

📅 07-08 ⭐ 155

美国站亚马逊会员日折扣规则是什么？会员日持续多久？

bat365在线登录

美国站亚马逊会员日折扣规则是什么？会员日持续多久？

亚马逊Prime会员日是亚马逊一年一度的重要购物节，专门为Prime会员提供大量折扣和优惠。这一活动在全球范围内吸引了大量消费者参与，成为购

📅 07-29 ⭐ 506