黄石企业网站建设,上海软件网站建设,网页制作自学教程,wordpress慢谷歌字体目录结构 注#xff1a;提前言明 本文借鉴了以下博主、书籍或网站的内容#xff0c;其列表如下#xff1a; 1、参考书籍#xff1a;《PostgreSQL数据库内核分析》 2、参考书籍#xff1a;《数据库事务处理的艺术#xff1a;事务管理与并发控制》 3、PostgreSQL数据库仓库… 目录结构 注提前言明 本文借鉴了以下博主、书籍或网站的内容其列表如下 1、参考书籍《PostgreSQL数据库内核分析》 2、参考书籍《数据库事务处理的艺术事务管理与并发控制》 3、PostgreSQL数据库仓库链接点击前往 4、日本著名PostgreSQL数据库专家 铃木启修 网站主页点击前往 5、参考书籍《PostgreSQL中文手册》 6、参考书籍《PostgreSQL指南内幕探索》点击前往 7、参考书籍《事务处理 概念与技术》 8、本人 pg_dirtyread git仓库点击前往 1、本文内容全部来源于开源社区 GitHub和以上博主的贡献本文也免费开源可能会存在问题评论区等待大佬们的指正 2、本文目的开源共享 抛砖引玉 一起学习 3、本文不提供任何资源 不存在任何交易 与任何组织和机构无关 4、大家可以根据需要自行 复制粘贴以及作为其他个人用途但是不允许转载 不允许商用 写作不易还请见谅 5、本文内容基于PostgreSQL15.1源码开发而成 PostgreSQL数据库开源扩展pg_dirtyread的使用场景和实现原理文章快速说明索引问题描述背景说明功能实现源码解析功能实现细节说明文章快速说明索引
学习目标
做数据库内核开发久了就会有一种 少年得志年少轻狂 的错觉然鹅细细一品觉得自己其实不算特别优秀 远远没有达到自己想要的。也许光鲜的表面掩盖了空洞的内在每每想到于此皆有夜半临渊如履薄冰之感。为了睡上几个踏实觉即日起 暂缓其他基于PostgreSQL数据库的兼容功能开发近段时间 将着重于学习分享Postgres的基础知识和实践内幕。 学习内容详见目录
1、PostgreSQL数据库开源扩展pg_dirtyread的使用场景和实现原理 学习时间
2023-02-16 10:18:54 学习产出
1、PostgreSQL数据库基础知识回顾 1个 2、CSDN 技术博客 1篇 3、PostgreSQL数据库内核深入学习 注下面我们所有的学习环境是Centos7PostgreSQL15.1(pg_backtrace1.0)Oracle19CMySQL8.0
postgres# select version();version
-----------------------------------------------------------------------------PostgreSQL 14.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.1.0, 64-bit
(1 row)postgres# \dxList of installed extensionsName | Version | Schema | Description
-----------------------------------------------------------------pg_backtrace | 1.0 | public | Dump backtrace i errorsplpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)postgres##-----------------------------------------------------------------------------#SQL select * from v$version; BANNER Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
BANNER_FULL Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production Version 19.17.0.0.0
BANNER_LEGACY Oracle Database 19c EE Extreme Perf Release 19.0.0.0.0 - Production
CON_ID 0#-----------------------------------------------------------------------------#mysql select version();
-----------
| version() |
-----------
| 8.0.27 |
-----------
1 row in set (0.06 sec)mysql问题描述背景说明
pg_dirtyread这个插件之前我们已经多次提过一直没有详细介绍过其使用场景和实现原理。接下来我们快速过一下这个短小精悍功能强大的插件
[postgressong197:~/postgres/contrib → REL_15_1]$ git clone gitgithub.com:TsinghuaLucky912/pg_dirtyread.git
Cloning into pg_dirtyread...
remote: Enumerating objects: 422, done.
remote: Counting objects: 100% (27/27), done.
remote: Compressing objects: 100% (23/23), done.
remote: Total 422 (delta 10), reused 15 (delta 4), pack-reused 395
Receiving objects: 100% (422/422), 88.79 KiB | 0 bytes/s, done.
Resolving deltas: 100% (238/238), done.
[postgressong197:~/postgres/contrib → REL_15_1]$ [postgressong197:~/postgres/contrib → REL_15_1]$ cd pg_dirtyread/
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$ ls
contrib dirtyread_tupconvert.c expected Makefile pg_dirtyread--1.0.sql pg_dirtyread.c README.md tupconvert.c.upstream
debian dirtyread_tupconvert.h LICENSE pg_dirtyread--1.0--2.sql pg_dirtyread--2.sql pg_dirtyread.control sql tupconvert.h.upstream
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$ make
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werrorvla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precisionstandard -Wno-format-truncation -g -O0 -fPIC -I. -I./ -I/home/postgres/test/include/server -I/home/postgres/test/include/internal -D_GNU_SOURCE -c -o pg_dirtyread.o pg_dirtyread.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werrorvla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precisionstandard -Wno-format-truncation -g -O0 -fPIC -I. -I./ -I/home/postgres/test/include/server -I/home/postgres/test/include/internal -D_GNU_SOURCE -c -o dirtyread_tupconvert.o dirtyread_tupconvert.c
gcc -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Werrorvla -Wendif-labels -Wmissing-format-attribute -Wimplicit-fallthrough3 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precisionstandard -Wno-format-truncation -g -O0 -fPIC -shared -o pg_dirtyread.so pg_dirtyread.o dirtyread_tupconvert.o -L/home/postgres/test/lib -Wl,--as-needed -Wl,-rpath,/home/postgres/test/lib,--enable-new-dtags
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$ ls
contrib dirtyread_tupconvert.c dirtyread_tupconvert.o LICENSE pg_dirtyread--1.0--2.sql pg_dirtyread--2.sql pg_dirtyread.control pg_dirtyread.so sql tupconvert.h.upstream
debian dirtyread_tupconvert.h expected Makefile pg_dirtyread--1.0.sql pg_dirtyread.c pg_dirtyread.o README.md tupconvert.c.upstream
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$ make install -j8
/usr/bin/mkdir -p /home/postgres/test/lib
/usr/bin/mkdir -p /home/postgres/test/share/extension
/usr/bin/install -c -m 755 pg_dirtyread.so /home/postgres/test/lib/pg_dirtyread.so
/usr/bin/mkdir -p /home/postgres/test/share/extension
/usr/bin/install -c -m 644 .//pg_dirtyread.control /home/postgres/test/share/extension/
/usr/bin/install -c -m 644 .//pg_dirtyread--1.0.sql .//pg_dirtyread--1.0--2.sql .//pg_dirtyread--2.sql /home/postgres/test/share/extension/
[postgressong197:~/postgres/contrib/pg_dirtyread → master]$ pg_dirtyread 扩展提供了从表中读取死的但未清理的行的能力。支持 PostgreSQL 9.2 及更高版本。在 9.2 上至少需要 9.2.9。一旦构建并安装了 pg_dirtyread您就可以将它添加到数据库中。加载 pg_dirtyread 就像以超级用户身份连接到数据库并运行一样简单 CREATE EXTENSION pg_dirtyread;SELECT * FROM pg_dirtyread(tablename) AS t(col1 type1, col2 type2, ...);需要注意pg_dirtyread() 函数返回 RECORD因此有必要附加一个描述表模式的表别名子句。列按名称匹配因此可以在别名中省略某些列或重新排列列。如下
postgres# \dxList of installed extensionsName | Version | Schema | Description
------------------------------------------------------------plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(1 row)postgres# CREATE EXTENSION pg_dirtyread;
CREATE EXTENSION
postgres# \dxList of installed extensionsName | Version | Schema | Description
-----------------------------------------------------------------------------pg_dirtyread | 2 | public | Read dead but unvacuumed rows from tableplpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(2 rows)postgres# CREATE TABLE foo (bar bigint, baz text);
CREATE TABLE
postgres# ALTER TABLE foo SET (
postgres(# autovacuum_enabled false, toast.autovacuum_enabled false
postgres(# );
ALTER TABLE
postgres# select * from pg_type where typname like %foo%;oid | typname | typnamespace | typowner | typlen | typbyval | typtype | typcategory | typispreferred | typisdefined | typdelim | typrelid | typsubscript | typelem | typarray | typinput | typoutput | typreceive | typsend | typmodin | typmodout | typanalyze | typalign | typstorage | typnotnull | typbasetype | typtypmod | typndims | typcollation | typdefaultbin | typdefault | typacl
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------16388 | foo | 2200 | 10 | -1 | f | c | C | f | t | , | 16386 | - | 0 | 16387 | record_in | record_out | record_recv | record_send | - | - | - | d | x | f | 0 | -1 | 0 | 0 | | | 16387 | _foo | 2200 | 10 | -1 | f | b | A | f | t | , | 0 | array_subscript_handler | 16388 | 0 | array_in | array_out | array_recv | array_send | - | - | array_typanalyze | d | x | f | 0 | -1 | 0 | 0 | | |
(2 rows)postgres#
postgres# INSERT INTO foo VALUES (1, Test), (2, New Test);
INSERT 0 2
postgres#postgres# create extension pageinspect ;
CREATE EXTENSION
postgres# \dxList of installed extensionsName | Version | Schema | Description
------------------------------------------------------------------------------------------pageinspect | 1.10 | public | inspect the contents of database pages at a low levelpg_dirtyread | 2 | public | Read dead but unvacuumed rows from tableplpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language
(3 rows)postgres# select * from heap_page_items(get_raw_page(foo,0)) ;lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
------------------------------------------------------------------------------------------------------------------------------------------------------1 | 8152 | 1 | 37 | 736 | 0 | 0 | (0,1) | 2 | 2050 | 24 | | | \x01000000000000000b546573742 | 8104 | 1 | 41 | 736 | 0 | 0 | (0,2) | 2 | 2050 | 24 | | | \x0200000000000000134e65772054657374
(2 rows)postgres# DELETE FROM foo WHERE bar 1;
DELETE 1
postgres# select * from heap_page_items(get_raw_page(foo,0)) ;lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
------------------------------------------------------------------------------------------------------------------------------------------------------1 | 8152 | 1 | 37 | 736 | 738 | 0 | (0,1) | 8194 | 258 | 24 | | | \x01000000000000000b546573742 | 8104 | 1 | 41 | 736 | 0 | 0 | (0,2) | 2 | 2306 | 24 | | | \x0200000000000000134e65772054657374
(2 rows)postgres# select * from foo;bar | baz
---------------2 | New Test
(1 row)postgres# select * from heap_page_items(get_raw_page(foo,0)) ;lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
------------------------------------------------------------------------------------------------------------------------------------------------------1 | 8152 | 1 | 37 | 736 | 738 | 0 | (0,1) | 8194 | 1282 | 24 | | | \x01000000000000000b546573742 | 8104 | 1 | 41 | 736 | 0 | 0 | (0,2) | 2 | 2306 | 24 | | | \x0200000000000000134e65772054657374
(2 rows)postgres#如上foo表并没有vacuum所以使用pageinspect也是可以看到 死元组 的存在如下使用pg_dirtyread再看一下(并看一下清理之后的)
postgres# select * from foo;bar | baz
---------------2 | New Test
(1 row)postgres# SELECT * FROM pg_dirtyread(foo) as t(bar bigint, baz text);bar | baz
---------------1 | Test2 | New Test
(2 rows)postgres# vacuum foo;
VACUUM
postgres# SELECT * FROM pg_dirtyread(foo) as t(bar bigint, baz text);bar | baz
---------------2 | New Test
(1 row)postgres# select * from heap_page_items(get_raw_page(foo,0)) ;lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
------------------------------------------------------------------------------------------------------------------------------------------------------1 | 0 | 0 | 0 | | | | | | | | | | 2 | 8144 | 1 | 41 | 736 | 0 | 0 | (0,2) | 2 | 2306 | 24 | | | \x0200000000000000134e65772054657374
(2 rows)postgres#在pg_dirtyread的新版本中又新增了一些功能(Dropped Columns和System Columns) 如下 Dropped Columns只要表没有被重写例如通过 VACUUM FULL 或 CLUSTER就可以检索删除列的内容。使用 dropped_N 访问第 N 列从 1 开始计数(下面有演示)。PostgreSQL 删除了原始列的类型信息因此如果在表别名中指定了正确的类型则只能进行少量完整性检查检查的是类型长度、类型对齐、类型修饰符和按值传递。
postgres# \d
Did not find any relations.
postgres# CREATE TABLE ab(a text, b text);
CREATE TABLE
postgres# INSERT INTO ab VALUES (Hello, World);
INSERT 0 1
postgres# ALTER TABLE ab DROP COLUMN b;
ALTER TABLE
postgres# DELETE FROM ab;
DELETE 1
postgres# select * from ab;a
---
(0 rows)postgres# SELECT * FROM pg_dirtyread(ab) ab(a text, dropped_2 text);a | dropped_2
------------------Hello | World
(1 row)postgres# SELECT * FROM pg_dirtyread(ab) ab(a text, b text);
2023-02-16 11:24:17.799 CST [4660] ERROR: Error converting tuple descriptors!
2023-02-16 11:24:17.799 CST [4660] DETAIL: Attribute b does not exist in type ab.
2023-02-16 11:24:17.799 CST [4660] STATEMENT: SELECT * FROM pg_dirtyread(ab) ab(a text, b text);
ERROR: Error converting tuple descriptors!
DETAIL: Attribute b does not exist in type ab.
postgres#System Columns可以通过将它们包含在附加到 pg_dirtyread() 调用的表别名中来检索系统列例如 xmax 和 ctid。一个布尔类型的特殊列 dead 可用于报告死行如 HeapTupleIsSurelyDead。死列在恢复期间不可用即最明显的是在备用服务器上不可用。oid 列仅在 PostgreSQL 版本 11 及更早版本中可用。
postgres# CREATE TABLE foo2 (bar bigint, baz text);
CREATE TABLE
postgres# ALTER TABLE foo2 SET (
postgres(# autovacuum_enabled false, toast.autovacuum_enabled false
postgres(# );
ALTER TABLE
postgres# INSERT INTO foo2 VALUES (1, Test), (2, New Test);
INSERT 0 2
postgres# select * from foo2 ;bar | baz
---------------1 | Test2 | New Test
(2 rows)postgres# DELETE FROM foo2 WHERE bar 1;
DELETE 1
postgres# SELECT * FROM pg_dirtyread(foo2) AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,
postgres(# bar bigint, baz text);tableoid | ctid | xmin | xmax | cmin | cmax | dead | bar | baz
--------------------------------------------------------------16442 | (0,1) | 747 | 748 | 0 | 0 | f | 1 | Test16442 | (0,2) | 747 | 0 | 0 | 0 | f | 2 | New Test
(2 rows)postgres#当然这两个也可以结合在一起进行使用如下
postgres# SELECT * FROM pg_dirtyread(foo2) AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,bar bigint, baz text);tableoid | ctid | xmin | xmax | cmin | cmax | dead | bar | baz
--------------------------------------------------------------16442 | (0,1) | 747 | 748 | 0 | 0 | f | 1 | Test16442 | (0,2) | 747 | 0 | 0 | 0 | f | 2 | New Test
(2 rows)postgres# ALTER TABLE foo2 DROP COLUMN baz;
ALTER TABLE
postgres# SELECT * FROM pg_dirtyread(foo2) AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,bar bigint, dropped_2 text);tableoid | ctid | xmin | xmax | cmin | cmax | dead | bar | dropped_2
---------------------------------------------------------------16442 | (0,1) | 747 | 748 | 0 | 0 | f | 1 | Test16442 | (0,2) | 747 | 0 | 0 | 0 | f | 2 | New Test
(2 rows)postgres# vacuum foo2 ;
VACUUM
postgres# SELECT * FROM pg_dirtyread(foo2) AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,bar bigint, dropped_2 text);tableoid | ctid | xmin | xmax | cmin | cmax | dead | bar | dropped_2
---------------------------------------------------------------16442 | (0,2) | 747 | 0 | 0 | 0 | f | 2 | New Test
(1 row)postgres#功能实现源码解析
postgres# \dfList of functionsSchema | Name | Result data type | Argument data types | Type
-------------------------------------------------------------------public | pg_dirtyread | SETOF record | regclass | func
(1 row)postgres# \df pg_dirtyread List of functionsSchema | Name | Result data type | Argument data types | Type | Volatility | Parallel | Owner | Security | Access privileges | Language | Source code | Description
---------------------------------------------------------------------------------------------------------------------------------------------------------------------public | pg_dirtyread | SETOF record | regclass | func | volatile | unsafe | postgres | invoker | | c | pg_dirtyread |
(1 row)postgres#接下来以下面SQL为例详细介绍一下这个插件的作用原理
postgres# SELECT * FROM foo ;bar | baz
---------------2 | New Test
(1 row)postgres# ALTER TABLE foo DROP COLUMN baz;
ALTER TABLE
postgres# SELECT * FROM pg_dirtyread(foo) AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,bar bigint, dropped_2 text);tableoid | ctid | xmin | xmax | cmin | cmax | dead | bar | dropped_2
---------------------------------------------------------------16447 | (0,1) | 754 | 755 | 0 | 0 | t | 1 | Test16447 | (0,2) | 754 | 0 | 0 | 0 | f | 2 | New Test
(2 rows)postgres#-- foo 表第一行数据被delete第二列 baz被drop此时表并未被清理此时函数调用堆栈如下
pg_dirtyread.so!pg_dirtyread(FunctionCallInfo fcinfo)
ExecMakeTableFunctionResult(SetExprState * setexpr, ExprContext * econtext, MemoryContext argContext, TupleDesc expectedDesc, _Bool randomAccess)
FunctionNext(FunctionScanState * node)
ExecScanFetch(ScanState * node, ExecScanAccessMtd accessMtd, ExecScanRecheckMtd recheckMtd)
ExecScan(ScanState * node, ExecScanAccessMtd accessMtd, ExecScanRecheckMtd recheckMtd)
ExecFunctionScan(PlanState * pstate)
ExecProcNodeFirst(PlanState * node)
ExecProcNode(PlanState * node)
ExecutePlan(EState * estate, PlanState * planstate, _Bool use_parallel_mode, CmdType operation, _Bool sendTuples, uint64 numberTuples, ScanDirection direction, DestReceiver * dest, _Bool execute_once)
standard_ExecutorRun(QueryDesc * queryDesc, ScanDirection direction, uint64 count, _Bool execute_once)
ExecutorRun(QueryDesc * queryDesc, ScanDirection direction, uint64 count, _Bool execute_once)
PortalRunSelect(Portal portal, _Bool forward, long count, DestReceiver * dest)
PortalRun(Portal portal, long count, _Bool isTopLevel, _Bool run_once, DestReceiver * dest, DestReceiver * altdest, QueryCompletion * qc)
exec_simple_query(const char * query_string)
PostgresMain(const char * dbname, const char * username)
BackendRun(Port * port)
BackendStartup(Port * port)
ServerLoop()
PostmasterMain(int argc, char ** argv)
main(int argc, char ** argv)如上的SQL 返回值是record oid 2249此刻的tupdesc 如下 接下来看一下今天的第一个重点dirtyread_convert_tuples_by_name函数
// contrib/pg_dirtyread/dirtyread_tupconvert.c/** The conversion setup routines have the following common API:* 转换设置例程具有以下通用 API** The setup routine checks whether the given source and destination tuple* descriptors are logically compatible. If not, it throws an error.* If so, it returns NULL if they are physically compatible (ie, no conversion* is needed), else a TupleConversionMap that can be used by do_convert_tuple* to perform the conversion.* 设置例程检查给定的源和目标元组描述符在逻辑上是否兼容* 如果不是它会抛出一个错误* 如果是这样如果它们在物理上兼容即不需要转换则返回 NULL否则返回 do_convert_tuple 可以用来执行转换的 TupleConversionMap** The TupleConversionMap, if needed, is pallocd in the callers memory* context. Also, the given tuple descriptors are referenced by the map,* so they must survive as long as the map is needed.* 如果需要TupleConversionMap 会在调用者的内存上下文中被分配* 此外给定的元组描述符由map引用因此只要需要map它们就必须存在** The caller must supply a suitable primary error message to be used if* a compatibility error is thrown. Recommended coding practice is to use* gettext_noop() on this string, so that it is translatable but wont* actually be translated unless the error gets thrown.* 如果抛出兼容性错误调用者必须提供要使用的合适的主要错误消息* 推荐的编码做法是在此字符串上使用 gettext_noop() 这样它是可翻译的但除非抛出错误否则实际上不会被翻译*** Implementation notes:* 实施说明** The key component of a TupleConversionMap is an attrMap[] array with* one entry per output column. This entry contains the 1-based index of* the corresponding input column, or zero to force a NULL value (for* a dropped output column). The TupleConversionMap also contains workspace* arrays.* TupleConversionMap 的关键组件是一个 attrMap[] 数组每个输出列有一个条目* 此条目包含相应输入列的从 1 开始的索引或零以强制 NULL 值对于删除的输出列* TupleConversionMap 还包含工作区数组*//** Set up for tuple conversion, matching input and output columns by name.* (Dropped columns are ignored in both input and output.) This is intended* for use when the rowtypes are related by inheritance, so we expect an exact* match of both type and typmod. The error messages will be a bit unhelpful* unless both rowtypes are named composite types.* 设置元组转换按名称匹配输入和输出列* 丢弃的列在输入和输出中都被忽略。* 这是为了在行类型通过继承相关时使用所以我们期望类型和 typmod 完全匹配* 除非两个行类型都命名为复合类型否则错误消息将有点无用*/
TupleConversionMap *
dirtyread_convert_tuples_by_name(TupleDesc indesc,TupleDesc outdesc,const char *msg)
{.../* Verify compatibility and prepare attribute-number map *//* 验证兼容性并准备属性编号映射 */attrMap dirtyread_convert_tuples_by_name_map(indesc, outdesc, msg);...
}此时说明一下
/*
indesc: 该表的列 在这里 为2列(即使第二列被删掉了注意 attisdropped 属性)outdesc: 我们这里要输入的 record 对应 (这里自然就是 7个系统列 2个用户的列)msg: Error converting tuple descriptors!
*/接下来先看一下dirtyread_convert_tuples_by_name_map函数
// contrib/pg_dirtyread/dirtyread_tupconvert.cstatic const struct system_columns_t {char *attname;Oid atttypid;int32 atttypmod;int attnum;
} system_columns[] {{ ctid, TIDOID, -1, SelfItemPointerAttributeNumber },
#if PG_VERSION_NUM 120000{ oid, OIDOID, -1, ObjectIdAttributeNumber },
#endif{ xmin, XIDOID, -1, MinTransactionIdAttributeNumber },{ cmin, CIDOID, -1, MinCommandIdAttributeNumber },{ xmax, XIDOID, -1, MaxTransactionIdAttributeNumber },{ cmax, CIDOID, -1, MaxCommandIdAttributeNumber },{ tableoid, OIDOID, -1, TableOidAttributeNumber },{ dead, BOOLOID, -1, DeadFakeAttributeNumber }, /* fake column to return HeapTupleIsSurelyDead */{ 0 },
};/** Return a pallocd bare attribute map for tuple conversion, matching input* and output columns by name. (Dropped columns are ignored in both input and* output.) This is normally a subroutine for convert_tuples_by_name, but can* be used standalone.* 返回用于元组转换的 palloc 裸属性映射按名称匹配输入和输出列* 丢弃的列在输入和输出中都被忽略。* 这通常是 convert_tuples_by_name 的子例程但可以单独使用** This version from dirtyread_tupconvert.c adds the ability to retrieve dropped* columns by requesting dropped_N as output column, where N is the attnum.* 这个来自 dirtyread_tupconvert.c 的版本增加了通过请求“dropped_N”作为输出列来检索删除的列的能力其中 N 是 attnum*/
AttrNumber *
dirtyread_convert_tuples_by_name_map(TupleDesc indesc,TupleDesc outdesc,const char *msg)
{...// 这个遍历 outdescfor (i 0; i n; i){...// 这个遍历 indescfor (j 0; j indesc-natts; j){...}.../* Check dropped columns */if (attrMap[i] 0)if (strncmp(attname, dropped_, sizeof(dropped_) - 1) 0){...}.../* Check system columns */if (attrMap[i] 0)for (j 0; system_columns[j].attname; j)if (strcmp(attname, system_columns[j].attname) 0){...}...}...
}这里我们把indesc和outdesc的关键属性(重要的)列举如下
attrelidattnameatttypidattlenattnumattisdropped16447bar2081f同上空0-12tattrelidattnameatttypidattlenattnumattisdropped0tableoid2641f0ctid2762f0xmin2843f0xmax2844f0cmin2945f0cmax2946f0dead1617f0bar2088f0dropped_225-19f解释一下上面这9个输出列其匹配如下
// contrib/pg_dirtyread/dirtyread_tupconvert.h
// FirstLowInvalidHeapAttributeNumber -7
#define DeadFakeAttributeNumber FirstLowInvalidHeapAttributeNumber/*检查系统列 attrMap[0] system_columns[5].attnum; -6 就是 TableOidAttributeNumber
检查系统列 attrMap[1] system_columns[0].attnum; -1 就是 SelfItemPointerAttributeNumber
检查系统列 attrMap[2] system_columns[1].attnum; -2 就是 MinTransactionIdAttributeNumber
检查系统列 attrMap[3] system_columns[3].attnum; -4 就是 MaxTransactionIdAttributeNumber
检查系统列 attrMap[4] system_columns[2].attnum; -3 就是 MinCommandIdAttributeNumber
检查系统列 attrMap[5] system_columns[4].attnum; -5 就是 MaxCommandIdAttributeNumber
检查系统列 attrMap[6] system_columns[6].attnum; -7 就是 DeadFakeAttributeNumber检查表的列 attrMap[7] (AttrNumber) (0 1); 1 就是 bar 列检查删除列 attrMap[8] (AttrNumber) 2; 2就是 原 baz 列
*/经过函数dirtyread_convert_tuples_by_name_map之后的attrMap就是上面的内容之后函数dirtyread_convert_tuples_by_name也就结束了 继续 如下 如上图所示接下来就是遍历这个表 heap_beginscan以ForwardScanDirection的方式获取全部元组如下
// contrib/pg_dirtyread/pg_dirtyread.c...if ((tuplein heap_getnext(usr_ctx-scan, ForwardScanDirection)) ! NULL){if (usr_ctx-map ! NULL){tuplein dirtyread_do_convert_tuple(tuplein, usr_ctx-map, usr_ctx-oldest_xmin);SRF_RETURN_NEXT(funcctx, HeapTupleGetDatum(tuplein));}elseSRF_RETURN_NEXT(funcctx, heap_copy_tuple_as_datum(tuplein, usr_ctx-reltupdesc));}else{heap_endscan(usr_ctx-scan);
#if PG_VERSION_NUM 120000table_close(usr_ctx-rel, AccessShareLock);
#elseheap_close(usr_ctx-rel, AccessShareLock);
#endifSRF_RETURN_DONE(funcctx);}
}接下来 我们再介绍一下 第二个重点 dirtyread_do_convert_tuple函数如下
// contrib/pg_dirtyread/dirtyread_tupconvert.c/** Perform conversion of a tuple according to the map.* 根据映射执行元组的转换*/
HeapTuple
dirtyread_do_convert_tuple(HeapTuple tuple, TupleConversionMap *map, OldestXminType oldest_xmin)
{AttrNumber *attrMap
#if PG_VERSION_NUM 130000map-attrMap-attnums;
#elsemap-attrMap;
#endifDatum *invalues map-invalues;bool *inisnull map-inisnull;Datum *outvalues map-outvalues;bool *outisnull map-outisnull;int outnatts map-outdesc-natts;int i;/** Extract all the values of the old tuple, offsetting the arrays so that* invalues[0] is left NULL and invalues[1] is the first source attribute;* this exactly matches the numbering convention in attrMap.* * 提取旧元组的所有值偏移数组使 invalues[0] 为 NULLinvalues[1] 为第一个源属性* 这完全符合 attrMap 中的编号约定*/heap_deform_tuple(tuple, map-indesc, invalues 1, inisnull 1);/** Transpose into proper fields of the new tuple.* 转置到新元组的适当字段中*/for (i 0; i outnatts; i){int j attrMap[i];if (j DeadFakeAttributeNumber){outvalues[i] HeapTupleIsSurelyDead(tuple
#if PG_VERSION_NUM 90400-t_data
#endif, oldest_xmin);outisnull[i] false;}else if (j 0)outvalues[i] heap_getsysattr(tuple, j, map-indesc, outisnull[i]);else{outvalues[i] invalues[j];outisnull[i] inisnull[j];}}/** Now form the new tuple.*/return heap_form_tuple(map-outdesc, outvalues, outisnull);
}其他的不再详解我们这里主要看一下 最下面的那个 for 循环如下
如果是 dead 则其值为 HeapTupleIsSurelyDead 函数的返回值为系统列 其值获取使用 heap_getsysattr表的普通列 使用invalues[j]方式
最后通过函数heap_form_tuple来构造这个(要返回的)元组 /** Now form the new tuple.*/return heap_form_tuple(map-outdesc, outvalues, outisnull);如上是函数第一次调用(第一行的构造逻辑)轮到第二次 此时的函数堆栈调用如下(同上) 第二行元组 构造 省略表遍历结束再无元组 pg_dirtyread函数执行完成
postgres# SELECT * FROM pg_dirtyread(foo) AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,bar bigint, dropped_2 text);tableoid | ctid | xmin | xmax | cmin | cmax | dead | bar | dropped_2
---------------------------------------------------------------16447 | (0,1) | 754 | 755 | 0 | 0 | t | 1 | Test16447 | (0,2) | 754 | 0 | 0 | 0 | f | 2 | New Test
(2 rows)postgres#功能实现细节说明
接下来我们详细看一下几个细节如下
细节一判断dead返回值的函数HeapTupleIsSurelyDead如下
// src/backend/access/heap/heapam_visibility.c/** HeapTupleIsSurelyDead** Cheaply determine whether a tuple is surely dead to all onlookers.* We sometimes use this in lieu of HeapTupleSatisfiesVacuum when the* tuple has just been tested by another visibility routine (usually* HeapTupleSatisfiesMVCC) and, therefore, any hint bits that can be set* should already be set. We assume that if no hint bits are set, the xmin* or xmax transaction is still running. This is therefore faster than* HeapTupleSatisfiesVacuum, because we consult neither procarray nor CLOG.* Its okay to return false when in doubt, but we must return true only* if the tuple is removable.* * 廉价地确定一个元组是否对所有旁观者来说肯定是死的* 当元组刚刚被另一个可见性例程通常是 HeapTupleSatisfiesMVCC测试时* 我们有时会使用它代替 HeapTupleSatisfiesVacuum因此应该已经设置了可以设置的任何提示位* * 我们假设如果没有设置提示位则 xmin 或 xmax 事务仍在运行* 因此这比 HeapTupleSatisfiesVacuum 更快因为我们既不查询 procarray 也不查询 CLOG* 有疑问时返回 false 是可以的但只有当元组可移动时我们才必须返回 true*/
bool
HeapTupleIsSurelyDead(HeapTuple htup, GlobalVisState *vistest);两个参数
/*
htup 当前判断的 元组vistest如下
*/其值的获取来源如下 // src/backend/storage/ipc/procarray.c/** If rel ! NULL, return test state appropriate for relation, otherwise* return state usable for all relations. The latter may consider XIDs as* not-yet-visible-to-everyone that a state for a specific relation would* already consider visible-to-everyone.* 如果 rel ! NULL则返回适用于关系的测试状态否则返回可用于所有关系的状态* 后者可能将 XID 视为尚未对所有人可见而特定关系的状态已将其视为对所有人可见** This needs to be called while a snapshot is active or registered, otherwise* there are wraparound and other dangers.* 这需要在快照处于活动状态或已注册时调用否则会有环绕和其他危险** See comment for GlobalVisState for details.*/
GlobalVisState *
GlobalVisTestFor(Relation rel)
{GlobalVisState *state NULL;/* XXX: we should assert that a snapshot is pushed or registered */Assert(RecentXmin);switch (GlobalVisHorizonKindForRel(rel)){case VISHORIZON_SHARED:state GlobalVisSharedRels;break;case VISHORIZON_CATALOG:state GlobalVisCatalogRels;break;case VISHORIZON_DATA:state GlobalVisDataRels;break;case VISHORIZON_TEMP:state GlobalVisTempRels;break;}Assert(FullTransactionIdIsValid(state-definitely_needed) FullTransactionIdIsValid(state-maybe_needed));return state;
}我们继续看一下
postgres# SELECT * FROM pg_dirtyread(foo) AS t(tableoid oid, ctid tid, xmin xid, xmax xid, cmin cid, cmax cid, dead boolean,bar bigint, dropped_2 text);tableoid | ctid | xmin | xmax | cmin | cmax | dead | bar | dropped_2
---------------------------------------------------------------16447 | (0,1) | 754 | 755 | 0 | 0 | t | 1 | Test16447 | (0,2) | 754 | 0 | 0 | 0 | f | 2 | New Test
(2 rows)postgres# select txid_status(754);txid_status
-------------committed
(1 row)postgres# select txid_status(755);txid_status
-------------committed
(1 row)postgres# SELECT txid_current ();txid_current
--------------757
(1 row)postgres#事务754插入这两行数据事务755删除第一行(这两个事务都已提交)。关于元组可见性的内容 不在详解有兴趣的小伙伴可以看一下本人之前的博客
PostgreSQL的学习心得和知识总结四十九|深入理解PostgreSQL数据库行可见性判断机制基础点击前往