参考文献: Regina Obe Leo Hsu 著, 丁奇鹏 译《PostgreSQL即学即用》
pgScript 是 pgAdmin 内置的一种脚本机制, 非常适合于重复执行SQL任务的场景. 相比PostgreSQL的函数机制, pgScript对内存的使用更合理因而执行效率更高.
pgScript 机制之所以能达到这种效果, 是因为 PostgreSQL 函数机制会将所有工作在最后一次性批量提交, 此前未提交的工作成果都保存至内存中.
相比之下, pgScript 在运行脚本时没执行一条 SQL 语句就提交一次, 这样就使得pgScript 特别适合执行会消耗大量内存而又不需要作为一个完整事务来提交的任务. 一旦某个事务被提交, 则该事务占用的内存会立即被释放, 这部分内存即可用于下一个事务.
pgScript 脚本语言是一种弱类型语言(即定义变量时无需明确指定其类型), 支持条件判断、循环、数据生成器、基本的打印函数以及记录型变量, 其语法与微软 SQL Server 数据库所使用的 Transact-SQL 语法类似. 前面加
可以在SQL查询执行窗口中执行 pgScript. 在窗口中输入脚本后, 点击 pgScript 图标来执行.
pgAdmin3 支持 pgScript, 但是 pgAdmin4 找不到运行 pgScript 的图标.
下面的示例演示了如何使用 pgScript 记录型变量和循环语法来构建一个交叉表, 使用的基础表是
postgres@localhost:5432 postgresql_book# \dt census.lu_fact_types
List of relations
Schema | Name | Type | Owner
--------+---------------+-------+----------
census | lu_fact_types | table | postgres
(1 row)
以下 pgScript 脚本中创建了一个名为 census.hisp_pop 的空表, 该表有以下数字型列:
DECLARE @I, @labels, @tdef; SET @I=0; -- 变量labels 将用于存放记录, 这里得到一个结果集, 模式为 R(col_name, fact_types_d) SET @labels= SELECT quote_ident( replace( replace(lower(COALESCE(fact_subcats[4], fact_subcats[3])), ' ','_'), ':','' ) ) As col_name, fact_types_id FROM census.lu_fact_types WHERE category = 'Population' AND fact_subcats[3] ILIKE 'Hispanic or Latino%' ORDER BY short_name; -- 创建名为 census.hisp_pop 的空表, 这里 @tdef 中的 t 和 def 分别指 table 和 define. SET @tdef = 'census.hisp_pop(tract_id varchar(11) PRIMARY KEY '; -- 使用LINES函数来循环遍历每一条记录. WHILE @I < LINES(@labels) BEGIN SET @tdef = @tdef + ', ' +@labels[@I][0]+' numeric(12,3)'; SET @I = @I+1; END SET @tdef = @tdef + ')'; -- 打印表def. PRINT @tdef;
将上述代码保存为
首先我们看一下 census 架构下的关系 lu_fact_types 的结构.
postgres@localhost:5432 postgresql_book# \d census.lu_fact_types
数据表 "census.lu_fact_types"
栏位 | 类型 | 校对规则 | 可空的 | 预设
--------------+--------------------------+----------+----------+------------------------------------------------------------
fact_type_id | integer | | not null | nextval('census.lu_fact_types_fact_type_id_seq'::regclass)
category | character varying(100) | | |
fact_subcats | character varying(255)[] | | |
short_name | character varying(50) | | |
索引:
"pk_lu_fact_types" PRIMARY KEY, btree (fact_type_id)
可以看到
在 pgAdmin III 中打开
[QUERY ] SELECT
quote_ident(
replace(
replace(lower(COALESCE(fact_subcats[4], fact_subcats[3])), ' ','_'),
':',''
)
) As col_name,
fact_type_id
FROM census.lu_fact_types
WHERE category = 'Population' AND fact_subcats[3] ILIKE 'Hispanic or Latino%'
ORDER BY short_name
[PGSCRIPT ] census.hisp_pop(tract_id varchar(11) PRIMARY KEY , hispanic_or_latino numeric(12,3), white_alone numeric(12,3), black_or_african_american_alone numeric(12,3), american_indian_and_alaska_native_alone numeric(12,3), asian_alone numeric(12,3), native_hawaiian_and_other_pacific_islander_alone numeric(12,3), some_other_race_alone numeric(12,3), two_or_more_races numeric(12,3))
于是在
CREATE TABLE census.hisp_pop(tract_id varchar(11) PRIMARY KEY , hispanic_or_latino numeric(12,3), white_alone numeric(12,3), black_or_african_american_alone numeric(12,3), american_indian_and_alaska_native_alone numeric(12,3), asian_alone numeric(12,3), native_hawaiian_and_other_pacific_islander_alone numeric(12,3), some_other_race_alone numeric(12,3), two_or_more_races numeric(12,3));
查看数据库中是否多了表
postgres@localhost:5436 postgresql_book# \d census.hisp_pop
数据表 "census.hisp_pop"
栏位 | 类型 | 校\xB6怨嬖?| \xBF煽盏 | 预设
--------------------------------------------------+-----------------------+----------+----------+----
tract_id | character varying(11) | | not null |
hispanic_or_latino | numeric(12,3) | | |
white_alone | numeric(12,3) | | |
black_or_african_american_alone | numeric(12,3) | | |
american_indian_and_alaska_native_alone | numeric(12,3) | | |
asian_alone | numeric(12,3) | | |
native_hawaiian_and_other_pacific_islander_alone | numeric(12,3) | | |
some_other_race_alone | numeric(12,3) | | |
two_or_more_races | numeric(12,3) | | |
索引:
"hisp_pop_pkey" PRIMARY KEY, btree (tract_id)
SELECT * FROM census.lu_fact_types WHERE category = 'Population' ORDER BY short_name;
postgres@localhost:5436 postgresql_book# SELECT *
postgresql_book-# FROM census.lu_fact_types
postgresql_book-# WHERE category = 'Population'
postgresql_book-# ORDER BY short_name;
fact_type_id | category | fact_subcats | short_name
--------------+------------+--------------------------------------------------------------------------------------------+------------
52 | Population | {D001,Total:} | d001
53 | Population | {D002,Total:,"Not Hispanic or Latino:"} | d002
54 | Population | {D003,Total:,"Not Hispanic or Latino:","White alone"} | d003
55 | Population | {D004,Total:,"Not Hispanic or Latino:","Black or African American alone"} | d004
56 | Population | {D005,Total:,"Not Hispanic or Latino:","American Indian and Alaska Native alone"} | d005
57 | Population | {D006,Total:,"Not Hispanic or Latino:","Asian alone"} | d006
58 | Population | {D007,Total:,"Not Hispanic or Latino:","Native Hawaiian and Other Pacific Islander alone"} | d007
59 | Population | {D008,Total:,"Not Hispanic or Latino:","Some Other Race alone"} | d008
60 | Population | {D009,Total:,"Not Hispanic or Latino:","Two or More Races"} | d009
61 | Population | {D010,Total:,"Hispanic or Latino:"} | d010
62 | Population | {D011,Total:,"Hispanic or Latino:","White alone"} | d011
63 | Population | {D012,Total:,"Hispanic or Latino:","Black or African American alone"} | d012
64 | Population | {D013,Total:,"Hispanic or Latino:","American Indian and Alaska Native alone"} | d013
65 | Population | {D014,Total:,"Hispanic or Latino:","Asian alone"} | d014
66 | Population | {D015,Total:,"Hispanic or Latino:","Native Hawaiian and Other Pacific Islander alone"} | d015
67 | Population | {D016,Total:,"Hispanic or Latino:","Some Other Race alone"} | d016
68 | Population | {D017,Total:,"Hispanic or Latino:","Two or More Races"} | d017
(17 行记录)
postgres@localhost:5436 postgresql_book# SELECT * FROM census.lu_fact_types WHERE fact_subcats[3] ILIKE 'Hispanic or Latino%';
fact_type_id | category | fact_subcats | short_name
--------------+------------+----------------------------------------------------------------------------------------+------------
61 | Population | {D010,Total:,"Hispanic or Latino:"} | d010
62 | Population | {D011,Total:,"Hispanic or Latino:","White alone"} | d011
63 | Population | {D012,Total:,"Hispanic or Latino:","Black or African American alone"} | d012
64 | Population | {D013,Total:,"Hispanic or Latino:","American Indian and Alaska Native alone"} | d013
65 | Population | {D014,Total:,"Hispanic or Latino:","Asian alone"} | d014
66 | Population | {D015,Total:,"Hispanic or Latino:","Native Hawaiian and Other Pacific Islander alone"} | d015
67 | Population | {D016,Total:,"Hispanic or Latino:","Some Other Race alone"} | d016
68 | Population | {D017,Total:,"Hispanic or Latino:","Two or More Races"} | d017
(8 行记录)
postgres@localhost:5436 postgresql_book# SELECT * FROM census.lu_fact_types WHERE category ='Population' AND fact_subcats[3] ILIKE 'Hispanic or Latino%';
fact_type_id | category | fact_subcats | short_name
--------------+------------+----------------------------------------------------------------------------------------+------------
61 | Population | {D010,Total:,"Hispanic or Latino:"} | d010
62 | Population | {D011,Total:,"Hispanic or Latino:","White alone"} | d011
63 | Population | {D012,Total:,"Hispanic or Latino:","Black or African American alone"} | d012
64 | Population | {D013,Total:,"Hispanic or Latino:","American Indian and Alaska Native alone"} | d013
65 | Population | {D014,Total:,"Hispanic or Latino:","Asian alone"} | d014
66 | Population | {D015,Total:,"Hispanic or Latino:","Native Hawaiian and Other Pacific Islander alone"} | d015
67 | Population | {D016,Total:,"Hispanic or Latino:","Some Other Race alone"} | d016
68 | Population | {D017,Total:,"Hispanic or Latino:","Two or More Races"} | d017
(8 行记录)
SELECT quote_ident(replace(replace(lower(COALESCE(fact_subcats[4], fact_subcats[3])), ' ','_'),':','')) As col_name, fact_type_id FROM census.lu_fact_types WHERE category = 'Population' AND fact_subcats[3] ILIKE 'Hispanic or Latino%' ORDER BY short_name;
postgres@localhost:5436 postgresql_book# SELECT quote_ident(replace(replace(lower(COALESCE(fact_subcats[4], fact_subcats[3])), ' ','_'),':','')) As col_name, fact_type_id
postgresql_book-# FROM census.lu_fact_types
postgresql_book-# WHERE category = 'Population' AND fact_subcats[3] ILIKE 'Hispanic or Latino%'
postgresql_book-# ORDER BY short_name;
col_name | fact_type_id
--------------------------------------------------+--------------
hispanic_or_latino | 61
white_alone | 62
black_or_african_american_alone | 63
american_indian_and_alaska_native_alone | 64
asian_alone | 65
native_hawaiian_and_other_pacific_islander_alone | 66
some_other_race_alone | 67
two_or_more_races | 68
(8 行记录)
这里用到了 SQL 中的一些函数, 比如
SQL 中的
其语法为
COALESCE(value [, ...])
postgres@localhost:5436 tutorial# SELECT COALESCE(NULL, NULL, 'atzjg.net', 'Sowya') AS Result; result ----------- atzjg.net (1 行记录)
lower 函数的功能是将其字符串转换为小写.
postgres@localhost:5432 postgresql_book# select lower('HelloWorlD');
lower
------------
helloworld
(1 行记录)
因此,
PostgreSQL 的
注意
quote_ident(string)
postgres@localhost:5432 postgresql_book# select quote_ident('hello_:world : p');
quote_ident
--------------------
"hello_:world : p"
(1 行记录)
postgres@localhost:5436 tutorial# SELECT QUOTE_IDENT('select * from employees');
quote_ident
---------------------------
"select * from employees"
(1 行记录)
下面的例子是对前面刚创建好的表
DECLARE @I, @labels, @tload, @tcols, @fact_types; SET @I=0; SET @labels= SELECT quote_ident( replace( replace( lower(COALESCE(fact_subcats[4], fact_subcats[3])), ' ','_'), ':','') ) As col_name, fact_type_id FROM census.lu_fact_types WHERE category='Population' AND fact_subcats[3] ILIKE 'Hispanic or Latino%' ORDER BY short_name; SET @tload='tract_id'; SET @tcols='tract_id'; SET @fact_types='-1'; WHILE @I < LINES(@labels) BEGIN SET @tcols=@tcols+', '+@labels[@I][0]; SET @tload=@tload+ ', MAX(CASE WHEN fact_type_id=' + CAST(@labels[@I][1] AS STRING) + ' THEN val ELSE NULL END)'; SET @fact_types=@fact_types+', '+CAST(@labels[@I][1] As STRING); SET @I=@I+1; END INSERT INTO census.hisp_pop(@tcols) SELECT @tload FROM census.facts WHERE fact_type_id IN(@fact_types) AND yr=2011 GROUP BY tract_id;
执行上面的 pgScript 后, 消息窗口输出
[QUERY ] SELECT
quote_ident(
replace(
replace(
lower(COALESCE(fact_subcats[4], fact_subcats[3])), ' ','_'),
':','')
) As col_name,
fact_type_id
FROM census.lu_fact_types
WHERE category='Population' AND fact_subcats[3] ILIKE 'Hispanic or Latino%'
ORDER BY short_name
[QUERY ] INSERT INTO census.hisp_pop(tract_id, hispanic_or_latino, white_alone, black_or_african_american_alone, american_indian_and_alaska_native_alone, asian_alone, native_hawaiian_and_other_pacific_islander_alone, some_other_race_alone, two_or_more_races)
SELECT tract_id, MAX(CASE WHEN fact_type_id=61 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=62 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=63 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=64 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=65 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=66 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=67 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=68 THEN val ELSE NULL END) FROM census.facts
WHERE fact_type_id IN(-1, 61, 62, 63, 64, 65, 66, 67, 68) AND yr=2011
GROUP BY tract_id
执行完pgScript, 这些数据就被插入到 census.hisp_pop 表中.
将上面的代码整理一下
INSERT INTO census.hisp_pop (tract_id, hispanic_or_latino, white_alone, black_or_african_american_alone, american_indian_and_alaska_native_alone, asian_alone, native_hawaiian_and_other_pacific_islander_alone, some_other_race_alone, two_or_more_races) SELECT tract_id, MAX(CASE WHEN fact_type_id=61 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=62 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=63 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=64 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=65 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=66 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=67 THEN val ELSE NULL END), MAX(CASE WHEN fact_type_id=68 THEN val ELSE NULL END) FROM census.facts WHERE fact_type_id IN(-1, 61, 62, 63, 64, 65, 66, 67, 68) AND yr=2011 GROUP BY tract_id
将其中的
SELECT tract_id,
MAX(CASE WHEN fact_type_id=61 THEN val ELSE NULL END),
MAX(CASE WHEN fact_type_id=62 THEN val ELSE NULL END),
MAX(CASE WHEN fact_type_id=63 THEN val ELSE NULL END),
MAX(CASE WHEN fact_type_id=64 THEN val ELSE NULL END),
MAX(CASE WHEN fact_type_id=65 THEN val ELSE NULL END),
MAX(CASE WHEN fact_type_id=66 THEN val ELSE NULL END),
MAX(CASE WHEN fact_type_id=67 THEN val ELSE NULL END),
MAX(CASE WHEN fact_type_id=68 THEN val ELSE NULL END)
FROM census.facts
WHERE fact_type_id IN(-1, 61, 62, 63, 64, 65, 66, 67, 68) AND yr=2011
GROUP BY tract_id;
SQL 中的
postgres@localhost:5436 postgresql_book# SELECT tract_id, postgresql_book-# MAX(CASE WHEN fact_type_id=61 THEN val ELSE NULL END), postgresql_book-# MAX(CASE WHEN fact_type_id=62 THEN val ELSE NULL END), postgresql_book-# MAX(CASE WHEN fact_type_id=63 THEN val ELSE NULL END), postgresql_book-# MAX(CASE WHEN fact_type_id=64 THEN val ELSE NULL END), postgresql_book-# MAX(CASE WHEN fact_type_id=65 THEN val ELSE NULL END), postgresql_book-# MAX(CASE WHEN fact_type_id=66 THEN val ELSE NULL END), postgresql_book-# MAX(CASE WHEN fact_type_id=67 THEN val ELSE NULL END), postgresql_book-# MAX(CASE WHEN fact_type_id=68 THEN val ELSE NULL END) postgresql_book-# FROM census.facts postgresql_book-# WHERE fact_type_id IN(-1, 61, 62, 63, 64, 65, 66, 67, 68) AND yr=2011 postgresql_book-# GROUP BY tract_id limit 10; tract_id | max | max | max | max | max | max | max | max -------------+---------+---------+---------+--------+-------+-------+---------+-------- 25021417602 | 140.000 | 77.000 | 16.000 | 1.000 | 3.000 | 0.000 | 34.000 | 9.000 25017350900 | 127.000 | 87.000 | 11.000 | 1.000 | 1.000 | 0.000 | 16.000 | 11.000 25021414200 | 172.000 | 105.000 | 15.000 | 5.000 | 1.000 | 0.000 | 29.000 | 17.000 25023502200 | 86.000 | 58.000 | 4.000 | 0.000 | 0.000 | 0.000 | 17.000 | 7.000 25025100300 | 352.000 | 60.000 | 107.000 | 1.000 | 0.000 | 0.000 | 138.000 | 46.000 25017351300 | 313.000 | 159.000 | 12.000 | 16.000 | 1.000 | 1.000 | 97.000 | 27.000 25009208200 | 159.000 | 75.000 | 4.000 | 4.000 | 0.000 | 0.000 | 54.000 | 22.000 25017374700 | 168.000 | 118.000 | 6.000 | 1.000 | 0.000 | 0.000 | 40.000 | 3.000 25017311900 | 590.000 | 242.000 | 34.000 | 3.000 | 6.000 | 0.000 | 270.000 | 35.000 25001011002 | 69.000 | 31.000 | 0.000 | 5.000 | 3.000 | 0.000 | 27.000 | 3.000 (10 行记录)
注意
postgres@localhost:5436 postgresql_book# SELECT count(*) from census.facts; count -------- 100504 (1 行记录)
若按
postgres@localhost:5436 postgresql_book# SELECT count(*) from census.facts GROUP BY tract_id;
count
-------
68
68
68
68
68
68
68
68
...
事实上, 有1478个组, 即1478个不同的
postgres@localhost:5436 postgresql_book# SELECT COUNT(*) FROM (SELECT count(*) from census.facts GROUP BY tract_id) AS T; count ------- 1478 (1 行记录)
postgres@localhost:5436 postgresql_book# select 1478*68; ?column? ---------- 100504 (1 行记录)
postgres@localhost:5436 postgresql_book# select count(tract_id) from census.facts where fact_type_id=61; count ------- 1478 (1 行记录)
pgAdmin 能够以图形化方式展示语句执行计划.
打开 SQL 语句执行窗口, 编写一个 SQL 语句, 然后点击“解释查询”图标就可以看到此语句的执行计划图示.
假设执行下面的查询:
SELECT left(tract_id, 5) As country_code, SUM(hispanic_or_latino) As tot, SUM(white_alone) As tot_white, SUM(COALESCE(hispanic_or_latino,0)-COALESCE(white_alone,0)) As non_white FROM census.hisp_pop GROUP BY country_code ORDER BY country_code;
执行的查询语句为
-- 执行查询: EXPLAIN (ANALYZE off, VERBOSE off, COSTS on, BUFFERS off, TIMING off )SELECT left(tract_id, 5) As country_code, SUM(hispanic_or_latino) As tot, SUM(white_alone) As tot_white, SUM(COALESCE(hispanic_or_latino,0)-COALESCE(white_alone,0)) As non_white FROM census.hisp_pop GROUP BY country_code ORDER BY country_code; 5 行查询结果即将返回。
文本解释计划:
"Sort (cost=159.32..163.02 rows=1478 width=20)"
" Sort Key: ("left"((tract_id)::text, 5))"
" -> HashAggregate (cost=51.95..81.51 rows=1478 width=20)"
" Group Key: "left"((tract_id)::text, 5)"
" -> Seq Scan on hisp_pop (cost=0.00..33.48 rows=1478 width=20)"
因此在 Linux/Unix 系统中它可以取代 crontab; 在 Windows 中, 它可取代定时任务规划器.
事实上, pgAgent 的定时任务功能远比这里描述的更强大: 任何一台机器, 不管操作系统是什么, 只要它上面能安装 pgAgent, 那么我们就可以在此机器上执行定时任务.
Windows 上使用 StackBuilder for PostgreSQL 安装 pgAgent. 首先启动 StackBuilder. 在 Windows 任务栏的搜索框中输入 Stack Builder 就可以找到 Application Stack Builder. 点击启动.
如果是 Local Notes 下的 PostgreSQL 9, 则运行相应的 stackbuilder.exe, 一般位于