Thông báo


Chia sẻ
Tùy chọn
Xem bài viết cuối
Offline admin  
#1 Đã gửi : 23/11/2016 lúc 03:30:22(UTC)

Danh hiệu: Administration

Chức danh:

Nhóm: Administrators
Gia nhập: 23-07-2013(UTC)
Bài viết: 6,117
Viet Nam
Đến từ: Vietnam

Cảm ơn: 10 lần
Được cảm ơn: 2 lần trong 2 bài viết

The fuzzystrmatch module provides several functions to determine similarities and distance between strings.


At present, fuzzystrmatch does not work well with multi-byte encodings (such as UTF-8).

F.12.1. Soundex

The Soundex system is a method of matching similar-sounding names by converting them to the same code. It was initially used by the United States Census in 1880, 1900, and 1910. Note that Soundex is not very useful for non-English names.

The fuzzystrmatch module provides two functions for working with Soundex codes:

   soundex(text) returns text

difference(text, text) returns int

The soundex function converts a string to its Soundex code. The difference function converts two strings to their Soundex codes and then reports the number of matching code positions. Since Soundex codes have four characters, the result ranges from zero to four, with zero being no match and four being an exact match. (Thus, the function is misnamed —similarity would have been a better name.)

Here are some usage examples:

SELECT soundex('hello world!');

SELECT soundex('Anne'), soundex('Ann'), difference('Anne', 'Ann');

SELECT soundex('Anne'), soundex('Andrew'), difference('Anne', 'Andrew');

SELECT soundex('Anne'), soundex('Margaret'), difference('Anne', 'Margaret');

CREATE TABLE s (nm text);



INSERT INTO s VALUES ('wobbly');


SELECT * FROM s WHERE soundex(nm) = soundex('john');

SELECT * FROM s WHERE difference(s.nm, 'john') > 2;

F.12.2. Levenshtein

This function calculates the Levenshtein distance between two strings:

   levenshtein(text source, text target, int ins_cost, int del_cost, int sub_cost) returns int

levenshtein(text source, text target) returns int

Both source and target can be any non-null string, with a maximum of 255 bytes. The cost parameters specify how much to charge for a character insertion, deletion, or substitution, respectively. You can omit the cost parameters, as in the second version of the function; in that case they all default to 1.


test=# SELECT levenshtein('GUMBO', 'GAMBOL');



(1 row)

test=# SELECT levenshtein('GUMBO', 'GAMBOL', 2,1,1);



(1 row)

F.12.3. Metaphone

Metaphone, like Soundex, is based on the idea of constructing a representative code for an input string. Two strings are then deemed similar if they have the same codes.

This function calculates the metaphone code of an input string:

   metaphone(text source, int max_output_length) returns text

source has to be a non-null string with a maximum of 255 characters. max_output_length sets the maximum length of the output metaphone code; if longer, the output is truncated to this length.


test=# SELECT metaphone('GUMBO', 4);



(1 row)

F.12.4. Double Metaphone

The Double Metaphone system computes two "sounds like" strings for a given input string — a "primary" and an "alternate". In most cases they are the same, but for non-English names especially they can be a bit different, depending on pronunciation. These functions compute the primary and alternate codes:

   dmetaphone(text source) returns text

dmetaphone_alt(text source) returns text

There is no length limit on the input strings.


test=# select dmetaphone('gumbo');



(1 row)

F.13. hstore

This module implements a data type hstore for storing sets of (key,value) pairs within a single PostgreSQL data field. This can be useful in various scenarios, such as rows with many attributes that are rarely examined, or semi-structured data.

In the current implementation, neither the key nor the value string can exceed 65535 bytes in length; an error will be thrown if this limit is exceeded. These maximum lengths may change in future releases.

F.13.1. hstore External Representation

The text representation of an hstore value includes zero or more key => value items, separated by commas. For example:

    k => v

foo => bar, baz => whatever

"1-a" => "anything at all"

The order of the items is not considered significant (and may not be reproduced on output). Whitespace between items or around the => sign is ignored. Use double quotes if a key or value includes whitespace, comma, = or >. To include a double quote or a backslash in a key or value, precede it with another backslash. (Keep in mind that depending on the setting of standard_conforming_strings, you may need to double backslashes in SQL literal strings.)

A value (but not a key) can be a SQL NULL. This is represented as

    key => NULL

The NULL keyword is not case-sensitive. Again, use double quotes if you want the string null to be treated as an ordinary data value.

Currently, double quotes are always used to surround key and value strings on output, even when this is not strictly necessary.

F.13.2. hstore Operators and Functions

Table F-5. hstore Operators

hstore -> textget value for key (null if not present)'a=>x, b=>y'::hstore -> 'a'x
text => textmake single-item hstore'a' => 'b'"a"=>"b"
hstore || hstoreconcatenation'a=>b, c=>d'::hstore || 'c=>x, d=>q'::hstore"a"=>"b", "c"=>"x", "d"=>"q"
hstore ? textdoes hstore contain key?'a=>1'::hstore ? 'a't
hstore @> hstoredoes left operand contain right?'a=>b, b=>1, c=>NULL'::hstore @> 'b=>1't
hstore <@ hstoreis left operand contained in right?'a=>c'::hstore <@ 'a=>b, b=>1, c=>NULL'f

(Before PostgreSQL 8.2, the containment operators @> and <@ were respectively called @ and ~. These names are still available, but are deprecated and will eventually be retired. Notice that the old names are reversed from the convention formerly followed by the core geometric datatypes!)

Note: The => operator is deprecated and may be removed in a future release. Use the hstore(text, text) function instead.

Table F-6. hstore Functions

FunctionReturn TypeDescriptionExampleResult
hstore(text, text)hstoremake single-item hstorehstore('a', 'b')"a"=>"b"
akeys(hstore)text[]get hstore's keys as arrayakeys('a=>1,b=>2'){a,b}
skeys(hstore)setof textget hstore's keys as setskeys('a=>1,b=>2')


avals(hstore)text[]get hstore's values as arrayavals('a=>1,b=>2'){1,2}
svals(hstore)setof textget hstore's values as setsvals('a=>1,b=>2')


each(hstore)setof (key text, value text)get hstore's keys and values as setselect * from each('a=>1,b=>2')
 key | value


a | 1

b | 2

exist(hstore,text)booleandoes hstore contain key?exist('a=>1','a')t
defined(hstore,text)booleandoes hstore contain non-null value for key?defined('a=>NULL','a')f
delete(hstore,text)hstoredelete any item matching keydelete('a=>1,b=>2','b')"a"=>"1"

F.13.3. Indexes

hstore has index support for @> and ? operators. You can use either GiST or GIN index types. For example:

CREATE INDEX hidx ON testhstore USING GIST(h);

CREATE INDEX hidx ON testhstore USING GIN(h);

F.13.4. Examples

Add a key, or update an existing key with a new value:

UPDATE tab SET h = h || ('c' => '3');

Delete a key:

UPDATE tab SET h = delete(h, 'k1');

F.13.5. Statistics

The hstore type, because of its intrinsic liberality, could contain a lot of different keys. Checking for valid keys is the task of the application. Examples below demonstrate several techniques for checking keys and obtaining statistics.

Simple example:

SELECT * FROM each('aaa=>bq, b=>NULL, ""=>1');

Using a table:

SELECT (each(h)).key, (each(h)).value INTO stat FROM testhstore;

Online statistics:

SELECT key, count(*) FROM

(SELECT (each(h)).key FROM testhstore) AS stat


ORDER BY count DESC, key;

key | count


line | 883

query | 207

pos | 203

node | 202

space | 197

status | 195

public | 194

title | 190

org | 189


Ai đang xem chủ đề này?
OceanSpiders 2.0
Di chuyển  
Bạn không thể tạo chủ đề mới trong diễn đàn này.
Bạn không thể trả lời chủ đề trong diễn đàn này.
Bạn không thể xóa bài của bạn trong diễn đàn này.
Bạn không thể sửa bài của bạn trong diễn đàn này.
Bạn không thể tạo bình chọn trong diễn đàn này.
Bạn không thể bỏ phiếu bình chọn trong diễn đàn này.

| Cung cấp bởi YAF.NET | YAF.NET © 2003-2021, Yet Another Forum.NET
Thời gian xử lý trang này hết 1.061 giây.