×
Community Blog PostgreSQL 10.0 Preview: Full-Text Search of JSON-Formatted Content

PostgreSQL 10.0 Preview: Full-Text Search of JSON-Formatted Content

This blog discusses the PostgreSQL community and the patch where PostgreSQL supports full-text searches of JSON-formatted content with tsvector.

By Digoal

PostgreSQL has a very vibrant community that is constantly helping to release new features and make sure that these features are stable. This is one reason for the great stability and reliability of PostgreSQL.

In version 10.0 of PostgreSQL, the full-text search of JSON-formatted content and ts rank and phrase indexes for searching are supported. This means specifically that you can query natural-language documents to location those words or expressions that best match your query. Ts rank and phrase indexes involve both tsvector and tsquery. Tsvector, specifically, is used for a document that is optimized for text search.

Email Patch

Below is a patch that discusses in detail support for ts rank and phrase indexes, which was detailed in the mail group. This was produced by the ever so enthusiastic community. The link for this email patch is given in the references of this blog.

Hi all, I would like to propose patch with a set of new small functions for fts in case of jsonb data type:

  • to_tsvector(config, jsonb): makes a tsvector from all string values and elements of jsonb object. To prevent the situation, when tsquery can find a phrase consisting of lexemes from two different values or elements. This function will add an increment to position of each lexeme from every new value or element.
  • ts_headline(config, jsonb, tsquery, options): generates a headline directly from a jsonb object

Here are examples of how they work:

=# select to_tsvector('{"a": "aaa bbb", "b": ["ccc ddd"], "c": {"d": "eee  
fff"}}'::jsonb);  
                   to_tsvector  
-------------------------------------------------  
 'aaa':1 'bbb':2 'ccc':4 'ddd':5 'eee':7 'fff':8  
(1 row)  
  
  
=# select ts_headline('english', '{"a": "aaa bbb", "b": {"c": "ccc  
ddd"}}'::jsonb, tsquery('bbb & ddd & hhh'), 'StartSel = <, StopSel = >');  
     ts_headline  
----------------------  
 aaa <bbb> ccc <ddd>  
(1 row)  

Any comments or suggestions?

There are many great resourses found at PostgreSQL. There happens to be a very active and enthusiastic community. A patch like the one above may be discussed in the mail group for several months or even a few years. After repeated revisions according to the opinions of everyone, the patch to be merged into the master is already quite mature. This is one reason why PostgreSQL is known for its stability.

Useful Links

FTS functions for jsonb

[[PATCH] few fts functions for jsonb](https://www.postgresql.org/message-id/flat/CA+q6zcWm_1Ygg5QOq0gYbnB_=zq7G51uexQt3QEgDJa0qQnPKw@mail.gmail.com#CA+q6zcWm_1Ygg5QOq0gYbnB_=zq7G51uexQt3QEgDJa0qQnPKw@mail.gmail.com)

0 0 0
Share on

digoal

282 posts | 25 followers

You may also like

Comments