Transforming Column Based Search To Document Based Search In Postgresql

1 minute read

Published:

Combining where clause and order clause is one thing.

But what about when the search result requires you to have a degree of relevance, and at the same time you have to maintain the performance…(?)

Postgre offers Text-Search Types

Just take a look at its documentation:

` PostgreSQL provides two data types that are designed to support full-text search, which is the activity of searching through a collection of natural-language documents to locate those that best match a query. The tsvector type represents a document in a form optimized for text search; the tsquery type similarly represents a text query. `

Just go and find out.

At least I have two references to start with:

this guy.

this guy.

this official documentation

From those references, they give you each the option to make the where clause for searching;

This approach:

SELECT
  workid, act, scene, description, body, ts_rank(tsv, q) as rank
FROM
  scenes, plainto_tsquery('heaven earth') q
WHERE
  tsv @@ q
ORDER BY
  rank DESC
LIMIT
  10;

Or this approach:

select file_id, content from document_text where to_tsvector(content) @@ to_tsquery('(!bear & predict)| instruments');

Or you want to use ranking approach.

WHERE (ts_rank_cd(your_ts_vector_column, to_tsquery('``:*&term1:*&term2:*|term3:*')) > 0 )

So as you can see they support regex/wildcards, and the | or & are self-explanatory.

if you want to have multi-word search assuming separated by spaces, can add <-> every word like this

say the term we are looking for is “string builder”.

(ts_rank_cd(full_text_search, to_tsquery('``:*|stri:* <-> buil:*&otherterm:*')) > 0 )

go and find out yourself which one is better for you…

just make sure that your query should ignore case and common cases to handle non-word letters (what is it called? There is a term for it in Language Science and in Natural Language Processing).