postgresql – Composite multicolumn index for geopoint range and numeric range query

I am building an app where the server needs to select rows based on some criteria/filters. One of them is the location of the user and the radius at which the user want’s to see posts and other filters such date range and filter for a value of another column. This is going to be for an ad-hoc event discovery app.

I have read about PostGIS, its geometry,geography types and I know there is a native point datatype. Based on this answer I understood that it is better to order from equality to range columns, even though I feel like geo point column should be the first.

Suppose the following few rows of a simplified events table (disregard the validity position data):

id  event_title                  event_position   event_type  is_public  start_date
    (varchar)                    (point lat/lon)  (smallint)  (boolean)  (timestamptz)
--  ---------------------------  ---------------  ---------   ---------  ----
 1  "John's Party"               (122,35)         0           0          2020-07-05
 2  "Revolution then Starbucks"  (123,30)         1           1          2020-07-06
 3  "Study for math exam"        (120,36)         2           1          2020-07-07
 4  "Party after exam"           (120,36)         1           1          2020-07-08
 5  "Hiking next to the city"    (95,40)          3           1          2020-07-09
 6  "Football match"             (-42,31)         4           1          2020-07-10

Imagine the table contains several thousand records at least, obviously not only 6.

So in this table a user would be able to query public events close to (122,34) by 100km (suppose first three rows fall into this area) and of event types 0, 1 or 2 falling between dates 2020-07-05 and 2020-07-07. The user would get the rows with ID 2 and 3.

This is the query I want to optimize with an appropriate index. My question is, how is it possible to create such an index? I thought about GiST or GIN index but not sure how these could help. Thanks!

restore – PostgreSQL COPY – Generated columns cannot be used in COPY

Im trying to import a SQL dump file for a table (A COPY command dump file). The table has a generated column. While restoring the dump Im getting this error.

ERROR:  column "call_status" is a generated column
DETAIL:  Generated columns cannot be used in COPY.

How do I bypass this error?

PostgreSQL version: 12

postgresql – wal_level set to replica at database level but i dont see that in configuration file

wal_level is commented in postgresql.conf and i dont see any entry in auto.conf either. but at the databsae level i see its set to REPLICA. Is there any other place whee it could have been set?

postgres@postgresqlmaster:/etc/postgresql/12/data$ psql -p 5433
psql (12.3 (Ubuntu 12.3-1.pgdg18.04+1))
Type "help" for help.

postgres=# select name, setting, sourcefile, sourceline from pg_settings where name = 'wal_level';
   name    | setting | sourcefile | sourceline
-----------+---------+------------+------------
 wal_level | replica |            |
(1 row)

postgres@postgresqlmaster:/etc/postgresql/12/data$ cat /var/lib/postgresql/12/data/postgresql.auto.conf
# Do not edit this file manually!
# It will be overwritten by the ALTER SYSTEM command.
listen_addresses = '*'
shared_buffers = '200MB'
synchronous_standby_names = 'pgstandby_synch1'

postgresql – Best field type for cryptocurrency big numbers?

I’m using PostgreSQL for my cryptocurrency exchange database, the question is: for saving currency amount (numbers) with their precision (like 323232323232323.45454545 ~ 23 digit + 1 dot: 15 digit before dot and 8 digit after that), Should I use varchar(24) type for them or double precision or numeric(15,8) ?

Note: It seems that double precision type cant properly save big numbers like example above and it will be rounded to 323232323232323!

Witch one has better performance (speed) and needs less resources?

postgresql – Caveats of OUTER JOIN on nested JSON value

I’m writing a query which is supposed to find the elements from a list which DO NOT exist in the DB. My first attempt at this was to use a nested query where the first query fetches the ids, then I right join on that query to get what I need, and this works well:

select v.id from (
    select distinct json_data ->> 'elementId' as elementId
    from content
    and json_data->> 'elementId' in ('id1', 'id2', 'id3')
) as a
right join (values('id1'), ('id2'), ('id3')) as v(id)
on v.id = a.elementId
where a.elementId is null

The above query works perfect except for the fact that I want to I should be able to reduce the nested query to a regulat select if I do the comparison on json_data ->> 'elementId' directly.

My attempt:

select v.id
from content a
right join (values('id1'), ('id2'), ('id3')) as v(id)
on json_data ->> 'elementId' = v.id

After some debugging I realized that this will never work because the content table will always contain a row even if json_data ->>'elementId' is null.

My question is; Is there a way to avoid using a nested query when wanted to do a left join or right join on JSON data?

postgresql – backup using gzip slow

We are currently backing up some schemas in Postgres using this command:

pg_dump -h localhost -n test_schema mydb | gzip > /data1/backup/test_chema.dmp.gz

And we are getting a rate of 50 Megabytes per minute, which is pretty slow, we think this could be improved.

Disk seems to be OK so is CPU.

Any thoughts on how this can get improved?

pg hba.conf – postgresql only allows connections via unix socket even though my pg_hba.conf lists 127.0.0.1/32?

I have the following pg_hba.conf:

# TYPE  DATABASE        USER            ADDRESS                 METHOD

# "local" is for Unix domain socket connections only
local   all             all                                     trust
# IPv4 local connections:
host    all             all             127.0.0.1/32            trust
# IPv6 local connections:
host    all             all             ::1/128                 trust
# Allow replication connections from localhost, by a user with the
# replication privilege.
local   replication     all                                     trust
host    replication     all             127.0.0.1/32            trust
host    replication     all             ::1/128                 trust

With the above I assume host all all 127.0.0.1/32 trust would allow TCP connections from localhost. However this does not seem to be the case

(root@XenonKiloCranberry:~)# psql -U postgres
psql (11.7)
Type "help" for help.

postgres=# q

(root@XenonKiloCranberry:~)# psql -U postgres -h 127.0.0.1
psql: FATAL:  no pg_hba.conf entry for host "127.0.0.1", user "postgres", database "postgres", SSL off

Where am I going wrong?

partitioning – Integrating large PostGreSQL table as partition for a new table

I currently have a relatively large table containing timeseries data (628+ mil. live rows). The table definition (some names changed slightly) at the bottom of question.

I want to attach the existing table as a partition to a new (partitioned) table. However, the existing table has a singular id primary key (mainly used because Django requires it). Attaching the table would require me to update the primary key constraint to (id, timestamp) on the old table.

Because id is unique this isn’t an issue, but given the size of the table I wonder if this constraint is checked upon creation (leading to quite the query) or whether only newly added/updated rows are checked. Stopping read/writes to the table for a few minutes is possible, but I can’t wait for multiple hours.


Intended new table

Like the old table the id column is mainly required by our ORM sadly. A similar (prop_id, "timestamp", value) index would be used on the partitioned table as well.

CREATE TABLE "newtable" (
    "id"            bigserial NOT NULL,
    "timestamp"     timestamp with time zone NOT NULL,
    "prop_id"       integer NOT NULL,
    "value"         double precision NOT NULL,
    PRIMARY KEY ("id", "timestamp")
) PARTITION BY RANGE ("timestamp")
;

Old table definition

The "id" primary key is an artifact of our ORM (Django) and is inconsequential for any queries we do. We use the (prop_id, "timestamp", value) index 99.9% of the time for index-only scans.

  Column   |           Type           | Collation | Nullable |                         Default                     | Storage | Stats target | Description
-----------+--------------------------+-----------+----------+----------------------------------------------------------+---------+--------------+-------------
 id        | bigint                   |           | not null | nextval('tablename_id_seq'::regclass) | plain   |              |
 timestamp | timestamp with time zone |           | not null |                     | plain   |              |
 value     | double precision         |           | not null |                     | plain   |              |
 prop_id   | integer                  |           | not null |                     | plain   |              |
Indexes:
    "tablename_pkey" PRIMARY KEY, btree (id)
    "tablename_prop_id_timestamp_value_b9bc8326_idx" btree (prop_id, "timestamp", value)
Foreign-key constraints:
    "tablename_prop_id_67f339b0_fk_othertable" FOREIGN KEY (prop_id) REFERENCES othertable(id) DEFERRABLE INITIALLY DEFERRED

postgresql – How to prevent different connections treating bytea in same query differently?

I find that identical postgresql queries issued by two different clients are handled differently. Specifically, bytea values are inserted differently.

A query that demonstrated the behaviour is this:

INSERT INTO "TestTable" ("Input") VALUES (decode('74657374', 'hex'))

74657374 is hexadecimal for ‘test’. In one client, ‘test’ is inserted into the “Input” field, whether that field is text/varchar or bytea. That is the behaviour I desire. In another client, ‘x74657374’ is inserted into the “Input” field, whether it is text/varchar or bytea. This string is the postgresql literal representation of the bytea bytes of ASCII ‘test’. But the sql literal syntax itself is inserted. I do not desire this behaviour.

I use a single piece of hand-written SQL, the bytea value only occurs “within” the query (if column “Input” has type Text then none of literals nor the receiving column have bytea type), and it seems unlikely to me that either client is parsing then rebuilding the query. Therefore it seems that the difference must be happening on the server where the query is executed. That means there must be some connection specific configuration setting which is altering the server behaviour.

Can anyone please tell me what connection specific settings could be altering this behaviour?

I know the queries are really behaving differently and it is not a display issue etc., because I can see the rows in the same table having different values (‘test’ and ‘x74657374’). I have tried various alternative bytea handling methods, but they are all affected by this problem. For those who are interested, the “good” client is pgAdminIII and the “bad” client is Ruby PG gem. Though for the reason I gave above I believe there must be some built-in feature of postgresql supporting this behaviour.

Best approach for UNIQUE ( not UUID ) primary key for a mult tenant with schemas on Postgresql

I’m working on some projects and I always have this doubt and I don’t know if I’m doing it in the wrong way or in the right way.

I have 2 new applications one is a MarketPlace and the other is a SAAS software. For both, I will need to make a global search feature on the application, so I make a multi tenant with schemas on PostgreSQL where each of my clients will have a schema like:

master
public
tenant1
tenant2
tenant3

and goes on…

I have a table called product when each of my clients will store a product and i want to make a global QUERY for each of the product tables from all my tenants.

I use Inheritance from PostgreSQL following this topic:

https://stackoverflow.com/questions/20575610/select-retrieve-all-records-from-multiple-schemas-using-postgres

And now all is working fine but with one problem, I make a simple sequence for all my schemas and I need to make it GLOBAL. So I using 2 approaches to make this work

APPROACH #1

I store in each tenant table the tableoid and I make a JOIN with tableoid and primary key

SELECT  tableoid, tableoid::regclass, * FROM master.bra_product LIMIT 10

APPROACH #2

I create a GLOBAL SEQUENCE on public schema on PostgreSQL for each new tables on the database

CREATE TABLE tenant3.bra_product (
        product_id integer DEFAULT nextval('public.product_global_seq'),
        description varchar
    );

APPROACH #3

Create a composite primary key with SEQUENCE + UUID ( this for me I have a bad experience with UUID my QUERIES take so long to retrieve ), that if I need to make a tenant select I use SEQUENCE if I need a global I make a JOIN with UUID

Please give some light to move on