Introduction
Nowadays, the most common way to transfer data is to use JSON (JavaScript Object Notation). But, in any case, we need to store the data on the server side. And here we have a choice to use either the NoSQL solution, for example, MongoDB, or the old trustee object-relational database.
Sometimes, we don’t even have this choice because we already use the ORB.
In Postgres 9.5, a new JSONB type column was introduced so that you can not only save and get the JSON but also perform operations based on the JSON structure.
In this blog post, I want to investigate these functions and implement a simple use case.
Use case
I have a simple application that registers users. I want to improve my application and add the functionality to configure the user interface (theme, icons, text size). Each user can have multiple configurations and switch between them.
The JSON I will work with will be the following:
{ "userid":"artur@gmail.com", "configurations":[ { "name":"myconf", "theme":"light", "icons":"small", "textsize":"large" }, { "name":"myconf2", "theme":"dark" } ] }
Creating a table
Let’s create a table to store the JSON configurations:
CREATE TABLE USER_CONFIGURATIONS ( ID BIGSERIAL PRIMARY KEY, DATA JSONB );
After running this script, we will have an empty table and start adding the data.
CRUD. Create
When the user registers in my application I want to create the record in the USER_CONFIGURATIONS table with the empty JSON array for configurations:
INSERT INTO USER_CONFIGURATIONS (DATA) VALUES('{"userid":"artur@gmail.com", "configurations":[]}'::jsonb);
Now, we can check the database and ensure that our entity was created successfully:
SELECT * FROM USER_CONFIGURATIONS;
id | data |
1 | {“userid”: “artur@gmail.com”, “configurations”: []} |
Let’s add one more entity to our table with configurations:
INSERT INTO USER_CONFIGURATIONS (DATA) VALUES('{"userid":"ihor@gmail.com", "configurations":[{ "name":"myconf", "theme":"light", "icons":"small", "textsize":"large" }, { "name":"myconf2", "theme":"dark" }]}'::jsonb);
Now if we check our table, we will find two entities:
id | data |
1 | {“userid”: “artur@gmail.com”, “configurations”: []} |
2 | {“userid”: “ihor@gmail.com”, “configurations”: [ {“name”: “myconf”, “icons”: “small”, “theme”: “light”, “textsize”: “large”}, {“name”: “myconf2”, “theme”: “dark”} ]} |
CRUD Read
Based on the use case, I need to be able to get all configurations by user ID. Let’s try to do this. I want to get all configurations related to ihor@gmail.com
SELECT DATA -> 'configurations' AS configs FROM USER_CONFIGURATIONS WHERE (DATA ->> 'userid') = 'ihor@gmail.com'
After executing this query, we will get:
configs |
[{“name”: “myconf”, “icons”: “small”, “theme”: “light”, “textsize”: “large”}, {“name”: “myconf2”, “theme”: “dark”}] |
Let’s analyze the query. When we work with JSONB column type, we can use additional functions such as ‘->’ and ‘->>’. Both of them will return the content of JSON Object by the name specified on the right side (in our case ‘configurations’). The difference between these functions is the returning type. ‘->’ returns the text and ‘->>’ the JSONB.
Besides this, I also need to be able to find the configuration for the user by name. Currently, you can do this in the following way:
SELECT config as congifuration FROM USER_CONFIGURATIONS CROSS JOIN jsonb_array_elements(DATA -> 'configurations') config WHERE (DATA ->> 'userid') = 'ihor@gmail.com' AND (config ->> 'name') = 'myconf';
After executing, we will get the expected result:
configuration |
{“name”: “myconf”, “icons”: “small”, “theme”: “light”, “textsize”: “large”} |
These are all read operations I currently need, so let’s switch to update.
CRUD. Update
I want to be able to add the new configuration to the already existing entity when I know the user ID. Let’s add a new configuration to artur@gmail.com
UPDATE USER_CONFIGURATIONS SET DATA = jsonb_set(DATA, '{configurations}'::text[], DATA ->'configurations' || '{"name":"firstconf", "theme":"dark", "textsize":"large"}'::jsonb) WHERE (DATA ->> 'userid') = 'artur@gmail.com';
We can check that we can successfully add the configuration by using the select SQL query with a username.
id | data |
1 | {“userid”: “artur@gmail.com”, “configurations”: [{“name”: “firstconf”, “theme”: “dark”, “textsize”: “large”}]} |
{"userid": "artur@gmail.com", "configurations": [{"name": "firstconf", "theme": "dark", "textsize": "large"}]}
Here we are using the jsonb_set(target jsonb, path text[], new_value jsonb, create_missing boolean) function. We pass the DATA as the target JSONB object, specify the ‘configurations’ JSON object as pass in which we want to replace information and define the new value for ‘configurations’ JSON object. Here is one trick. We don’t update the information. We replace it. The function ‘||’ is the concatenate function. So, here we take the previous ‘configurations’ value and concatenate it with the new one. After that, we replace the ‘configurations’ object in the initial DATA JSON and replace the DATA JSON with the updated one.
Let’s use this SQL query one more time to be sure that we are adding the configuration and are not replacing the already existing one. Now I will add the configuration with name = ‘second’ and only theme = “light”.
id | data |
1 | {“userid”: “artur@gmail.com”, “configurations”: [ {“name”: “firstconf”, “theme”: “dark”, “textsize”: “large”}, {“name”: “second”, “theme”: “light”} ]} |
Also, I want to update the configuration. For example, I want to change the name of my configuration. How can I do this? I think this is the hardest operation. The only way to do that is the following:
UPDATE USER_CONFIGURATIONS SET DATA = JSONB_SET( DATA, '{configurations}'::text[], (((DATA -> 'configurations') - (SELECT i FROM generate_series(0, jsonb_array_length(data->'configurations') - 1) AS i WHERE (data->'configurations'->i->>'name' = 'firstconf')))::jsonb || '{"name":"first", "theme":"light"}'::jsonb)) WHERE (DATA ->> 'userid') = 'artur@gmail.com';
After executing this query:
id | data |
1 | {“userid”: “artur@gmail.com”, “configurations”: [ {“name”: “second”, “theme”: “light”}, {“name”: “first”, “theme”: “light”} ]} |
But we did not update. We performed the complex operation when we were getting the new JSON object. Firstly, we got the current ‘configuration’ array. Then, we performed the SQL query to get the index of the configuration with name = ‘firstconf’ and deleted this element from the ‘configuration’ array. Then, we appended the new value and passed it to jsonb_set function. This works fine until we can be sure of the result of SELECT i query. In case, for some reason, we cannot get the index of the configuration element. For example, if we set the wrong configuration name, it would cause data loss.
id | data |
1 | <null> |
Also, we have the case when the user wants to clean all configurations he currently has.
UPDATE USER_CONFIGURATIONS SET DATA = jsonb_set(DATA, '{configurations}'::text[], '[]'::jsonb) WHERE (DATA ->> 'userid') = 'artur@gmail.com';
This is the simple update operation in which we replace all configurations on the empty JSON array. The result of this operation will be the following:
id | data |
1 | {“userid”: “artur@gmail.com”, “configurations”: []} |
The last update operation I want to introduce is deleting configuration by name. Currently, we have a user ‘ihor@gmail.com‘ who has two configurations.
id | data |
1 | {“userid”: “ihor@gmail.com”, “configurations”: [ {“name”: “myconf”, “icons”: “small”, “theme”: “light”, “textsize”: “large”}, {“name”: “myconf2”, “theme”: “dark”} ]} |
I want to delete the one with the name ”myconf2″.
UPDATE USER_CONFIGURATIONS SET DATA = DATA #- ('{configurations,' || ( SELECT i FROM generate_series(0, jsonb_array_length(DATA->'configurations') - 1) AS i WHERE (DATA->'configurations'->i->>'name' = 'myconf2') ) || '}')::text[] WHERE (DATA ->> 'userid') = 'ihor@gmail.com';
Here is our result:
id | data |
1 | {“userid”: “ihor@gmail.com”, “configurations”: [ {“name”: “myconf”, “icons”: “small”, “theme”: “light”, “textsize”: “large”} ]} |
In this SQL query, we use the remove function ‘#-‘ that takes the text[] variable. So, to delete the second array element, we can use:
DATA #- '{configurations, 1}'::text[]
The main problem here is to get the array index of the element we want to delete based on its name. For this, we use the additional query. But, in this case, we have the same problem as with updating by name. In case, we cannot find the array index we will set null instead of JSON object.
id | data |
2 | <null> |
CRUD. Delete
The simplest case for delete operation. The user doesn’t want to use my service and he is deleting his account from the system.
DELETE FROM USER_CONFIGURATIONS WHERE (DATA ->> 'userid') = 'ihor@gmail.com'
Conclusion. NoSQL VS PostgreSQL JSONB
I think this choice depends on the situation. If you are already using PostgresSQL in your application, the JSONB type could become a really good feature to simplify data storage. The problematic part with PostgresSQL JSONB is modifying a JSONB column content. In any case, we totally replace the old content with a new one. So, the update operations turn to complex queries that can lose the content. You can avoid this by performing the full JSON reorganization before the persistent operation.
Useful links:
Habrahabr JSONB operations (russian)
by Artur Meshcheriakov