In this article, we will talk about fine-grained access control in DB2 – hiding data that is actually present in the database, but should not be accessible to certain users.
Fine-grained access control is usually performed in the database itself as the data is accessed, but can also be performed between the database server and the database client using a programmable proxy.
Each of these approaches has its advantages and disadvantages. Let’s look at each of them and compare them.
Fine-grained access control capabilities of DB2
DB2 has excellent fine-grained access control capabilities – among the best in the RDBMS world.
They come in two flavors:
There is some overlap between these two approaches. The biggest difference is that RCAC based on the code (involves executing SQL code to determine if a piece of data is available and how) while LBAC based on data (relies on declarations and tags, without any code).
Row and Column Access Control (RCAC)
RCAC, as its name suggests, consists of two parts:
- Row permissionswhich allow you to define who has access to which rows
- Column maskswhich allow you to define how certain columns are displayed to the database client
By combining them, you can achieve the most common access control requirements.
Row permissions
Line permissions are expressed as sections of code that define which lines are visible. For example:
CREATE PERMISSION CustomersEuropeOnly ON DEMO.CUSTOMERS
FOR ROWS WHERE
VERIFY_GROUP_FOR_USER(SESSION_USER, 'SALES_EUROPE') AND
COUNTRY IN ('BE', 'DK', 'ES', 'FR', 'GR', 'HN', 'IT')
ENFORCED FOR ALL ACCESS
ENABLE;
This specifies the database users that belong to the group SALES_EUROPE
can see the rows in the table DEMO.CUSTOMERS
where is COUNTRY
column has one of the specified values. These values can be looked up or calculated from the database, and permissions can call SQL functions including safe user-defined functions, so there is quite a bit of flexibility.
Additional permissions can be added to the same table, and DB2 will automatically apply them all and display only the rows that satisfy at least one permission.
This type of permission covers all table access: select, insert, update and delete, directly or via a view. For example, if the user tries to insert a row that does not satisfy at least one permission, the insertion will fail. Same, obviously, for update and delete.
Column masks
Column masks are also expressed as pieces of code that specify how a column should be presented to the client. For example:
CREATE MASK CUSTOMERS_LAST_NAME_MASK ON DEMO.CUSTOMERS
FOR COLUMN LAST_NAME RETURN
CASE WHEN (VERIFY_ROLE_FOR_USER(SESSION_USER, 'ENGINEER') AND COUNTRY IN ('BE', 'ES'))
THEN SUBSTR(LAST_NAME, 1, 1) || 'XXXX'
ELSE LAST_NAME
END
ENABLE;
This specifies that for database users who have that role ENGINEER
column LAST_NAME
in the table DEMO.CUSTOMERS
it should be masked, showing only the first character, followed by it XXXX
but only for customers in certain countries.
You can only have one mask per column, so things can get painful if you have a lot of logic.
Label-Based Access Control (LBAC)
Unlike permissions and masks in RCAC, which are defined by code, tag-based access control is declarative.
It works by creating a system of sticker components, which can be organized as disorderly gatheringsfor example:
CREATE SECURITY LABEL COMPONENT COMPARTMENT
SET 'RESEARCH', 'MANAGEMENT', 'FINANCE';
As ordered stringswhere higher levels include lower levels, eg:
CREATE SECURITY LABEL COMPONENT CLASSIFICATION_LEVEL
ARRAY [ 'TOP SECRET', 'SECRET', 'CONFIDENTIAL', 'PUBLIC' ];
Or inside treeswhere again higher levels include lower levels:
CREATE SECURITY LABEL COMPONENT REGION
TREE (
'WORLD' ROOT,
'AMERICAS' UNDER 'WORLD',
'CANADA' UNDER 'AMERICAS',
'USA' UNDER 'AMERICAS',
'CALIFORNIA' UNDER 'USA',
'TEXAS' UNDER 'USA',
etc...
This is quite powerful, although managing them all can quickly become a challenge.
Label components can then be assembled into labels, e.g.
CREATE SECURITY LABEL POLICY2024.RESEARCH_TOPSECRET_USA
COMPONENT COMPARTMENT 'RESEARCH',
COMPONENT CLASSIFICATION_LEVEL 'TOP SECRET',
COMPONENT REGION 'USA';
These tags can then be used to protect individual columns:
CREATE TABLE DEMO.CUSTOMERS (
ID INT NOT NULL,
FIRST_NAME NVARCHAR(100) NOT NULL,
LAST_NAME NVARCHAR(100) NOT NULL,
COUNTRY CHAR(2) NOT NULL,
BALANCE NUMERIC(12,2) NOT NULL SECURED WITH FINANCE_CONFIDENTIAL_WORLD,
SECLABEL DB2SECURITYLABEL NOT NULL,
PRIMARY KEY (ID)
)
SECURITY POLICY POLICY2024;
As well as individual lines:
INSERT INTO DEMO.CUSTOMERS(ID, FIRST_NAME, LAST_NAME, COUNTRY, SECLABEL) VALUES
(21, 'Wernher', 'von Braun', 'US', SECLABEL_BY_NAME('POLICY2024', 'RESEARCH_TOPSECRET_USA'));
Finally, users, roles, and groups can be given read, write, or both access to tags:
GRANT SECURITY LABEL POLICY2024.RESEARCH_TOPSECRET_USA
TO USER TJONES FOR READ ACCESS;
There’s even more to it than LBAC, such as the opt-out mechanism.
This is an elaborate system that has complex rules governing things like security upgrades, default behavior, backups and data movement, and everything else that security auditors like to tinker with. It is powerful enough to implement the vast majority of precision access control needs, although scaling this system to cover large, complex, multidimensional requirements can be a significant design, implementation, and management challenge.
Pros and cons of DB2 fine-grained access control
Avg
- Built into the database: No need for any other software, except maybe something to help manage all that security metadata
- Robust and Certified: This is used by many security conscious organizations around the world.
- It automatically handles all relevant queries and updates
Against
- Adds additional load to the database: most of these mechanisms are actually implemented by concatenating them into a query, which can result in some very complex (and expensive) queries if you have many masks, permissions, and tags.
- It does not cover certain data types, such as XML and LOBs
- If a permission becomes invalid (which can easily happen with complex permissions), it will block all access to the table for which it is defined. This can obviously be destructive, and debugging is not trivial.
- All these labels and permissions must be managed by a database user with SECADM authority.
Many of the advantages and disadvantages of this solution stem from the fact that centralized solutionwhich is great for some scenarios and less so for others.
Fine-grained proxy-based access control
A very different approach consists of restricting access to data between database server and database client, using a programmable proxy.
This is more decentralized access: some trusted clients can access the database directly, while some less trusted clients may need to go through a proxy. And of course, you can have different proxy servers for different types of clients.
Since this kind of access control is being done outside the databaseit is in some ways a less powerful approach, but it has certain advantages.
Limiting or rewriting queries
A proxy can intercept any SQL statement on its way to the database and potentially alter or reject it entirely.
For example, a proxy can simply reject certain queries based on the user of the application, the data being accessed, the time of day, etc.
Furthermore, the proxy can rewrite queries to meet our data hiding requirements. This override can of course vary depending on who the user is, where the call is coming from, or any other useful conditions.
For example, a proxy can trivially intercept a query like:
SELECT ID, FIRST_NAME, LAST_NAME, BALANCE
FROM DEMO.CUSTOMERS
WHERE COUNTRY = 'FR'
And copy it into:
SELECT ID, FIRST_NAME, SUBSTR(LAST_NAME, 1, 1) || 'XXXX' AS LAST_NAME, 0 AS BALANCE
FROM DEMO.CUSTOMERS
WHERE COUNTRY = 'FR' AND REGION <> 'Z'
All this happens before it is sent to the database. This approach works best for known queries, although it is often possible to dynamically analyze the query and modify it as needed.
Modifying result sets
When the database responds to a query, the result sets or result values are passed through the proxy, which can manipulate them as needed.
For example, using Gallium Data as a proxy, a trivial result set filter might look like this:
if (row.COUNTRY === 'FR')
row.LAST_NAME = row.LAST_NAME.substring(0, 1) + "XXXX";
row.BALANCE = 0;
This extra processing will add a small amount of latency, but that work needs to be done somewhere, and by doing it in the proxy, you save the database from having to do it.
The proxy has full control over result sets and result values: it can modify rows, remove entire rows, or even insert new rows into the result set.
Advantages and disadvantages of proxy-based data hiding
Avg
- It does not require any changes to the database, database server or clients. This is clearly the most attractive feature of this approach: nothing has to be changed in your existing system.
- It does not load the database additionally
- It can cover the execution of a stored procedure, including parameter values and returned result sets
- It can be modulated based on database user, application user, application behavior or any other factor
- Different proxies may impose different sets of restrictions.
- Constraints are managed outside of DB2 and do not require special permissions on the database.
Against
- Small performance cost (typically on the order of 5%-10% increased latency)
- Limited in the case of free-form queries, since the proxy will never understand the queries as deeply as the database can
- It cannot completely hide some data (eg aggregated data may reveal the existence and value of some hidden data).
Conclusion
The easiest way to look at this is to ask yourself what will best meet your requirements: a centralized solution or a decentralized solution?
- If you’re fine with managing all your access controls in DB2, then you should go that route: it’s extremely robust and flexible, and it’s all built-in.
- If you would rather not change your database, or DB2 access control does not meet your requirements, then using a proxy may be the easiest route – sometimes only route.