PL/SQL 101 : Cursors and SQL Projection
This is not a question, it's a forum article, in reponse to the number of questions we get regarding a "dynamic number of columns" or "rows to columns"
There are two integral parts to an SQL Select statement that relate to what data is selected. One is Projection and the other is Selection:-
Selection is the one that we always recognise and use as it forms the WHERE clause of the select statement, and hence selects which rows of data are queried.
The other, SQL Projection is the one that is less understood, and the one that this article will help to explain.
-----
In short, SQL Projection is the collective name for the columns that are Selected and returned from a query.
So what? Big deal eh? Why do we need to know this?
The reason for knowing this is that many people are not aware of when SQL projection comes into play when you issue a select statement. So let's take a basic query...
First create some test data...
create table proj_test as
select 1 as id, 1 as rn, 'Fred' as nm from dual union all
select 1,2,'Bloggs' from dual union all
select 2,1,'Scott' from dual union all
select 2,2,'Smith' from dual union all
select 3,1,'Jim' from dual union all
select 3,2,'Jones' from dual
/
... and now query that data...
SQL> select * from proj_test;
ID RN NM
---------- ---------- ------
1 1 Fred
1 2 Bloggs
2 1 Scott
2 2 Smith
3 1 Jim
3 2 Jones
6 rows selected.
OK, so what is that query actually doing?
To know that we need to consider that all queries are cursors and all cursors are processed in a set manner, roughly speaking...
1. The cursor is opened
2. The query is parsed
3. The query is described to know the projection (what columns are going to be returned, names, datatypes etc.)
4. Bind variables are bound in
5. The query is executed to apply the selection and identify the data to be retrieved
6. A row of data is fetched
7. The data values from the columns within that row are extracted into the known projection
8. Step 6 and 7 are repeated until there is no more data or another condition ceases the fetching
9. The cursor is closed
The purpose of the projection being determined is so that the internal processing of the cursor can allocate memory etc. ready to fetch the data into. We won't get to see that memory allocation happening easily, but we can see the same query being executed in these steps if we do it programatically using the dbms_sql package...
CREATE OR REPLACE PROCEDURE process_cursor (p_query in varchar2) IS
v_sql varchar2(32767) := p_query;
v_cursor number; -- A cursor is a handle (numeric identifier) to the query
--
col_cnt integer;
v_n_val number; -- numeric type to fetch data into
v_v_val varchar2(20); -- varchar type to fetch data into
v_d_val date; -- date type to fetch data into
--
rec_tab dbms_sql.desc_tab; -- table structure to hold sql projection info
dummy number;
v_ret number; -- number of rows returned
v_finaltxt varchar2(100);
col_num number;
BEGIN
--
-- 1. Open the cursor
--
dbms_output.put_line('1 - Opening Cursor');
v_cursor := dbms_sql.open_cursor;
--
-- 2. Parse the cursor
--
dbms_output.put_line('2 - Parsing the query');
dbms_sql.parse(v_cursor, v_sql, dbms_sql.NATIVE);
--
-- 3. Describe the query
--
-- Note: The query has been described internally when it was parsed, but we can look at
-- that description...
--
-- Fetch the description into a structure we can read, returning the count of columns that has been projected
dbms_output.put_line('3 - Describing the query');
dbms_sql.describe_columns(v_cursor, col_cnt, rec_tab);
--
-- Use that description to define local datatypes into which we want to fetch our values
-- Note: This only defines the types, it doesn't fetch any data and whilst we can also
-- determine the size of the columns we'll just use some fixed sizes for this example
dbms_output.put_line(chr(10)||'3a - SQL Projection:-');
for j in 1..col_cnt
loop
v_finaltxt := 'Column Name: '||rpad(upper(rec_tab(j).col_name),30,' ');
case rec_tab(j).col_type
-- if the type of column is varchar2, bind that to our varchar2 variable
when 1 then
dbms_sql.define_column(v_cursor,j,v_v_val,20);
v_finaltxt := v_finaltxt||' Datatype: Varchar2';
-- if the type of the column is number, bind that to our number variable
when 2 then
dbms_sql.define_column(v_cursor,j,v_n_val);
v_finaltxt := v_finaltxt||' Datatype: Number';
-- if the type of the column is date, bind that to our date variable
when 12 then
dbms_sql.define_column(v_cursor,j,v_d_val);
v_finaltxt := v_finaltxt||' Datatype: Date';
-- ...Other types can be added as necessary...
else
-- All other types we'll assume are varchar2 compatible (implicitly converted)
dbms_sql.DEFINE_COLUMN(v_cursor,j,v_v_val,2000);
v_finaltxt := v_finaltxt||' Datatype: Varchar2 (implicit)';
end case;
dbms_output.put_line(v_finaltxt);
end loop;
--
-- 4. Bind variables
--
dbms_output.put_line(chr(10)||'4 - Binding in values');
null; -- we have no values to bind in for our test
--
-- 5. Execute the query to make it identify the data on the database (Selection)
-- Note: This doesn't fetch any data, it just identifies what data is required.
--
dbms_output.put_line('5 - Executing the query');
dummy := dbms_sql.execute(v_cursor);
--
-- 6.,7.,8. Fetch the rows of data...
--
dbms_output.put_line(chr(10)||'6,7 and 8 Fetching Data:-');
loop
-- 6. Fetch next row of data
v_ret := dbms_sql.fetch_rows(v_cursor);
-- If the fetch returned no row then exit the loop
exit when v_ret = 0;
--
-- 7. Extract the values from the row
v_finaltxt := null;
-- loop through each of the Projected columns
for j in 1..col_cnt
loop
case rec_tab(j).col_type
-- if it's a varchar2 column
when 1 then
-- read the value into our varchar2 variable
dbms_sql.column_value(v_cursor,j,v_v_val);
v_finaltxt := ltrim(v_finaltxt||','||rpad(v_v_val,20,' '),',');
-- if it's a number column
when 2 then
-- read the value into our number variable
dbms_sql.column_value(v_cursor,j,v_n_val);
v_finaltxt := ltrim(v_finaltxt||','||to_char(v_n_val,'fm999999'),',');
-- if it's a date column
when 12 then
-- read the value into our date variable
dbms_sql.column_value(v_cursor,j,v_d_val);
v_finaltxt := ltrim(v_finaltxt||','||to_char(v_d_val,'DD/MM/YYYY HH24:MI:SS'),',');
else
-- read the value into our varchar2 variable (assumes it can be implicitly converted)
dbms_sql.column_value(v_cursor,j,v_v_val);
v_finaltxt := ltrim(v_finaltxt||',"'||rpad(v_v_val,20,' ')||'"',',');
end case;
end loop;
dbms_output.put_line(v_finaltxt);
-- 8. Loop to fetch next row
end loop;
-- 9. Close the cursor
dbms_output.put_line(chr(10)||'9 - Closing the cursor');
dbms_sql.close_cursor(v_cursor);
END;
/
SQL> exec process_cursor('select * from proj_test');
1 - Opening Cursor
2 - Parsing the query
3 - Describing the query
3a - SQL Projection:-
Column Name: ID Datatype: Number
Column Name: RN Datatype: Number
Column Name: NM Datatype: Varchar2
4 - Binding in values
5 - Executing the query
6,7 and 8 Fetching Data:-
1 ,1 ,Fred
1 ,2 ,Bloggs
2 ,1 ,Scott
2 ,2 ,Smith
3 ,1 ,Jim
3 ,2 ,Jones
1 ,3 ,Freddy
1 ,4 ,Fud
9 - Closing the cursor
PL/SQL procedure successfully completed.
So, what's really the point in knowing when SQL Projection occurs in a query?
Well, we get many questions asking "How do I convert rows to columns?" (otherwise known as a pivot) or questions like "How can I get the data back from a dynamic query with different columns?"
Let's look at a regular pivot. We would normally do something like...
SQL> select id
2 ,max(decode(rn,1,nm)) as nm_1
3 ,max(decode(rn,2,nm)) as nm_2
4 from proj_test
5 group by id
6 /
ID NM_1 NM_2
---------- ------ ------
1 Fred Bloggs
2 Scott Smith
3 Jim Jones
(or, in 11g, use the new PIVOT statement)
but many of these questioners don't understand it when they say their issue is that, they have an unknown number of rows and don't know how many columns it will have, and they are told that you can't do that in a single SQL statement. e.g.
SQL> insert into proj_test (id, rn, nm) values (1,3,'Freddy');
1 row created.
SQL> select id
2 ,max(decode(rn,1,nm)) as nm_1
3 ,max(decode(rn,2,nm)) as nm_2
4 from proj_test
5 group by id
6 /
ID NM_1 NM_2
---------- ------ ------
1 Fred Bloggs
2 Scott Smith
3 Jim Jones
... it's not giving us this 3rd entry as a new column and we can only get that by writing the expected columns into the query, but then what if more columns are added after that etc.
If we look back at the steps of a cursor we see again that the description and projection of what columns are returned by a query happens before any data is fetched back.
Because of this, it's not possible to have the query return back a number of columns that are based on the data itself, as no data has been fetched at the point the projection is required.
So, what is the answer to getting an unknown number of columns in the output?
1) The most obvious answer is, don't use SQL to try and pivot your data. Pivoting of data is more of a reporting requirement and most reporting tools include the ability to pivot data either as part of the initial report generation or on-the-fly at the users request. The main point about using the reporting tools is that they query the data first and then the pivoting is simply a case of manipulating the display of those results, which can be dynamically determined by the reporting tool based on what data there is.
2) The other answer is to write dynamic SQL. Because you're not going to know the number of columns, this isn't just a simple case of building up a SQL query as a string and passing it to the EXECUTE IMMEDIATE command within PL/SQL, because you won't have a suitable structure to read the results back into as those structures must have a known number of variables for each of the columns at design time, before the data is know. As such, inside PL/SQL code, you would have to use the DBMS_SQL package, just like in the code above that showed the workings of a cursor, as the columns there are referenced by position rather than name, and you have to deal with each column seperately. What you do with each column is up to you... store them in an array/collection, process them as you get them, or whatever. They key thing though with doing this is that, just like the reporting tools, you would need to process the data first to determine what your SQL projection is, before you execute the query to fetch the data in the format you want e.g.
create or replace procedure dyn_pivot is
v_sql varchar2(32767);
--
-- cursor to find out the maximum number of projected columns required
-- by looking at the data
--
cursor cur_proj_test is
select distinct rn
from proj_test
order by rn;
begin
v_sql := 'select id';
for i in cur_proj_test
loop
-- dynamically add to the projection for the query
v_sql := v_sql||',max(decode(rn,'||i.rn||',nm)) as nm_'||i.rn;
end loop;
v_sql := v_sql||' from proj_test group by id order by id';
dbms_output.put_line('Dynamic SQL Statement:-'||chr(10)||v_sql||chr(10)||chr(10));
-- call our DBMS_SQL procedure to process the query with it's dynamic projection
process_cursor(v_sql);
end;
/
SQL> exec dyn_pivot;
Dynamic SQL Statement:-
select id,max(decode(rn,1,nm)) as nm_1,max(decode(rn,2,nm)) as nm_2,max(decode(rn,3,nm)) as nm_3 from proj_test group by id order by id
1 - Opening Cursor
2 - Parsing the query
3 - Describing the query
3a - SQL Projection:-
Column Name: ID Datatype: Number
Column Name: NM_1 Datatype: Varchar2
Column Name: NM_2 Datatype: Varchar2
Column Name: NM_3 Datatype: Varchar2
4 - Binding in values
5 - Executing the query
6,7 and 8 Fetching Data:-
1 ,Fred ,Bloggs ,Freddy
2 ,Scott ,Smith ,
3 ,Jim ,Jones ,
9 - Closing the cursor
PL/SQL procedure successfully completed.
... and if more data is added ...
SQL> insert into proj_test (id, rn, nm) values (1,4,'Fud');
1 row created.
SQL> exec dyn_pivot;
Dynamic SQL Statement:-
select id,max(decode(rn,1,nm)) as nm_1,max(decode(rn,2,nm)) as nm_2,max(decode(rn,3,nm)) as nm_3,max(decode(rn,4,nm)) as nm_4 from proj_test group by id order by id
1 - Opening Cursor
2 - Parsing the query
3 - Describing the query
3a - SQL Projection:-
Column Name: ID Datatype: Number
Column Name: NM_1 Datatype: Varchar2
Column Name: NM_2 Datatype: Varchar2
Column Name: NM_3 Datatype: Varchar2
Column Name: NM_4 Datatype: Varchar2
4 - Binding in values
5 - Executing the query
6,7 and 8 Fetching Data:-
1 ,Fred ,Bloggs ,Freddy ,Fud
2 ,Scott ,Smith , ,
3 ,Jim ,Jones , ,
9 - Closing the cursor
PL/SQL procedure successfully completed.
Of course there are other methods, using dynamically generated scripts etc. (see ), but the above simply demonstrates that:-
a) having a dynamic projection requires two passes of the data; one to dynamically generate the query and another to actually query the data,
b) it is not a good idea in most cases as it requires code to handle the results dynamically rather than being able to simply query directly into a known structure or variables, and
c) a simple SQL statement cannot have a dynamic projection.
Most importantly, dynamic queries prevent validation of your queries at the time your code is compiled, so the compiler can't check that the column names are correct or the tables names, or that the actual syntax of the generated query is correct. This only happens at run-time, and depending upon the complexity of your dynamic query, some problems may only be experienced under certain conditions. In effect you are writing queries that are harder to validate and could potentially have bugs in them that would are not apparent until they get to a run time environment. Dynamic queries can also introduce the possibility of SQL injection (a potential security risk), especially if a user is supplying a string value into the query from an interface.
-----
To summarise:-
The projection of an SQL statement must be known by the SQL engine before any data is fetched, so don't expect SQL to magically create columns on-the-fly based on the data it's retrieving back; and, if you find yourself thinking of using dynamic SQL to get around it, just take a step back and see if what you are trying to achieve may be better done elsewhere, such as in a reporting tool or the user interface.
-----
Other articles in the PL/SQL 101 series:-
PL/SQL 101 : Understanding Ref Cursors
PL/SQL 101 : Exception Handling