SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Defining Ethnicity to group one rec per person based on multiple conditions

708388Nov 15 2010 — edited Nov 18 2010

Hi,

The end results will be used in a report and chart. A new category for ethnicity has been added to the database.
Because of this, the field that we were using to report on is no longer available. I put together a query to
identify one record per person for ethnicity codes. The rules are if the person has a record identifying as
Hispanic use that regardless of whether they have more than one record. If the person doesn't have Hispanic,
then identify what category they are. If they only have one record (or multiple records within the same category),
code as the main category. If they have more than one record across categories (excluding Hispanic), then count
them as Multi-Ethnic.

Because the table is large and is being joined with other tables for various reporting purposes, would like to find out
if there is a more efficient/alternative way to write the sql statement. Also, for the future, they
plan to request a report that will break down the multi-ethnic into the various categories to show on a chart.
I was thinking that I would need to pivot "horizontally", or could this be done "vertically" the way the
underlying table is.

In the sample data:
1st Person has two records Hispanic and European; count under 'Hispanic'
2nd Person has two records White and Western European; count under 'White'
3rd Person has two records Indian and African; count under 'Multi-Ethnic'
4th Person has one record Sioux; count under 'American Indian or Alaskan Native'

Here is a table of sample data:

CREATE TABLE ETHTEK
(
  ID             VARCHAR2(5 CHAR),
  ETHCODE        VARCHAR2(1 CHAR),
  ETHGROUP_CODE  VARCHAR2(3 CHAR),
  ETHGROUP       VARCHAR2(30 CHAR)
);

INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11111', '5' ,'SEU', 'Southern European');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11111', 'H', 'SAM', 'South American');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11121', '5', 'WEU', 'Western European');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11121', '5', '5', 'White');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11122', '3', 'AF', 'African');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11122', '2', 'IND', 'Indian');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11159', '1', 'SIO', 'SIOUX');

Note: so as not to need to create another table, I have hard coded the description translation into case statements
in the query below. The actual query is using a lookup table.

Here is the sql: (For some reason, in my preview, the greater/less than symbols aren't showing, so put a comment on the lines that should be not equal in case they don't show up when posted)

WITH e AS                                                       --Ethnicity
        (SELECT DISTINCT id,
                         ethcode, 
                         ethgroup_code
                         ethgroup
                    FROM ethtek
        ),
        h AS                                                        --Hispanic
        (SELECT id, ethcode
           FROM e
          WHERE ethcode = 'H'),
        nhm AS                                  --Non Hispanic or Multi Ethnic
        (SELECT DISTINCT id, ethcode
                    FROM e
                   WHERE ethcode <> 'H'),             -- Not Equal 'H'
        nh AS                                                   --Non Hispanic
        (SELECT id,
                CASE
                   WHEN COUNT (*) OVER (PARTITION BY id) >
                                                             1
                      THEN 'M'
                   ELSE ethcode
                END AS ethcode
           FROM nhm
          WHERE ethcode <> 'H')                     -- Not Equal 'H'
   SELECT id, ethcode,
          CASE
             WHEN ethcode = 'H'
                THEN 'Hispanic or Latino'
             WHEN ethcode = '5'
                THEN 'White'
             WHEN ethcode = 'M'
                THEN 'Multi-Ethnic'
             WHEN ethcode = '1'
                THEN 'American Indian or Alaskan Native'
             ELSE ethcode
          END AS ethnic_desc,
          CASE
             WHEN ethcode = 'H'
                THEN '4'
             WHEN ethcode = '4'
                THEN '5'
             WHEN ethcode = '5'
                THEN '6'
             ELSE ethcode
          END AS old_ethnicity_code
     FROM (SELECT DISTINCT h.*
                      FROM h
           UNION
           SELECT DISTINCT nh.*
                      FROM nh LEFT OUTER JOIN h ON nh.id =
                                                                  h.id
                     WHERE h.id IS NULL) eth;

This is what the end result of the query is:

Row#	ID	ETHCODE	ETHNIC_DESC	OLD_ETHNICITY_CODE
1	11111	H	Hispanic or Latino			4
2	11121	5	White					6
3	11122	M	Multi-Ethnic				M
4	11159	1	American Indian or Alaskan Native	1

Thus, out of the 7 records, we are only counting 4 individuals one time.

This is how the Summary query (not included) result looks:

	Group			Total
American Indian/Alaskan Native	1
Hispanic			1
White				1
Multi-Ethnic			1
Total				4

This post has been answered by Frank Kulash on Nov 16 2010

Jump to Answer

Locked Post

New comments cannot be posted to this locked post.

Locked on Dec 16 2010

Added on Nov 15 2010

6 comments

573 views