Hi,
The end results will be used in a report and chart. A new category for ethnicity has been added to the database.
Because of this, the field that we were using to report on is no longer available. I put together a query to
identify one record per person for ethnicity codes. The rules are if the person has a record identifying as
Hispanic use that regardless of whether they have more than one record. If the person doesn't have Hispanic,
then identify what category they are. If they only have one record (or multiple records within the same category),
code as the main category. If they have more than one record across categories (excluding Hispanic), then count
them as Multi-Ethnic.
Because the table is large and is being joined with other tables for various reporting purposes, would like to find out
if there is a more efficient/alternative way to write the sql statement. Also, for the future, they
plan to request a report that will break down the multi-ethnic into the various categories to show on a chart.
I was thinking that I would need to pivot "horizontally", or could this be done "vertically" the way the
underlying table is.
In the sample data:
1st Person has two records Hispanic and European; count under 'Hispanic'
2nd Person has two records White and Western European; count under 'White'
3rd Person has two records Indian and African; count under 'Multi-Ethnic'
4th Person has one record Sioux; count under 'American Indian or Alaskan Native'
Here is a table of sample data:
CREATE TABLE ETHTEK
(
ID VARCHAR2(5 CHAR),
ETHCODE VARCHAR2(1 CHAR),
ETHGROUP_CODE VARCHAR2(3 CHAR),
ETHGROUP VARCHAR2(30 CHAR)
);
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11111', '5' ,'SEU', 'Southern European');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11111', 'H', 'SAM', 'South American');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11121', '5', 'WEU', 'Western European');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11121', '5', '5', 'White');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11122', '3', 'AF', 'African');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11122', '2', 'IND', 'Indian');
INSERT INTO ETHTEK
(ID, ETHCODE, ETHGROUP_CODE, ETHGROUP)
Values
('11159', '1', 'SIO', 'SIOUX');
Note: so as not to need to create another table, I have hard coded the description translation into case statements
in the query below. The actual query is using a lookup table.
Here is the sql: (For some reason, in my preview, the greater/less than symbols aren't showing, so put a comment on the lines that should be not equal in case they don't show up when posted)
WITH e AS --Ethnicity
(SELECT DISTINCT id,
ethcode,
ethgroup_code
ethgroup
FROM ethtek
),
h AS --Hispanic
(SELECT id, ethcode
FROM e
WHERE ethcode = 'H'),
nhm AS --Non Hispanic or Multi Ethnic
(SELECT DISTINCT id, ethcode
FROM e
WHERE ethcode <> 'H'), -- Not Equal 'H'
nh AS --Non Hispanic
(SELECT id,
CASE
WHEN COUNT (*) OVER (PARTITION BY id) >
1
THEN 'M'
ELSE ethcode
END AS ethcode
FROM nhm
WHERE ethcode <> 'H') -- Not Equal 'H'
SELECT id, ethcode,
CASE
WHEN ethcode = 'H'
THEN 'Hispanic or Latino'
WHEN ethcode = '5'
THEN 'White'
WHEN ethcode = 'M'
THEN 'Multi-Ethnic'
WHEN ethcode = '1'
THEN 'American Indian or Alaskan Native'
ELSE ethcode
END AS ethnic_desc,
CASE
WHEN ethcode = 'H'
THEN '4'
WHEN ethcode = '4'
THEN '5'
WHEN ethcode = '5'
THEN '6'
ELSE ethcode
END AS old_ethnicity_code
FROM (SELECT DISTINCT h.*
FROM h
UNION
SELECT DISTINCT nh.*
FROM nh LEFT OUTER JOIN h ON nh.id =
h.id
WHERE h.id IS NULL) eth;
This is what the end result of the query is:
Row# ID ETHCODE ETHNIC_DESC OLD_ETHNICITY_CODE
1 11111 H Hispanic or Latino 4
2 11121 5 White 6
3 11122 M Multi-Ethnic M
4 11159 1 American Indian or Alaskan Native 1
Thus, out of the 7 records, we are only counting 4 individuals one time.
This is how the Summary query (not included) result looks:
Group Total
American Indian/Alaskan Native 1
Hispanic 1
White 1
Multi-Ethnic 1
Total 4