SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

find a Value that exists the most

WestDraytonDec 14 2010 — edited Dec 17 2010

Let's say i have a column with various Values, now i want to find a Value that exists the most, how is the better way to make such query?
Is the best solution to do this way:

WITH t AS
 (SELECT 'a' AS VALUE  FROM Dual  UNION ALL
  SELECT 'b'  FROM Dual  UNION ALL
  SELECT 'a' FROM Dual)
SELECT VALUE
FROM (SELECT t.Value, COUNT(*) AS Cnt
           FROM t
           GROUP BY t.Value
           ORDER BY Cnt DESC
          )
WHERE Rownum = 1

As you see that Value 'a' exists the most times in table T, so the query must output the 'a' and nothing more. If the number of occurrences of the 'a' and 'b' would be same, then query should output one of them, doesn't matter which one.
As you see query now doesn't seem to be well optimized, i think.
(Oracle 10g)

This post has been answered by MichaelS on Dec 14 2010

Jump to Answer

BluShadow

CharlesRoos wrote:
As you see that Value 'a' exists the most times in table T, so the query must output the 'a' and nothing more. If the number of occurrences of the 'a' and 'b' would be same, then query should output one of them, doesn't matter which one.
As you see query now doesn't seem to be well optimized, i think.

Why do you think it's not optimised? It does what you're asking doesn't it?
There would also be a way using dense_rank keep method, but not sure if that would be any quicker.

1 person found this helpful

818663

hi,
instead of using order by clause if we use having it will help a little bit, i hope...

WITH t AS
 (SELECT 'a' AS VALUE  FROM Dual  UNION ALL
  SELECT 'b'  FROM Dual  UNION ALL
  SELECT 'b' FROM Dual)
SELECT VALUE
FROM (SELECT t.Value, COUNT(*) AS Cnt
           FROM t
           GROUP BY t.Value
           having count(*) > 1
          )

1 person found this helpful

WestDrayton

Hi, your query is not correct because we may have all Values existing only 1 time as below example, and in that case we must return/output still one Value, any of Values.

WITH t AS
 (SELECT 'a' AS VALUE  FROM Dual  UNION ALL
  SELECT 'b' FROM Dual)
SELECT VALUE
FROM (SELECT t.Value, COUNT(*) AS Cnt
           FROM t
           GROUP BY t.Value
           having count(*) > 1
          )

The query must find Value that exists the most times, and return that Value.
Maybe then my initial query seems ok. I haven't tested the performance of it yet, i just thought maybe theoretically looking we can see better way to rewrite the query.

MichaelS

Answer

SQL> with t
     as (
  select 'a' as value from dual union all
  select 'b' from dual union all
  select 'b' from dual
)
--
--
select max (value) keep (dense_rank last order by count (*)) value,
       count(*) cnt
  from t
group by value
/
VALUE        CNT
----- ----------
b              2
1 row selected.

Edited by: MichaelS on Dec 14, 2010 10:39 AM

added cnt.

Marked as Answer by WestDrayton · Sep 27 2020

818663

hi,
by looking at the sample data i was thingking thatwe need to find the data if it exists more than once.....

635471

You don't have to include Count(*) in the select clause of the inline view, by the way. you could just:

WITH t AS
 (SELECT 'a' AS VALUE  FROM Dual  UNION ALL
  SELECT 'b'  FROM Dual  UNION ALL
  SELECT 'a' FROM Dual)
SELECT VALUE
FROM (SELECT t.Value
           FROM t
           GROUP BY t.Value
           ORDER BY COUNT(*) DESC
          )
WHERE Rownum = 1

Probably no impact on performance.

1 person found this helpful

WestDrayton

Seems the "keep dense_rank" solution is the best, Internet said it is slightly faster solution than others.
But the "cnt" column shows wrong value - it shows the count of all records- but i don't need the count, so the query suits my need.

with t
     as (
  select 'a' as value from dual union all
  select 'b' from dual
)
select min(value) keep (dense_rank last order by count (*) desc) value,
       count(*) cnt
  from t
group by value;
/*
a	2
*/

635471

CharlesRoos wrote:
... Internet said it is slightly faster solution than others.

Internet also says that Barack Obama was born in Kenya and is a Martian. Check it for yourself on your own data set.

1 person found this helpful

Aketi Jyuuzou

sometimes,Please remember stats_mode function ;-)

col modeVal for a10

with t as (
select 'a' as value from dual union all
select 'b' from dual union all
select 'b' from dual)
select stats_mode(VALUE) as modeVal from t;

modeVal
-------
b

1 person found this helpful

Aketi Jyuuzou

I like this nested AggFunction usage of "Frank Kulash" B-)
2135816

with t as (
select 'a' as value from dual union all
select 'b' from dual union all
select 'b' from dual)
select
max(value) Keep(Dense_Rank Last order by count(*)) as modeVal,
max(count(*)) as cnt
 from t
group by value;

modeVal  CNT
-------  ---
b          2

WestDrayton

I didn't know that function at all, thx.

>
STATS_MODE takes as its argument a set of values and returns the value that occurs with the greatest frequency. If more than one mode exists, Oracle Database chooses one and returns only that one value.

John Spencer

It really depends on how you feel about ties, anf if you care about accurate counts.

Your original version gets a single "random" row in case of a tie since there is no guarantee which value will be ordered where within the same count group

SQL> with t as (
  2    select 'a' as value from dual union all
  3    select 'b' from dual union all
  4    select 'b' from dual union all
  5    select 'c' from dual union all
  6    select 'c' from dual)
  7  SELECT VALUE
  8  FROM (SELECT t.Value, COUNT(*) AS Cnt
  9             FROM t
 10             GROUP BY t.Value
 11             ORDER BY Cnt DESC
 12            )
 13  WHERE Rownum = 1;
 
V
-
b

Michael's will give the "largest" value of value in case of a tie, but the count will be incorrrect:

SQL> with t as (
  2    select 'a' as value from dual union all
  3    select 'b' from dual union all
  4    select 'b' from dual union all
  5    select 'c' from dual union all
  6    select 'c' from dual)
  7  select max (value) keep (dense_rank last order by count (*)) value,
  8         count(*) cnt
  9    from t
 10  group by value;
 
V        CNT
- ----------
c          3

Also, the more ties there are, the more the count varies:

SQL> with t as (
  2    select 'a' as value from dual union all
  3    select 'b' from dual union all
  4    select 'b' from dual union all
  5    select 'c' from dual union all
  6    select 'c' from dual union all
  7    select 'd' from dual union all
  8    select 'd' from dual)
  9  select max (value) keep (dense_rank last order by count (*)) value,
 10         count(*) cnt
 11  from t
 12  group by value;
 
V        CNT
- ----------
d          4

Aketi's mode_value solution will apparently (based on very limited tests) give the "smallest" value in case of ties:

SQL> with t as (
  2    select 'a' as value from dual union all
  3    select 'b' from dual union all
  4    select 'b' from dual union all
  5    select 'c' from dual union all
  6    select 'c' from dual)
  7  select stats_mode(VALUE) as modeVal from t;
 
M
-
b

If you want to see all of the ties, then I like the bog standard:

SQL> with t as (
  2    select 'a' as value from dual union all
  3    select 'b' from dual union all
  4    select 'b' from dual union all
  5    select 'c' from dual union all
  6    select 'c' from dual)
  7  select value, count(*)
  8  from t
  9  group by value
 10  having count(*) >= ALL (select count(*) from t
 11                          group by value);
 
V   COUNT(*)
- ----------
b          2
c          2

or, in one pass through the table:

SQL> with t as (
  2    select 'a' as value from dual union all
  3    select 'b' from dual union all
  4    select 'b' from dual union all
  5    select 'c' from dual union all
  6    select 'c' from dual)
  7  select value, cnt
  8  from (select value, cnt,
  9               rank() over(order by cnt desc) rnk
 10        from (select value, count(*) cnt
 11              from t
 12              group by value))
 13  where rnk = 1;
 
V        CNT
- ----------
b          2
c          2

The last has the advantage that it can be tweaked to find any N or set of N's (by using dense_rank), or to just pick one value (by using row_number).

John

WestDrayton

Thanks,
today this forum has given lot of new knowledges to me.

Frank Kulash

Hi,

You only need one qub-query to do this. John's last suggestion can be written like this:

  select value, cnt
  from (select value, 
       	       count (*)				AS cnt,
               rank() over(order by COUNT (*) desc) 	AS rnk
        from t
        group by value)
  where rnk = 1;

Analytic functions are computed after the GROUP BY clause is applied.

John Spencer

I knew that :-)

Not enough coffee yet I gues :-(

795356

If your bothered abt tie ups thn

with t as
(
select 1 no from dual
union all
select 1 no from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 3 from dual
union all
select 3 from dual
union all
select 3 from dual
)
 select no,cnt,rw,rnk
 from
     (
        Select no,cnt,row_number() over(order by cnt desc) rw,rank() over(order by cnt desc) rnk
        from
        (
        select no,count(*) cnt
        from t
        group by no
        order by no
        )
     )
     where rnk = 1

else

with t as
(
select 1 no from dual
union all
select 1 no from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 3 from dual
union all
select 3 from dual
union all
select 3 from dual
)
 select no,cnt,rw,rnk
 from
     (
        Select no,cnt,row_number() over(order by cnt desc) rw,rank() over(order by cnt desc) rnk
        from
        (
        select no,count(*) cnt
        from t
        group by no
        order by no
        )
     )
     where rw = 1

795356

Wow here I find much optimized solution..............

with t as
(
select 1 no from dual
union all
select 1 no from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 3 from dual
union all
select 3 from dual
union all
select 3 from dual
)select count(*),no from t
group by no
having count(*) >= all(select count(*) from t group by no)

635471

If you were going to go that route then it might be more efficient to:

with t as
(
select 1 no from dual
union all
select 1 no from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 3 from dual
union all
select 3 from dual
union all
select 3 from dual
),
agg_t as (select no, count(*) c from t group by no)
select c,no from agg_t
where c >= all(select c from agg_t)
/

with t as
(
select 1 no from dual
union all
select 1 no from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 2 from dual
union all
select 3 from dual
union all
select 3 from dual
union all
select 3 from dual
),
agg_t as (select no, count(*) c from t group by no)
select c,no from agg_t
where c = (select max(c) from agg_t)
/

795356

But I found 'ALL' very interesting .And this is for the first time I'm making use of it.

1 - 19

Locked Post

New comments cannot be posted to this locked post.

Locked on Jan 14 2011

Added on Dec 14 2010

19 comments

8,494 views

SQL & PL/SQL

find a Value that exists the most

Comments

Post Details