Skip to Main Content

SQL & PL/SQL

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Random sampling of records in a data table

633461Apr 11 2008 — edited Apr 11 2008
I had a doubt regarding random sampling of data .....it goes as such.

i have a table of multiple records and i want to frame a query which picks random 10% samples/records from the table and displays it.

there are 3 diff queries related to that...whic one is efficient and the right one .please guide.


1)
SQL> select * from EMP sample(10);



EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

---------- ---------- --------- ---------- --------- ---------- ---------- ----------

7876 ADAMS CLERK 7788 23-MAY-87 1100 20



SQL> select * from EMP sample(50);



EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO

---------- ---------- --------- ---------- --------- ---------- ---------- ----------

7369 SMITH CLERK 7902 17-DEC-80 800 20

7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30

7566 JONES MANAGER 7839 02-APR-81 2975 20

7782 CLARK MANAGER 7839 09-JUN-81 2450 10

7839 KING PRESIDENT 17-NOV-81 5000 10

7876 ADAMS CLERK 7788 23-MAY-87 1100 20

7934 MILLER CLERK 7782 23-JAN-82 1300 10



7 rows selected.



In the above execution of the query, we use the sample clause to randomly sample some percentage of data. SAMPLE(n) – n is the number denoting the percentage.



One thing which I read about this clause is it works only for single table queries on local tables. If we want to use this clause in a multi table or remote query then we have to create inline view on the driving table of the query with sample clause.



SELECT T1.DEPT,T2.EMP

FROM (SELECT * FROM DEPT SAMPLE(5)) T1,EMP T2

WHERE T1.DEPT_ID=T2.DEPT_ID;





2)
With reference to the last query mentioned in the 1st query,it worked (but only for the 1st run) !
Consecutive runs - no rows are selected

SQL> SELECT d.deptno, e.ename
2 FROM (SELECT * FROM dept SAMPLE(5)) d, emp e
3 WHERE d.deptno=e.deptno
4 ;

DEPTNO ENAME
---------- ----------
10 CLARK
10 KING
10 MILLER


3)
Is there any other way to do it without using the sample function?
This is another query;

select * from emp where rownum<=(select trunc(.20*count(*),0) from emp)



but it samples only the first 20% records.



Can we put some condition on rownum and make the output random.
Comments
Locked Post
New comments cannot be posted to this locked post.
Post Details
Locked on May 9 2008
Added on Apr 11 2008
3 comments
8,389 views