BookmarkSubscribeRSS Feed
Midi
Obsidian | Level 7

Hello , I have a tabe that contains 7 million rows and i want to sample it , by keeping only (700.000) rows :

 

 

OBSId_CustCus_Id_DupNaT_IdGen_IdRegistr_Dat_IdSign_Dat_NL_IdDat_Firs_Aqui_Id

DaT_Last_acqu_Id

N_Ordaaau_Ida_id

GR_id

LKLK1ss_idCCD_IDD_N
1238517238517601200300006523902.0000N/A401FEB1997:10:00:00
2238918235518701200400006904002.0000N/A21.
3235519235519451200406545654506905202.0000N/A21.
423852023892061200100006905202.0000N/A520GEN1993:11:00:00
5238521238521901200000006524302.0000N/A5.
6238922238922111200000006905402.0000N/A5 

So This is What I tried , without any Success :

 

proc sql;
	create table clients as 
	select Id_Cus, ranuni(66564) as random_num
	(select distinct ID_Cus
	from dbo.Eu_Ord)
	where calculated random_num<=0.05
	;
quit;

Any Help Would be Much Appreciated , thank you all.

2 REPLIES 2
subhroster
Fluorite | Level 6

You can sample number of rows from existing data set using the Proc Survey Select .

 

proc surveyselect data=dbo.Eu_Ord
   method=srs n=700 out=clients;
run;
Subhro Kar
www.9to5sas.com
FreelanceReinh
Jade | Level 19

Hello @Midi,

 

First of all, you should start with a clear picture of what type of sample you need:

  1. What is the sampling unit? If it is specified by Id_Cust, is this a unique key in the original table?
  2. Do you need exactly 700,000 rows or exactly 10 % (i.e. round(N/10) where N is the number of rows in the original table) or would just "about 10 %" be acceptable?
  3. Do you need a sample with replacement or without replacement? (see explanations in last week's thread https://communities.sas.com/t5/SAS-Programming/How-to-select-10-random-rows-in-a-data-set/m-p/589518...)
  4. Do you need all variables from the original table in the sample or only Id_Cust and (really?) a random number?
  5. Would you prefer the sample to be in random order or in the same order as in the original table?

The next question is: Do you have SAS/STAT licensed so that PROC SURVEYSELECT (as suggested by @subhroster) is available?

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 937 views
  • 0 likes
  • 3 in conversation