BookmarkSubscribeRSS Feed
Midi
Obsidian | Level 7

Hello , I have a tabe that contains 7 million rows and i want to sample it , by keeping only (700.000) rows :

 

 

OBSId_CustCus_Id_DupNaT_IdGen_IdRegistr_Dat_IdSign_Dat_NL_IdDat_Firs_Aqui_Id

DaT_Last_acqu_Id

N_Ordaaau_Ida_id

GR_id

LKLK1ss_idCCD_IDD_N
1238517238517601200300006523902.0000N/A401FEB1997:10:00:00
2238918235518701200400006904002.0000N/A21.
3235519235519451200406545654506905202.0000N/A21.
423852023892061200100006905202.0000N/A520GEN1993:11:00:00
5238521238521901200000006524302.0000N/A5.
6238922238922111200000006905402.0000N/A5 

So This is What I tried , without any Success :

 

proc sql;
	create table clients as 
	select Id_Cus, ranuni(66564) as random_num
	(select distinct ID_Cus
	from dbo.Eu_Ord)
	where calculated random_num<=0.05
	;
quit;

Any Help Would be Much Appreciated , thank you all.

2 REPLIES 2
subhroster
Fluorite | Level 6

You can sample number of rows from existing data set using the Proc Survey Select .

 

proc surveyselect data=dbo.Eu_Ord
   method=srs n=700 out=clients;
run;
Subhro Kar
www.9to5sas.com
FreelanceReinh
Jade | Level 19

Hello @Midi,

 

First of all, you should start with a clear picture of what type of sample you need:

  1. What is the sampling unit? If it is specified by Id_Cust, is this a unique key in the original table?
  2. Do you need exactly 700,000 rows or exactly 10 % (i.e. round(N/10) where N is the number of rows in the original table) or would just "about 10 %" be acceptable?
  3. Do you need a sample with replacement or without replacement? (see explanations in last week's thread https://communities.sas.com/t5/SAS-Programming/How-to-select-10-random-rows-in-a-data-set/m-p/589518...)
  4. Do you need all variables from the original table in the sample or only Id_Cust and (really?) a random number?
  5. Would you prefer the sample to be in random order or in the same order as in the original table?

The next question is: Do you have SAS/STAT licensed so that PROC SURVEYSELECT (as suggested by @subhroster) is available?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 963 views
  • 0 likes
  • 3 in conversation