Need Logic for picking random observations ?

Reply
Contributor
Posts: 46

Need Logic for picking random observations ?

Hi SAS Community,

I am in need of smart logic for my problem. I have an input table with the following columns and its values

Customer_ID     Bill_Date     Bill_Amount    

AA001             10/112013          250

AA050              08/10/2013        450    

AA001              23/11/2013        150

AA003               13/11/2013       890

AA078              16/09/2013        234

Ak123               02/10/2013       678

AK147               10/12/2013      198

PM001             09/10/2013         245

BH104              31/10/2013        675

The table has 5 yrs historical data and so it is pretty huge. I would like to pick random number of Customer_Id's and pass those list Customer_Id's as a filter when i need them for further analysis on them. Until now, I am doing plain guess work by listing certain Customer_Id's just by typing in a where condition like for example Where Customer_ID in ('AA040', 'AH345', 'PL251'......etc). However I want this list to be generated randomly using a logic first and then I can apply the random generated list as filter in the query. Random generation would help my analysis by a long shot.

It would be great to have anywhere between 12-20 Customer_id's generated. Can anybody help?

Super User
Posts: 19,789

Re: Need Logic for picking random observations ?

Posted in reply to Allaluiah

Have you looked at proc surveyselect?

SAS/STAT(R) 9.2 User's Guide, Second Edition

Contributor
Posts: 46

Re: Need Logic for picking random observations ?

Thanks. Certainly a very helpful procedure. However, I am trying to work on a Datastep based solution and eventually parameterizing it using a macro. Also, the reason is that I want greater control over the process and be in a position to dynamically change and manipulate as and when the requirements change. Nevertheless, Thanks for the very helpful response.

Respected Advisor
Posts: 3,799

Re: Need Logic for picking random observations ?

Posted in reply to Allaluiah

is right surveyselect looks like the best choice.  You can tell it how many customer ids you want and it will "generate the list" and take the sample.

data huge;
   input Customer_ID $ Bill_Date:ddmmyy.  Bill_Amount;
   format bill_date ddmmyy10.;
  
do x = 1 to 5;
     
output;
     
end;
  
cards;
AA001             10/11/2013          250
AA050              08/10/2013        450    
AA001              23/11/2013        150
AA003               13/11/2013       890
AA078              16/09/2013        234
Ak123               02/10/2013       678
AK147               10/12/2013      198
PM001             09/10/2013         245
BH104              31/10/2013        675
AA003               13/11/2013       890
;;;;
   run;

proc surveyselect sampsize=3 out=sample seed=354624589;
  
samplingunit customer_id;
   run;
proc print;
  
run;
Contributor
Posts: 46

Re: Need Logic for picking random observations ?

Posted in reply to data_null__

Thanks DN and Reeza. Appreciate it!

Ask a Question
Discussion stats
  • 4 replies
  • 231 views
  • 2 likes
  • 3 in conversation