DATA Step, Macro, Functions and more

How to split a dataset 80 - 20 percent

Reply
Contributor
Posts: 63

How to split a dataset 80 - 20 percent

How to split a dataset 80 - 20 percent with common id  i.e i want to split data by id(80-20percentage of data to be splitted on basis of id)

id      score    forum

12     89          98    

12     87          67    

13     56          87    

13     45          98    

14     78          98

15    23           87    

16    54          23

Super Contributor
Posts: 543

Re: How to split a dataset 80 - 20 percent

Posted in reply to venkatard

Hi,

Given the example you provided, how do you want the end result to look like?

Thanks.

Super Contributor
Posts: 578

Re: How to split a dataset 80 - 20 percent

Posted in reply to venkatard

Presuming that you want all of the records associated with 80% of unique IDs to be identified:

data have;

input id score forum;

cards;

12     89          98   

12     87          67   

13     56          87   

13     45          98   

14     78          98

15    23           87   

16    54          23

;

proc sql;

create table ids as select distinct id, 0 as id_rand_val from work.have order by id;

update ids set id_rand_val=rand('uniform');

create table want as

select

    t1.*,

    case when t2.id_rand_val <= .8 then 'Group1' else 'Group2' end as ID_Group

from

    work.have t1

    inner join work.ids t2

        on t1.id=t2.id;

quit;

Respected Advisor
Posts: 3,799

Re: How to split a dataset 80 - 20 percent

Posted in reply to venkatard

SELECTED=1 is the RATE= sample in this case the 20%.  Therefore SELECTED=0 would be the 1-rate part.

data score;
   input id $     score    forum;
   cards;
12     89          98    
12     87          67    
13     56          87    
13     45          98    
14     78          98
15    23           87    
16    54          23
;;;;
   run;
proc surveyselect seed=2 rate=.2 outall;
  
SAMPLINGUNIT id;
   run;
Ask a Question
Discussion stats
  • 3 replies
  • 982 views
  • 1 like
  • 4 in conversation