BookmarkSubscribeRSS Feed
CathyVI
Pyrite | Level 9

Hello,

I am trying to create columns from a dataset. I tried to use the if then/else statement but I'm not getting the desired results.

 

data have;
input case $ id $ flag30  flag60;
datalines;	
PAA	     001199905      0      0
PAA	     001199905      0      0	
PAA	     001199905      1      1	
PostDDA  001199905      1      1	
PAA	     001587306      1      1	
PostDDA  001587306      0      0	
PostDDA  001546292	    1	   1
;
run;

 

From the data. I have ID where case PAA and PostDDA are both 1.
I want to category any ID where PAA and PostDDA is equal to 1 as multi_form
ID with PAA, where flag30=1 should be categorized as PAA_only
ID with PostDDA, where flag30=1 should be categorized as PostDDA_only.

data need;
*do until(last.id);
   set have;
   by id;
   if case='PAA' and flag30=1 then PAA_only=1;
   if case='PostDDA' and flag30=1 then PostDDA_only=1;
   if case='PAA' and case='PostDDA' and flag30=1 then multi_form=1;
   end;
run;

Here is my desired output:

id

flag30

flag60

PAA_only

PostDDA_only

multi_form

00119990

1

1

.

.

1

00154629

1

1

.

1

.

00158730

1

1

1

.

.

 

 I think this might require multiple data steps. Am help will be highly appreciated. Thanks

 

4 REPLIES 4
A_Kh
Lapis Lazuli | Level 10

We can proc transpose data by ID where flag30=1, this will pick up both PAA and POSTDDA. The only concern is flag60, if this is always equal to flag30, then we can hardcode it 1. 
eg:

proc sort data=have; by id case; run; 
proc transpose data=have out=wide(drop=_name_);
	where flag30=1;
	by id;
	var flag30;
	id case;
run; 
data want;
	set wide;
	if paa=1 and PostDDA=1 then do;
		multi_form=1; 
		call missing(paa, postdda);
	end; 

flag30=1; flag60=1; /*if always 1 when FLAG30=1*/; proc print;run;

variable order slightly changes, but it could be fixed  in data step or in the proc step where it needed. Or it could me merged back from the HAVE dataset.

Obs id PAA PostDDA multi_form flag30 flag60
1 00119990 . . 1 1 1
2 00154629 . 1 . 1 1
3 00158730 1 . . 1 1
 
Reeza
Super User

Ugly but works.

data have;
input case $ id $ flag30  flag60;
datalines;  
PAA      001199905      0      0
PAA      001199905      0      0    
PAA      001199905      1      1    
PostDDA  001199905      1      1    
PAA      001587306      1      1    
PostDDA  001587306      0      0    
PostDDA  001546292      1      1
;
run;

proc sql;
create table want(drop = PAA_flag POSTDDA_FLAG) as
select id, max(flag30) as flag30, max(flag60) as flag60, 
sum(case when caseID = 'PAA' then 1*flag30 else 0 end)  as PAA_flag,
sum(case when caseID = 'PostDDA' then 1*flag30 else 0 end)  as PostDDA_flag,
case when calculated PAA_flag=calculated PostDDA_flag=1 then 1 else . end as multi_form, 
case when calculated PAA_flag=1 and calculated PostDDA_flag=0 then 1 else . end as PAA_only, 
case when calculated PAA_flag=0 and calculated PostDDA_flag=1 then 1 else . end as PostDDA_only
from have(rename=case=caseID)
group by id;
quit;
Reeza
Super User

If you're already sorted, a data step will do this in one step:

 

proc sort data=have;
    by id;
run;

data want;
    set have;
    by id;
    retain _flag30 _flag60 paa postdda;

    if first.id then
        call missing(_flag30, _flag60, paa, postdda, paa_only, postdda_only, 
            multi_form);
            
    _flag30=max(flag30, _flag30);
    _flag60=max(flag60, _flag60);

    if case='PAA' and flag30=1 then
        paa=1;
    else if case='PostDDA' and flag30=1 then
        postDDA=1;

    if last.id then
        do;
            flag30=_flag30;
            flag60=_flag60;

            if paa=1 and postdda=1 then
                multi_form=1;
            else if paa=1 then
                paa_only=1;
            else if postdda=1 then
                postdda_only=1;
            OUTPUT;
        end;
    keep id flag30 flag60 paa_only postdda_only multi_form;
run;

@CathyVI wrote:

Hello,

I am trying to create columns from a dataset. I tried to use the if then/else statement but I'm not getting the desired results.

 

data have;
input case $ id $ flag30  flag60;
datalines;	
PAA	     001199905      0      0
PAA	     001199905      0      0	
PAA	     001199905      1      1	
PostDDA  001199905      1      1	
PAA	     001587306      1      1	
PostDDA  001587306      0      0	
PostDDA  001546292	    1	   1
;
run;

 

From the data. I have ID where case PAA and PostDDA are both 1.
I want to category any ID where PAA and PostDDA is equal to 1 as multi_form
ID with PAA, where flag30=1 should be categorized as PAA_only
ID with PostDDA, where flag30=1 should be categorized as PostDDA_only.

data need;
*do until(last.id);
   set have;
   by id;
   if case='PAA' and flag30=1 then PAA_only=1;
   if case='PostDDA' and flag30=1 then PostDDA_only=1;
   if case='PAA' and case='PostDDA' and flag30=1 then multi_form=1;
   end;
run;

Here is my desired output:

id

flag30

flag60

PAA_only

PostDDA_only

multi_form

00119990

1

1

.

.

1

00154629

1

1

.

1

.

00158730

1

1

1

.

.

 

 I think this might require multiple data steps. Am help will be highly appreciated. Thanks

 


 

Tom
Super User Tom
Super User

Not sure why you want the missing values, but here is a simple data step to check all of the observations for an ID and keep track of whether any of the two specific cases happen or not.  Then just use normal boolean logic to make your new flags.

data want;
  do until(last.id);
    set have;
    by id;
    if flag30 then do;
      if case='PAA' then PAA=1;
      else if case='PostDDA' then PostDDA=1;
    end;
    _flag30=max(_flag30,flag30);
    _flag60=max(_flag60,flag60);
  end;
  flag30=_flag30=1;
  flag60=_flag60=1;
  PAA_only= PAA and not PostDDA;
  PostDDA_only = PostDDA and not PAA;
  Multi_form = PostDDA and PAA ;
  keep id flag30 flag60 PAA_only PostDDA_only Multi_form ;
run;

Let's try it.

data have;
  length id $9 case $8 ;
input case id flag30  flag60;
datalines; 
PAA      001199905  0 0
PAA      001199905  0 0 
PAA      001199905  1 1 
PostDDA  001199905  1 1 
PAA      001587306  1 1 
PostDDA  001587306  0 0 
PostDDA  001546292  1 1
xxx      YYYY       1 1
PAA      ZZZZ       0 0      
;

Result

                                                    PostDDA_    Multi_
Obs    id           flag30    flag60    PAA_only      only       form

 1     001199905       1         1          0           0          1
 2     001546292       1         1          0           1          0
 3     001587306       1         1          1           0          0
 4     YYYY            1         1          0           0          0
 5     ZZZZ            0         0          0           0          0

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 643 views
  • 2 likes
  • 4 in conversation