BookmarkSubscribeRSS Feed
SAS1918
Calcite | Level 5

I have a data that has activities each person performs during a season -

NameSeasonActivity
A1x
A1y
A1z
A1t
A2y
A2u
A3x
B1u
B1v
B2r
B2x
B2y
B3k
B3j

 

If a person has activity  x & y  in season 1 then I want all his rows for the first season to have a flag yes-

NameSeasonDiseaseFlag
A1xYes
A1yYes
A1zYes
A1tYes
A2yNo
A2uNo
A3xNo

 

similarly I have different disease combination for different seasons and activity. 

 

I am unable to write an efficient code for this.

The code I have come up with is --

Create table activity_x as					
	Select a.*
	From table1 as a join (Select distinct person, season from table1 where activity= 'x') as b
	on a.person  = b.person and a.season= b.season
	order by person, season;
Quit;

This would give me a data-set with all information for person and seasons where atleast one activity in the season is x.

Similarly I would repeat for activity y and get a dataset.

Then find inner join of the two.

 

But issue is that the activity code combinations are very complex 

 

Any one of this list and Any one of this list
a r
b+c m
d+c n
s+c o

Is there any better way to do this?

4 REPLIES 4
gamotte
Rhodochrosite | Level 12

Hello,

 

 

data have;
    input name $ season $ activity $;
    cards;
A	1	x
A	1	y
A	1	z
A	1	t
A	2	y
A	2	u
A	3	x
B	1	u
B	1	v
B	2	r
B	2	x
B	2	y
B	3	k
B	3	j
;
run;

proc sql noprint;
    CREATE TABLE want AS
    SELECT *, CASE WHEN (sum(activity="x") AND sum(activity="y")) THEN "YES" ELSE "NO" END AS Flag format=$3.
    FROM have
    GROUP BY name, season
    ;
quit;
SAS1918
Calcite | Level 5

Hi Could you please explain how sum function is working for categorical variables 

gamotte
Rhodochrosite | Level 12

(activity="x") implicitely defines a variable which is 1 when the condition is met and 0 otherwise.

The sum function will compute the sum for this implicit variable for all observations belonging

to the group specified by the GROUP BY clause, that is the sum per person/season.

 

So if sum(activity="x")>0, it means that at least one observation within the person/season group has

the variable activity equals to x.

 

SAS1918
Calcite | Level 5

Thanks for the explanation.

But the issue is that I have multiple activity columns ( activity 1- activity 5) and within that I have about 30 combinations of activities for which I have to make 8-10 flag variables. Is there any other way to do this so that I dont have to write an extremely long code?

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1398 views
  • 1 like
  • 2 in conversation