BookmarkSubscribeRSS Feed
samraz
Calcite | Level 5

Hello all, 

I am a complete newbie to SAS and mostly survived through this forum and UCLA thus far. I have rows of data on same subject (1 to 20+ occurrences). I have combined them using PROC means by class SubjectID, which worked well for continuous variables.

 

However, I have a few categorical variables (e.g. WhichHand?, WhatTime?) for which I need one conditional observation per subject. For example, if a subject used only R hand=1 then newvar=1, only L=2 then newvar=2, but if they used R and L=2 OR both=3  then newvar=3. 

 

I tried to create a new variable using data step with BY statement, used first.subject last.subjectID tutorials, but nothing has worked. Instead of new variable doing a loop of all observation for subject, it basically copies the old variable. 

 

I am posting my most recent attempt, but I've basically given up and may just end up using mode or max in proc summary/means with By SubjectID at this point. 

 

 

PROC FORMAT;
	VALUE TIMED
	0 = 'morning' 
	1 = 'afternoon'
	2 = 'evening' 		
	3 = 'overnight';
	VALUE HAND
		1 = 'Left'
		2 = 'Right'
		3 = 'Both';
run;
DATA attemptWhichHand;
	SET have;
	BY subject_id;
	IF ((WhichHand=3)) THEN HandUsed = 3;
	else IF ((WhichHand=1) AND (WhichHand=2) AND (WhichHand=3)) THEN HandUsed = 3;	
	else IF ((WhichHand=1) AND (WhichHand=2)) THEN HandUsed = 3;
	else IF ((WhichHand=1) AND (WhichHand=3)) THEN HandUsed = 3;
	else IF ((WhichHand=2) AND (WhichHand=3)) THEN HandUsed = 3;
	else IF (WhichHand=1) THEN HandUsed = 1;
	else IF ((WhichHand=2)) THEN HandUsed = 2;  
	FORMAT HandUsed HAND.;
run;
/* This is what I did with other variables*/
proc means noprint data=HM_Meta.glasssheet NWAY ;
	class SUBJECT_ID ;
	id DOB;
	output out=hand (drop=_type_ rename=(_freq_=TotalCollect))
	Min (Age) = MinAgeWeek
	Max (Age) = MaxAgeWeek
	Range (Age) =  RAgeWeek
	Min (DATE) = MinDate
	Max (DATE) = MaxDate
	Max (Time WhichHand) = TimeMx WhichHandMx
/*Find a better way to summarize these variables - create new varibale only left, only right, both*/
run;/*925*/
proc print data=attempt (obs=50);    
run;

 

 

 

SubjectID date      WhichHand? Time   AgeMilk DOB .....
1      	24Jul2020       1         0
1       20Jul2020       2         2
1	29Jul2020 	1	  3
2	28Jul2020       3	  0 
3       30Jul2020       1         4

Sorry about the poor quality of post and if this has been answered and thank you for your time 🙂 

2 REPLIES 2
PaigeMiller
Diamond | Level 26

I tried to create a new variable using data step with BY statement, used first.subject last.subjectID tutorials, but nothing has worked. 

 

What doesn't work? Explain this to us. What is the evidence that something isn't working properly? Show us the evidence.

 

It might also help if you provide a portion of the data set named HAVE, following these instructions (and not via any other method).

--
Paige Miller
FreelanceReinh
Jade | Level 19

Hello @samraz and welcome to the SAS Support Communities!

 

The issue with your code is that the IF conditions are checked for each record separately. But for a single observation a condition like WhichHand=1 AND WhichHand=2 is always FALSE because variable WhichHand cannot have different values at the same time. What you want to check is whether WhichHand=1 for one observation and WhichHand=2 for another observation of the same subject. So, it will help to RETAIN the information across observations of a subject_id BY group and accumulate it until the last observation of the subject is reached.

 

One of several ways to achieve this is to use a temporary array:

/* Create sample data for demonstration */

data have;
input subject_id date :date9. WhichHand;
format date date9.;
cards;
1 24Jul2020 1
1 20Jul2020 2
1 29Jul2020 1
2 27Jul2020 2
2 28Jul2020 3
3 30Jul2020 2
3 31Jul2020 2
4 31Jul2020 1
;

/* Summarize WhichHand data */

data want(keep=subject_id HandUsed);
array h[3] _temporary_;
set have;
by subject_id;
if first.subject_id then call missing(of h[*]); /* initialize array for accumulated WhichHand data */
if WhichHand in (1:3) then h[WhichHand]=1;      /* tick "Left", "Right" or "Both" according to WhichHand */
if last.subject_id;                             /* continue once all obs. of a subject have been evaluated */
if h[3] or h[1] and h[2] then HandUsed=3;       /* collect information from the array to determine */
else if h[1] then HandUsed=1;                   /* the value of HandUsed                           */
else if h[2] then HandUsed=2;
else HandUsed=0;
format HandUsed HAND.;
run;

Note that the three array elements h[1], h[2], h[3] are automatically retained and that their values, 1 or missing, are evaluated as TRUE or FALSE in IF statements.

 

Personally, I would feel tempted to use the (rarely used) BOR function:

data want(keep=subject_id HandUsed);
set have;
by subject_id;
retain HandUsed;
if first.subject_id then HandUsed=0;
if WhichHand in (1:3) then HandUsed=bor(HandUsed, WhichHand);
if last.subject_id;
format HandUsed HAND.;
run;

But the array technique will generalize more easily to other categorical variables.

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 632 views
  • 1 like
  • 3 in conversation