BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aalluru
Obsidian | Level 7

I have a dataset as follows:

data have;
  input ID ID1 Mark1;
datalines;
1   1       .
1   1       76
1   1       67
2   2       .
2   2       32
2 2 45 run;

I would like to group by ID and ID1 and extract the first and last non-missing values of mark for each group so that the resultant dataset looks like this:

ID   ID1   Mark1First   Mark1Last

1    1       76                  67

2    2       32                  45

1 ACCEPTED SOLUTION

Accepted Solutions
r_behata
Barite | Level 11
data have;
input ID ID1 Mark1 Mark2 $ GetSum;
datalines;
1 1 .  N 1 
1 1 76 N 2
1 1 67 Y 3
2 2 .  N 4
2 2 32 N 5
2 2 45 Y 6
;
run;


data want;
length Mark2First Mark2Last $1.;
call missing(Mark1First,Mark1Last,Mark2First,Mark2Last,Sum_GetSum);
do until(last.id);
	set have;
	by id ;

	Sum_GetSum=sum(GetSum,Sum_GetSum);
	if  missing(Mark1First) then do;
			Mark1First=Mark1;
			Mark2First=Mark2;
	end;
end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2 GetSum; 

run;

View solution in original post

13 REPLIES 13
Jerrya00
Fluorite | Level 6

Please try this:

 

proc sort data=have; by ID ID2;
run;

data have_1;
set have;
by ID ID1;
retain Mark1First Mark1Last;
retain first last;
if first.ID1 then do;
   first=.;
   last=.;

   Mark1First=.;
   Mark1Last=.;
end;
if first=. and Mark1^=. then do; 
   first=1; 
   Mark1First=Mark1;
end;
if Mark1^=. then Mark1Last=Mark1;
if last.ID1 then output;

keep ID ID1 Mark1First Mark1Last;
run;

 

 

aalluru
Obsidian | Level 7

what if there's also a Mark2 and I have to do this for both Mark1 and Mark2?

i.e. Mark1First, Mark2First, Mark1Last, Mark2Last?

mkeintz
PROC Star

Use the WHERE statement:



data have; input ID ID1 Mark1; datalines; 1 1 . 1 1 76 1 1 67 2 2 . 2 2 32 2 2 45 run; data want (drop=mark1); set have (where=(mark1^=.)); by id id1; retain mark1first; if first.id1 then mark1first=mark1; if last.id1; mark1last=mark1; run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
aalluru
Obsidian | Level 7

what if there is also a Mark2 and I have to do this for both Mark1 and Mark2?

i.e. Mark1First, Mark2First, Mark1Last, Mark2Last?

r_behata
Barite | Level 11
data want;

Mark1First=.;
Mark1Last=.;
do until(last.id);
	set have;
	by id ;
	if  missing(Mark1First) then Mark1First=Mark1;
end;
Mark1Last=Mark1;
drop Mark1;

run;
aalluru
Obsidian | Level 7

what if there's also a Mark2 and I need to do this for both Mark1 and Mark2?

i.e. Mark1First, Mark2First, Mark1Last, Mark2Last?

r_behata
Barite | Level 11

Try this :

 

data have;
  input ID ID1 Mark1 Mark2;
datalines;
1 1 .  .
1 1 76 33
1 1 67 58
2 2 .  .
2 2 32 89
2 2 45 11
;
run;

data want;

call missing(Mark1First,Mark1Last,Mark2First,Mark2Last);
do until(last.id);
	set have;
	by id ;
	if  missing(Mark1First) then Mark1First=Mark1;
	if  missing(Mark2First) then Mark2First=Mark2;

end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2;

run;



proc print data=want;
run;
aalluru
Obsidian | Level 7

Actually my data set looks a little like this:

data have;
  input ID ID1 Mark1 Mark2;
datalines;
1 1 .  N
1 1 76 N
1 1 67 Y
2 2 .  N
2 2 32 N
2 2 45 Y
;
run;

So I need to make sure that Mark2First for ID 1 is the 'N' coressponding to the row with Mark1 as 76 and not the very first 'N' that you see in the column

r_behata
Barite | Level 11

Try this code. I have changed the First N to X to show the distinction.

 

data have;
  input ID ID1 Mark1 Mark2 $;
datalines;
1 1 .  X
1 1 76 N
1 1 67 Y
2 2 .  X
2 2 32 N
2 2 45 Y
;
run;


data want;
length Mark2First Mark2Last $1.;
call missing(Mark1First,Mark1Last,Mark2First,Mark2Last);
do until(last.id);
	set have;
	by id ;
	if  missing(Mark1First) then do;
			Mark1First=Mark1;
			Mark2First=Mark2;
	end;
end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2;

run;

aalluru
Obsidian | Level 7

Oh that works! Here's one last update (I promise!)

data have;
  input ID ID1 Mark1 Mark2 GetSum;
datalines;
1 1 .  N 1 
1 1 76 N 2
1 1 67 Y 3
2 2 .  N 4
2 2 32 N 5
2 2 45 Y 6
;
run;

I also want to get the sum of ALL values from the getsum column by the group so the output should be 6 and 15 for the 1st and 2nd row and not 5 and 11

mkeintz
PROC Star

There is too much mission creep in this topic.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
r_behata
Barite | Level 11
data have;
input ID ID1 Mark1 Mark2 $ GetSum;
datalines;
1 1 .  N 1 
1 1 76 N 2
1 1 67 Y 3
2 2 .  N 4
2 2 32 N 5
2 2 45 Y 6
;
run;


data want;
length Mark2First Mark2Last $1.;
call missing(Mark1First,Mark1Last,Mark2First,Mark2Last,Sum_GetSum);
do until(last.id);
	set have;
	by id ;

	Sum_GetSum=sum(GetSum,Sum_GetSum);
	if  missing(Mark1First) then do;
			Mark1First=Mark1;
			Mark2First=Mark2;
	end;
end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2 GetSum; 

run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2729 views
  • 1 like
  • 4 in conversation