BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aalluru
Obsidian | Level 7

I have a dataset as follows:

data have;
  input ID ID1 Mark1;
datalines;
1   1       .
1   1       76
1   1       67
2   2       .
2   2       32
2 2 45 run;

I would like to group by ID and ID1 and extract the first and last non-missing values of mark for each group so that the resultant dataset looks like this:

ID   ID1   Mark1First   Mark1Last

1    1       76                  67

2    2       32                  45

1 ACCEPTED SOLUTION

Accepted Solutions
r_behata
Barite | Level 11
data have;
input ID ID1 Mark1 Mark2 $ GetSum;
datalines;
1 1 .  N 1 
1 1 76 N 2
1 1 67 Y 3
2 2 .  N 4
2 2 32 N 5
2 2 45 Y 6
;
run;


data want;
length Mark2First Mark2Last $1.;
call missing(Mark1First,Mark1Last,Mark2First,Mark2Last,Sum_GetSum);
do until(last.id);
	set have;
	by id ;

	Sum_GetSum=sum(GetSum,Sum_GetSum);
	if  missing(Mark1First) then do;
			Mark1First=Mark1;
			Mark2First=Mark2;
	end;
end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2 GetSum; 

run;

View solution in original post

13 REPLIES 13
Jerrya00
Fluorite | Level 6

Please try this:

 

proc sort data=have; by ID ID2;
run;

data have_1;
set have;
by ID ID1;
retain Mark1First Mark1Last;
retain first last;
if first.ID1 then do;
   first=.;
   last=.;

   Mark1First=.;
   Mark1Last=.;
end;
if first=. and Mark1^=. then do; 
   first=1; 
   Mark1First=Mark1;
end;
if Mark1^=. then Mark1Last=Mark1;
if last.ID1 then output;

keep ID ID1 Mark1First Mark1Last;
run;

 

 

aalluru
Obsidian | Level 7

what if there's also a Mark2 and I have to do this for both Mark1 and Mark2?

i.e. Mark1First, Mark2First, Mark1Last, Mark2Last?

mkeintz
PROC Star

Use the WHERE statement:



data have; input ID ID1 Mark1; datalines; 1 1 . 1 1 76 1 1 67 2 2 . 2 2 32 2 2 45 run; data want (drop=mark1); set have (where=(mark1^=.)); by id id1; retain mark1first; if first.id1 then mark1first=mark1; if last.id1; mark1last=mark1; run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
aalluru
Obsidian | Level 7

what if there is also a Mark2 and I have to do this for both Mark1 and Mark2?

i.e. Mark1First, Mark2First, Mark1Last, Mark2Last?

r_behata
Barite | Level 11
data want;

Mark1First=.;
Mark1Last=.;
do until(last.id);
	set have;
	by id ;
	if  missing(Mark1First) then Mark1First=Mark1;
end;
Mark1Last=Mark1;
drop Mark1;

run;
aalluru
Obsidian | Level 7

what if there's also a Mark2 and I need to do this for both Mark1 and Mark2?

i.e. Mark1First, Mark2First, Mark1Last, Mark2Last?

r_behata
Barite | Level 11

Try this :

 

data have;
  input ID ID1 Mark1 Mark2;
datalines;
1 1 .  .
1 1 76 33
1 1 67 58
2 2 .  .
2 2 32 89
2 2 45 11
;
run;

data want;

call missing(Mark1First,Mark1Last,Mark2First,Mark2Last);
do until(last.id);
	set have;
	by id ;
	if  missing(Mark1First) then Mark1First=Mark1;
	if  missing(Mark2First) then Mark2First=Mark2;

end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2;

run;



proc print data=want;
run;
aalluru
Obsidian | Level 7

Actually my data set looks a little like this:

data have;
  input ID ID1 Mark1 Mark2;
datalines;
1 1 .  N
1 1 76 N
1 1 67 Y
2 2 .  N
2 2 32 N
2 2 45 Y
;
run;

So I need to make sure that Mark2First for ID 1 is the 'N' coressponding to the row with Mark1 as 76 and not the very first 'N' that you see in the column

r_behata
Barite | Level 11

Try this code. I have changed the First N to X to show the distinction.

 

data have;
  input ID ID1 Mark1 Mark2 $;
datalines;
1 1 .  X
1 1 76 N
1 1 67 Y
2 2 .  X
2 2 32 N
2 2 45 Y
;
run;


data want;
length Mark2First Mark2Last $1.;
call missing(Mark1First,Mark1Last,Mark2First,Mark2Last);
do until(last.id);
	set have;
	by id ;
	if  missing(Mark1First) then do;
			Mark1First=Mark1;
			Mark2First=Mark2;
	end;
end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2;

run;

aalluru
Obsidian | Level 7

Oh that works! Here's one last update (I promise!)

data have;
  input ID ID1 Mark1 Mark2 GetSum;
datalines;
1 1 .  N 1 
1 1 76 N 2
1 1 67 Y 3
2 2 .  N 4
2 2 32 N 5
2 2 45 Y 6
;
run;

I also want to get the sum of ALL values from the getsum column by the group so the output should be 6 and 15 for the 1st and 2nd row and not 5 and 11

mkeintz
PROC Star

There is too much mission creep in this topic.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
r_behata
Barite | Level 11
data have;
input ID ID1 Mark1 Mark2 $ GetSum;
datalines;
1 1 .  N 1 
1 1 76 N 2
1 1 67 Y 3
2 2 .  N 4
2 2 32 N 5
2 2 45 Y 6
;
run;


data want;
length Mark2First Mark2Last $1.;
call missing(Mark1First,Mark1Last,Mark2First,Mark2Last,Sum_GetSum);
do until(last.id);
	set have;
	by id ;

	Sum_GetSum=sum(GetSum,Sum_GetSum);
	if  missing(Mark1First) then do;
			Mark1First=Mark1;
			Mark2First=Mark2;
	end;
end;
Mark1Last=Mark1;
Mark2Last=Mark2
;

drop Mark1 Mark2 GetSum; 

run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1495 views
  • 1 like
  • 4 in conversation