BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
aalluru
Obsidian | Level 7

I have a data set that I have grouped by 'ID'

ID Mark1 Mark2

1   24       76

1   24       76

1   45       67

2   23       56

2   87       32

 

I want this to be transformed into

ID   Mark1First Mark1Last Mark2First Mark2Last

1    24               45             76               67

2    23               87             56               32

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

With variable renaming and a strategic SET ... POINT= statement, you can have a relatively compact data step:

 

data have;
  input ID Mark1 Mark2;
datalines;
1   24       76
1   24       76
1   45       67
2   23       56
2   87       32
run;
data want;
  set have (rename=(mark1=mark1last mark2=mark2last));
  by id;
  if first.id then set have (rename=(mark1=mark1first mark2=mark2first)) point=_n_;
  if last.id;
run;

 

Editted addition.  An alternative that can be instructive about the use of 2 SET statements each with sequential access:

 

data want;
  set have (rename=(mark1=mark1first mark2=mark2first));
  by id;
  if first.id;
  do until (last.id);
    set have (rename=(mark1=mark1last mark2=mark2last));
    by id;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

3 REPLIES 3
ballardw
Super User

One way:

data want;
   set have;
   by id;
   retain Mark1First Mark1last Mark2first Mark2last;
   if first.id then do;
      Mark1First=mark1;
      Mark2first=mark2; 
   end;
   if last.id then do;
      Mark1last=mark1;
      Mark2last=mark2; 
      output;
   end;
   keep id Mark1First Mark1last Mark2first Mark2last; 
run;

When you use a BY group in a data step SAS creates automatic variables that indicate whether the record is the first or last for a group. You reference these variables with the First. and Last. notation. These are numeric 1/0 for which can be used as 1=True and 0=False for use in if statements.

The RETAIN statement sets up variables to keep values across iterations of the data step boundary.

The explicit OUTPUT statement limits when records are written to the output data set.

mkeintz
PROC Star

With variable renaming and a strategic SET ... POINT= statement, you can have a relatively compact data step:

 

data have;
  input ID Mark1 Mark2;
datalines;
1   24       76
1   24       76
1   45       67
2   23       56
2   87       32
run;
data want;
  set have (rename=(mark1=mark1last mark2=mark2last));
  by id;
  if first.id then set have (rename=(mark1=mark1first mark2=mark2first)) point=_n_;
  if last.id;
run;

 

Editted addition.  An alternative that can be instructive about the use of 2 SET statements each with sequential access:

 

data want;
  set have (rename=(mark1=mark1first mark2=mark2first));
  by id;
  if first.id;
  do until (last.id);
    set have (rename=(mark1=mark1last mark2=mark2last));
    by id;
  end;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
aalluru
Obsidian | Level 7

hey! the 2nd code worked! I was wondering if I could modify this to have the first populated row of mark1 i.e. if the mark1 column has missing values, I'd like to include the first row where mark1 is not missing in mark1first and also include a count variable that has the number of rows where mark1 is not missing

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1729 views
  • 0 likes
  • 3 in conversation