DATA Step, Macro, Functions and more

How to - Merge Desired Rows

Accepted Solution Solved
Reply
Super Contributor
Posts: 381
Accepted Solution

How to - Merge Desired Rows

Hello everyone,

I have a dataset similar to below. I need to take first three values all of the “Column” variables. I added the picture which I have and I also added the desired output. I tried to use Proc Sort then I use last.ID option but It didn't work. Does anybody have any idea ?

 

data have;
Length ID $ 10 Column1 8 Column2 8 Column3 8;
Infile datalines missover dlm=",";
Input ID Column1 Column2 Column3 ;
datalines;
1,5,.,.,
2,10,.,.,
3,15,.,.,
1,15,10,.,
2,15,20,.,
3,15,30,.,
1,15,30,15,
2,15,30,30,
3,15,30,45,
;
run;
/*It didn't work*/
Proc Sort data=have;
by Id;
run;
/*It didn't work*/
data want;
set have;
BY Id;
If last.Id THEN OUTPUT want;
run;

Have

Have.png

 

 

 

Want

 

Want.png

 

 

Thank  you.

 


Accepted Solutions
Solution
‎11-29-2015 05:48 PM
Respected Advisor
Posts: 4,651

Re: How to - Merge Desired Rows

 

data want;
set have(keep=id column1 firstobs=1);
set have(keep=id column2 firstobs=4);
set have(keep=id column3 firstobs=7);
run;
PG

View solution in original post


All Replies
Solution
‎11-29-2015 05:48 PM
Respected Advisor
Posts: 4,651

Re: How to - Merge Desired Rows

 

data want;
set have(keep=id column1 firstobs=1);
set have(keep=id column2 firstobs=4);
set have(keep=id column3 firstobs=7);
run;
PG
Super User
Super User
Posts: 6,502

Re: How to - Merge Desired Rows

To get what you want you need another variable. You pictures show that for COLUMN1 you want the values from the first group of rows, for column2 the sets of records from the second group of rows, etc. But there is no group number variable in your dataset. If you really don't have such a variable but your data is sorted such that each group always starts with ID=1 then you might create one by using a data step like this.

data grouped;
   set have;
   group+(id=1);
run;

Then you want to merge the values by the groups.

data want ;
   merge grouped(where=(group=1) drop=column2 column3)
          grouped(where=(group=2) drop=column1 column3)
          grouped(where=(group=3) drop=column1 column2)
  ;
  by id;
run;

 

Now if you just want to take the first three non-missing values for each column you could do something like this:

 

data want ;
  do i=1 to 3 ;
     if not eof1 then set have (keep=column1 where=(column1 ne .)) end=eof1;
     if not eof2 then set  have (keep=column2 where=(column2 ne .)) end=eof2;
     if not eof3 then set  have (keep=column3 where=(column3 ne .)) end=eof3;
    output;
    call missing(of column1-column3);
  end;
  stop;
run;

But that would really ignore the ID column, but it is not clear what the ID column means in this case anyway.

Super Contributor
Posts: 381

Re: How to - Merge Desired Rows

[ Edited ]

Hello Tom,

 

Thank you for the detailed information. After I did some calculation on my data the table was created like that(my sample(have)) dataset.Then I needed to merge every columns for the first three rows. ID column is indispensable to me. Actually, on my real dataset there are 40 columns.Your first method(groups) seems to me more dynamic but I don't know how can I do this method for 40 columns. I also have to say that there are many columns on my dataset independently of Column variables and ID variable.

 

By the way, Thank you @PGStats.

Super Contributor
Posts: 381

Re: How to - Merge Desired Rows

Hello again @Tom and @PGStats,

 

I would like to ask you If I have another variables except from "Colum" and "ID" how can I get the other variables on desired dataset. For example, I addded a sample dataset, I put two variables more on the dataset. I need to add all other the variables on my desired dataset Which you helped me to create previous post. I added my last sample dataset below and also last desired output. The significant point is it may be many "Diff" variables on my dataset and the other variables may not start with the same name.

 

data have;
Length ID $ 10 Column1 8 Column2 8 Column3 8 Diff1 $ 10 Diff2 8;
Infile datalines missover dlm=",";
Input ID Column1 Column2 Column3 Diff1 Diff2;
datalines;
2,5,.,.,C,100
1,10,.,.,A,50
3,15,.,.,Y,150
2,15,10,.,C,100
1,15,20,.,A,50
3,15,30,.,Y,150
2,15,30,15,C,100
1,15,30,30,A,50
3,15,30,45,Y,150
;
run;

Desired.png

Super Contributor
Posts: 381

Re: How to - Merge Desired Rows

Hello again,

 

I would like to learn how can I get other variables different from "ID" and "Column" variables. If I use @PGStats method;

 

set have(keep=id column1 firstobs=1);
set have(keep=id column2 firstobs=4);
set have(keep=id column3 firstobs=7);

 

I can get the "ID" and "Column" variables but If I want to get the other variables may not start with the same name on dataset how can I exceute it. Here is my last sample dataset.

 

data have;
Length ID $ 10 Column1 8 Column2 8 Column3 8 Diff1 $ 10 Diff2 8 Alternate 8 OtherWise 8;
Infile datalines missover dlm=",";
Input ID Column1 Column2 Column3 Diff1 Diff2 Alternate OtherWise;
datalines;
2,5,.,.,C,100,11,13
1,10,.,.,A,50,12,14
3,15,.,.,Y,150,13,15
2,15,10,.,C,100,11,15
1,15,20,.,A,50,12,14
3,15,30,.,Y,150,13,15
2,15,30,15,C,100,11,13
1,15,30,30,A,50,12,14
3,15,30,45,Y,150,13,15
;
run;

Want.png

 

 

Thank you.

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 289 views
  • 2 likes
  • 3 in conversation