DATA Step, Macro, Functions and more

How to combine rows into one

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 11
Accepted Solution

How to combine rows into one

Dear All,

 

I would like to combine two rows of data with missing information in different rows.

 

For example:

 

Jack / 500 / head /     / 23

Jack /       / head / patil / 23

Tim / 100 / shoulder/     / 24

Tim / 100 /              / Dufwenberg /   

 

The end result would be:

Jack / 500 / head / patil / 23

Tim / 100 / shoulder/ Dufwenberg / 24

 

Any suggestions of how to perform this in SAS?

 

Thank you,

Jack


Accepted Solutions
Solution
‎02-10-2017 10:16 AM
Respected Advisor
Posts: 3,777

Re: How to combine rows into one


jackmelbourne wrote:

Yes, exactly.


Updating to last non-missing is standard UPDATE statement.  The conditional min/max makes for a bit of extra code for AGE.

 

data have;
   input Name :$upcase4. Age Net Reg Loss;
   cards;
home 300 200 50     .
home 100 80   .     5
TDR  80  75   .     3
TDR  70   .   5     .
;;;;
   run;
proc print;
   run;
data want;
   if 0 then set have;
   update have(keep=name obs=0) have(drop=age);
   by name;
   if first.name then age=.;
   set have(keep=age rename=(age=_age));
   select(name);
      when('HOME') age = min(age,_age);
      when('TDR')  age = max(age,_age);
      end;
   drop _age;
   run;
proc print;
   run;

Capture.PNG

View solution in original post


All Replies
Super User
Posts: 5,260

Re: How to combine rows into one

Is this the only case you have, duplicates on two rows, and all non missing values are the same?

I would try to use SQL with max() for this specific scenario.

Data never sleeps
Super User
Super User
Posts: 7,424

Re: How to combine rows into one

Well, not typing that in, so this is just examples (note: post test data in the form of a datastep for better answers):

Sort and merge each variable with column1

Use retain statements in one datatep and only output at last.column1

Transpose the data and sort nodupkey.

Occasional Contributor
Posts: 11

Re: How to combine rows into one

Sorry, maybe I could make my question a bit clearer with this dataset?

 

For example:

 

dataset: housing

 

Name       Age      Net     Reg     Loss

home        300        200     50        . 

home        100        80        .          5

TDR          80          75        .         3

TDR          70           .          5        .

 

I would like to transform the housing data set to look like this

 

Name       Age      Net     Reg     Loss

home        100        80      50         5

TDR          80          75       5         3

 

What code would you recommend?

 

 

Respected Advisor
Posts: 3,777

Re: How to combine rows into one


jackmelbourne wrote:

Sorry, maybe I could make my question a bit clearer with this dataset?

 

For example:

 

dataset: housing

 

Name       Age      Net     Reg     Loss

home        300        200     50        . 

home        100        80        .          5

TDR          80          75        .         3

TDR          70           .          5        .

 

I would like to transform the housing data set to look like this

 

Name       Age      Net     Reg     Loss

home        100        80      50         5

TDR          80          75       5         3

 

What code would you recommend?

 

 


Why does age 100 update 300 for NAME=HOME while age 70 does not update 80 for NAME=TDR?

Occasional Contributor
Posts: 11

Re: How to combine rows into one

Hi,

 

Sorry, should have made that clearer.

 

I need the lower value for home and the higher value for TDR when there are values for both observations.

 

Thanks,

Jack

Respected Advisor
Posts: 3,777

Re: How to combine rows into one


jackmelbourne wrote:

Hi,

 

Sorry, should have made that clearer.

 

I need the lower value for home and the higher value for TDR when there are values for both observations.

 

Thanks,

Jack


And the other variables update to last non-missing?

Occasional Contributor
Posts: 11

Re: How to combine rows into one

Yes, exactly.

Solution
‎02-10-2017 10:16 AM
Respected Advisor
Posts: 3,777

Re: How to combine rows into one


jackmelbourne wrote:

Yes, exactly.


Updating to last non-missing is standard UPDATE statement.  The conditional min/max makes for a bit of extra code for AGE.

 

data have;
   input Name :$upcase4. Age Net Reg Loss;
   cards;
home 300 200 50     .
home 100 80   .     5
TDR  80  75   .     3
TDR  70   .   5     .
;;;;
   run;
proc print;
   run;
data want;
   if 0 then set have;
   update have(keep=name obs=0) have(drop=age);
   by name;
   if first.name then age=.;
   set have(keep=age rename=(age=_age));
   select(name);
      when('HOME') age = min(age,_age);
      when('TDR')  age = max(age,_age);
      end;
   drop _age;
   run;
proc print;
   run;

Capture.PNG

Occasional Contributor
Posts: 11

Re: How to combine rows into one

Perfect, thank you so much! Seems to work Smiley Happy

Super Contributor
Posts: 336

Re: How to combine rows into one

(simplistic :-) ) solution:

Data A;
  Format Name $8. Value 8. BodyPart $8. UnknownVar $12. Value2 8.;
  Input Name $ Value BodyPart $ UnknownVar $ Value2; 
  Datalines;
Jack 500 head . 23
Jack . head patil 23
Tim 100 shoulder . 24
Tim 100 . Dufwenberg .
;
* ^^^ check if this is similar to your data here ..;

Data B (Keep=Dummy:
		Rename=(DummyName=Name 
				DummyValue=Value
				DummyBodyPart=BodyPart
				DummyUnknownVar=UnknownVar
				DummyValue2=Value2)); 
  Set A;
  Format DummyName $8. DummyValue 8. DummyBodyPart $8. DummyUnknownVar $12. DummyValue2 8.;
  Retain DummyName DummyValue DummyBodyPart DummyUnknownVar DummyValue2;
  By /*Notsorted*/ Name;
  If not Missing (Name) Then DummyName=Name; * hm, different var types are a bit awkward ..;
  If not Missing (Value) Then DummyValue=Value; * .. could try array otherwise ..;
  If not Missing (BodyPart) Then DummyBodyPart=BodyPart; * .. or maybe call execute;
  If not Missing (UnknownVar) Then DummyUnknownVar=UnknownVar;
  If not Missing (Value2) Then DummyValue2=Value2;
  If Last.Name Then Output;
Run;
Valued Guide
Posts: 505

Re: How to combine rows into one

Keep the last non-missing value for each variable

I think there is an error in you 'want' table
TDR(AGE)=70 not 80


HAVE
====

Up to 40 obs WORK.HAVE total obs=4

Obs    NAME   AGE  NET  REG  LOSS

 1     home   300  200   50     .
 2     home   100   80    .     5
 3     TDR     80   75    .     3
 4     TDR     70    .    5     .

WANT
====

Obs    NAME   AGE  NET  REG  LOSS

 1     home   100   80   50     5
 2     TDR     70   75    5     3

SOLUTION

* I set this up to work with an arbitrary number of numeric columns;
* Do not need disubl if you hardcode variable names;
* I wish SAS would make the number of coluns available at compilation time;
data have;
input Name $ Age Net Reg Loss;
cards4;
home 300 200 50 .
home 100 80 . 5
TDR 80 75 . 3
TDR 70 . 5 .
;;;;
run;quit;

data _null_;
    set have(obs=1);
    array nums _numeric_;
    call symputx('nums',put(dim(nums),5.));
    rc=dosubl('
      data want;
        set have;
        by name notsorted;
        array nums _numeric_;
        array savs[&nums.] _temporary_;
        do i=1 to dim(nums);
          if nums[i] ne . then savs[i]=nums[i];
        end;
        do i=1 to dim(nums);
          nums[i]=savs[i];
        end;
        if last.name then output;
      run;quit;
    ');
run;quit;

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 356 views
  • 1 like
  • 6 in conversation