BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Roger
Calcite | Level 5

Hello,

I have a table that has multiple rows, and some of the values are missing. I just need to combine them into a single row by year.

I have done it with two data sets. In this case, I only have one data set. What is best way to combine them to a single row?

Thanks team in advance!

Input table:

ID Year Value1 Value2 Value3

1  1999              270

1  1999                          350 

1  1999   

1  2000    20

1  2000              300

1  2000                           320

1  2001              122

1  2001                

1  2001                           500

I need the output table as below:

ID Year Value1 Value2 Value3

1  1999           350        270

1  2000   20     300       320

1  2001           122        500

1 ACCEPTED SOLUTION

Accepted Solutions
kmw
SAS Employee kmw
SAS Employee

Editor's note: this topic is very popular.  Thanks to @peterz@Hima, and @AskoLötjönen and others who contributed useful replies.  We have consolidated some of these into this single reply so that future readers may benefit.

 

 

/* input data */

data input_table;

infile cards missover dsd;

input ID Year Value1 Value2 Value3;

datalines;

1,1999,,270

1,1999,,,350

1,1999,,,

1,2000,20

1,2000,,300

1,2000,,,320

1,2001,,122

1,2001,,,

1,2001,,,500

;

run; 

 

/*   Method #1 using PROC MEANS  */

proc means data=input_table noprint nway;

class id year;

var value1 value2 value3;

output

out=output_table(drop=_type_ _freq_) max=;

quit;

 

/* Method #2 using the UPDATE statement */

data want;

update input_table (obs=0) input_table;

by id year;

run;

 

/* Method 3 using PROC SQL */

proc sql;

create table test2 as

select id,

year,

max(value1) as value1,

max(value2) as value2,

max(value3) as value3

from input_table

group by id, year;

quit;

 

/* Resulting data set */

 

Obs   ID   Year  value1 value2 value3

1         1    1999     .         270     350

2         1     2000  20        300     320

3         1     2001    .         122     500

 

View solution in original post

16 REPLIES 16
peterz
Calcite | Level 5

I think you are after:

data

input_table;

infile

cards missover dsd;

input

ID Year Value1 Value2 Value3;

cards

;

1,1999,,270

1,1999,,,350

1,1999,,,

1,2000,20

1,2000,,300

1,2000,,,320

1,2001,,122

1,2001,,,

1,2001,,,500

;

 

 

proc means data=input_table noprint nway;

class id year;

var value1 value2 value3;

output

out=output_table(drop=_type_ _freq_) max

=;

quit

;

peterz
Calcite | Level 5

Hmmmm, not used to this editor thing. It seems pretty stupid. It has moved all the semi-colons and added carriage returns in random places but the basic syntax you need is there.

It also doesn't like me Ctrl+Ving stuff into it.

Haikuo
Onyx | Level 15

What happened? This thread has been sufficiently discussed and multiple answers have been offered yesterday, now they are all GONE???

Hima
Obsidian | Level 7

Yes Hai.Kuo, something happened but they are all gone. Here is the code posted  by PGStats yesterday. I saved it.

data have;
input ID Year Value1 Value2 Value3;
datalines;
1  1999      .      270 .
1  1999      . .                 350 
1  1999   . . .
1  2000    20 . .
1  2000    .         300 .
1  2000    .          .           320
1  2001     .        122 .
1  2001     .         . .    
1  2001     .         .           500
;


data want;
update have(obs=0) have;
by id year;
run;

art297
Opal | Level 21

Hima:  Glad you found it as I was going to repost it myself, and add in the original statement crediting Tom Abernathy for originally suggesting it.

Apparently some people still don't realize that, if they delete a post, they also delete all posts that have replied to their post.

Hima
Obsidian | Level 7

Yes Art. I am totally agreeing with you. PGStats said that the saved the code from Tom. I forgot to add that in my previous post.

hpk
Calcite | Level 5 hpk
Calcite | Level 5

Hello Arthur,

How can we get this kind of a result?

O/P

1 1999  (22,30,12) (270,40,14) (32,50,14) - Brackets means a single cell.

2 2000 (18,20,15) (22,33,22) (25,36,320)

input ID Year Value1 Value2 Value3;

datalines;

1  1999      22  270 32

1  1999      30 40 50 

1  1999   . 12 14

1  2000    18 22 25

1  2000    20 33 36

1  2000    15 22  320

1  2001     .        122 .

1  2001     .         . .    

1  2001     .         .           500

;


data want;
update have(obs=0) have;
by id year;
run;

Thank you!

Haikuo
Onyx | Level 15

aha, the 'wicked' update statement application. Just FYI, after I have studied help document on 'update', it turns out in this case, both (obs=0) and (obs=1) would work.

Regards,

Haikuo

art297
Opal | Level 21

Yes but, if you are dealing with a file containing millions of IDs, why read all of those records if you don't have to?

AskoLötjönen
Quartz | Level 8

Hi Roger,

I think that using of proc Sql is the easiest solution:

data test;

input ID @1 year 4-7 value1 11-17 value2 22-28 value3 33-39;

datalines;

1  1999

;

run;

proc sql;

create table test2 as

select id,

          year,

         max(value1) as value1,

         max(value2) as value2,

         max(value3) as value3

from test

group by id, year;

quit;

UrvishShah
Fluorite | Level 6

Here is my try...

proc sql;

  create table want as

  select id,year,

         sum(value1) as value1,

             sum(value2) as value2,

             sum(value3) as value3

  from have

  group by 1,2;

quit;

-Urvish

kmw
SAS Employee kmw
SAS Employee

Editor's note: this topic is very popular.  Thanks to @peterz@Hima, and @AskoLötjönen and others who contributed useful replies.  We have consolidated some of these into this single reply so that future readers may benefit.

 

 

/* input data */

data input_table;

infile cards missover dsd;

input ID Year Value1 Value2 Value3;

datalines;

1,1999,,270

1,1999,,,350

1,1999,,,

1,2000,20

1,2000,,300

1,2000,,,320

1,2001,,122

1,2001,,,

1,2001,,,500

;

run; 

 

/*   Method #1 using PROC MEANS  */

proc means data=input_table noprint nway;

class id year;

var value1 value2 value3;

output

out=output_table(drop=_type_ _freq_) max=;

quit;

 

/* Method #2 using the UPDATE statement */

data want;

update input_table (obs=0) input_table;

by id year;

run;

 

/* Method 3 using PROC SQL */

proc sql;

create table test2 as

select id,

year,

max(value1) as value1,

max(value2) as value2,

max(value3) as value3

from input_table

group by id, year;

quit;

 

/* Resulting data set */

 

Obs   ID   Year  value1 value2 value3

1         1    1999     .         270     350

2         1     2000  20        300     320

3         1     2001    .         122     500

 

El_Aura
Calcite | Level 5

This is awesome! Thank you!

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 16 replies
  • 70739 views
  • 3 likes
  • 13 in conversation