Solved: Re: Convert to wide format

vdfdd · Posted 07-30-2018 12:43 PM

I have a dataset in the following format:

ID	Counts	State
2997	18	FL
2997	4	TX
2997	31	NM
2997	17	NY
2997	11	GA
2998	32	FL
2998	48	NY
2998	36	CA
2998	4	GA
2998	4	NM
2999	12	FL
2999	42	CA
2999	67	GA
2999	23	NM
2999	9	TX

How do I manipulate it using proc SQL (etc.) if I want it like this?

	FL	TX	NM	NY	GA	NY	CA
2997
2998
2999

Thank you all in advance.

Kurt_Bremser · Posted 07-30-2018 12:45 PM

proc transpose

by id

var counts

id state

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

View solution in original post

Kurt_Bremser · Posted 07-30-2018 12:45 PM

proc transpose

by id

var counts

id state

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

VDD · Posted 07-30-2018 12:45 PM

use proc transpose

Tom · Posted 07-30-2018 12:46 PM

Don't do it with PROC SQL. Use PROC TRANSPOSE.

novinosrin · Posted 07-30-2018 12:46 PM

data have;
input ID	Counts	State $;
cards;
2997	18	FL
2997	4	TX
2997	31	NM
2997	17	NY
2997	11	GA
2998	32	FL
2998	48	NY
2998	36	CA
2998	4	GA
2998	4	NM
2999	12	FL
2999	42	CA
2999	67	GA
2999	23	NM
2999	9	TX
;

proc transpose data=have out=want;
by id;
id state;
var counts;
run;

PeterClemmensen · Posted 07-30-2018 12:48 PM

I think most users in here think you should use Proc Transpose 🙂

But why on earth would you want to use Proc SQL anyway?

The DATA to DATA Step Macro
Blog: SASnrd

novinosrin · Posted 07-30-2018 01:06 PM

proc sql;
select distinct state into :state separated by ' '
from have;
quit;

%put &state;


data want1;
do _n_=1 by 1 until(last.id);
set have;
by id;
array st &state;
if _n_=1 then call missing(of st(*));
do _i_=1 to dim(st);
if vname(st(_i_))=state then st(_i_)=counts;
end;
end;
run;

kiranv_ · Posted 07-30-2018 01:53 PM

As everyone suggested, this is easy and clean using PROC transpose. But if you want to do in PROC sql. you can do it if you have one distinct state per id by lot of hard coding as shown below

proc sql;
create table want as 
select id,
       max( case when state = "FL" then counts else 0 end) as FL,
	   max( case when state = "TX" then counts else 0 end) as TX,
	   max( case when state = "NM" then counts else 0 end) as NM,
	   max( case when state = "NY" then counts else 0 end) as NY,
	   max( case when state = "GA" then counts else 0 end) as GA,
	     max( case when state = "CA" then counts else 0 end) as CA
from have
group by 1;

vdfdd · Posted 08-02-2018 04:16 PM

Thank you for all your quick and helpful responses. The proc transpose threw up some error messages saying ' The ID value "FL" occurs twice in the same BY group'.

The PROC SQL worked fine for this purpose. I didn't try the second suggested SQL procedure since the first worked for my data.

proc sql;
select distinct state into :state separated by ' '
from have;
quit;

%put &state;

data want1;
do _n_=1 by 1 until(last.id);
set have;
by id;
array st &state;
if _n_=1 then call missing(of st(*));
do _i_=1 to dim(st);
if vname(st(_i_))=state then st(_i_)=counts;
end;
end;
run;

I appreciate your efforts. Thank you all once again.

art297 · Posted 08-02-2018 04:41 PM

@vdfdd: Check your input dataset! Sounds like you have multiple records for one or more IDs for at least one of your by groups.

Either that's in error, or you have to sum the same-state records before transposing the dataset.

Art, CEO, AnalystFinder.com

Kurt_Bremser · Posted 08-03-2018 01:32 AM

This error message from proc transpose also means that your results from the other approaches will not be correct, as values will be overwritten and lost. Another reason why proc transpose is THE tool for this, as it will alert you.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

vdfdd · Posted 08-06-2018 09:31 AM

When I added the Let option to the proc transpose, I got the following:

WARNING: 4 observations omitted due to missing ID values.
NOTE: There were 99187 observations read from the data set WORK.CLEANY.
NOTE: The data set WORK.WANTY has 26568 observations and 38 variables.

The same number of observations as the proc sql. Any thought on this?

Kurt_Bremser · Posted 08-06-2018 09:43 AM

So you have missing ID values. Decide how you want to treat those.

Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
The macro for direct download as ZIP
How to post code
Please vote for Provide Sequential Search Capability for Hash Objects
How to deal with locked files on UNIX

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away