Hi all,
I have following data, where the state name is embedded in the variable name. I need to create a new variable called state from this.
testdata:
period | var1.ID | var1.OR | var1.NV | var1.WA | var1.CA | var2.ID | var2.OR | var2.NV | var2.WA | var2.CA |
1995Q1 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
1995Q2 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
1995Q3 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
1995Q4 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
1996Q1 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
1996Q2 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
1996Q3 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
1996Q4 | 10 | 20 | 30 | 40 | 50 | 10 | 20 | 30 | 40 | 50 |
expected data:
period | state | var1 | var2 |
1995Q1 | ID | 10 | 10 |
1995Q1 | OR | 20 | 20 |
1995Q1 | NV | 30 | 30 |
1995Q1 | WA | 40 | 40 |
1995Q1 | CA | 50 | 50 |
1995Q2 | ID | 10 | 10 |
1995Q2 | OR | 20 | 20 |
1995Q2 | NV | 30 | 30 |
1995Q2 | WA | 40 | 40 |
1995Q2 | CA | 50 | 50 |
1995Q3 | ID | 10 | 10 |
1995Q3 | OR | 20 | 20 |
1995Q3 | NV | 30 | 30 |
1995Q3 | WA | 40 | 40 |
1995Q3 | CA | 50 | 50 |
1995Q4 | ID | 10 | 10 |
1995Q4 | OR | 20 | 20 |
1995Q4 | NV | 30 | 30 |
1995Q4 | WA | 40 | 40 |
1995Q4 | CA | 50 | 50 |
Please help me with this.
Thanks
This is done with proc transpose:
data have;
infile cards dlm='09'x;
input period $ var1_ID var1_OR var1_NV var1_WA var1_CA var2_ID var2_OR var2_NV var2_WA var2_CA;
cards;
1995Q1 10 20 30 40 50 10 20 30 40 50
1995Q2 10 20 30 40 50 10 20 30 40 50
1995Q3 10 20 30 40 50 10 20 30 40 50
1995Q4 10 20 30 40 50 10 20 30 40 50
1996Q1 10 20 30 40 50 10 20 30 40 50
1996Q2 10 20 30 40 50 10 20 30 40 50
1996Q3 10 20 30 40 50 10 20 30 40 50
1996Q4 10 20 30 40 50 10 20 30 40 50
;
run;
proc transpose data=have out=trans1;
by period;
var var:;
run;
data trans2;
set trans1;
length state $2;
state = scan(_name_,2,'_');
_name_ = scan(_name_,1,'_');
run;
proc sort data=trans2;
by period state;
run;
proc transpose data=trans2 out=want (drop=_name_);
by period state;
var col1;
id _name_;
run;
Note that the only assumption is the prefix and the underline as separator in the original variable name.
Note 2: dots are not valid in SAS variable names, so I changed to underlines.
How is this data stored? Do you have periods in your variable names?
yes,
we have a variable named "period" which contains all the quarters in character format
@PeterClemmensen noted that dots are not valid in SAS names, so you can't have var1.OR as a variable.
This is done with proc transpose:
data have;
infile cards dlm='09'x;
input period $ var1_ID var1_OR var1_NV var1_WA var1_CA var2_ID var2_OR var2_NV var2_WA var2_CA;
cards;
1995Q1 10 20 30 40 50 10 20 30 40 50
1995Q2 10 20 30 40 50 10 20 30 40 50
1995Q3 10 20 30 40 50 10 20 30 40 50
1995Q4 10 20 30 40 50 10 20 30 40 50
1996Q1 10 20 30 40 50 10 20 30 40 50
1996Q2 10 20 30 40 50 10 20 30 40 50
1996Q3 10 20 30 40 50 10 20 30 40 50
1996Q4 10 20 30 40 50 10 20 30 40 50
;
run;
proc transpose data=have out=trans1;
by period;
var var:;
run;
data trans2;
set trans1;
length state $2;
state = scan(_name_,2,'_');
_name_ = scan(_name_,1,'_');
run;
proc sort data=trans2;
by period state;
run;
proc transpose data=trans2 out=want (drop=_name_);
by period state;
var col1;
id _name_;
run;
Note that the only assumption is the prefix and the underline as separator in the original variable name.
Note 2: dots are not valid in SAS variable names, so I changed to underlines.
hi,..
one question. what is var: in this code mean?? do I need to specify all variable names?
proc transpose data=have out=trans1;
by period;
var var:;
run;
The colon is a wildcard character, similar to the asterisk in WIndows or UNIX. So
var:
simply means "all variables where the name starts with 'var'".
Thanks !!!! The code works perfectly !!!
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.