Hi All,
I have a dataset from my client that I need to reformat so that I can do some analysis and create some reports with it. It is in Excel in the following structure:
Chicago | DC | ||
Building | 25156 | 54885 | 2665 |
Phase 1 | 5/27/2011 | 7/5/2012 | 3/30/2012 |
Phase 2 | 6/30/2011 | 8/10/2012 | 4/4/2012 |
Phase 3 | 7/15/2011 | 8/1/2012 | 4/11/2012 |
Time from Phase 1 to Phase 3 | 18 | 25 | 12 |
I want it in this format (I will calculate the time in days myself)
Building | Phase 1 | Phase 2 | Phase 3 | City |
25156 | 5/27/2011 | 6/30/2011 | 7/15/2011 | Chicago |
54885 | 7/5/2012 | 8/10/2012 | 8/1/2012 | Chicago |
2665 | 3/30/2012 | 4/4/2012 | 4/11/2012 | DC |
Thanks!
Hi.
Assuming that you can 'unmerge' the cell named 'Chicago' to 'Chicago' 'Chicago', or 'Chicago' 'Chicago1' then this may do what you need:
proc import out = your_data
datafile= "path\data.xls"
dbms = excel replace;
MIXED=YES; *this is important because in the city variable(s) you have dates and building codes;
run;
*now we transpose and use the ID to label the columns;
proc transpose data = your_data out = my_data;id var1;var _all_ ;run;
*clean up the data;
data want(rename = (_name_ = city));
set my_data(where = (_name_ ne "var1") drop = _label_ );
label _name_ = "City";
run;
Let me know what you think.
Anca.
Hi.
Assuming that you can 'unmerge' the cell named 'Chicago' to 'Chicago' 'Chicago', or 'Chicago' 'Chicago1' then this may do what you need:
proc import out = your_data
datafile= "path\data.xls"
dbms = excel replace;
MIXED=YES; *this is important because in the city variable(s) you have dates and building codes;
run;
*now we transpose and use the ID to label the columns;
proc transpose data = your_data out = my_data;id var1;var _all_ ;run;
*clean up the data;
data want(rename = (_name_ = city));
set my_data(where = (_name_ ne "var1") drop = _label_ );
label _name_ = "City";
run;
Let me know what you think.
Anca.
Thanks Anca, this worked well. However, the City column shows the variables as Chicago, Chicago1 Chicago2, DC, DC1, DC2, etc. Is there a way to get rid of the _# after the city?
Hi.
One way to replace the digits is to use regular expressions:
data want;
set have;
new_city = put(prxchange('s/[0-9]//', -1, city), $10.); |
run;
Good luck!
Anca.
Thanks Anca.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.