DATA Step, Macro, Functions and more

Reformat datasets

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 106
Accepted Solution

Reformat datasets

Hi All,

I have a dataset from my client that I need to reformat so that I can do some analysis and create some reports with it. It is in Excel in the following structure:

ChicagoDC
Building25156548852665
Phase 15/27/20117/5/20123/30/2012
Phase 26/30/20118/10/20124/4/2012
Phase 37/15/20118/1/20124/11/2012
Time from Phase 1 to Phase 3182512

I want it in this format (I will calculate the time in days myself)

BuildingPhase 1Phase 2Phase 3City
251565/27/20116/30/20117/15/2011Chicago
548857/5/20128/10/20128/1/2012Chicago
26653/30/20124/4/20124/11/2012DC

Thanks!


Accepted Solutions
Solution
‎03-22-2013 02:24 PM
Super Contributor
Posts: 543

Re: Reformat datasets

Posted in reply to sarahsasuser

Hi.

Assuming that you can 'unmerge' the cell named 'Chicago' to 'Chicago' 'Chicago', or 'Chicago' 'Chicago1' then this may do what you need:

proc import out = your_data

            datafile= "path\data.xls"

            dbms = excel replace;

     MIXED=YES; *this is important because in the city variable(s) you have dates and building codes;

run;

*now we transpose and use the ID to label the columns;

proc transpose data = your_data out = my_data;id var1;var _all_ ;run;

*clean up the data;

data want(rename = (_name_ = city));

    set my_data(where = (_name_ ne "var1") drop = _label_ );

label _name_ = "City";

run;

Let me know what you think.

Anca.

View solution in original post


All Replies
Solution
‎03-22-2013 02:24 PM
Super Contributor
Posts: 543

Re: Reformat datasets

Posted in reply to sarahsasuser

Hi.

Assuming that you can 'unmerge' the cell named 'Chicago' to 'Chicago' 'Chicago', or 'Chicago' 'Chicago1' then this may do what you need:

proc import out = your_data

            datafile= "path\data.xls"

            dbms = excel replace;

     MIXED=YES; *this is important because in the city variable(s) you have dates and building codes;

run;

*now we transpose and use the ID to label the columns;

proc transpose data = your_data out = my_data;id var1;var _all_ ;run;

*clean up the data;

data want(rename = (_name_ = city));

    set my_data(where = (_name_ ne "var1") drop = _label_ );

label _name_ = "City";

run;

Let me know what you think.

Anca.

Frequent Contributor
Posts: 106

Re: Reformat datasets

Posted in reply to AncaTilea

Thanks Anca, this worked well. However, the City column shows the variables as Chicago, Chicago1 Chicago2, DC, DC1, DC2, etc. Is there a way to get rid of the _# after the city?

Super Contributor
Posts: 543

Re: Reformat datasets

Posted in reply to sarahsasuser

Hi.

One way to replace the digits is to use regular expressions:

data want;

     set have;

new_city = put(prxchange('s/[0-9]//', -1, city), $10.);

run;

Good luck!

Anca.

Frequent Contributor
Posts: 106

Re: Reformat datasets

Posted in reply to AncaTilea

Thanks Anca.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 235 views
  • 1 like
  • 2 in conversation