BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
emikea
Calcite | Level 5

Hello. I'm stumped by the following situation and would be grateful for any help.

I have a dataset (D1) similar to this.

Metric     Open   BD1     BD2

MetricA      10     20         30

Metric2      40     50         60

and I want to transform it into another (D2) which contains one row like this.

MetricA_Open      MetricA_BD1     MetricA_BD2     Metric2_Open     Metric2_BD1     Metric2_BD2

10                         20                    30                    40                         50                    60

My actual dataset is much larger with many rows and columns, so I can't hard code it. I'm currently creating a dataset for each metric and then joining each in SQL, but this doesn't scale well.

I feel it is probably possible to do this in one pass through the data and without creating tons of intermediate datasets, but I haven't been able to find a solution. Any help would be appreciated.

Thanks,

Mike

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Here is one way.  Convert it to vertical format and use PROC TRANSPOSE to generate the resulting variable names.

data have;

  input Metric $ Open BD1 BD2 @@;

cards;

MetricA 10 20 30 Metric2 40 50 60

run;

data temp / view=temp;

  set have;

  array _n _numeric_;

  do _n_=1 to dim(_n);

     _name_ = vname(_n(_n_));

     col1 = _n(_n_);

     output;

  end;

  keep metric _name_ col1 ;

run;

proc transpose data=temp out=want delimiter=_;

  id metric _name_ ;

  var col1;

run;

View solution in original post

8 REPLIES 8
Reeza
Super User

How would you handle the remaining rows and columns? It seems strange to want to have all of the data in one column. Can you explain some more about what you're looking for?

emikea
Calcite | Level 5

Additional rows would lead to additional columns whose names would begin with the value in the Metric field.

If the next row was:

Metric3       70          80          90

then D2 would have the additional columns of

Metric3_Open     Metric3_BD1     Metric3_BD2

To clarify, I want the resulting dataset to have one row, but it will have as many columns as required by the number of observations in the first dataset.

The reason I'm doing this is because I have hundreds and hundreds of datasets like D1 that each correspond to a particular date. I want to flatten each one out and then append them to each other to foster time series analysis. The number of columns in D1 is actually always going to be twenty some, so my solution can assume a limited and fixed number of columns in D1. For the sake of this question, we can just assume that Open, BD1, and BD2 are the only columns that we have to transform.

slchen
Lapis Lazuli | Level 10

data have;

input Metric $   Open   BD1     BD2;

cards;

MetricA      10     20         30

Metric2      40     50         60

;

run;

proc sort data=have;

  by metric;

run;

proc transpose data=have out=have1 name=trans_name;

   by metric;

run;

data have2;

  set have1;

  new_name=catx('_',metric,trans_name);

  drop metric t_name;

run;

proc transpose data=have2 out=want(drop=_name_);

  id new_name;

  run;

proc print;

run;

emikea
Calcite | Level 5

Another great solution. Thank you.

Tom
Super User Tom
Super User

Here is one way.  Convert it to vertical format and use PROC TRANSPOSE to generate the resulting variable names.

data have;

  input Metric $ Open BD1 BD2 @@;

cards;

MetricA 10 20 30 Metric2 40 50 60

run;

data temp / view=temp;

  set have;

  array _n _numeric_;

  do _n_=1 to dim(_n);

     _name_ = vname(_n(_n_));

     col1 = _n(_n_);

     output;

  end;

  keep metric _name_ col1 ;

run;

proc transpose data=temp out=want delimiter=_;

  id metric _name_ ;

  var col1;

run;

emikea
Calcite | Level 5

This is amazing. Thank you so much for taking the time to help me out.

Mike Anas

Ksharp
Super User

or pure data step code.

data have;
  input Metric $ Open BD1 BD2 @@;
cards;
MetricA 10 20 30 Metric2 40 50 60
;
run;
proc sql noprint;
create table x as
select * from
 (select distinct metric from have),
 (select name from dictionary.columns where libname='WORK' and memname='HAVE');
 quit;
 data ;
  set x end=last;
  by metric ;
  if _n_ eq 1 then call execute('data want;merge ');
  if first.metric then call execute(cats('have(where=(metric="',metric,'") rename=('));
   else call execute(cats(name,'=',metric,'_',name));
  if last.metric then call execute('))');
  if last then call execute(';drop metric;run;');
run;

Ksharp

emikea
Calcite | Level 5

Thank you Ksharp.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1730 views
  • 6 likes
  • 5 in conversation