BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tmcwill
Fluorite | Level 6

If I have a dataset with X rows and Y columns, I'm trying to essentially make every Z number of sequential rows concatenated into one, so I end up with X/Z rows and Y*Z columns. 

 

For example, X=6, Y=3, Z=2. 

 

Have:

VAR1VAR2VAR3
123
456
789
101112
131415
161718

 

Want: 

VAR1VAR2VAR3VAR4VAR5VAR6
123456
789101112
131415161718

 

It seems simple enough, but I can't quite figure out a way that works. I should also note that the for the data I am using this on, X will always be divisible by Z. Thanks! 

 

Edit: 

The data provided above are just an example to show the type of thing I'm looking for. In reality, the dataset will have some multiple of 24 as the number of rows. Each row represents one hour out of one day, with each 24 rows coming from the same day. There will be 120 columns of numeric data for each row (no missing data). The goal is to get to the point of having each row be one day, with 2,880 (120x24) columns of data. What I'm going to end up doing is writing a macro to iterate the steps over multiple datasets with a different number of "days" (hence different number of total rows). Thanks to everyone who has responded! 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

How about a couple of DO loops?  Put the outer one around the SET statement.

data have;
  input VAR1	VAR2	VAR3 ;
cards;
1	2	3
4	5	6
7	8	9
10	11	12
13	14	15
16	17	18
;

%let rows=2;
%let cols=3;
%let new_cols=%eval(&rows*&cols);

data want;
do row=1 to &rows;
  set have;
  array var [&cols] ;
  array newvar [&new_cols] ;
  do col=1 to &cols;
    index=sum(index,1);
    newvar[index] = var[col];
  end;
end;
drop var: row col index;
run;

proc print;
run;
Obs    newvar1    newvar2    newvar3    newvar4    newvar5    newvar6

 1         1          2          3          4          5          6
 2         7          8          9         10         11         12
 3        13         14         15         16         17         18

View solution in original post

6 REPLIES 6
biopharma
Quartz | Level 8

Your problem is perhaps less complicated and could be solved by a procedure like TRANSPOSE. It is not clear from your description. However, if you really want a solution as described, here is one solution.

 

data have ;
   input var1 var2 var3 ;
   cards ;
1 2 3
4 5 6
7 8 9
10 11 12
13 14 15
16 17 18
;
run ;

data want (drop=_:) ;
   set have ;
   retain _v1 - _v3 ;
   array _v(3) ;
   array var(6) ;
   if mod(_n_,2)=0 then 
      do ;
         do _i=4 to dim(var) ;
            var(_i) = var(_i-3) ;
            var(_i-3) = _v(_i-3) ;
         end ;
         output ;
      end ;
   do _i = 1 to dim(_v) ;
      _v(_i) = var(_i) ;
   end ;
run ; 
      
ballardw
Super User

Is the example data your actual data or some sort of  example? If an example then you may need to tell us if any of the variables are character and what the lengths will need to be.

Will the data always have a structure such that the number of cells is exactly equal to the x times y?

 

One way is to write out the data and read it:

data have;
  input v1 v2 v3;
datalines;
1	2	3
4	5	6
7	8	9
10	11	12
13	14	15
16	17	18
;
proc export data=have
    outfile='c:\file.csv'
    replace
    dbms=dlm;
    delimiter=',';
run;

data want;
   infile  'c:\file.csv' dlm=',' flowover firstobs=2;
   input v1 - v6; /*<= the magic 6 is your y*z value*/
run;

But since you don't provide rules for if you have 30 cells and want 8 variables ...

 

 

Tom
Super User Tom
Super User

How about a couple of DO loops?  Put the outer one around the SET statement.

data have;
  input VAR1	VAR2	VAR3 ;
cards;
1	2	3
4	5	6
7	8	9
10	11	12
13	14	15
16	17	18
;

%let rows=2;
%let cols=3;
%let new_cols=%eval(&rows*&cols);

data want;
do row=1 to &rows;
  set have;
  array var [&cols] ;
  array newvar [&new_cols] ;
  do col=1 to &cols;
    index=sum(index,1);
    newvar[index] = var[col];
  end;
end;
drop var: row col index;
run;

proc print;
run;
Obs    newvar1    newvar2    newvar3    newvar4    newvar5    newvar6

 1         1          2          3          4          5          6
 2         7          8          9         10         11         12
 3        13         14         15         16         17         18
Reeza
Super User
SAS iterates over rows automatically and you manually need to loop over columns. In this case, it's usually easier to leave your data in a long format as its then easier to work with as it loops automatically. If this is time data and you're missing time points you can add that in using PROC TIMESERIES/EXPAND. There are options to impute missing as averages or last observation carried forward.
Ksharp
Super User

Do you have SAS/IML ?

 

data have;
  input VAR1	VAR2	VAR3 ;
cards;
1	2	3
4	5	6
7	8	9
10	11	12
13	14	15
16	17	18
;

%let z=3;
proc iml;
use have;
read all var _num_ into x;
close;
z=shape(x,&z,0,.);
create want from z;
append from z;
close;
quit;

RichardDeVen
Barite | Level 11

Presuming the original variable names are actually numerically suffixed 1..Y and the new columns use suffixes Y+1...Z*Y

If the original names are for example Alpha, Beta, Grom, Zed, the new columns would have to be something like Alpha_2...Zed_2, Alpha_3...Zed_3

 

One approach is to write a macro to create a data step that uses arrays to pivot multiple rows into a single row.  Let's call it 'florping'

 

Data generator for testing

Spoiler
%macro makedata(data,rows,cols,var);
  data &data;
    do r = 1 to &rows;
      do c = 1 to &cols;
        array &var(&cols);
        k + 1;
        &var(c) = k;
      end;
      output;
    end;
    drop r c k ;
  run;
%mend;

options mprint;

%makedata (one,10,3,zeta);
%makedata (two,30,24,zoink);

Florp macro

%macro florp(data,var,count,out,format);

  %local N TOP; 

  proc transpose data=&data(obs=0) out=cols;
  proc sql noprint;
    select count(*) into :N trimmed from cols 
    where upcase(_name_) like %upcase("&var.%");

  %let TOP = %eval (&N * &count);

  data &out (drop=&var.: rename=(florp1-florp&TOP=&var.1-&var.&TOP));
    array florp (&TOP);
    retain _ix 0 florp;
    set &data end=done;
    array &var(&N);

    do _i_ = 1 to &N;
      _ix + 1;
      florp(_ix) = &var(_i_);
    end;

    if _ix = &TOP or done then do;
      OUTPUT;
      call missing (of florp(*));
      _ix = 0;
    end;
    drop _:;
    %if %length(&format) %then %do;
    format florp: &format.;
    %end;
  run;
%mend;

%* count parameters is how many rows to combine into one;
%florp(data=one, var=zeta, count=3, out=one_florped, format=4.);
%florp(data=two, var=zoink, count=7, out=two_florped, format=4.);

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 16. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 2180 views
  • 1 like
  • 7 in conversation