BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Caymon
Calcite | Level 5

I currently have a dataset that has observations sorted by times going across the top.    The observations being at different time periods.

project   q1 2008  q2 2008 q32008 q42008 q12009 q2009 q32009 q42009  ......

1              4             5          8           8           9       12      14         16

2             0              0          3           6           5        7        10           8

3             4              3          5           4          3         5        0            3

4             0              0          0           0           5        4         7           10

5             0              0          0           0           0        0         3            2

I would like to rearrange the data so that every observation beings a the same time.  Ideally the dataset

would like this:

project       t1            t2            t3         t4        t5         t6         t7      t8

1               4             5             8          8          9         12       14      16

2               3             6             5          7         10        8           .        .

3               4             3             5         4          3          5         0         3

4               5             4             7        10          .           .         .           .

5               3             2              .           .          .          .          .         .

Any suggestions would be greatly appreciated. 

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
art297
Opal | Level 21

I'm not really sure what you want to do and, besides, I don't have a way tonight to test the following code. However, that said, I think that the following accomplishes what you showed in your example:

data have;

  input project   q12008  q22008 q32008 q42008 q12009 q2009 q32009 q42009;

  cards;

1              4             5          8           8           9       12      14         16

2             0              0          3           6           5        7        10           8

3             4              3          5           4          3         5        0            3

4             0              0          0           0           5        4         7           10

5             0              0          0           0           0        0         3            2

;

data want;

  set have;

  array projects(*) q12008--q42009;

  do i=1 to dim(projects);

    if projects(i) gt 0 then do;

      found=i;

      i=dim(projects)+1;

  end;

  j=0;

  do i=found to dim(projects);

    j+1;

    projects(j)=projects(i);

  end;

  do i=dim(projects)-found+2 to dim(projects);

    call missing(projects(i);

  end;

run;

View solution in original post

12 REPLIES 12
art297
Opal | Level 21

I'm not really sure what you want to do and, besides, I don't have a way tonight to test the following code. However, that said, I think that the following accomplishes what you showed in your example:

data have;

  input project   q12008  q22008 q32008 q42008 q12009 q2009 q32009 q42009;

  cards;

1              4             5          8           8           9       12      14         16

2             0              0          3           6           5        7        10           8

3             4              3          5           4          3         5        0            3

4             0              0          0           0           5        4         7           10

5             0              0          0           0           0        0         3            2

;

data want;

  set have;

  array projects(*) q12008--q42009;

  do i=1 to dim(projects);

    if projects(i) gt 0 then do;

      found=i;

      i=dim(projects)+1;

  end;

  j=0;

  do i=found to dim(projects);

    j+1;

    projects(j)=projects(i);

  end;

  do i=dim(projects)-found+2 to dim(projects);

    call missing(projects(i);

  end;

run;

Caymon
Calcite | Level 5

Thanks! this is exactly what I was looking for. This works perfectly.

I made only two tiny changes to your code:

art297 wrote:

data want;

  set have;

  array projects(*) q12008--q42009;

  do i=1 to dim(projects);

    if projects(i) gt 0 then do;

      found=i;

      i=dim(projects)+1;

  end;

  j=0;

  end;

  do i=found to dim(projects);

    j+1;

    projects(j)=projects(i);

  end;

  do i=dim(projects)-found+2 to dim(projects);

    call missing(projects(i));

  end;

run;

data_null__
Jade | Level 19

You can use PROC TRANSPOSE for a more concise solution.  If your data had extra numeric variables you would need to include a VAR Q:; statment in the first PROC TRANSPOSE.  

EDIT: Art brought it to my attention that my program did not meet the specs.  This corrected version seems to provide the correct result. Not so concise.  But still much simpler that shifting value in arrays.

data have;
  input project q12008 q22008 q32008 q42008 q12009 q2009 q32009 q42009;
  cards;
1 4 5 8 8 9 12 14 16
2 0 0 3 6 5  7 10  8
3 4 3 5 4 3  5  0  3
4 0 0 0 0 5  4  7 10
5 0 0 0 0 0  0  3  2
;;;;
proc transpose out=tall;
   by project;
   run;
data tallV / view=tallV;
   set tall;
   by project col1 notsorted;
   if first.project and col1 eq 0 then _iorc_=1;
   if _iorc_ and col1 eq 0 then do;
      if last.col1 then _iorc_ + -1;
      delete;
      end;
   run;
proc transpose prefix=T out=T(drop=_name_);
   by project;
   run;
proc print;
   run;

Obs    project    T1    T2    T3    T4    T5    T6    T7    T8

1        1        4     5     8     8     9    12    14    16

2        2        3     6     5     7    10     8     .     .

3        3        4     3     5     4     3     5     0     3

4        4        5     4     7    10     .     .     .     .

5        5        3     2     .     .     .     .     .     .


art297
Opal | Level 21

Datanull,  According to the OP's specs 0s that came after a non-zero value had to be kept.

data_null__
Jade | Level 19

Art,

Here is a method using maximum obfuscation. :smileymischief:  It is the most direct way I could come up with to "slide" variable values, in this case to the left.  As long as dim(q)*8 is less than 32767 this should work.

data have;

   input project q12008 q22008 q32008 q42008 q12009 q2009 q32009 q42009;

   array q

  • q:;
  •    do _n_ = 1 to dim(q) by 1 until(q[_n_] gt 0); end;

       if _n_ eq 1 then return;

       _iorc_   = (dim(q)-_n_+1)*8;

       _infile_ = peekClong(addrlong(q[_n_]),_iorc_);

       call missing(of q

  • );
  •    call pokelong(_infile_,addrlong(q[1]),_iorc_);

       cards;

    1 4 5 8 8 9 12 14 16

    2 0 0 3 6 5  7 10  8

    3 4 3 5 4 3  5  0  3

    4 0 0 0 0 5  4  7 10

    5 0 0 0 0 0  0  3  2

    ;;;;

    proc print;

       run;

    Message was edited by: data _null_ corrected spelling.

    art297
    Opal | Level 21

    Datanull,  Art C would be proud(?) of your efforts (see, e.g., http://www.caloxy.com/papers/63_TT05.pdf )

    Peter_C
    Rhodochrosite | Level 12

    that looks like one from PaulDorfman although better known as hashman, he is a great proponent of  memory functions for performance

    data_null__
    Jade | Level 19

    Yes, I was thinking that code was somewhat Dorfmanesk especially the use of _N_, _IORC_ and _INFILE_  for purposes for which they might not otherwise be used. Everything in that code was learned from Mr. Dorfman.

    Ksharp
    Super User

    Null. I am big surprised that the order of data will not be changed after proc transpose and deleting the first several zero.But still not suited for large dataset.

    Ksharp

    data_null__
    Jade | Level 19

    Ksharp wrote:

    Null. I am big surprised that the order of data will not be changed after proc transpose and deleting the first several zero.But still not suited for large dataset.

    Ksharp


    I don't know why you are surprised that the order of the data would not change.  Nothing is done to change it.  What did you think would happen?

    I expect it would be fast enough most of the time.

    Using PROC TRANSPOSE makes this what I like to call a "SAS Program".  A program that uses SAS features not general coding, arrays, subscripts, counters and the rest.  Like the difference between using PROC FREQ vs programming counts in a data step.  Both produce the same result PROC FREQ being easier to code.

    Ksharp
    Super User

    Null.

    It is apparent that I learned somthing new from you again. Thx.

    Ksharp

    Ksharp
    Super User

    I know art.t has already given an answer.

    But I just want to try it. Forgive my curiosity. Smiley Happy

     
    
    data have;
      input project   q12008  q22008 q32008 q42008 q12009 q2009 q32009 q42009;
      cards;
    1              4             5          8           8           9       12      14         16
    2             0              0          3           6           5        7        10           8
    3             4              3          5           4          3         5        0            3
    4             0              0          0           0           5        4         7           10
    5             0              0          0           0           0        0         3            2
    ;
    run;
    
     
    
    data want(drop=i j k );
      set have;
      array projects{*} q12008--q42009;
      i=1;
      do while (projects{i} eq 0 and i le dim(projects) );
      projects{i}=.;i+1;
      end;
      if i gt 1 and i lt dim(projects) then do;
        do j=1 to dim(projects)-1;
          do k=j+1 to dim(projects);
           if missing(projects{j}) then do;
                                        projects{j}=projects{k};
                                               projects{k}=.;
                                               end;
          end;
         end;
                                               end;
    run;
    
    
    
    

    Ksharp

    消息编辑者为:xia keshan Ops. Not Notice totally. It is correct code.

    sas-innovate-2024.png

    Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

    Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

     

    Register now!

    How to Concatenate Values

    Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 12 replies
    • 1324 views
    • 0 likes
    • 5 in conversation