BookmarkSubscribeRSS Feed
newsas007
Quartz | Level 8

Hi,

My data contains information on 6 different categories. Each category has L, F, Start, Reg variables. The final variable close_diff is the variable I need to compare to get the required dataset.

For ID#1, the close_diff = -90 is equal to start_4=-90; therefore the L_before is the corresponding L_4=2, F_before=C is corresponding F_4, and Reg_before is corresponding R_4= d,e,r. The after variables are calculated using the next set. That is, L_after = L_5 which is 3, F_after = F_5, which is C and  Reg_after = Reg_5 which is c,e,f,t.

For ID#2, the close_diff = 2 and is a positive number, so therefore the before and after variables are swapped. The close_diff  = 2 is equal to Start = -5; now the L_after = L_5 = 3, F_after=F_5=C, and Reg_After = c,e,f,t. Now the L_before is from the information from before; i.e. L_before = L_4 =2 , F_before = F_4 = O, and Reg_Begore = f,d,d

When the close_diff = 0; it is considered a negative number and the same rules apply asID#1

There might be instances where there is no information and is left blank as shown in Want data

Data Have:

SubjectID L_1 F_1 Start_1 Reg_1 L_2 F_2 Start_2 Reg_2 L_3 F_3 Start_3 Reg_3 L_4 F_4 Start_4 Reg_4 L_5 F_5 Start_5 Reg_5 L_6 Fa_6 Start_6 Reg_6 Close_diff
1 -1       0       1 C -150 a,b,d,e 2 C -90 d,e,r 3 C 131 c,e,f,t 4 C 169 s,r -90
2 -1       0       1 O -116 s,g,y,u,u 2 O -36 f,d,d 3 C 2 f,e,w 4       2
3 -1       0       1 C:O -840 r,d,h,j,r 2 C -197 q,d,f,d 3       4       -197
4 1       0       1 O -1249 t,y,o,l 2 O -965 w,d,g 3 C:O -425 d,g,e,t 4 C -126 f -126
5 1       0       1 O -1245 q,d,f,d 2 O -1133 f 3 O -938 f,e,s,r 4 O -446 e,g,u -446
6 1       0       1 O -832 w,d,g 2 C 78 f 3 C 440 d,f,g,h 4       78
7 1       0       1 O -518 f 2 O -263 a,b,d,e 3 C:O -77 f,a 4 C 166 a,b,d,e -77
8 1       0       1 O -3670 f 2 O -810 s,g,y,u,u 3 O -717 f,t,h 4 C:O -292 s,g,y,u,u -292
9 1 O -1269 a,b 0 O -1181 a 1 O -700 a,b,d,e 2 O -362 f,a 3 C -183 w,e,s,f 4     f,a -183
10 1       0       1 C:O -657 s,g,y,u,u 2 C -113 f,t,h 3 C 0 s,w,r 4 C 101 f,t,h 0
11 1       0       1 O -1227 r,d,h,j,r 2 O -997 w,e,s,f 3 O -695 c,e,f,t 4 C -567 c,e,f,t -567
12 1       0 O -2525 a 1 C:O -298 t,y,o,l 2 C:O 83 s,w,r 3 C:O 477 f,e,w 4 C:O 618 f,e,w 83

 

Data Want:

SubjectID L_Before F_Before Reg_Before L_After F_After Reg_After
1 2 C d,e,r 3 C c,e,f,t
2 2 O f,d,d 3 C f,e,w
3 2 C q,d,f,d 3    
4 4 C f      
5 4 O e,g,u      
6 1 O w,d,g 2 C f
7 3 C:O f,a 4 C a,b,d,e
8 4 C:O s,g,y,u,u      
9 3 C w,e,s,f 4   f,a
10 3 C s,w,r 4 C f,t,h
11 4 C c,e,f,t      
12 1 C:O t,y,o,l 2 C:O s,w,r

 

 

6 REPLIES 6
ChrisNZ
Tourmaline | Level 20

Please provide the  data in usable format (data step or SQL insert statements).

 

newsas007
Quartz | Level 8

@ChrisNZ ,

I have attached the dataset. I had a hard time with inputting with datalines. Sorry, i have not used datalines that much for data input. Please let me know if this works

mkeintz
PROC Star

I am not going to go through the effort of converting your CSV file to a SAS dataset, so this program is not tested:

 


data want;
  set have;
  array L_ {6};
  array F_ {6};
  array REG_ $ {6};
  array START_ {6};

  b=whichn(close_diff,of start_{*});
  if b=0 then return;

  if close_diff>0 then b=b-1;

  if (1<=b<=6) then do;
    L_before=L_{b};   F_before=F_{b};    reg_before=reg_{b};
  end;
  if (0<=b<=5) then do;
    L_after=L_{b+1};  F_after=F_{b+1};   reg_after=reg_{b+1};
  end;
  drop b;
run;

 

However, I did actually look at your CSV.  The 4th close_diff value is -126, which matches start_6.  So the "before" variables get the 6th set of values, but there is no 7th set, which presumably would go to the "after" variables.

 

And you could have a similar set of problems if you had a positive close_diff matching START_1.  According to your rules the "after" would get the 1st collection, and the "before" would get what?

 

The program above avoids such "out of bounds" situations.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
newsas007
Quartz | Level 8

Hi @mkeintz :

Added the data step above. 

Also, since there is no 7th set, the observations is going to be empty. You will notice the in a few IDs where there is missing values in the 'Want' dataset.

For positive values, the before will have missing data. Hope this makes sense

newsas007
Quartz | Level 8

@ChrisNZ  

@mkeintz 

Here is the data step for inputting data

data have;
infile cards dsd dlm=",";
input ID L_1 F_1 $ Start_1 Reg_1 $ L_2 F_2 $ Start_2 Reg_2 $ L_3 F_3 $ Start_3 Reg_3 $ L_4 F_4 $ Start_4 Reg_4 $ L_5 F_5 $ Start_5 Reg_5 $ L_6 F_6 $ Start_6 Reg_6 $ Close_diff;
format Reg_2 $10. Reg_3 $10.;
cards;
1,-1,,,,0,,,,1,C,-150,"a,b,d,e",2,C,-90,"d,e,r",3,C,131,"c,e,f,t",4,C,169,"s,r",-90
2,-1,,,,0,,,,1,O,-116,"s,g,y,u,u",2,O,-36,"f,d,d",3,C,2,"f,e,w",4,,,,2
3,-1,,,,0,,,,1,C:O,-840,"r,d,h,j,r",2,C,-197,"q,d,f,d",3,,,,4,,,,-197
4,1,,,,0,,,,1,O,-1249,"t,y,o,l",2,O,-965,"w,d,g",3,C:O,-425,"d,g,e,t",4,C,-126,f,-126
5,1,,,,0,,,,1,O,-1245,"q,d,f,d",2,O,-1133,f,3,O,-938,"f,e,s,r",4,O,-446,"e,g,u",-446
6,1,,,,0,,,,1,O,-832,"w,d,g",2,C,78,f,3,C,440,"d,f,g,h",4,,,,78
7,1,,,,0,,,,1,O,-518,f,2,O,-263,"a,b,d,e",3,C:O,-77,"f,a",4,C,166,"a,b,d,e",-77
8,1,,,,0,,,,1,O,-3670,f,2,O,-810,"s,g,y,u,u",3,O,-717,"f,t,h",4,C:O,-292,"s,g,y,u,u",-292
9,1,O,-1269,"a,b",0,O,-1181,a,1,O,-700,"a,b,d,e",2,O,-362,"f,a",3,C,-183,"w,e,s,f",4,,,"f,a",-183
10,1,,,,0,,,,1,C:O,-657,"s,g,y,u,u",2,C,-113,"f,t,h",3,C,0,"s,w,r",4,C,101,"f,t,h",0
11,1,,,,0,,,,1,O,-1227,"r,d,h,j,r",2,O,-997,"w,e,s,f",3,O,-695,"c,e,f,t",4,C,-567,"c,e,f,t",-567
12,1,,,,0,O,-2525,a,1,C:O,-298,"t,y,o,l",2,C:O,83,"s,w,r",3,C:O,477,"f,e,w",4,C:O,618,"f,e,w",83
;
Run;

ChrisNZ
Tourmaline | Level 20

Please post you code using the appropriate icon.

 

The result you want is given by altering @mkeintz 's code slightly.

data WANT;
  set HAVE;
  array L_     {6}     ;
  array F_     {6}     ;
  array REG_   {6} $10 ;
  array START_ {6}     ;

  POS=whichn(CLOSE_DIFF, of START_{*});
  if POS;

  if CLOSE_DIFF > 0 then POS=POS-1;

  if 1 <= POS <= 6 then do;
    L_BEFORE=L_{POS} ;  F_BEFORE=F_{POS} ;  REG_BEFORE=reg_{POS} ;
  end;
  if 0 <= POS <= 5 then do;
    L_AFTER=L_{POS+1};  F_AFTER=F_{POS+1};  REG_AFTER=reg_{POS+1};
  end;
run;

 

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1073 views
  • 0 likes
  • 3 in conversation