BookmarkSubscribeRSS Feed
newsas007
Quartz | Level 8

Hi,

My data contains information on 6 different categories. Each category has L, F, Start, Reg variables. The final variable close_diff is the variable I need to compare to get the required dataset.

For ID#1, the close_diff = -90 is equal to start_4=-90; therefore the L_before is the corresponding L_4=2, F_before=C is corresponding F_4, and Reg_before is corresponding R_4= d,e,r. The after variables are calculated using the next set. That is, L_after = L_5 which is 3, F_after = F_5, which is C and  Reg_after = Reg_5 which is c,e,f,t.

For ID#2, the close_diff = 2 and is a positive number, so therefore the before and after variables are swapped. The close_diff  = 2 is equal to Start = -5; now the L_after = L_5 = 3, F_after=F_5=C, and Reg_After = c,e,f,t. Now the L_before is from the information from before; i.e. L_before = L_4 =2 , F_before = F_4 = O, and Reg_Begore = f,d,d

When the close_diff = 0; it is considered a negative number and the same rules apply asID#1

There might be instances where there is no information and is left blank as shown in Want data

Data Have:

SubjectID L_1 F_1 Start_1 Reg_1 L_2 F_2 Start_2 Reg_2 L_3 F_3 Start_3 Reg_3 L_4 F_4 Start_4 Reg_4 L_5 F_5 Start_5 Reg_5 L_6 Fa_6 Start_6 Reg_6 Close_diff
1 -1       0       1 C -150 a,b,d,e 2 C -90 d,e,r 3 C 131 c,e,f,t 4 C 169 s,r -90
2 -1       0       1 O -116 s,g,y,u,u 2 O -36 f,d,d 3 C 2 f,e,w 4       2
3 -1       0       1 C:O -840 r,d,h,j,r 2 C -197 q,d,f,d 3       4       -197
4 1       0       1 O -1249 t,y,o,l 2 O -965 w,d,g 3 C:O -425 d,g,e,t 4 C -126 f -126
5 1       0       1 O -1245 q,d,f,d 2 O -1133 f 3 O -938 f,e,s,r 4 O -446 e,g,u -446
6 1       0       1 O -832 w,d,g 2 C 78 f 3 C 440 d,f,g,h 4       78
7 1       0       1 O -518 f 2 O -263 a,b,d,e 3 C:O -77 f,a 4 C 166 a,b,d,e -77
8 1       0       1 O -3670 f 2 O -810 s,g,y,u,u 3 O -717 f,t,h 4 C:O -292 s,g,y,u,u -292
9 1 O -1269 a,b 0 O -1181 a 1 O -700 a,b,d,e 2 O -362 f,a 3 C -183 w,e,s,f 4     f,a -183
10 1       0       1 C:O -657 s,g,y,u,u 2 C -113 f,t,h 3 C 0 s,w,r 4 C 101 f,t,h 0
11 1       0       1 O -1227 r,d,h,j,r 2 O -997 w,e,s,f 3 O -695 c,e,f,t 4 C -567 c,e,f,t -567
12 1       0 O -2525 a 1 C:O -298 t,y,o,l 2 C:O 83 s,w,r 3 C:O 477 f,e,w 4 C:O 618 f,e,w 83

 

Data Want:

SubjectID L_Before F_Before Reg_Before L_After F_After Reg_After
1 2 C d,e,r 3 C c,e,f,t
2 2 O f,d,d 3 C f,e,w
3 2 C q,d,f,d 3    
4 4 C f      
5 4 O e,g,u      
6 1 O w,d,g 2 C f
7 3 C:O f,a 4 C a,b,d,e
8 4 C:O s,g,y,u,u      
9 3 C w,e,s,f 4   f,a
10 3 C s,w,r 4 C f,t,h
11 4 C c,e,f,t      
12 1 C:O t,y,o,l 2 C:O s,w,r

 

 

6 REPLIES 6
ChrisNZ
Tourmaline | Level 20

Please provide the  data in usable format (data step or SQL insert statements).

 

newsas007
Quartz | Level 8

@ChrisNZ ,

I have attached the dataset. I had a hard time with inputting with datalines. Sorry, i have not used datalines that much for data input. Please let me know if this works

mkeintz
PROC Star

I am not going to go through the effort of converting your CSV file to a SAS dataset, so this program is not tested:

 


data want;
  set have;
  array L_ {6};
  array F_ {6};
  array REG_ $ {6};
  array START_ {6};

  b=whichn(close_diff,of start_{*});
  if b=0 then return;

  if close_diff>0 then b=b-1;

  if (1<=b<=6) then do;
    L_before=L_{b};   F_before=F_{b};    reg_before=reg_{b};
  end;
  if (0<=b<=5) then do;
    L_after=L_{b+1};  F_after=F_{b+1};   reg_after=reg_{b+1};
  end;
  drop b;
run;

 

However, I did actually look at your CSV.  The 4th close_diff value is -126, which matches start_6.  So the "before" variables get the 6th set of values, but there is no 7th set, which presumably would go to the "after" variables.

 

And you could have a similar set of problems if you had a positive close_diff matching START_1.  According to your rules the "after" would get the 1st collection, and the "before" would get what?

 

The program above avoids such "out of bounds" situations.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
newsas007
Quartz | Level 8

Hi @mkeintz :

Added the data step above. 

Also, since there is no 7th set, the observations is going to be empty. You will notice the in a few IDs where there is missing values in the 'Want' dataset.

For positive values, the before will have missing data. Hope this makes sense

newsas007
Quartz | Level 8

@ChrisNZ  

@mkeintz 

Here is the data step for inputting data

data have;
infile cards dsd dlm=",";
input ID L_1 F_1 $ Start_1 Reg_1 $ L_2 F_2 $ Start_2 Reg_2 $ L_3 F_3 $ Start_3 Reg_3 $ L_4 F_4 $ Start_4 Reg_4 $ L_5 F_5 $ Start_5 Reg_5 $ L_6 F_6 $ Start_6 Reg_6 $ Close_diff;
format Reg_2 $10. Reg_3 $10.;
cards;
1,-1,,,,0,,,,1,C,-150,"a,b,d,e",2,C,-90,"d,e,r",3,C,131,"c,e,f,t",4,C,169,"s,r",-90
2,-1,,,,0,,,,1,O,-116,"s,g,y,u,u",2,O,-36,"f,d,d",3,C,2,"f,e,w",4,,,,2
3,-1,,,,0,,,,1,C:O,-840,"r,d,h,j,r",2,C,-197,"q,d,f,d",3,,,,4,,,,-197
4,1,,,,0,,,,1,O,-1249,"t,y,o,l",2,O,-965,"w,d,g",3,C:O,-425,"d,g,e,t",4,C,-126,f,-126
5,1,,,,0,,,,1,O,-1245,"q,d,f,d",2,O,-1133,f,3,O,-938,"f,e,s,r",4,O,-446,"e,g,u",-446
6,1,,,,0,,,,1,O,-832,"w,d,g",2,C,78,f,3,C,440,"d,f,g,h",4,,,,78
7,1,,,,0,,,,1,O,-518,f,2,O,-263,"a,b,d,e",3,C:O,-77,"f,a",4,C,166,"a,b,d,e",-77
8,1,,,,0,,,,1,O,-3670,f,2,O,-810,"s,g,y,u,u",3,O,-717,"f,t,h",4,C:O,-292,"s,g,y,u,u",-292
9,1,O,-1269,"a,b",0,O,-1181,a,1,O,-700,"a,b,d,e",2,O,-362,"f,a",3,C,-183,"w,e,s,f",4,,,"f,a",-183
10,1,,,,0,,,,1,C:O,-657,"s,g,y,u,u",2,C,-113,"f,t,h",3,C,0,"s,w,r",4,C,101,"f,t,h",0
11,1,,,,0,,,,1,O,-1227,"r,d,h,j,r",2,O,-997,"w,e,s,f",3,O,-695,"c,e,f,t",4,C,-567,"c,e,f,t",-567
12,1,,,,0,O,-2525,a,1,C:O,-298,"t,y,o,l",2,C:O,83,"s,w,r",3,C:O,477,"f,e,w",4,C:O,618,"f,e,w",83
;
Run;

ChrisNZ
Tourmaline | Level 20

Please post you code using the appropriate icon.

 

The result you want is given by altering @mkeintz 's code slightly.

data WANT;
  set HAVE;
  array L_     {6}     ;
  array F_     {6}     ;
  array REG_   {6} $10 ;
  array START_ {6}     ;

  POS=whichn(CLOSE_DIFF, of START_{*});
  if POS;

  if CLOSE_DIFF > 0 then POS=POS-1;

  if 1 <= POS <= 6 then do;
    L_BEFORE=L_{POS} ;  F_BEFORE=F_{POS} ;  REG_BEFORE=reg_{POS} ;
  end;
  if 0 <= POS <= 5 then do;
    L_AFTER=L_{POS+1};  F_AFTER=F_{POS+1};  REG_AFTER=reg_{POS+1};
  end;
run;

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1019 views
  • 0 likes
  • 3 in conversation