Hi ,
I have two Datasets with Groups . Data1 has DIX in group1 and Data2 has DIX in group2 so we are not missing data . How do i do sorting so that i can say both DIX and PTE are present in both Data1 and Data2 and they are not missing
| Data1 | ||
| Policy_num | Group1 | Group2 |
| 100 | DIX | PTE |
| Data2 | ||
| Policy_num | Group1 | Group2 |
| 100 | PTE | DIX |
Can anyone please help
I don't understand.
Can you post some actual data?
Something like the below, the kind we could actually run through some SAS code.
DATA Have;
input var1 var2 var3;
datalines;
4 2 3
3 2 3
3 2 4
4 2 4
1 3 4
2 3 2
4 4 2
1 4 4
4 4 3
3 4 3
4 4 2
2 4 2
1 4 2
2 4 5
5 1 1
2 4 4
4 3 4
5 2 3
4 2 5
5 4 3
;
RUN;
Jim
Hi ,
Here below i am validating the groups present in Data1 for each policy_num are present in Data2 irrespective of the order
For example for policy_num 100 we have DIX and PTE in Data1 and DIX and PTE in Data2 . How do i validate this scenario ?
| Data1 | ||
| Policy_num | Group1 | Group2 |
| 100 | DIX | PTE |
| Data2 | ||
| Policy_num | Group1 | Group2 |
| 100 | PTE | DIX |
Transpose the group variables, so that you have only one column containing all the values. Then it's easy to do a join.
I'm not completely sure if I understand all the possible permutations of your data, but the following code (which is a bit hastily assembled):
Data Data1;
DROP Var_Cnt;
INFILE DATALINES MISSOVER;
INPUT Policy_num $
Group1 $
Group2 $
;
Var_Cnt + 1;
CALL SYMPUTX(CATS('Var', PUT(Var_Cnt, 3.)), VVALUEX(CATS('Group', PUT(Var_Cnt, 3.))));
Var_Cnt + 1;
CALL SYMPUTX(CATS('Var', PUT(Var_Cnt, 3.)), VVALUEX(CATS('Group', PUT(Var_Cnt, 3.))));
DATALINES;
100 DIX PTE
200 DIX
300 PTE
400 PTE DIX
;
RUN;
%put _user_;
DATA Data2;
INFILE DATALINES MISSOVER;
INPUT Policy_num $
Group1 $
Group2 $
;
DATALINES;
100 PTE DIX
200 PTE
300 DIX
400 DIX PTE
;
RUN;
DATA Summary_Data (KEEP=Policy_Num &Var1 &Var2);
LENGTH Policy_Num $8;
ARRAY d1_arr [*] $ d1_GROUP1 - d1_GROUP2;
ARRAY d2_arr [*] $ d2_GROUP1 - d2_GROUP2;
MERGE Data1 (In=d1 rename=(group1=d1_group1 group2=d1_group2))
Data2 (In=d2 rename=(group1=d2_group1 group2=d2_group2))
;
BY Policy_Num;
IF D1 AND D2;
DO i = 1 TO DIM(d1_Arr);
IF i = 1 THEN
IF D1_ARR[i] IN d2_arr AND
NOT MISSING(D1_ARR[i]) THEN
&Var1 = 1;
ELSE
&Var1 = 0;
ELSE
IF i = 2 THEN
IF D1_ARR[i] IN d2_arr AND
NOT MISSING(D1_ARR[i]) THEN
&Var2 = 1;
ELSE
&Var2 = 0;
END;
RUN;
Will produce the following results:
1 means that it is present in both datasets. 0 means that it was not present in both.
Jim
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.