Hi ,
I have two Datasets with Groups . Data1 has DIX in group1 and Data2 has DIX in group2 so we are not missing data . How do i do sorting so that i can say both DIX and PTE are present in both Data1 and Data2 and they are not missing
Data1 | ||
Policy_num | Group1 | Group2 |
100 | DIX | PTE |
Data2 | ||
Policy_num | Group1 | Group2 |
100 | PTE | DIX |
Can anyone please help
I don't understand.
Can you post some actual data?
Something like the below, the kind we could actually run through some SAS code.
DATA Have;
input var1 var2 var3;
datalines;
4 2 3
3 2 3
3 2 4
4 2 4
1 3 4
2 3 2
4 4 2
1 4 4
4 4 3
3 4 3
4 4 2
2 4 2
1 4 2
2 4 5
5 1 1
2 4 4
4 3 4
5 2 3
4 2 5
5 4 3
;
RUN;
Jim
Hi ,
Here below i am validating the groups present in Data1 for each policy_num are present in Data2 irrespective of the order
For example for policy_num 100 we have DIX and PTE in Data1 and DIX and PTE in Data2 . How do i validate this scenario ?
Data1 | ||
Policy_num | Group1 | Group2 |
100 | DIX | PTE |
Data2 | ||
Policy_num | Group1 | Group2 |
100 | PTE | DIX |
Transpose the group variables, so that you have only one column containing all the values. Then it's easy to do a join.
I'm not completely sure if I understand all the possible permutations of your data, but the following code (which is a bit hastily assembled):
Data Data1;
DROP Var_Cnt;
INFILE DATALINES MISSOVER;
INPUT Policy_num $
Group1 $
Group2 $
;
Var_Cnt + 1;
CALL SYMPUTX(CATS('Var', PUT(Var_Cnt, 3.)), VVALUEX(CATS('Group', PUT(Var_Cnt, 3.))));
Var_Cnt + 1;
CALL SYMPUTX(CATS('Var', PUT(Var_Cnt, 3.)), VVALUEX(CATS('Group', PUT(Var_Cnt, 3.))));
DATALINES;
100 DIX PTE
200 DIX
300 PTE
400 PTE DIX
;
RUN;
%put _user_;
DATA Data2;
INFILE DATALINES MISSOVER;
INPUT Policy_num $
Group1 $
Group2 $
;
DATALINES;
100 PTE DIX
200 PTE
300 DIX
400 DIX PTE
;
RUN;
DATA Summary_Data (KEEP=Policy_Num &Var1 &Var2);
LENGTH Policy_Num $8;
ARRAY d1_arr [*] $ d1_GROUP1 - d1_GROUP2;
ARRAY d2_arr [*] $ d2_GROUP1 - d2_GROUP2;
MERGE Data1 (In=d1 rename=(group1=d1_group1 group2=d1_group2))
Data2 (In=d2 rename=(group1=d2_group1 group2=d2_group2))
;
BY Policy_Num;
IF D1 AND D2;
DO i = 1 TO DIM(d1_Arr);
IF i = 1 THEN
IF D1_ARR[i] IN d2_arr AND
NOT MISSING(D1_ARR[i]) THEN
&Var1 = 1;
ELSE
&Var1 = 0;
ELSE
IF i = 2 THEN
IF D1_ARR[i] IN d2_arr AND
NOT MISSING(D1_ARR[i]) THEN
&Var2 = 1;
ELSE
&Var2 = 0;
END;
RUN;
Will produce the following results:
1 means that it is present in both datasets. 0 means that it was not present in both.
Jim
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.