BookmarkSubscribeRSS Feed
hexx18
Quartz | Level 8

Hi ,

 

I have two Datasets with Groups . Data1 has DIX in group1 and Data2 has DIX in group2 so we are not missing data . How do i do sorting so that i can say both DIX and PTE are present in both Data1 and Data2 and they are not missing 

 

 Data1 
Policy_numGroup1Group2
100DIXPTE
   
 Data2 
Policy_numGroup1Group2
100PTEDIX

 

Can anyone please help 

4 REPLIES 4
jimbarbour
Meteorite | Level 14

I don't understand.

 

Can you post some actual data?

 

Something like the below, the kind we could actually run through some SAS code.

DATA	Have;
	input var1 var2 var3;

datalines;
4 2 3
3 2 3
3 2 4
4 2 4
1 3 4
2 3 2
4 4 2
1 4 4
4 4 3
3 4 3
4 4 2
2 4 2
1 4 2
2 4 5
5 1 1
2 4 4
4 3 4
5 2 3
4 2 5
5 4 3
;
RUN;

Jim

hexx18
Quartz | Level 8

Hi ,

Here below i am validating the groups present in Data1 for each policy_num are present in Data2 irrespective of the order 

For example  for policy_num 100  we have DIX and PTE in Data1 and DIX and PTE in Data2 . How do i validate this scenario ?

 Data1 
Policy_numGroup1Group2
100DIXPTE
   
 Data2 
Policy_numGroup1Group2
100PTEDIX
jimbarbour
Meteorite | Level 14

I'm not completely sure if I understand all the possible permutations of your data, but the following code (which is a bit hastily assembled):

Data	Data1;
	DROP	Var_Cnt;
	INFILE	DATALINES	MISSOVER;
	INPUT	Policy_num	$
			Group1		$
			Group2		$
			;
	Var_Cnt	+	1;
	CALL	SYMPUTX(CATS('Var', PUT(Var_Cnt, 3.)), VVALUEX(CATS('Group', PUT(Var_Cnt, 3.))));
	Var_Cnt	+	1;
	CALL	SYMPUTX(CATS('Var', PUT(Var_Cnt, 3.)), VVALUEX(CATS('Group', PUT(Var_Cnt, 3.))));
DATALINES;
100 DIX PTE
200 DIX
300 PTE
400 PTE DIX
;
RUN;
 	 	 
%put _user_;

DATA 	Data2;
	INFILE	DATALINES	MISSOVER;
	INPUT	Policy_num	$
			Group1		$
			Group2		$
			;
DATALINES;
100 PTE DIX
200 PTE
300 DIX
400 DIX PTE
;
RUN;

DATA	Summary_Data	(KEEP=Policy_Num &Var1 &Var2);
	LENGTH	Policy_Num	$8;
	ARRAY	d1_arr	[*]	$	d1_GROUP1	-	d1_GROUP2;
	ARRAY	d2_arr	[*]	$	d2_GROUP1	-	d2_GROUP2;

	MERGE	Data1	(In=d1	rename=(group1=d1_group1 group2=d1_group2))
			Data2	(In=d2	rename=(group1=d2_group1 group2=d2_group2))
			;
			BY	Policy_Num;
	IF	D1	AND	D2;

	DO	i = 1	TO	DIM(d1_Arr);
		IF	i	=	1					THEN
			IF	D1_ARR[i]	IN	d2_arr	AND
				NOT	MISSING(D1_ARR[i])	THEN
					&Var1	=	1;
				ELSE
					&Var1	=	0;
		ELSE
		IF	i	=	2					THEN
			IF	D1_ARR[i]	IN	d2_arr	AND
				NOT	MISSING(D1_ARR[i])	THEN
					&Var2	=	1;
				ELSE
					&Var2	=	0;
	END;
RUN;

Will produce the following results:

jimbarbour_0-1601407514761.png

 

1 means that it is present in both datasets.  0 means that it was not present in both.

 

Jim

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1120 views
  • 4 likes
  • 3 in conversation