DATA Step, Macro, Functions and more

How to get a distinct list of words in a variable

Accepted Solution Solved
Reply
Regular Contributor
Posts: 247
Accepted Solution

How to get a distinct list of words in a variable

I have a dataset that is read in from a control file that lists visit IDs, names, and windows as well as anchor datasets if applicable.  I then merge this with eCRF data and external lab data to compare visit dates to make sure they are within windows.  My concern is that there may be more than one anchor dataset, and I want to be able to get a distinct list of anchor datasets where applicable, but I'm not sure how to go about this.  Below is my current thought process, but this is a specific example.  I need to generalize my code inside a macro to fit any number of anchor datasets.  I think my biggest concern is how to get the maximum number of "words" in the ANCHOR variable so that I can loop through it.  There may even be a more efficient way to do it than what I have below.

 

In short, below is the code to get a sample dataset.  I want to take the ANCHOR variable and get a list of distinct values, which in this case would be RD_DOSEL, RD_DOSEL2, RD_DOSEL3, RD_DOSEL4, RD_DOSEL5.  I can then use this to loop inside a macro to read each dataset.

 

 

data viswin;
	input VISITID 1-5 VISITMNEMONIC $ 9-24 MIN_WINDOW 25-27 MAX_WINDOW 29 ANCHOR $ 33-63;
	cards;
30101	Day 1 (Part 1)	0	0	n/a
30115	Day 15 (Pat 1)	-2	2	n/a
30199	SFUP (Part 1)	-7	7	RD_DOSEL, RD_DOSEL2, RD_DOSEL3
30201	Day 1 (Part 2)	0	0	n/a
30215	Day 15 (Part 2)	-2	2	n/a
30299	SFUP (Part 2)	-7	7	RD_DOSEL, RD_DOSEL4
30301	Day 1 (Part 3)	0	0	n/a
30315	Day 15 (Part 3)	-3	3	n/a
30399	SFUP (Part 3)	-7	7	RD_DOSEL, RD_DOSEL5
;
run;

data anchor;
set viswin (where=(anchor^='n/a'));
ndset=countw(anchor);
dset1=scan(anchor,1);
dset2=scan(anchor,2);
dset3=scan(anchor,3);
run;

data anchor2;
set anchor (keep=dset1 rename=(dset1=dset))
anchor (keep=dset2 rename=(dset2=dset))
anchor (keep=dset3 rename=(dset3=dset));
if dset='' then delete;
run;
proc sort data=anchor2 nodupkey;
by dset;
run;

 


Accepted Solutions
Solution
‎02-20-2018 11:27 AM
Contributor
Posts: 47

Re: How to get a distinct list of words in a variable

Posted in reply to djbateman
Hi This does the same as your code, but is a little less complicated. * split multiple anchors in one record per value; data anchor; set viswin (where=(anchor^='n/a')); do i = 1 to countw(anchor); dset=scan(anchor,i); output; end; run; proc sql noprint; * noprint to avoid output-display from second select; * get distinct list; select distinct dset into :anchor_list separated by ' ' from anchor; * count; select count(distinct(dset)) into: anchor_count from anchor; quit; %let anchor_count = &anchor_count; * to get rid of blanks; %put &=anchor_list; %put &=anchor_count;

View solution in original post


All Replies
Regular Contributor
Posts: 247

Re: How to get a distinct list of words in a variable

[ Edited ]
Posted in reply to djbateman

Okay.  I have come up with a piece of generalized code.  But can someone review this and let me know if there is a more efficient way to do this?  The code just seems a little lengthy for what I'm trying to do.

 

%macro anchor;
	data anchor;
		set viswin (where=(anchor^='n/a'));
		ndset=countw(anchor);
	run;
	proc sql noprint;
		select max(ndset) into :nanchor from anchor;
		%let nanchor=%sysfunc(compress(&nanchor.));
	quit;
	data anchor;
		set anchor;
		%do i=1 %to &nanchor.;
			dset&i.=scan(anchor,&i.);
		%end;
	run;

	data anchor2;
		set %do i=1 %to &nanchor.;
				anchor (keep=dset&i. rename=(dset&i.=dset))
			%end; ;
		if dset='' then delete;
	run;
	proc sort data=anchor2 nodupkey;
		by dset;
	run;

	proc sql noprint;
		select count(*) into :ndset from anchor2;
		%let ndset=%sysfunc(compress(&ndset.));
		select dset into :dset1-:dset&ndset. from anchor2;
	quit;
%mend anchor;

%anchor;

  

Solution
‎02-20-2018 11:27 AM
Contributor
Posts: 47

Re: How to get a distinct list of words in a variable

Posted in reply to djbateman
Hi This does the same as your code, but is a little less complicated. * split multiple anchors in one record per value; data anchor; set viswin (where=(anchor^='n/a')); do i = 1 to countw(anchor); dset=scan(anchor,i); output; end; run; proc sql noprint; * noprint to avoid output-display from second select; * get distinct list; select distinct dset into :anchor_list separated by ' ' from anchor; * count; select count(distinct(dset)) into: anchor_count from anchor; quit; %let anchor_count = &anchor_count; * to get rid of blanks; %put &=anchor_list; %put &=anchor_count;
Regular Contributor
Posts: 247

Re: How to get a distinct list of words in a variable

Posted in reply to ErikLund_Jensen
This is great. I typically have a hard time going back to some of the basics with loops and output statements. Thank you for your help.
Contributor
Posts: 47

Re: How to get a distinct list of words in a variable

Posted in reply to djbateman

Sorry the code seems to be unreadable.

 

* split multiple anchors in one record per value;
data anchor; set viswin (where=(anchor^='n/a')); 
	do i = 1 to countw(anchor);
		dset=scan(anchor,i); 
		output;
	end;
run;

proc sql noprint; * noprint to avoid output-display from second select;
	* get distinct list;
	select distinct dset into :anchor_list separated by ' '
		from anchor;

	* count;
	select count(distinct(dset)) into: anchor_count
		from anchor;
quit;
%let anchor_count = &anchor_count; * to get rid of blanks;
%put &=anchor_list;
%put &=anchor_count;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 109 views
  • 1 like
  • 2 in conversation