Hi,
I have a results output with study participant IDs and numeric values
data results;
input ID resultA resultB;
datalines;
1 2392 887
2 1587 902
3 2392 234
4 4035 2392
;
run;
These numeric values correspond to string values (names) found in a reference file
data reference;
input num first$ last$;
datalines;
2392 joe bloggs
4035 mary smith
1587 bloggs joe
234 sherlock holmes
902 john watson
887 benjamin button
;
run;
I want to match the string values in the reference file with the numeric values in the results output.
This is the output I want to achieve; the string values under resultA and resultB are based
on the numbers and corresponding string values in the reference file
data want;
input ID resultA :$200. resultB :$200. ;
datalines;
1 joe-bloggs benjamin-button
2 bloggs-joe john-watson
3 joe-bloggs sherlock-holmes
4 mary-smith joe-bloggs
;
run;
I have tried to search for code on the forum to solve this, with no success.
Any help would be greatly appreciated.
Please let me know if my problem is not clear from the above.
Hi @Epi_Stats,
I would probably create a custom format:
data fmt(drop=first last);
retain fmtname 'ref';
set reference(rename=(num=start));
length label $40;
label=catx('-',first,last);
run;
proc format cntlin=fmt;
run;
Then you may not even need dataset WANT because you can get the same information from the existing RESULTS dataset:
proc print data=results;
format res: ref.;
run;
Creating dataset WANT is not much work either:
data want(drop=_:);
set results(rename=(resultA=_A resultB=_B));
length resultA resultB $200;
resultA=put(_A, ref.);
resultB=put(_B, ref.);
run;
You have to join Results and Reference tables. I'm suggesting to use proc SQL for joining because there is two joins and tables doesn't have common variable.
Hello
Based upon your first two datasets results and reference the outcome does not appear to be what you have shown.
However using the sample code you should be able to get the desired result as shown
data results;
input ID resultA resultB;
datalines;
1 2392 887
2 1587 902
3 2392 234
4 4035 2392
;
run;
data reference;
input num first$ last$;
datalines;
2392 joe bloggs
4035 mary smith
1587 bloggs joe
234 sherlock holmes
902 john watson
887 benjamin button
;
run;
proc sql;
Select id, cat(Propcase(strip(first)),"-",propcase(Strip(last))) as Name from results a,reference b
where a.resultA=b.num;
quit;
The code gives result like this with the given datasets results and reference.
Hi @Epi_Stats,
I would probably create a custom format:
data fmt(drop=first last);
retain fmtname 'ref';
set reference(rename=(num=start));
length label $40;
label=catx('-',first,last);
run;
proc format cntlin=fmt;
run;
Then you may not even need dataset WANT because you can get the same information from the existing RESULTS dataset:
proc print data=results;
format res: ref.;
run;
Creating dataset WANT is not much work either:
data want(drop=_:);
set results(rename=(resultA=_A resultB=_B));
length resultA resultB $200;
resultA=put(_A, ref.);
resultB=put(_B, ref.);
run;
Thank you @FreelanceReinh ,
Very clever to treat the reference file as a format - I had not thought of this!
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.