Dataset that I have:
PtID Visit Type Target days
1 4M A 100 101
1 4M A 100 120
1 4M C 100 110
1 8M B 200 210
1 8M C 200 220
2 4M B 100 101
2 4M D 100 120
Expected output dataset:
PtID Visit Type Target days
1 4M A 100 101
1 4M . 100 120
1 4M . 100 110
1 8M B 200 210
1 8M . 200 220
2 4M B 100 101
2 4M . 100 120
I have a dataset with multiple visit data for each PtID. For each PtID, at each visit, only one observation is valid and the rest should be coded to missing (as in the expected output)
Rules:
1) For each PtID, for each visit, if there are multiple ‘A’ visits, then chose the visit where days is closest to the target date (101 is closer to 100 target days), and set the rest of the visit group to missing
2) For each PtID, for each visit, if there is no visit of type ‘A’, then we chose visit type ‘B’, and set the rest of the visit group (eg:4M) to missing
3) For each PtID, for each visit, if there is no visit of type ‘A’ or ‘B’, then we chose visit type ‘C’, and set the rest of visit group to missing
Hi,
here you go:
data have;
input PtID Visit $ Type $ Target days;
datalines;
1 4M A 100 101
1 4M A 100 120
1 4M C 100 110
1 8M B 200 210
1 8M C 200 220
2 4M B 100 101
2 4M D 100 120
;
run;
proc sort data=test sortseq=linguistic(numeric_collation=on);
by PtID Visit Type days;
run;
data want;
set have;
by PtID Visit Type days;
if not first.Visit then Type='';
run;
- Cheers -
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.