Hi there!
I have a table of patients (one line per patient) for whom i want to find a score in another table of treatments (several lines per patient, one score for each date). In my real search it is several thousands, but here it's boiled down to a simple example.
Patient-table:
Data patients;
Input PatientID StartDate;
cards;
1 50
2 70
3 20
4 10
run;
Treatments-table:
Data Treatments;
Input PatientID TreatmentDate Score;
format TreatmentDate date9.;
cards;
1 20 1
1 40 3
1 60 4
2 30 1
run;
Then my goal is then to search for the treatments by using a hash-object. For each treatment-score that fulfills the criteria, i want to populate/create a new variable:
Look-up code:
data lookup;
if _n_ = 1 then do;
if 0 then set Treatments;
dcl hash treat (dataset: 'Treatments', multidata: 'y');
treat.definekey ('PatientID');
treat.definedata ('TreatmentDate','Score');
treat.definedone ();
end;
set patients;
by patientid;
if first.patientid then count = .;
do _iorc_ = treat.find() by 0 while (_iorc_ = 0);
if TreatmentDate < Startdate then do;
Count + 1;
call symput('counter',count);
Number&Counter = Score;
end;
_iorc_ = treat.find_next();
end;
run;
So that i end up with the final table looking like:
PatientID | StartDate | Number1 | Number2 | Number3 ... Number X
1 | 50 | 1 | 3 | .
2 | 70 | . | . | .
Thanks for your time and help 🙂
> For each treatment-score that fulfills the criteria, i want to populate/create a new variable
You can't create variables on the fly. All variables are created when the data step compiles.
To fill these variables:
1. Create an array at the start of the data step
array Number [99] ;
2. Fill the array by changing your code from
Number&Counter = Score;
to
Number[Count] = Score;
Please double check that first data step and post one that runs.
Data for CARDS cannot be on the same line and I really suspect that you used something other than the SAS editor to make that and the result is something that is mashing the value for the first patient into the id of the second patient. Assuming that is the case, a join and transpose may be appropriate
Data patients; Input PatientID StartDate; cards; 1 50 2 70 3 20 4 10 run; Data Treatments; Input PatientID TreatmentDate Score; format TreatmentDate date9.; cards; 1 20 1 1 40 3 1 60 4 2 30 1 4 20 1 4 30 2 4 40 3 4 50 4 4 60 7 4 70 10 run; proc sql; create table temp as select a.*,b.score from patients as a left join treatments as b on a.PatientID=b.patientid ; quit; proc transpose data=temp out=want (drop=_name_) prefix=Number ; by patientid startdate; var score; run;
You don't mention what is supposed to happen with treatment dates but for your very limited example this works.
If you expect the score values in a specific order that may or may not be possible depending on the rule. An order by clause in the Proc SQL may work.
Set your Enterprise Guide to not use tabs in the code. Replace tabs on loading files or while you type with blanks. Different systems use different tab widths.
Next, in your second data step you assign a date format to values that are quite clearly not SAS date values.
Third, the TRANSPOSE method is to be preferred, because it automatically results in the minimal number of variables needed, completely data-driven. The array method must either use an array size with a substantial safety margin, resulting in lots of unneeded columns, wasting disk and screen space, or you run the risk of an array index overflowing, or dropping values (losing information) without noticing it.
And another important hint for handling the code windows here:
do not edit code by simply clicking on it and typing away; position the cursor in the window, hit the "little running man" (or </>) button, and then do your editing in the pop-up. When typing directly without calling up the pop-up, linefeeds do not work.
@Kurt_Bremser Totally agree about never using tabs.
About transposing, the downside is that you might end up with tables having different structures. Pre-defining the columns ensures that the table structure is consistent, and if the array happened to be undersized, SAS would certainly let you know with:
ERROR: Array subscript out of range at line 29 column 26.
> For each treatment-score that fulfills the criteria, i want to populate/create a new variable
You can't create variables on the fly. All variables are created when the data step compiles.
To fill these variables:
1. Create an array at the start of the data step
array Number [99] ;
2. Fill the array by changing your code from
Number&Counter = Score;
to
Number[Count] = Score;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.