Hi,
I have 2 datasets as below: 'insects' and 'test1'.
Data test1 has thousands of rows with many columns including binary 1/0 ins_ant, ins_bee. ins_beetle.
I want to calculate new columns 'total_weight' and 'total_id' based on the values in Data insects (sum of the corresponding "weight"s and "id"s.)
I'm using a proc format to do this, my code is below.
I used type = 'I' because my input is a string (insect name) and outputs are numbers, and that's the informat type to use to the documentation here - SAS Help Center: Results: PROC FORMAT
But my final output dataset test2 has null values for the total_weight and total_id columns.
What am I doing wrong?
Or please let me know if there's a simpler way to do this (without using proc format, but also without hardcoding anything like names of insects/weights/ids/number of insects etc.)
Thank you!!
code:
data insects;
length insect $100.;
insect='ant';
weight = 10;
id0 = 1;
output;
insect='beetle';
weight = 20;
id0 = 2;
output;
insect='bee';
weight = 30;
id0 = 3;
output;
run;
data fmt;
set insects;
length start $100;
start = upcase(strip(insect));
retain type 'I';
fmtname = 'weight';
label = weight;
output fmt;
fmtname = 'id';
label = id0;
output fmt;
keep start fmtname label type;
run;
proc format cntlin=fmt;
run;
data test1;
patient = 4772;
ins_ant = 1;
ins_bee = 1;
ins_beetle = 0;
output;
patient = 6792;
ins_ant = 0;
ins_bee = 1;
ins_beetle = 1;
output;
run;
data test2;
set test1;
total_weight = 0;
total_id0 = 0;
array ins_all {*} ins_: ;
do i_local = 1 to dim(ins_all);
total_weight = total_weight + ( ins_all[i_local] * input(upcase(strip(scan(vname(ins_all[i_local]),2,'_'))),weight.) );
total_id0 = total_id0 + ( ins_all[i_local] * input(upcase(strip(scan(vname(ins_all[i_local]),2,'_'))),id.) );
end;
drop i_local;
run;
If you have SAS/STAT licensed (why are you running SAS without it?) you can use PROC SCORE to do the work for you.
Take your INSECT dataset and transpose it and add a _TYPE_ variable with constant value of 'SCORE'.
data insects;
_type_='SCORE';
input insect :$32. weight id0 ;
cards;
ant 10 1
beetle 20 2
bee 30 3
;
proc transpose data=insects out=insect_score;
by _type_;
id insect;
var weight id0;
run;
Now you can use it to score your test data. Make sure the variable names match the values of _NAME_ in the scoring dataset.
data test1;
input patient ant bee beetle ;
cards;
4772 1 1 0
6792 0 1 1
;
proc score data=test1 score=insect_score out=want;
run;
Results:
Obs patient ant bee beetle weight id0 1 4772 1 1 0 40 4 2 6792 0 1 1 50 5
PS Notice how much easier it is to share data using in-line data an INPUT statement instead of multiple assignment statements and OUTPUT statement.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.