Data test;
Input Event_id Drug_use $10. Condom_use $10.;
Datalines;
123 Marijuana Always
456 Meth Sometimes
789 . Always
156 Heroin .
147 . .
269 Meth Never
run;
proc print data=test;
run;
I have this:
Event ID | Drug_use | Condom_Use |
123 | Marijuana | Always |
456 | Meth | Sometimes |
789 | . | Always |
156 | Heroin | . |
147 | . | . |
269 | Meth | Never |
I want this:
Event ID | Drug_use | Condom_Use |
123 | 1 | 1 |
456 | 1 | 1 |
789 | 0 | 1 |
156 | 1 | 0 |
147 | 0 | 0 |
269 | 1 | 1 |
Your guidance would be helpful.
Thank you
Missing values in character variables are blanks.
Also make correct use of formats in list input.
Data test;
infile cards dlm=' ' dsd;
Input Event_id _Drug_use :$10. _Condom_use :$10.;
Drug_use = _Drug_use > '';
Condom_use = _Condom_use > '';
drop _Drug_use _Condom_use;
Datalines;
123 Marijuana Always
456 Meth Sometimes
789 Always
156 Heroin
147
269 Meth Never
run;
proc print data=test;
run;
@Kurt_Bremser kindly please explain how drug_use and condom_use are assigned to 0 and 1
i didn't get it so asking your help
Your latest example is unreadable. Please use the {i} icon to insert text so that spaces used for alignment are preserved.
I imported an excel file into sas, that looks like this
event_id risks
123 drug use marijuana
456 drug use heroin
123 condom use (sometime)
974 drug use meth
456 condom use(always)
Then I transposed the dataset now it looks like this in sas
event_id drug use marijuana drug use heroin condom use (sometimes) drug use meth condom use(always)
123 drug marijuana condom use (sometimes)
456 drug use heroin condom use(always)
974 drug use meth
I want the table like this
event_id drug use marijuana drug use heroin condom use (sometimes) drug use meth condom use(always)
123 1 0 1 0 0
456 0 1 0 0 1
974 0 0 0 1 0
I am pretty sure it doesn't look like:
event_id risks 123 drug use marijuana 456 drug use heroin 123 condom use (sometime) 974 drug use meth 456 condom use(always)
Perhaps it looks like the dataset you would get if you ran this code:
data have ;
length event_id 8 risks $30 ;
infile cards dsd dlm='|' truncover ;
input event_id risks;
cards;
123|drug use marijuana
456|drug use heroin
123|condom use (sometime)
974|drug use meth
456|condom use(always)
;
You cannot really make a dataset with variable names like in your wanted output. Perhaps you want a report instead?
To get a report like you describe why not just use PROC FREQ?
proc freq data=have ;
tables event_id*risks ;
run;
It looks like what text is there doesn't matter, just that there's text. And then you convert any text to 1 and 0 otherwise.
Is that correct?
If so, declare two arrays, one that has all your original and one that has your new variables.
Then loop through and assign your 0/1.
Untested code below:
array old_vars(*) $50 use_marijuana use_heroin condom_sometimes condom_always;
array new_vars(*) drug_marijuana drug_heroin condom_1 condom_2;*variable names need to be different;
do i=1 to dim(old_vars);
if missing(old_vars(i)) then new_vars(i) = 0; else new_vars(i)=1;
end;
@soham_sas wrote:
@Kurt_Bremser kindly please explain how drug_use and condom_use are assigned to 0 and 1
i didn't get it so asking your help
The result of a comparison is a boolean value, 0 for false and 1 for true.
x = y = z;
would mean: if y equals z, assign 1 to x, otherwise assign 0.
Use the MISSING() function.
So given this test dataset
data test;
input Event_id Drug_use :$10. Condom_use :$10.;
datalines;
123 Marijuana Always
456 Meth Sometimes
789 . Always
156 Heroin .
147 . .
269 Meth Never
;
If you just do this:
data want ;
set test ;
drug_use = not missing(drug_use);
condom_use = not missing(comdom_use);
run;
You will have some trouble since those variables are already defined as character variables instead of numeric variables.
You can instead make new variables. If you want to use the same names then you will need use rename.
Here is one way.
data want ;
set test ;
drug_use_flag = not missing(drug_use);
condom_use_flag = not missing(comdom_use);
drop drug_use condom_use;
rename drug_use_flag = drug_use condom_use_flag=condom_use ;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.