I have a gender variable in the old dataset from a week ago and a gender variable in the current dataset. I have created old and new versions of these variables and combined them into a table called joined.
My goal is to create a flag that will show if there are differences between the old and the new gender variable +- 100. I'd like to know if there is drops.
The gender variable is categorical (categories are: female, male, unknown).
Is there a way for me to create flags using proc sql -I will need to do this for the other variables including sex, race, age, etc. What is the easiest way to create a flag and also show the counts of the variables? The end goal would be three columns the counts of the two gender variables and then a flag saying if there was an increase or decrease of 100 from the two weeks.
This is the code I have so far:
proc sql;
create table joined as
select coalesce(a.person_key, b.person_key) as person_key,
a.gender_old, b.gender_new,
a.sexor_old, b.sexor_new,
a.race_cd_old, b.race_cd_new,
a.age_group_old, b.age_group_new,
a.sex_old, b.sex_new,
a.peh1_old, b.peh1_new,
a.peh2_old, b.peh2_new,
a.dose_total_old, b.dose_total_new
from oldperson a
full join newperson b
on a.person_key = b.person_key;
quit; /Goal: Flag significant increases and flag ANY drops./
Do you want to compare individual observations and see if the value has changed?
create table gender_check as
select coalesce(a.id,b.id) as id
, case when (a.sex ne b.sex) then cats(a.sex,'->',b.sex)
else ' '
end as sex_flag
from old a
full join new b
on a.id =b.id
;
Or do you want to compare the distribution of values between the two datasets?
select coalesce(a.sex,b.sex) as sex
, a.count as old_count
, b.count as new_count
, a.count - b.count as count_diff
from (select sex,count(*) as count from old group by sex) a
full join (select sex,count(*) as count from new group by sex) b
on a.sex = b.sex
;
I'd want to look at the distribution of values so the second code block may work.
How would I add a flag in the proc freq to say if the count_diff is more than 100?
@xxartpopxx wrote:
I'd want to look at the distribution of values so the second code block may work.
How would I add a flag in the proc freq to say if the count_diff is more than 100?
There is no PROC FREQ mentioned in this thread. Only PROC SQL. In PROC SQL, you perform a test to see if the difference is greater than 100 using the CASE WHEN statement in PROC SQL. If it is greater than 100, you assign the flag variable a value of 1, otherwise you assign the flag variable a value of zero.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.