Using Proc SQL, I want to create a table that displays new reporter numbers that exist in March but don't exist in February or January.
I already created the datasets for January and February and March, now I need to see which report numbers exist in March but don't exist in January or February.
I'm not sure how to compare across three or more datasets using proc sql. Any insight would be appreciated.
HI @StephenPM33 Can you please post a sample data of what you HAVE and what you WANT(expected output) explaining your requirement. I am sure that will help somebody to work and test their solution before they post.
Btw, noticing your 1st post, welcome to SAS communities.
/* Creation of Masterlist for All Months reports for this Group Number;*/------>I Have This Part
PROC SQL;
CREATE TABLE WORK.FILTER_FOR_REGISTRY_CURRENT AS
SELECT a.*
FROM MICRO.REGISTRY_CURRENT a
WHERE a.GROUP_NUMBER = '0000000';
proc sql;
create table work.census_in_microdata as
select a.state_code,a.county_code, a.rept_num, b.data_value, b.datatype, b.version, b.ref_mm, b.ref_yy
from work.filter_for_registry_current a, micro.microdata_current b
where a.state_code=b.state_code and a.rept_num=b.rept_num;
/* Creation of Masterlist for All Months reports for this Group Number;*/
quit;
/* Creation of List for January Ref MM reports for this Group Number;*/
proc sql;
create table January_Census_Output as
select state_code, rept_num, data_value, datatype, version, ref_mm, ref_yy
from work.census_in_microdata
where ref_mm='01'
group by state_code, rept_num
having version=max(version);
quit;
/* Creation of List for January Ref MM reports for this Group Number;*/
/* Creation of List for February Ref MM reports for this Group Number;*/
proc sql;
create table February_Census_Output as
select state_code, rept_num, data_value, datatype, version, ref_mm, ref_yy
from work.census_in_microdata
where ref_mm='02'
group by state_code, rept_num
having version=max(version);
quit;
/* Creation of List for February Ref MM reports for this Group Number;*/
/* Creation of List for March Ref MM reports for this Group Number;*/
proc sql;
create table March_Census_Output as
select state_code, rept_num, data_value, datatype, version, ref_mm, ref_yy
from work.census_in_microdata
where ref_mm='03'
group by state_code, rept_num
having version=max(version);
quit;
/* Creation of List for March Ref MM reports for this Group Number;*/--------->I Have This Part
/* Creation of List for All First Time Reporters for Ref MM=03 reports for this Group Number;*/--------->I Need This Part.
proc sql;
create table 'WORK.FILTER_FOR_REGISTRY_CURRENT';
select a.* from WORK.FILTER_FOR_REGISTRY_CURRENT a;
create table 'Table micro.microdata_current b';
select b.* from micro.microdata_current b;
quit;/* Creation of List for All First Time Reporters for Ref MM=03 reports for this Group Number;*/--------->I Need This Part.
You don't want three identical SQL extracts, one for each month, that have to be combined. You can have one extract containing all three months, and then you don't have to combine three data sets somehow. Then, as I said, PROC SUMMARY or the equivalent PROC SQL will show which months have data for a given variable, and which months do not.
This isn't difficult in SQL. Query your March data, then LEFT JOIN to your FEB and JAN data. If the report number is missing from the latter two than means they don't exist in those months. It should work something like this. If reportid_feb or reportid_jan return as missing values they don't exist in those tables.
proc sql;
create want as
select A.*
,B.reportid as reportid_feb
,C.reportid as reportid_jan
from Mar as A
left join Feb as B
on A.reportid = B.reportid
left join Jan as C
on A.reportid = C.reportid
;
quit;
Does "numbers exist" mean: NOT (all missing values)?
A simple solution, but not using PROC SQL, is to append all three data sets, then run PROC SUMMARY where month is a CLASS variable. If the computed N for each variable in each month is greater than zero, then "numbers exist".
Don't fall into the trap that this absolutely has to be done in PROC SQL.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.