BookmarkSubscribeRSS Feed
StephenPM33
Calcite | Level 5

Using Proc SQL, I want to create a table that displays new reporter numbers that exist in March but don't exist in February or January.

 

I already created the datasets for January and February and March, now I need to see which report numbers exist in March but don't exist in January or February.

I'm not sure how to compare across three or more datasets using proc sql. Any insight would be appreciated.

5 REPLIES 5
novinosrin
Tourmaline | Level 20

HI @StephenPM33  Can you please post a sample data of what you HAVE and what you WANT(expected output) explaining your requirement. I am sure that will help somebody to work and test their solution before they post. 

 

Btw, noticing your 1st post, welcome to SAS communities. 

StephenPM33
Calcite | Level 5

/* Creation of Masterlist for All Months reports for this Group Number;*/------>I Have This Part
PROC SQL;
   CREATE TABLE WORK.FILTER_FOR_REGISTRY_CURRENT AS
   SELECT a.*
      FROM MICRO.REGISTRY_CURRENT a
      WHERE a.GROUP_NUMBER = '0000000';
proc sql;
 create table work.census_in_microdata as
 select a.state_code,a.county_code, a.rept_num, b.data_value, b.datatype, b.version, b.ref_mm, b.ref_yy
 from work.filter_for_registry_current a, micro.microdata_current b
 where a.state_code=b.state_code and a.rept_num=b.rept_num;
/* Creation of Masterlist for All Months reports for this Group Number;*/
quit;
/* Creation of List for January Ref MM reports for this Group Number;*/
proc sql;
   create table January_Census_Output as
   select state_code, rept_num, data_value, datatype, version, ref_mm, ref_yy
   from work.census_in_microdata
   where ref_mm='01'
   group by state_code, rept_num
   having version=max(version);
quit;
 /* Creation of List for January Ref MM reports for this Group Number;*/
 /* Creation of List for February Ref MM reports for this Group Number;*/
proc sql;
   create table February_Census_Output as
   select state_code, rept_num, data_value, datatype, version, ref_mm, ref_yy
   from work.census_in_microdata
   where ref_mm='02'
   group by state_code, rept_num
   having version=max(version);
quit;
/* Creation of List for February Ref MM reports for this Group Number;*/
/* Creation of List for March Ref MM reports for this Group Number;*/
proc sql;
   create table March_Census_Output as
   select state_code, rept_num, data_value, datatype, version, ref_mm, ref_yy
   from work.census_in_microdata
   where ref_mm='03'
   group by state_code, rept_num
   having version=max(version);
quit;
/* Creation of List for March Ref MM reports for this Group Number;*/--------->I Have This Part

/* Creation of List for All First Time Reporters for Ref MM=03 reports for this Group Number;*/--------->I Need This Part.

proc sql;
  create table 'WORK.FILTER_FOR_REGISTRY_CURRENT';
  select a.* from WORK.FILTER_FOR_REGISTRY_CURRENT a;
  create table 'Table micro.microdata_current b';
  select b.* from micro.microdata_current b;

quit;/* Creation of List for All First Time Reporters for Ref MM=03 reports for this Group Number;*/--------->I Need This Part.

PaigeMiller
Diamond | Level 26

You don't want three identical SQL extracts, one for each month, that have to be combined. You can have one extract containing all three months, and then you don't have to combine three data sets somehow. Then, as I said, PROC SUMMARY or the equivalent PROC SQL will show which months have data for a given variable, and which months do not.

--
Paige Miller
SASKiwi
PROC Star

This isn't difficult in SQL. Query your March data, then LEFT JOIN to your FEB and JAN data. If the report number is missing from the latter two than means they don't exist in those months. It should work something like this. If reportid_feb or reportid_jan return as missing values they don't exist in those tables.

proc sql;
  create want as 
  select  A.*
            ,B.reportid as reportid_feb
            ,C.reportid as reportid_jan
  from Mar as A
  left join Feb as B
  on A.reportid = B.reportid
  left join Jan as C
  on A.reportid = C.reportid
  ;
quit; 
PaigeMiller
Diamond | Level 26

Does "numbers exist" mean: NOT (all missing values)?

 

A simple solution, but not using PROC SQL, is to append all three data sets, then run PROC SUMMARY where month is a CLASS variable. If the computed N for each variable in each month is greater than zero, then "numbers exist". 

 

Don't fall into the trap that this absolutely has to be done in PROC SQL.

--
Paige Miller

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1022 views
  • 0 likes
  • 4 in conversation