DATA Step, Macro, Functions and more

PROC Compare - Maybe?!

Reply
Occasional Contributor
Posts: 7

PROC Compare - Maybe?!

I need to compare 2 datasets (should be equal). They have same variables but I want to see matching and missing observations. I have attempted PROC Compare. I'm having a problem with the "value comparison results for variables" output listing a value in both datasets and calculating a percent difference.
Ex: obs Base Compare Diff. %Diff
17 17Sep2014 17Sep2014. 0.6250 0.003128
Super User
Posts: 3,254

Re: PROC Compare - Maybe?!

One explanation for what you are seeing is that the variable you are comparing is in fact a DATETIME but you are only seeing the DATE part of it. What SAS format is assigned to this variable? 

Occasional Contributor
Posts: 7

Re: PROC Compare - Maybe?!

I don't see an assigned format for the specimen collection date variable. It only lists the format of "common variables with differing attributes" under variable summary
Super User
Posts: 3,254

Re: PROC Compare - Maybe?!

Use PROC CONTENTS to check for assigned formats. PROC COMPARE is indicating there is an assigned format by displaying data in the form DDMONYYYY:

 

proc contents data = DatasetName;
run;
Occasional Contributor
Posts: 7

Re: PROC Compare - Maybe?!

Dataset Type Len Format Informat
Base Num 8 Date9. Date9.
Compare Num 8 Date9. Date9.
Super User
Posts: 11,343

Re: PROC Compare - Maybe?!

Please show the code you are attempting. Compare has a number of options on what to compare and how to display the results.

 

Also if you are interested in comparisons of observations you may need to sort the data sets by a number of variables to get them into the same order.

Occasional Contributor
Posts: 7

Re: PROC Compare - Maybe?!

PROC COMPARE BASE = Dataset COMPARE = Dataset;

I created the base dataset manually to match that of compare. More than one facility reports to the base and therefore had more data than needed. As opposed to the compare dataset which is data from one facility.

I'm looking to compare them for accuracy
Super User
Super User
Posts: 7,050

Re: PROC Compare - Maybe?!

You probably converted one of the files from Excel.  In Excel datetime values are stored using the interger part for the date and the fraction as the percent of 24 hours.  In SAS the DATE format will ignore the fractional part of the value and just display the date.

404   data _null_;
405    dt1='17Sep2014'd ;
406    dt2=dt1 + 0.6250;
407    put (dt1 dt2) (=) / (dt1 dt2) (date9. +1) ;
408   run;

dt1=19983 dt2=19983.625
17SEP2014 17SEP2014
Occasional Contributor
Posts: 7

Re: PROC Compare - Maybe?!

Both files are imported Excel files. I should do this with all (3) of date variables?
Super User
Super User
Posts: 7,050

Re: PROC Compare - Maybe?!

[ Edited ]

You will need to decide what to do based on what you need.

If you are just looking to see if the values have the date part and the very few that don't are just because of these fractional parts then you could perhaps just declare success and move on.

 

If you did want to convert you would use a data step to make a new version of the data with the date values converted.  If you have more than one date variable per dataset then you can use ARRAY processing to simplify the code.

 

For example to just remove the fractional date part using the INT() or ROUND(,1) function.

mydate = int(mydate);

If you wanted to convert the variable to DATETIME instead to preserve the time part then you could use the DHMS() format. It will automatically conver the fractional part for you.

mydate = dhms(mydate,0,0,0);
format mydate datetime20.;

 

 

Occasional Contributor
Posts: 7

Re: PROC Compare - Maybe?!

I don't understand why it's reading date and time. I know fate is the integer portion and time fractional? Both excel sheets are in the same date format XX/XX/XXXX
Super User
Super User
Posts: 7,050

Re: PROC Compare - Maybe?!

The import wizard noticed that most of the cells in that column have a date format attached, so it decided to import the value as a date. Since Excel stores dates as number and so does SAS it just copied over the number and adjusted the value to match the difference in base dates that the two systems use.  If the value had a fractional date then that was copied along with it.  In both SAS and Excel if you format the value with a format that ignores the fractional value then you do not normally see the fraction value, but it is still there in the data.

Occasional Contributor
Posts: 7

Re: PROC Compare - Maybe?!

I used the import wizard
Ask a Question
Discussion stats
  • 12 replies
  • 143 views
  • 0 likes
  • 4 in conversation