Hi all
Each month I run a running report from Jan 1. Each month new items are added to the sets of data. I would like to be able to write code that will automatically populate a column in the new table with "Y" or "N" which indicates whether or not a line is has already been recognized in the old table. I do this manually each month and it would be time consuming if I were able to do this. I am attaching examples for review (I cant seem to cut and paste w/o losing my format):
@wheddingsjr Ok That makes me feel bad. One last try please at least for the sake of my satisfaction:-
proc sql;
create table want as
select *, 'Y' as retract from
(select * from new
intersect
select * from old)
union
select *,'N' as retract from (select * from new
except
select * from old)
order by memno, begdate;
quit;
something like this?
data want;
if _n_=1 then do;
if 0 then set old;
dcl hash H (dataset:'old',multidata:'y') ;
h.definekey ('MEMNO','BEGDATE','COMBO');
h.definedone () ;
end;
set new;
if h.check()=0 then new_column='Y';
else new_column='N';
run;
It appears you could even add
COUNT | UNIQDOS |
to the key variable list in the hash object and use the direct check method. Play around and let us know
Hi Novinosrin and thanks for your help. Sorry for the delayed response but it was a long weekend here where I am. I brought in the old data and ran it against the new data using the code you provided. It creates and populates a column in my results but its putting a "N" in every row even though I know there are some that should have a "Y". I am not very versed with complex code as I basically just run a standard and saved script so some of the code you use I don't really understand.
I used this:
data UrineAnalysis1;
if _n_=1 then do;
if 0 then set Apr2018_TEST; (The old data I imported in)
dcl hash H (dataset:'Apr2018_TEST',multidata:'N') ;
h.definekey ('COMBO');
h.definedone () ;
end;
set UrineAnalysis; (the new data created)
if h.check()=0 then RETRACT='N';
else RETRACT='Y';
run;
Can you provide us with a more and better representative sample of your real plz?
Here you go.
Please test and let us know. Thank you
data old;
input MEMNO COUNT UNIQDOS BEGDATE :mmddyy10. COMBO ;
cards;
12345 1 1 01/03/18 1234543103
12345 . 0 01/03/18 1234543103
12345 2 1 01/05/18 1234543105
12345 3 1 01/07/18 1234543107
12345 4 1 01/09/18 1234543109
12345 5 1 01/10/18 1234543110
12345 . 0 01/10/18 1234543110
12345 6 1 01/12/18 1234543112
12345 7 1 01/13/18 1234543113
12345 8 1 01/14/18 1234543114
12345 9 1 01/16/18 1234543116
12345 10 1 01/17/18 1234543117
12345 11 1 01/19/18 1234543119
12345 . 0 01/19/18 1234543119
12345 12 1 01/21/18 1234543121
12345 13 1 01/23/18 1234543123
12345 14 1 01/24/18 1234543124
12345 15 1 01/26/18 1234543126
12345 16 1 01/27/18 1234543127
12345 17 1 01/28/18 1234543128
12345 18 1 01/30/18 1234543130
12345 19 1 01/31/18 1234543131
12345 20 1 02/02/18 1234543133
12345 . 0 02/02/18 1234543133
12345 21 1 02/04/18 1234543135
;
data new;
input MEMNO RETRACT $ COUNT UNIQDOS BEGDATE :mmddyy10. COMBO;
drop retract;
cards;
12345 Y 1 1 01/03/18 1234543103
12345 Y . 0 01/03/18 1234543103
12345 Y 2 1 01/05/18 1234543105
12345 Y 3 1 01/07/18 1234543107
12345 Y 4 1 01/09/18 1234543109
12345 Y 5 1 01/10/18 1234543110
12345 Y . 0 01/10/18 1234543110
12345 Y 6 1 01/12/18 1234543112
12345 Y 7 1 01/13/18 1234543113
12345 Y 8 1 01/14/18 1234543114
12345 Y 9 1 01/16/18 1234543116
12345 Y 10 1 01/17/18 1234543117
12345 Y 11 1 01/19/18 1234543119
12345 Y . 0 01/19/18 1234543119
12345 Y 12 1 01/21/18 1234543121
12345 Y 13 1 01/23/18 1234543123
12345 Y 14 1 01/24/18 1234543124
12345 Y 15 1 01/26/18 1234543126
12345 Y 16 1 01/27/18 1234543127
12345 Y 17 1 01/28/18 1234543128
12345 Y 18 1 01/30/18 1234543130
12345 Y 19 1 01/31/18 1234543131
12345 Y 20 1 02/02/18 1234543133
12345 Y . 0 02/02/18 1234543133
12345 Y 21 1 02/04/18 1234543135
12345 N 22 1 02/06/18 1234543137
12345 N 23 1 02/07/18 1234543138
12345 N 24 1 02/09/18 1234543140
12345 N . 0 02/09/18 1234543140
12345 N 25 1 02/11/18 1234543142
12345 N 26 1 02/13/18 1234543144
12345 N 27 1 02/16/18 1234543147
12345 N . 0 02/16/18 1234543147
12345 N 28 1 02/18/18 1234543149
12345 N 29 1 02/21/18 1234543152
12345 N . 0 02/21/18 1234543152
;
data want;
if _n_=1 then do;
if 0 then set old;
dcl hash H (dataset:'old') ;
h.definekey ('MEMNO','count','UNIQDOS','BEGDATE','COMBO');
h.definedone () ;
end;
set new;
if h.check()=0 then retract='Y';
else retract='N';
run;
Hi again
I tested this and it isnt working. Its ok though. I automated a pretty good amount of the process (took it from a 3-4 hour process down to 20 mins.) So manually going through the monthly reports for dupes wont be that bad, I have done it that way for 2 years now so its no big deal. But thanks for all of your help.
@wheddingsjr Ok That makes me feel bad. One last try please at least for the sake of my satisfaction:-
proc sql;
create table want as
select *, 'Y' as retract from
(select * from new
intersect
select * from old)
union
select *,'N' as retract from (select * from new
except
select * from old)
order by memno, begdate;
quit;
Novinosrin
It looked as if it wanted to work..lol...but I got a ton of either
ERROR: Column 20 from the first contributor of INTERSECT is not the same type as its counterpart from the second.
or
ERROR: Numeric expression requires a numeric format.
@wheddingsjr Ok, Now it's time for us to know the data and metadata(data about data) . Can you find out
1. do both the old and new tables have the same variables and types(char and numeric)
2. you could run proc contents to find that out
Novinosrins
It was mentioned to me that instead of importing the previous months excel report, that I just run that report in SAS so I dont have to import. I did that and now I have just a few lines that give the error previously mentioned and i think I know why. In the middle stages of my code I add a few columns to my final table. I think I need to do those steps AFTER I do the steps that you recommended. I am going to give that a shot and let you know how that works out.
Thanks again
@wheddingsjr Ok, some progress after all.
I reiterate the logic in both approaches:
1.Hash : takes in all vars as key i.e each record makes in unique
2. sql : set operators same as above
In essence we compare record to record
hash.check=0 finds it as the same as intersect when we choose all vars as key
Therefore,if one var is not the same (any additional), both approaches will fail.
@novinosrinAll I have to say is that you are a genius. Everything worked out absolutely perfect. Because I just wrote the previous months data into my code and moved your union, intersect, except code further up in the script, I was able to include my add-on columns later in the process. I manually did this at the beginning of the month and just compared it with the new automated process and they are identical. Thank you very very much for your patience and your desire to help me. It is much appreciated.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.