BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
wheddingsjr
Pyrite | Level 9

Hi all

 

Each month I run a running report from Jan 1. Each month new items are added to the sets of data. I would like to be able to write code that will automatically populate a column in the new table with "Y" or "N" which indicates whether or not a line is has already been recognized in the old table. I do this manually each month and it would be time consuming if I were able to do this. I am attaching examples for review (I cant seem to cut and paste w/o losing my format):

 



1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

@wheddingsjr Ok That makes me feel bad. One last try please at least for the sake of my satisfaction:-

 

 

proc sql;
 create table want as
 select *, 'Y' as retract from
   (select *  from new
         intersect
         select * from old)
   union
select *,'N' as retract from (select *  from new
         except
         select * from old)
order by memno, begdate;
quit;

 

View solution in original post

23 REPLIES 23
wheddingsjr
Pyrite | Level 9
BTW, the tab labeled "OLD" is the tab to look at. Ignore tab labeled 'NEW" as it was included in the "OLD" tab. Thx
novinosrin
Tourmaline | Level 20

something like this?

data want;
if _n_=1 then do;
if 0 then set old;
   dcl hash H (dataset:'old',multidata:'y') ;
   h.definekey  ('MEMNO','BEGDATE','COMBO');
   h.definedone () ;
end;
set new;
if h.check()=0 then new_column='Y';
else new_column='N';
run;

It appears you could even add 

COUNTUNIQDOS

to the key variable list in the hash object and use the direct check method. Play around and let us know  

wheddingsjr
Pyrite | Level 9

Hi Novinosrin and thanks for your help. Sorry for the delayed response but it was a long weekend here where I am. I brought in the old data and ran it against the new data using the code you provided. It creates and populates a column in my results but its putting a "N" in every row even though I know there are some that should have a "Y". I am not very versed with complex code as I basically just run a standard and saved script so some of the code you use I don't really understand.

 

I used this:

data UrineAnalysis1;
if _n_=1 then do;
if 0 then set Apr2018_TEST; (The old data I imported in)
dcl hash H (dataset:'Apr2018_TEST',multidata:'N') ;
h.definekey ('COMBO');
h.definedone () ;
end;
set UrineAnalysis; (the new data created)
if h.check()=0 then RETRACT='N';
else RETRACT='Y';
run;

novinosrin
Tourmaline | Level 20

Can you provide us with a more and better representative sample of your real plz?

wheddingsjr
Pyrite | Level 9

Here you go.

novinosrin
Tourmaline | Level 20

Please test and let us know. Thank you

data old;
input MEMNO 	COUNT	UNIQDOS	BEGDATE :mmddyy10.	COMBO ;
cards;
12345	1	1	01/03/18	1234543103
12345	.	0	01/03/18	1234543103
12345	2	1	01/05/18	1234543105
12345	3	1	01/07/18	1234543107
12345	4	1	01/09/18	1234543109
12345	5	1	01/10/18	1234543110
12345	.	0	01/10/18	1234543110
12345	6	1	01/12/18	1234543112
12345	7	1	01/13/18	1234543113
12345	8	1	01/14/18	1234543114
12345	9	1	01/16/18	1234543116
12345	10	1	01/17/18	1234543117
12345	11	1	01/19/18	1234543119
12345	.	0	01/19/18	1234543119
12345	12	1	01/21/18	1234543121
12345	13	1	01/23/18	1234543123
12345	14	1	01/24/18	1234543124
12345	15	1	01/26/18	1234543126
12345	16	1	01/27/18	1234543127
12345	17	1	01/28/18	1234543128
12345	18	1	01/30/18	1234543130
12345	19	1	01/31/18	1234543131
12345	20	1	02/02/18	1234543133
12345	.	0	02/02/18	1234543133
12345	21	1	02/04/18	1234543135
;


data new;
input MEMNO	RETRACT $	COUNT	UNIQDOS	BEGDATE  :mmddyy10.	COMBO;
drop retract;
cards;
12345	Y	1	1	01/03/18	1234543103
12345	Y	.	0	01/03/18	1234543103
12345	Y	2	1	01/05/18	1234543105
12345	Y	3	1	01/07/18	1234543107
12345	Y	4	1	01/09/18	1234543109
12345	Y	5	1	01/10/18	1234543110
12345	Y	.	0	01/10/18	1234543110
12345	Y	6	1	01/12/18	1234543112
12345	Y	7	1	01/13/18	1234543113
12345	Y	8	1	01/14/18	1234543114
12345	Y	9	1	01/16/18	1234543116
12345	Y	10	1	01/17/18	1234543117
12345	Y	11	1	01/19/18	1234543119
12345	Y	.	0	01/19/18	1234543119
12345	Y	12	1	01/21/18	1234543121
12345	Y	13	1	01/23/18	1234543123
12345	Y	14	1	01/24/18	1234543124
12345	Y	15	1	01/26/18	1234543126
12345	Y	16	1	01/27/18	1234543127
12345	Y	17	1	01/28/18	1234543128
12345	Y	18	1	01/30/18	1234543130
12345	Y	19	1	01/31/18	1234543131
12345	Y	20	1	02/02/18	1234543133
12345	Y	.	0	02/02/18	1234543133
12345	Y	21	1	02/04/18	1234543135
12345	N	22	1	02/06/18	1234543137
12345	N	23	1	02/07/18	1234543138
12345	N	24	1	02/09/18	1234543140
12345	N	.	0	02/09/18	1234543140
12345	N	25	1	02/11/18	1234543142
12345	N	26	1	02/13/18	1234543144
12345	N	27	1	02/16/18	1234543147
12345	N	.	0	02/16/18	1234543147
12345	N	28	1	02/18/18	1234543149
12345	N	29	1	02/21/18	1234543152
12345	N	.	0	02/21/18	1234543152
;

data want;
if _n_=1 then do;
if 0 then set old;
   dcl hash H (dataset:'old') ;
   h.definekey  ('MEMNO','count','UNIQDOS','BEGDATE','COMBO');
   h.definedone () ;
end;
set new;
if h.check()=0 then retract='Y';
else retract='N';
run;
wheddingsjr
Pyrite | Level 9

Hi again

 

I tested this and it isnt working. Its ok though. I automated a pretty good amount of the process (took it from a 3-4 hour process down to 20 mins.) So manually going through the monthly reports for dupes wont be that bad, I have done it that way for 2 years now so its no big deal. But thanks for all of your help.

novinosrin
Tourmaline | Level 20

@wheddingsjr Ok That makes me feel bad. One last try please at least for the sake of my satisfaction:-

 

 

proc sql;
 create table want as
 select *, 'Y' as retract from
   (select *  from new
         intersect
         select * from old)
   union
select *,'N' as retract from (select *  from new
         except
         select * from old)
order by memno, begdate;
quit;

 

wheddingsjr
Pyrite | Level 9

Novinosrin

 

It looked as if it wanted to work..lol...but I got a ton of either

 

         ERROR: Column 20 from the first contributor of INTERSECT is not the same type as its counterpart from the second.

or
        ERROR: Numeric expression requires a numeric format.

novinosrin
Tourmaline | Level 20

@wheddingsjr Ok, Now it's time for us to know the data and metadata(data about data) . Can you find out 

 

1. do both the old and new tables have the same variables and types(char and numeric)

2. you could run proc contents to find that out

 

wheddingsjr
Pyrite | Level 9

Novinosrins

 

It was mentioned to me that instead of importing the previous months excel report, that I just run that report in SAS so I dont have to import. I did that and now I have just a few lines that give the error previously mentioned and i think I know why. In the middle stages of my code I add a few columns to my final table. I think I need to do those steps AFTER I do the steps that you recommended. I am going to give that a shot and let you know how that works out.

 

Thanks again

novinosrin
Tourmaline | Level 20

@wheddingsjr Ok, some progress after all.

 

I reiterate the logic in both approaches:

 

1.Hash  : takes in all vars as key i.e each record makes in unique

2. sql : set operators  same as above

 

In essence we compare record to record

hash.check=0 finds it  as the same as intersect when we choose all vars as key

 

Therefore,if one var is not the same (any additional), both approaches will fail. 

wheddingsjr
Pyrite | Level 9

@novinosrinAll I have to say is that you are a genius. Everything worked out absolutely perfect. Because I just wrote the previous months data into my code and moved your union, intersect, except code further up in the script, I was able to include my add-on columns later in the process. I manually did this at the beginning of the month and just compared it with the new automated process and they are identical. Thank you very very much for your patience and your desire to help me. It is much appreciated.

wheddingsjr
Pyrite | Level 9
What took me 4 hours to do a few weeks ago was just reduced to 20 minutes. That's amazing!!!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 23 replies
  • 2336 views
  • 4 likes
  • 3 in conversation