BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
justicetobias
Calcite | Level 5

Hi,

I am trying to write a program that is looking at a table with an id,  date, provider number, and procedure number.

I want to write a program that identifies when there are rows that have the same (or duplicate) id AND date AND provider number, then output them into a new table. 

 

Any idea how that may be possible?

1 ACCEPTED SOLUTION

Accepted Solutions
mklangley
Lapis Lazuli | Level 10

It looks like you're referring to the alias have in your join condition, though your join uses the aliases fake_partB and c.

 

proc sql;
 87             create table want as select fake_partB.*,c.count
 88                 from fake_partB left join _counts_ c
 89                 on have.bene=c.bene and have.srvc_dt=c.srvc_dt and have.prvdr_npi=c.prvdr_npi;
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Expression using equals (=) has components that are of different data types.
 ERROR: Expression using equals (=) has components that are of different data types.
 ERROR: Expression using equals (=) has components that are of different data types.
 NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
 90         quit;

If you use have as the alias for fake_partb, that should work:

from fake_partB have left join _counts_ c

 

View solution in original post

13 REPLIES 13
PaigeMiller
Diamond | Level 26

@justicetobias wrote:

Hi,

I am trying to write a program that is looking at a table with an id,  date, provider number, and procedure number.

I want to write a program that identifies when there are rows that have the same id, date, and provider number and output them into a new table. 


It would definitely help if you provided us a portion of your data (as working SAS data step code).


What should we do if it is not a duplicate? What about triplicates?

 

Why do you need them in a separate data set? That's usually not necessary and causes extra work.

 

Here is code (untested) that identifies duplicates, triplicates, etc.

 

proc freq data=have;
    table id*date*provider_number/noprint list out=_counts_;
run;

proc sql;
    create table want as select have.*,c.count
        from have left join _counts_ c
        on have.id=c.id and have.date=c.date and have.provider_number=c.provider_number;
quit;

 

--
Paige Miller
justicetobias
Calcite | Level 5

Than you so much for your help.

A portion of the data would be:

ID       Procedure    Provider      Date

8451002444412-Dec-19
2611005444113-Dec-19
3261006444512-Dec-19
2321004444412-Dec-19
2521009444913-Dec-19
9481003444613-Dec-19
6751005444713-Dec-19
1541001444112-Dec-19
8521008444712-Dec-19
9541006444012-Dec-19
2321005444412-Dec-19
2321005444512-Dec-19
1541001444113-Dec-19
9481002444613-Dec-19
6751006444713-Dec-19
2611006444112-Dec-19

 

I don't need a new output table, but the only rows I want are ones that have and type of repeat (duplicate, triplicate, etc.) with id, provider, and date.

 

The code you provider was super helpful and identified the correct rows! But instead of an output that told me the rows, I would like to only have those rows left in the table. 

PaigeMiller
Diamond | Level 26

So when you go to use this table next, for example in PROC MEANS or whatever, you can use something like this

 

proc means data=want(where=(count>1));
--
Paige Miller
justicetobias
Calcite | Level 5

Thank you! do you know why these error messages would appear?

 

ERROR: Unresolved reference to table/correlation name have.
ERROR: Expression using equals (=) has components that are of different data types.
PaigeMiller
Diamond | Level 26

Show us the log for these steps, the code, the warning messages, the error messages (in other words, don't chop anything out) by copying the log as text (not a picture) and pasting it into the window that appears when you click on the </> icon.

--
Paige Miller
justicetobias
Calcite | Level 5
1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
72        
73         proc import datafile  =  '/home/u49170471/sasuser.v94/fake_partB.xlsx'
74          out  =  fake_partB
75          dbms  =  xlsx
76          replace
77          ;
78         run;
 
NOTE: The import data set has 16 observations and 4 variables.
NOTE: WORK.FAKE_PARTB data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.01 seconds
       memory              2743.15k
       OS Memory           33704.00k
       Timestamp           06/17/2020 01:54:02 PM
       Step Count                        226  Switch Count  4
       Page Faults                       0
       Page Reclaims                     655
       Page Swaps                        0
       Voluntary Context Switches        31
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
      
 
79        
80        
81        
82         proc freq data=fake_partB;
83             table bene*srvc_dt*prvdr_npi/noprint list out=_counts_;
84         run;
 
NOTE: There were 16 observations read from the data set WORK.FAKE_PARTB.
NOTE: The data set WORK._COUNTS_ has 13 observations and 5 variables.
NOTE: PROCEDURE FREQ used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              1853.75k
       OS Memory           33212.00k
       Timestamp           06/17/2020 01:54:02 PM
       Step Count                        227  Switch Count  10
       Page Faults                       0
       Page Reclaims                     405
       Page Swaps                        0
       Voluntary Context Switches        46
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           1320
      
 
85        
86         proc sql;
87             create table want as select fake_partB.*,c.count
88                 from duplicates left join _counts_ c
89                 on have.bene=c.bene and have.srvc_dt=c.srvc_dt and have.prvdr_npi=c.prvdr_npi;
ERROR: Unresolved reference to table/correlation name have.
ERROR: Unresolved reference to table/correlation name have.
ERROR: Unresolved reference to table/correlation name have.
ERROR: Could not expand fake_partB.*, correlation name not found.
ERROR: Expression using equals (=) has components that are of different data types.
ERROR: Expression using equals (=) has components that are of different data types.
ERROR: Expression using equals (=) has components that are of different data types.
ERROR: The following columns were not found in the contributing tables: fake_partB.
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
90         quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              5970.12k
       OS Memory           37296.00k
       Timestamp           06/17/2020 01:54:02 PM
       Step Count                        228  Switch Count  0
       Page Faults                       0
       Page Reclaims                     86
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
      
91        
92        
93        
94        
95         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
107       
PaigeMiller
Diamond | Level 26

...  by copying the log as text (not a picture) and pasting it into the window that appears when you click on the </> icon.

 

You forgot that step.

--
Paige Miller
justicetobias
Calcite | Level 5
1          OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 72         
 73         proc import datafile  =  '/home/u49170471/sasuser.v94/fake_partB.xlsx'
 74          out  =  fake_partB
 75          dbms  =  xlsx
 76          replace
 77          ;
 78         run;
 
 NOTE: The import data set has 16 observations and 4 variables.
 NOTE: WORK.FAKE_PARTB data set was successfully created.
 NOTE: PROCEDURE IMPORT used (Total process time):
       real time           0.00 seconds
       user cpu time       0.01 seconds
       system cpu time     0.00 seconds
       memory              2742.68k
       OS Memory           32168.00k
       Timestamp           06/17/2020 03:01:19 PM
       Step Count                        32  Switch Count  4
       Page Faults                       0
       Page Reclaims                     666
       Page Swaps                        0
       Voluntary Context Switches        30
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           264
       
 
 79         
 80         
 81         
 82         proc freq data=fake_partB;
 83             table bene*srvc_dt*prvdr_npi/noprint list out=_counts_;
 84         run;
 
 NOTE: There were 16 observations read from the data set WORK.FAKE_PARTB.
 NOTE: The data set WORK._COUNTS_ has 13 observations and 5 variables.
 NOTE: PROCEDURE FREQ used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              1965.90k
       OS Memory           31676.00k
       Timestamp           06/17/2020 03:01:19 PM
       Step Count                        33  Switch Count  10
       Page Faults                       0
       Page Reclaims                     418
       Page Swaps                        0
       Voluntary Context Switches        45
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           1320
       
 
 85         
 86         proc sql;
 87             create table want as select fake_partB.*,c.count
 88                 from fake_partB left join _counts_ c
 89                 on have.bene=c.bene and have.srvc_dt=c.srvc_dt and have.prvdr_npi=c.prvdr_npi;
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Expression using equals (=) has components that are of different data types.
 ERROR: Expression using equals (=) has components that are of different data types.
 ERROR: Expression using equals (=) has components that are of different data types.
 NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
 90         quit;
 NOTE: The SAS System stopped processing this step because of errors.
 NOTE: PROCEDURE SQL used (Total process time):
       real time           0.00 seconds
       user cpu time       0.00 seconds
       system cpu time     0.00 seconds
       memory              5862.12k
       OS Memory           35760.00k
       Timestamp           06/17/2020 03:01:19 PM
       Step Count                        34  Switch Count  0
       Page Faults                       0
       Page Reclaims                     89
       Page Swaps                        0
       Voluntary Context Switches        0
       Involuntary Context Switches      0
       Block Input Operations            0
       Block Output Operations           0
       
 91         
 92         
 93         
 94         
 95         OPTIONS NONOTES NOSTIMER NOSOURCE NOSYNTAXCHECK;
 107        
 
User: u49170471

My apologies. Super new. Still figuring things out. 

mklangley
Lapis Lazuli | Level 10

It looks like you're referring to the alias have in your join condition, though your join uses the aliases fake_partB and c.

 

proc sql;
 87             create table want as select fake_partB.*,c.count
 88                 from fake_partB left join _counts_ c
 89                 on have.bene=c.bene and have.srvc_dt=c.srvc_dt and have.prvdr_npi=c.prvdr_npi;
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Unresolved reference to table/correlation name have.
 ERROR: Expression using equals (=) has components that are of different data types.
 ERROR: Expression using equals (=) has components that are of different data types.
 ERROR: Expression using equals (=) has components that are of different data types.
 NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
 90         quit;

If you use have as the alias for fake_partb, that should work:

from fake_partB have left join _counts_ c

 

PaigeMiller
Diamond | Level 26

Yes, @mklangley hits the nail on the head. You have to change all occurrences of HAVE to FAKE_PARTB

--
Paige Miller
justicetobias
Calcite | Level 5
this is perfect! thank you and @PaigeMiller so very much!
smantha
Lapis Lazuli | Level 10
proc sort data = have; 
by id date provider_number;
data want;
set have;
by id date provider_number;
if first.id and last.id then do;
 end;
else do;
 output;
end;
run;
smantha
Lapis Lazuli | Level 10

My bad the solution will be like

proc sort data = have; 
by id date provider_number;
data want;
set have;
by id date provider_number;
if first.provider_number and last.provider_number then do;
 end;
else do;
 output;
end;
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2351 views
  • 1 like
  • 4 in conversation