DATA Step, Macro, Functions and more

1 to 1 & 1 to many w/2 variables (repeating)

Accepted Solution Solved
Reply
Contributor
Posts: 52
Accepted Solution

1 to 1 & 1 to many w/2 variables (repeating)

SAS Enterprise Guide

Version 4.3 (4.3.0.11182)Hot fix 14

Windows XP

Service Pack 3

SAS 9.2

/*

I NEED TO CREATE A DATA STEP THAT WILL PARSE TWO VARIABLES LOOKING

FOR A ONE TO ONE RELATIONSHIP EVEN IF THE DATA IS REPEATED IN DIFFERENT

COLUMNS(ROWS) EXAMPLE C 3 IS CONSIDERED ONE TO ONE. ANY TIME A DISTINCT

NPI NUMBER SHARES THE SAME TAX_ID - THIS IS CONSIDERED A ONE TO MANY

RELATIONSHIP AND SHOULD BE STORED IN THE TAX_ID_NPI_ONE_2_MANY TABLE.

ANYTIME AN NPI NUMBER IS REPEATED SHARING THE SAME TAX_ID NUMBER THIS IS

CONSIDER ONE TO ONE RELATIONSHIP AND SHOULD BE STORED IN TAX_ID_NPI_ONE_2_ONE.

I'M NOT SURE IF THE FIRST/LAST FUNCTION WITH AN IF WAS THE CORRECT ROUTE.

I AM NEW TO THIS SAS THING SO ANY HELP IS GREATLY APPRECIATED. :smileyconfused:

*/

data TEST_DATA;

     input TAX_ID $ NPI;

     datalines;

A 1

B 2

C 3

C 3

C 3

D 4

D 5

D 6

E 3

F 4

F 4

G 5

G 6

;

data TAX_ID_NPI_ONE_2_ONE TAX_ID_NPI_ONE_2_MANY;

     set WORK.TEST_DATA;

     keep     TAX_ID

           NPI;

     by TAX_ID NPI;

     if ((FIRST.TAX_ID) AND (FIRST.NPI OR LAST.NPI))

           OR ((LAST.TAX_ID) AND (FIRST.NPI OR LAST.NPI)) THEN

           OUTPUT     TAX_ID_NPI_ONE_2_ONE;

     ELSE OUTPUT TAX_ID_NPI_ONE_2_MANY;

run;

/*

RESULTS:

TAX_ID_NPI_ONE_2_ONE

A    1 - CORRECT

B    2 - CORRECT

C    3 - CORRECT

C    3 - CORRECT

       - INCORRECT MISSING THIRD C3

D    4 - INCORRECT

D    6 - INCORRECT

E    3 - CORRECT

F    4 - CORRECT

F    4 - CORRECT

G    5 - INCORRECT HAS ONE TO MANY RELATIONSHIP

G    6 - INCORRECT HAS ONE TO MANY RELATIONSHIP

TAX_ID_NPI_ONE_2_MANY

- MISSING D 4

- MISSING D 6

- MISSING G 5

- MISSING G 6

C    3 - INCORRECT HAS ONE TO ONE RELATIONSHIP

D    5 - CORRECT

*/

Attachment

Accepted Solutions
Solution
‎08-06-2013 09:17 AM
Super User
Super User
Posts: 6,502

Re: 1 to 1 & 1 to many w/2 variables (repeating)

You need some way to count the number of distinct NPI values for each TAX_ID.  You could do it with PROC SQL, but here is a method that takes advantage of the fact that your data is already sorted.

data test_data;

  input tax_id $ npi @@;

datalines;

A 1 B 2 C 3 C 3 C 3 D 4 D 5 D 6 E 3 F 4 F 4 G 5 G 6

;;;;

data tax_id_npi_one_2_one tax_id_npi_one_2_many ;

  do until (last.tax_id);

    set test_data;

    by tax_id npi;

    n_npi=sum(n_npi,first.npi);

  end;

  do until (last.tax_id);

    set test_data;

    by tax_id npi;

    if n_npi=1 then output tax_id_npi_one_2_one;

    else output tax_id_npi_one_2_many ;

  end;

run;

TAX_ID_NPI_ONE_2_ONE;

tax_id=A npi=1 n_npi=1

tax_id=B npi=2 n_npi=1

tax_id=C npi=3 n_npi=1

tax_id=C npi=3 n_npi=1

tax_id=C npi=3 n_npi=1

tax_id=E npi=3 n_npi=1

tax_id=F npi=4 n_npi=1

tax_id=F npi=4 n_npi=1

TAX_ID_NPI_ONE_2_MANY;

tax_id=D npi=4 n_npi=3

tax_id=D npi=5 n_npi=3

tax_id=D npi=6 n_npi=3

tax_id=G npi=5 n_npi=2

tax_id=G npi=6 n_npi=2

View solution in original post


All Replies
Solution
‎08-06-2013 09:17 AM
Super User
Super User
Posts: 6,502

Re: 1 to 1 & 1 to many w/2 variables (repeating)

You need some way to count the number of distinct NPI values for each TAX_ID.  You could do it with PROC SQL, but here is a method that takes advantage of the fact that your data is already sorted.

data test_data;

  input tax_id $ npi @@;

datalines;

A 1 B 2 C 3 C 3 C 3 D 4 D 5 D 6 E 3 F 4 F 4 G 5 G 6

;;;;

data tax_id_npi_one_2_one tax_id_npi_one_2_many ;

  do until (last.tax_id);

    set test_data;

    by tax_id npi;

    n_npi=sum(n_npi,first.npi);

  end;

  do until (last.tax_id);

    set test_data;

    by tax_id npi;

    if n_npi=1 then output tax_id_npi_one_2_one;

    else output tax_id_npi_one_2_many ;

  end;

run;

TAX_ID_NPI_ONE_2_ONE;

tax_id=A npi=1 n_npi=1

tax_id=B npi=2 n_npi=1

tax_id=C npi=3 n_npi=1

tax_id=C npi=3 n_npi=1

tax_id=C npi=3 n_npi=1

tax_id=E npi=3 n_npi=1

tax_id=F npi=4 n_npi=1

tax_id=F npi=4 n_npi=1

TAX_ID_NPI_ONE_2_MANY;

tax_id=D npi=4 n_npi=3

tax_id=D npi=5 n_npi=3

tax_id=D npi=6 n_npi=3

tax_id=G npi=5 n_npi=2

tax_id=G npi=6 n_npi=2

Super User
Super User
Posts: 6,502

Re: 1 to 1 & 1 to many w/2 variables (repeating)

Here is an SQL solution. Note that SQL cannot generate multiple output data sets in one step the way that a data step can.

proc sql ;

  create table TAX_ID_NPI_ONE_2_ONE as

    select * from test_data

    group by tax_id

    having count(distinct npi) = 1

  ;

  create table TAX_ID_NPI_ONE_2_MANY as

    select * from test_data

    group by tax_id

    having count(distinct npi) > 1

  ;

quit;

Contributor
Posts: 52

Re: 1 to 1 & 1 to many w/2 variables (repeating)

Very helpful and when I re-read and validated the code - your two examples made it easy to understand and re-code accordinlgy.  I actually learned something instead of copying and editing your code.  Smiley Happy

Contributor
Posts: 52

Re: 1 to 1 & 1 to many w/2 variables (repeating)

How would I include a third, fourth, etc variable if I was looking at a different Ex: Provider ID that shared the same TAX-ID & NPI.  C 3 121 and C 3 122 are the same excluding the third column.  Same tax_id, same npi, but he has a different provider id.  If I wanted to distinguish the variance would I use a CASE within the same data set?

A 1 999

B 2 111

C 3 121

C 3 121

C 3 122

D 4 555

D 5 567

D 6 568

E 3 678

F 4 679

F 4 679

G 5 680

G 6 681

Regular Contributor
Posts: 195

Re: 1 to 1 & 1 to many w/2 variables (repeating)

How about following one...

proc sql;

   create table want as

   select tax_id,max(npi) as max,min(npi) as min,

          case

                  when calculated max EQ calculated min then "one_2_one"

                    else "one_2_many"

              end as relation

   from have as a

   group by tax_id;

quit;

proc sql;

   create table one_2_one as

   select a.*

   from have as a, want as b

   where a.tax_id = b.tax_id and

         relation = "one_2_one";

   create table one_2_many as

   select a.*

   from have as a, want as b

   where a.tax_id = b.tax_id and

         relation = "one_2_many";

quit;

-Urvish

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 273 views
  • 10 likes
  • 3 in conversation