BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
imcbczm
Obsidian | Level 7

Hi,

 

I have a dataset but want to create a new one which will repeat two variables (here are 'ID', 'dx') for each unique value in the third variable (here is 'test'). I guess it would be simple but I could not get it done at this moment.

 

For expample:

------------------

data have;

input Id dx $ test;

datalines;

8210 no 1

8210 yes 20

8310 no 30

8321 yes 11

8321 no 2

;

run;

---------------

I want:

----------

8210 no 1

8210 no 20

8210 no 30

8210 yes 1

8210 yes 20

8210 yes 30

8210 no 1

8210 no 20

8210 no 30

8321 yes 11

8321 yes 2

8321 no 11

8321 no 2

-----------------

 

Thank you for help!

Z

 

1 ACCEPTED SOLUTION

Accepted Solutions
hbi
Quartz | Level 8 hbi
Quartz | Level 8

Below is a PROC SQL version of the same. I think "want_version1" is what you are seeking.

 

Comment: I included "want_version2", but it removes duplicates and returns only distinct observations. 

 

data have;
input ID dx $ test;
datalines;
8210 no   1
8210 yes 20
8210 no  30
8321 yes 11
8321 no   2
;
run;

/* want_version1 will give you exactly what you want */
PROC SQL;
  CREATE TABLE want_version1 AS 
  SELECT FirstQuery.ID
       , dx
       , test
  FROM (SELECT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

/* want_version2 is a different solution that removes duplicates */
PROC SQL;
  CREATE TABLE want_version2 AS 
  SELECT DISTINCT FirstQuery.ID
       , dx
       , test
  FROM (SELECT DISTINCT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT DISTINCT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

View solution in original post

9 REPLIES 9
billfish
Quartz | Level 8

for the results you want , you have a type in

 

8310 no 30

 

It should be

 

8210 no 30.

 

A solution amongst others (I assume the input dataset is sorted w/r/t the variable id):

 

data have;

input Id dx $ test;

datalines;

8210 no 1

8210 yes 20

8210 no 30

8321 yes 11

8321 no 2

;

run;

 

data want;

do until (last.id);

set have;

by id;

output;

end;

do until (last.id);

set have;

by id;

output;

end;

do until (last.id);

set have;

by id;

output;

end;

run;

 

proc print data=want; run;

imcbczm
Obsidian | Level 7

I have tried the similar code. It seems not right. BTW, I may have many unique values in 'test'.

 

thanks,

Z

 

hbi
Quartz | Level 8 hbi
Quartz | Level 8

Below is a PROC SQL version of the same. I think "want_version1" is what you are seeking.

 

Comment: I included "want_version2", but it removes duplicates and returns only distinct observations. 

 

data have;
input ID dx $ test;
datalines;
8210 no   1
8210 yes 20
8210 no  30
8321 yes 11
8321 no   2
;
run;

/* want_version1 will give you exactly what you want */
PROC SQL;
  CREATE TABLE want_version1 AS 
  SELECT FirstQuery.ID
       , dx
       , test
  FROM (SELECT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

/* want_version2 is a different solution that removes duplicates */
PROC SQL;
  CREATE TABLE want_version2 AS 
  SELECT DISTINCT FirstQuery.ID
       , dx
       , test
  FROM (SELECT DISTINCT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT DISTINCT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;
imcbczm
Obsidian | Level 7
Great, thank you so much!
Astounding
PROC Star

In a way, this is an all combinations join so it's probably easier for SQL to handle:

 

proc sql;

create table want as select a.id, a.dx, b.test from have a, have b where a.id=b.id;

quit;

 

Good luck.

slchen
Lapis Lazuli | Level 10


data want;
set have end=last ;
output;
if last then do;
do i=1 to nobs;
set have point=i nobs=nobs;
if dx='no' then dx='yes';
else dx='no';
output;
end;
stop;
end;
run;

imcbczm
Obsidian | Level 7
sorry, not really what i want.
Steelers_In_DC
Barite | Level 11

Here is another solutions:

 

data have;
input Id dx $ test;
datalines;
8210 no 1
8210 yes 20
8210 no 30
8321 yes 11
8321 no 2
;

proc sql;
create table want as
select /*distinct?*/
a.id,
a.dx,
b.test
from have a full join
       have b on
a.id = b.id;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 3418 views
  • 1 like
  • 6 in conversation