BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
imcbczm
Obsidian | Level 7

Hi,

 

I have a dataset but want to create a new one which will repeat two variables (here are 'ID', 'dx') for each unique value in the third variable (here is 'test'). I guess it would be simple but I could not get it done at this moment.

 

For expample:

------------------

data have;

input Id dx $ test;

datalines;

8210 no 1

8210 yes 20

8310 no 30

8321 yes 11

8321 no 2

;

run;

---------------

I want:

----------

8210 no 1

8210 no 20

8210 no 30

8210 yes 1

8210 yes 20

8210 yes 30

8210 no 1

8210 no 20

8210 no 30

8321 yes 11

8321 yes 2

8321 no 11

8321 no 2

-----------------

 

Thank you for help!

Z

 

1 ACCEPTED SOLUTION

Accepted Solutions
hbi
Quartz | Level 8 hbi
Quartz | Level 8

Below is a PROC SQL version of the same. I think "want_version1" is what you are seeking.

 

Comment: I included "want_version2", but it removes duplicates and returns only distinct observations. 

 

data have;
input ID dx $ test;
datalines;
8210 no   1
8210 yes 20
8210 no  30
8321 yes 11
8321 no   2
;
run;

/* want_version1 will give you exactly what you want */
PROC SQL;
  CREATE TABLE want_version1 AS 
  SELECT FirstQuery.ID
       , dx
       , test
  FROM (SELECT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

/* want_version2 is a different solution that removes duplicates */
PROC SQL;
  CREATE TABLE want_version2 AS 
  SELECT DISTINCT FirstQuery.ID
       , dx
       , test
  FROM (SELECT DISTINCT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT DISTINCT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

View solution in original post

9 REPLIES 9
billfish
Quartz | Level 8

for the results you want , you have a type in

 

8310 no 30

 

It should be

 

8210 no 30.

 

A solution amongst others (I assume the input dataset is sorted w/r/t the variable id):

 

data have;

input Id dx $ test;

datalines;

8210 no 1

8210 yes 20

8210 no 30

8321 yes 11

8321 no 2

;

run;

 

data want;

do until (last.id);

set have;

by id;

output;

end;

do until (last.id);

set have;

by id;

output;

end;

do until (last.id);

set have;

by id;

output;

end;

run;

 

proc print data=want; run;

imcbczm
Obsidian | Level 7

I have tried the similar code. It seems not right. BTW, I may have many unique values in 'test'.

 

thanks,

Z

 

hbi
Quartz | Level 8 hbi
Quartz | Level 8

Below is a PROC SQL version of the same. I think "want_version1" is what you are seeking.

 

Comment: I included "want_version2", but it removes duplicates and returns only distinct observations. 

 

data have;
input ID dx $ test;
datalines;
8210 no   1
8210 yes 20
8210 no  30
8321 yes 11
8321 no   2
;
run;

/* want_version1 will give you exactly what you want */
PROC SQL;
  CREATE TABLE want_version1 AS 
  SELECT FirstQuery.ID
       , dx
       , test
  FROM (SELECT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

/* want_version2 is a different solution that removes duplicates */
PROC SQL;
  CREATE TABLE want_version2 AS 
  SELECT DISTINCT FirstQuery.ID
       , dx
       , test
  FROM (SELECT DISTINCT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT DISTINCT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;
imcbczm
Obsidian | Level 7
Great, thank you so much!
Astounding
PROC Star

In a way, this is an all combinations join so it's probably easier for SQL to handle:

 

proc sql;

create table want as select a.id, a.dx, b.test from have a, have b where a.id=b.id;

quit;

 

Good luck.

slchen
Lapis Lazuli | Level 10


data want;
set have end=last ;
output;
if last then do;
do i=1 to nobs;
set have point=i nobs=nobs;
if dx='no' then dx='yes';
else dx='no';
output;
end;
stop;
end;
run;

imcbczm
Obsidian | Level 7
sorry, not really what i want.
Steelers_In_DC
Barite | Level 11

Here is another solutions:

 

data have;
input Id dx $ test;
datalines;
8210 no 1
8210 yes 20
8210 no 30
8321 yes 11
8321 no 2
;

proc sql;
create table want as
select /*distinct?*/
a.id,
a.dx,
b.test
from have a full join
       have b on
a.id = b.id;

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2048 views
  • 1 like
  • 6 in conversation