DATA Step, Macro, Functions and more

How to create repeated observations of two variables for each value within ID

Accepted Solution Solved
Reply
Contributor
Posts: 40
Accepted Solution

How to create repeated observations of two variables for each value within ID

Hi,

 

I have a dataset but want to create a new one which will repeat two variables (here are 'ID', 'dx') for each unique value in the third variable (here is 'test'). I guess it would be simple but I could not get it done at this moment.

 

For expample:

------------------

data have;

input Id dx $ test;

datalines;

8210 no 1

8210 yes 20

8310 no 30

8321 yes 11

8321 no 2

;

run;

---------------

I want:

----------

8210 no 1

8210 no 20

8210 no 30

8210 yes 1

8210 yes 20

8210 yes 30

8210 no 1

8210 no 20

8210 no 30

8321 yes 11

8321 yes 2

8321 no 11

8321 no 2

-----------------

 

Thank you for help!

Z

 


Accepted Solutions
Solution
‎11-10-2015 07:16 PM
Contributor hbi
Contributor
Posts: 66

Re: How to create repeated observations of two variables for each value within ID

Below is a PROC SQL version of the same. I think "want_version1" is what you are seeking.

 

Comment: I included "want_version2", but it removes duplicates and returns only distinct observations. 

 

data have;
input ID dx $ test;
datalines;
8210 no   1
8210 yes 20
8210 no  30
8321 yes 11
8321 no   2
;
run;

/* want_version1 will give you exactly what you want */
PROC SQL;
  CREATE TABLE want_version1 AS 
  SELECT FirstQuery.ID
       , dx
       , test
  FROM (SELECT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

/* want_version2 is a different solution that removes duplicates */
PROC SQL;
  CREATE TABLE want_version2 AS 
  SELECT DISTINCT FirstQuery.ID
       , dx
       , test
  FROM (SELECT DISTINCT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT DISTINCT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

View solution in original post


All Replies
Contributor
Posts: 52

Re: How to create repeated observations of two variables for each value within ID

for the results you want , you have a type in

 

8310 no 30

 

It should be

 

8210 no 30.

 

A solution amongst others (I assume the input dataset is sorted w/r/t the variable id):

 

data have;

input Id dx $ test;

datalines;

8210 no 1

8210 yes 20

8210 no 30

8321 yes 11

8321 no 2

;

run;

 

data want;

do until (last.id);

set have;

by id;

output;

end;

do until (last.id);

set have;

by id;

output;

end;

do until (last.id);

set have;

by id;

output;

end;

run;

 

proc print data=want; run;

Contributor
Posts: 40

Re: How to create repeated observations of two variables for each value within ID

I have tried the similar code. It seems not right. BTW, I may have many unique values in 'test'.

 

thanks,

Z

 

Solution
‎11-10-2015 07:16 PM
Contributor hbi
Contributor
Posts: 66

Re: How to create repeated observations of two variables for each value within ID

Below is a PROC SQL version of the same. I think "want_version1" is what you are seeking.

 

Comment: I included "want_version2", but it removes duplicates and returns only distinct observations. 

 

data have;
input ID dx $ test;
datalines;
8210 no   1
8210 yes 20
8210 no  30
8321 yes 11
8321 no   2
;
run;

/* want_version1 will give you exactly what you want */
PROC SQL;
  CREATE TABLE want_version1 AS 
  SELECT FirstQuery.ID
       , dx
       , test
  FROM (SELECT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;

/* want_version2 is a different solution that removes duplicates */
PROC SQL;
  CREATE TABLE want_version2 AS 
  SELECT DISTINCT FirstQuery.ID
       , dx
       , test
  FROM (SELECT DISTINCT ID, dx
        FROM have) AS FirstQuery
  INNER JOIN (SELECT DISTINCT ID, test
              FROM have) AS SecondQuery
    ON FirstQuery.ID = SecondQuery.ID;
QUIT;
Contributor
Posts: 40

Re: How to create repeated observations of two variables for each value within ID

Great, thank you so much!
Super User
Posts: 5,503

Re: How to create repeated observations of two variables for each value within ID

In a way, this is an all combinations join so it's probably easier for SQL to handle:

 

proc sql;

create table want as select a.id, a.dx, b.test from have a, have b where a.id=b.id;

quit;

 

Good luck.

Contributor
Posts: 40

Re: How to create repeated observations of two variables for each value within ID

Posted in reply to Astounding
It works, thank you!
Super Contributor
Posts: 275

Re: How to create repeated observations of two variables for each value within ID


data want;
set have end=last ;
output;
if last then do;
do i=1 to nobs;
set have point=i nobs=nobs;
if dx='no' then dx='yes';
else dx='no';
output;
end;
stop;
end;
run;

Contributor
Posts: 40

Re: How to create repeated observations of two variables for each value within ID

sorry, not really what i want.
Valued Guide
Posts: 860

Re: How to create repeated observations of two variables for each value within ID

[ Edited ]

Here is another solutions:

 

data have;
input Id dx $ test;
datalines;
8210 no 1
8210 yes 20
8210 no 30
8321 yes 11
8321 no 2
;

proc sql;
create table want as
select /*distinct?*/
a.id,
a.dx,
b.test
from have a full join
       have b on
a.id = b.id;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 292 views
  • 1 like
  • 6 in conversation