BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mkt_apprentice
Obsidian | Level 7

Hello SAS communities,

 

I have a data set like this: 

id   day  drinks

1     2      1

1     3      3

1     5      4

2     3      1

3     2      1

3     4      2

 

The study lasts 5 days, and days a person doesn't have any drink (drink = 0) are not reported in the data. How could I create these observations (drink = 0 in the days the person doesn't drink) in this data set? The data I want looks like this:

id   day  drinks

1     1      0

1     2      1

1     3      3

1     4      0

1     5      4

2     1      0

2     2      0

2     3      1

2     4      0

2     5      0

3     1      0

3     2      1

3     3      0

3     4      2

3     5      0

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

No, just ignore that option. Not needed. Will work without it

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks") ;
   h.definedone () ;
end;
set have(keep=id blah blahblah blahblahblah);
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then drinks=0;
output;
end;
drop rc;
run;

View solution in original post

18 REPLIES 18
novinosrin
Tourmaline | Level 20
data have;
input id   day  drinks;
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks") ;
   h.definedone () ;
end;
set have(keep=id);;
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then drinks=0;
output;
end;
drop rc;
run;
mkt_apprentice
Obsidian | Level 7

Thanks a lot! I will try and update the results if it works.

mkt_apprentice
Obsidian | Level 7

My data actually has more variables in addition to these 3 variables. What if I want to keep all of the rest variables together with the three variables when creating these new observations?

mkt_apprentice
Obsidian | Level 7

The code you wrote works. However, it only keeps the 3 variables (as I posted a simplified version of my data). The other variables are just demographics and do not change in different days. How could I keep all of them when I create new observations = 0? 

 

Thank you

 

 

novinosrin
Tourmaline | Level 20

Hi @mkt_apprentice 

 

Include you other variables in definedata like

 

   h.definedata ("drinks","var1","var2"....and so on up to "varn") ;

 Also add this statement

do day=1 to 5;
rc=h.find();
if rc ne 0 then do; 
drinks=0;
call missing(var1,var2, .....upto varn);
output;
end;

Also please post a more representative sample so that we can avoid going back and forth. 

mkt_apprentice
Obsidian | Level 7

Thanks. Can you rewrite the whole code in one block? I added the part you suggested and the variables to keep the old ones. SAS returns that There was 1 unclosed DO block. 

 

Could you please help?

novinosrin
Tourmaline | Level 20

Hi @mkt_apprentice  Please review the revised sample below. You will get it

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks",'blah','blahblah','blahblahblah') ;
   h.definedone () ;
end;
set have(keep=id);;
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then do; 
drinks=0;
call missing(blah,blahblah,blahblahblah);
end;
output;
end;
drop rc;
run;
mkt_apprentice
Obsidian | Level 7

Thanks a lot for the code. Just one more thing that I need your help. The demographic variables that are associated each person (id) are not kept for the observation drinks = 0. They all become "." or missing data. Is there any way I can keep them in the newly created drinks=0 observations? I copied your code and added the demographic variables to the appropriate location in the code but the results include . rather than these demographic variables. Can you please help?

novinosrin
Tourmaline | Level 20

Comment out this part

/*call missing(blah,blahblah,blahblahblah);*/

 

Complete code

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks",'blah','blahblah','blahblahblah') ;
   h.definedone () ;
end;
set have(keep=id);;
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then do; 
drinks=0;
/*call missing(blah,blahblah,blahblahblah);*/
end;
output;
end;
drop rc;
run;

 

mkt_apprentice
Obsidian | Level 7

What does call missing do? I want these demographic variables to be the same as the ones in observations drinks differ than 0. 

For example: 

id  day drinks age gender

1    1     2        24      1

1    2     0        24      1

 

Currently the resulting data is like this:

id  day drinks age gender

1    1     2        24      1

1    2     0        .         .

 

novinosrin
Tourmaline | Level 20

@mkt_apprentice  Edited the previous with complete code

mkt_apprentice
Obsidian | Level 7

It still produces . values 😞 

novinosrin
Tourmaline | Level 20

Hi @mkt_apprentice 

 

I owe an apology as I overlooked a minor issue. Really sorry

 

Please find the corrected code below

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have',multidata:'y') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks") ;
   h.definedone () ;
end;
set have(keep=id blah blahblah blahblahblah);
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then drinks=0;
output;
end;
drop rc;
run;
mkt_apprentice
Obsidian | Level 7

What do you mean in multidata: 'y'? Do I need to create a separate data set for the demographic variables?

 

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 18 replies
  • 3272 views
  • 4 likes
  • 3 in conversation