BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mkt_apprentice
Obsidian | Level 7

Hello SAS communities,

 

I have a data set like this: 

id   day  drinks

1     2      1

1     3      3

1     5      4

2     3      1

3     2      1

3     4      2

 

The study lasts 5 days, and days a person doesn't have any drink (drink = 0) are not reported in the data. How could I create these observations (drink = 0 in the days the person doesn't drink) in this data set? The data I want looks like this:

id   day  drinks

1     1      0

1     2      1

1     3      3

1     4      0

1     5      4

2     1      0

2     2      0

2     3      1

2     4      0

2     5      0

3     1      0

3     2      1

3     3      0

3     4      2

3     5      0

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

No, just ignore that option. Not needed. Will work without it

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks") ;
   h.definedone () ;
end;
set have(keep=id blah blahblah blahblahblah);
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then drinks=0;
output;
end;
drop rc;
run;

View solution in original post

18 REPLIES 18
novinosrin
Tourmaline | Level 20
data have;
input id   day  drinks;
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks") ;
   h.definedone () ;
end;
set have(keep=id);;
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then drinks=0;
output;
end;
drop rc;
run;
mkt_apprentice
Obsidian | Level 7

Thanks a lot! I will try and update the results if it works.

mkt_apprentice
Obsidian | Level 7

My data actually has more variables in addition to these 3 variables. What if I want to keep all of the rest variables together with the three variables when creating these new observations?

mkt_apprentice
Obsidian | Level 7

The code you wrote works. However, it only keeps the 3 variables (as I posted a simplified version of my data). The other variables are just demographics and do not change in different days. How could I keep all of them when I create new observations = 0? 

 

Thank you

 

 

novinosrin
Tourmaline | Level 20

Hi @mkt_apprentice 

 

Include you other variables in definedata like

 

   h.definedata ("drinks","var1","var2"....and so on up to "varn") ;

 Also add this statement

do day=1 to 5;
rc=h.find();
if rc ne 0 then do; 
drinks=0;
call missing(var1,var2, .....upto varn);
output;
end;

Also please post a more representative sample so that we can avoid going back and forth. 

mkt_apprentice
Obsidian | Level 7

Thanks. Can you rewrite the whole code in one block? I added the part you suggested and the variables to keep the old ones. SAS returns that There was 1 unclosed DO block. 

 

Could you please help?

novinosrin
Tourmaline | Level 20

Hi @mkt_apprentice  Please review the revised sample below. You will get it

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks",'blah','blahblah','blahblahblah') ;
   h.definedone () ;
end;
set have(keep=id);;
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then do; 
drinks=0;
call missing(blah,blahblah,blahblahblah);
end;
output;
end;
drop rc;
run;
mkt_apprentice
Obsidian | Level 7

Thanks a lot for the code. Just one more thing that I need your help. The demographic variables that are associated each person (id) are not kept for the observation drinks = 0. They all become "." or missing data. Is there any way I can keep them in the newly created drinks=0 observations? I copied your code and added the demographic variables to the appropriate location in the code but the results include . rather than these demographic variables. Can you please help?

novinosrin
Tourmaline | Level 20

Comment out this part

/*call missing(blah,blahblah,blahblahblah);*/

 

Complete code

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks",'blah','blahblah','blahblahblah') ;
   h.definedone () ;
end;
set have(keep=id);;
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then do; 
drinks=0;
/*call missing(blah,blahblah,blahblahblah);*/
end;
output;
end;
drop rc;
run;

 

mkt_apprentice
Obsidian | Level 7

What does call missing do? I want these demographic variables to be the same as the ones in observations drinks differ than 0. 

For example: 

id  day drinks age gender

1    1     2        24      1

1    2     0        24      1

 

Currently the resulting data is like this:

id  day drinks age gender

1    1     2        24      1

1    2     0        .         .

 

novinosrin
Tourmaline | Level 20

@mkt_apprentice  Edited the previous with complete code

mkt_apprentice
Obsidian | Level 7

It still produces . values 😞 

novinosrin
Tourmaline | Level 20

Hi @mkt_apprentice 

 

I owe an apology as I overlooked a minor issue. Really sorry

 

Please find the corrected code below

 

/*Added additional variables to the sample HAVE*/
data have;
input id   day  drinks ;
blah=rand('integer',1,100);
blahblah=rand('integer',1,100);
blahblahblah=rand('integer',1,100);
cards;
1     2      1
1     3      3
1     5      4
2     3      1
3     2      1
3     4      2
;

data want;
if _n_=1 then do;
if 0 then set have;
dcl hash H (dataset:'have',multidata:'y') ;
   h.definekey  ("id",'day') ;
   h.definedata ("drinks") ;
   h.definedone () ;
end;
set have(keep=id blah blahblah blahblahblah);
by id;/*assuming it's sorted by id as your sample suggests*/
if first.id;
do day=1 to 5;
rc=h.find();
if rc ne 0 then drinks=0;
output;
end;
drop rc;
run;
mkt_apprentice
Obsidian | Level 7

What do you mean in multidata: 'y'? Do I need to create a separate data set for the demographic variables?

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1181 views
  • 4 likes
  • 3 in conversation