BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ANKH1
Pyrite | Level 9

I have this dataset with. Some IDs are missing values for window=0

data ds1;
input ID$ window m1 m2 m3;
datalines;
1	3	157	56	111
1	6	145	35	145
1	9	178	78	200
2	0	132	27	201
2	6	124	95	211
2	12	127	37	532
2	15	178	35	465
3	6	148	74	231
3	12	138	49	532
;
run;

The dataset that should be used to complement data when window=0 is missing is this:

data ds2;
input ID$ window m1 m2 m3;
datalines;
1	0	145	54	201
2	0	143	45	143
3	0	132	47	165
4	0	131	48	253
9	0	124	52	511
;
run;

We need to replace missing window=0 from ds1 with values from ds2. Note that data should be added only when window=0 is missing from ds1. See the desired output below for ID 2, it keeps the data from window=0 from ds1. How can we get the below dataset?

IDwindowm1m2m3
1014554201
1315756111
1614535145
1917878200
2013227201
2612495211
21212737532
21517835465
3013247165
3614874231
31213849532
4013148253
9012452511
1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

Lazy way:

  • data full_data;
    set ds1 ds2 indsname=src;
    source=src;
    run;
    
    proc sort data=full_data;
    by id window source;
    run;
    
    data want;
    set full_data;
    by id window source;
    if first.window;
    run;
  • Stack the data sets identifying the source datasets
  • Sort by ID, window and source datasets
  • Keep ones where in DS1 first 

@ANKH1 wrote:

I have this dataset with. Some IDs are missing values for window=0

data ds1;
input ID$ window m1 m2 m3;
datalines;
1	3	157	56	111
1	6	145	35	145
1	9	178	78	200
2	0	132	27	201
2	6	124	95	211
2	12	127	37	532
2	15	178	35	465
3	6	148	74	231
3	12	138	49	532
;
run;

The dataset that should be used to complement data when window=0 is missing is this:

data ds2;
input ID$ window m1 m2 m3;
datalines;
1	0	145	54	201
2	0	143	45	143
3	0	132	47	165
4	0	131	48	253
9	0	124	52	511
;
run;

We need to replace missing window=0 from ds1 with values from ds2. Note that data should be added only when window=0 is missing from ds1. See the desired output below for ID 2, it keeps the data from window=0 from ds1. How can we get the below dataset?

ID window m1 m2 m3
1 0 145 54 201
1 3 157 56 111
1 6 145 35 145
1 9 178 78 200
2 0 132 27 201
2 6 124 95 211
2 12 127 37 532
2 15 178 35 465
3 0 132 47 165
3 6 148 74 231
3 12 138 49 532
4 0 131 48 253
9 0 124 52 511

 

View solution in original post

8 REPLIES 8
Reeza
Super User

Lazy way:

  • data full_data;
    set ds1 ds2 indsname=src;
    source=src;
    run;
    
    proc sort data=full_data;
    by id window source;
    run;
    
    data want;
    set full_data;
    by id window source;
    if first.window;
    run;
  • Stack the data sets identifying the source datasets
  • Sort by ID, window and source datasets
  • Keep ones where in DS1 first 

@ANKH1 wrote:

I have this dataset with. Some IDs are missing values for window=0

data ds1;
input ID$ window m1 m2 m3;
datalines;
1	3	157	56	111
1	6	145	35	145
1	9	178	78	200
2	0	132	27	201
2	6	124	95	211
2	12	127	37	532
2	15	178	35	465
3	6	148	74	231
3	12	138	49	532
;
run;

The dataset that should be used to complement data when window=0 is missing is this:

data ds2;
input ID$ window m1 m2 m3;
datalines;
1	0	145	54	201
2	0	143	45	143
3	0	132	47	165
4	0	131	48	253
9	0	124	52	511
;
run;

We need to replace missing window=0 from ds1 with values from ds2. Note that data should be added only when window=0 is missing from ds1. See the desired output below for ID 2, it keeps the data from window=0 from ds1. How can we get the below dataset?

ID window m1 m2 m3
1 0 145 54 201
1 3 157 56 111
1 6 145 35 145
1 9 178 78 200
2 0 132 27 201
2 6 124 95 211
2 12 127 37 532
2 15 178 35 465
3 0 132 47 165
3 6 148 74 231
3 12 138 49 532
4 0 131 48 253
9 0 124 52 511

 

ANKH1
Pyrite | Level 9

Thank you! The output is exactly what we need. I have a question, how does

if first.window;

accounts for the first window for each value, i.e., first value of window=0 and does not delete the window=6, 12, etc. rows? I would've thought that first.window, filters the very first row of the variable window.  Thanks!

A_Kh
Lapis Lazuli | Level 10

first.variable refers to a distinct value (making a group). In your data 0, 6, 12 all makes a distinct subgroups within your ID groups.  So, the first.window will keep the first observations from each distinct subgroups. If your statement were first.ID then it would delete all other observations, except window=0 (where your window has this value).  

ANKH1
Pyrite | Level 9

Oh! Thank you so much for your explanation! I really appreciate it.

Tom
Super User Tom
Super User

Doesn't look lazy to me. Looks like a lot of extra work.

 

Just MERGE the two datasets.

Tom
Super User Tom
Super User

Just MERGE the two datasets.  Make sure to list the dataset you want to WIN last in the MERGE statement.

data want;
  merge ds2 ds1;
  by id window;
run;

In your data only id='2' has any observations with window=0 in DS1.  

So let's see what values SAS used for that observation:

Tom_0-1701126811538.png

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 1381 views
  • 5 likes
  • 5 in conversation