Can you please see above reply to kiranv?
@d0816 Where is the reply to Kiranv?
Anyway, if you say keep only those with "value as Level1 and consecutive missing values",. shouldn't 4 be deleted too?
4 | C | Level1 | Level1 | Level2 | Level2 | Level2 |
9 | D | Level1 | Level4 | Level4 | Level4 | Level4 | Level4 | Level4 |
23 | D | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level2 |
Please clarify
Month1 | Month2 | Month3 | Month4 | Month5 | Month6 | Month7 | Month8 |
Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | |
Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | ||
Level1 | Level1 | Level1 | Level1 | Level1 | |||
Level1 | Level1 | Level1 | Level1 | ||||
Level1 | Level1 | Level1 | |||||
Level1 | Level1 | ||||||
Level1 |
HI, for the 2nd part I want dataset like this.
Ok got it. Thank you. Be right back in a bit
data want;
set have;
array t(*) month:;
_k=whichc(' ', of t(*));
if _k>1 then do;
do _n_=dim(t) by -1 to _k+1;
if not missing(t(_n_)) then do; _t=1;leave;end;
end;
end;
do _n_=dim(t) by -1 to 1;
if not missing(t(_n_)) and t(_n_) ne 'Level1' then do; __t=1;leave;end;
end;
if _t or __t then delete;
drop _:;
run;
I will try to see if I can make it really slick. Hope that helps
My log and result:
4365 data want;
4366 set have;
4367 array t(*) month:;
4368 _k=whichc(' ', of t(*));
4369 if _k>1 then do;
4370 do _n_=dim(t) by -1 to _k+1;
4371 if not missing(t(_n_)) then do; _t=1;leave;end;
4372 end;
4373 end;
4374 do _n_=dim(t) by -1 to 1;
4375 if not missing(t(_n_)) and t(_n_) ne 'Level1' then do; __t=1;leave;end;
4376 end;
4377 if _t or __t then delete;
4378 drop _:;
4379 run;
NOTE: There were 24 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 16 observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
My output:
ID | Store | Month1 | Month2 | Month3 | Month4 | Month5 | Month6 | Month7 | Month8 |
1 | A | Level1 | Level1 | Level1 | Level1 | ||||
2 | A | Level1 | Level1 | ||||||
5 | B | Level1 | |||||||
7 | A | Level1 | Level1 | ||||||
8 | A | Level1 | Level1 | Level1 | Level1 | Level1 | |||
10 | D | Level1 | Level1 | Level1 | |||||
11 | E | Level1 | |||||||
12 | E | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | |
15 | A | Level1 | Level1 | Level1 | Level1 | Level1 | |||
16 | D | Level1 | Level1 | ||||||
17 | B | Level1 | Level1 | Level1 | |||||
18 | E | Level1 | Level1 | Level1 | |||||
19 | F | Level1 | Level1 | Level1 | Level1 | Level1 | |||
20 | D | Level1 | Level1 | Level1 | Level1 | Level1 | |||
22 | A | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | |
24 | E | Level1 | Level1 |
I tested again, use this HAVE and and check the WANT. If this works for your sample, replicate carefully for your original source
data have;
input ID Store$ (Month1-Month8)($);
infile datalines dsd missover;
datalines;
1,A,Level1,Level1,Level1,Level1,,,,
2,A,Level1,Level1,,,,,,
3,B,Level1,Level1,,,Level1,,Level1,Level1
4,C,Level1,Level1,Level2,Level2,Level2,,,
5,B,Level1,,,,,,,
6,B,Level1,,Level1,Level1,Level3,Level3,Level1,Level1
7,A,Level1,Level1,,,,,,
8,A,Level1,Level1,Level1,Level1,Level1,,,
9,D,Level1,Level4,Level4,Level4,Level4,Level4,Level4,
10,D,Level1,Level1,Level1,,,,,
11,E,Level1,,,,,,,
12,E,Level1,Level1,Level1,Level1,Level1,Level1,Level1,
13,D,Level1,,Level1,,,,,
14,B,Level1,Level1,Level1,,Level3,Level3,Level3,Level3
15,A,Level1,Level1,Level1,Level1,Level1,,,
16,D,Level1,Level1,,,,,,
17,B,Level1,Level1,Level1,,,,,
18,E,Level1,Level1,Level1,,,,,
19,F,Level1,Level1,Level1,Level1,Level1,,,
20,D,Level1,Level1,Level1,Level1,Level1,,,
21,A,Level1,Level1,,Level1,Level1,Level1,Level1,Level1
22,A,Level1,Level1,Level1,Level1,Level1,Level1,Level1,
23,D,Level1,Level1,Level1,Level1,Level1,Level1,Level2,
24,E,Level1,Level1,,,,,,
;
data want;
set have;
array t(*) month:;
_k=whichc(' ', of t(*));
if _k>1 then do;
do _n_=dim(t) by -1 to _k+1;
if not missing(t(_n_)) then do; _t=1;leave;end;
end;
end;
do _n_=dim(t) by -1 to 1;
if not missing(t(_n_)) and t(_n_) ne 'Level1' then do; __t=1;leave;end;
end;
if _t or __t then delete;
drop _:;
run;
I tried with the sample dataset like you said for both part 1 and part 2. This is definitely working for the sample dataset that I provided to you. But somehow is not working for the full dataset. In the sample dataset, No. of variables remain the same which is 10 for both Part1 and part 2 code. But for my full dataset, Part 1 code returns me 11 variables (11th variable is k) even after carefully making sure I am using the right dataset.
i use underscore(_k) as a prefix to flag variables to drop them later as you may have noticed in the code. Please make sure there are absolutely no typos at all.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.