Can you please see above reply to kiranv?
@d0816 Where is the reply to Kiranv?
Anyway, if you say keep only those with "value as Level1 and consecutive missing values",. shouldn't 4 be deleted too?
4 | C | Level1 | Level1 | Level2 | Level2 | Level2 |
9 | D | Level1 | Level4 | Level4 | Level4 | Level4 | Level4 | Level4 |
23 | D | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level2 |
Please clarify
Month1 | Month2 | Month3 | Month4 | Month5 | Month6 | Month7 | Month8 |
Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | |
Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | ||
Level1 | Level1 | Level1 | Level1 | Level1 | |||
Level1 | Level1 | Level1 | Level1 | ||||
Level1 | Level1 | Level1 | |||||
Level1 | Level1 | ||||||
Level1 |
HI, for the 2nd part I want dataset like this.
Ok got it. Thank you. Be right back in a bit
data want;
set have;
array t(*) month:;
_k=whichc(' ', of t(*));
if _k>1 then do;
do _n_=dim(t) by -1 to _k+1;
if not missing(t(_n_)) then do; _t=1;leave;end;
end;
end;
do _n_=dim(t) by -1 to 1;
if not missing(t(_n_)) and t(_n_) ne 'Level1' then do; __t=1;leave;end;
end;
if _t or __t then delete;
drop _:;
run;
I will try to see if I can make it really slick. Hope that helps
My log and result:
4365 data want;
4366 set have;
4367 array t(*) month:;
4368 _k=whichc(' ', of t(*));
4369 if _k>1 then do;
4370 do _n_=dim(t) by -1 to _k+1;
4371 if not missing(t(_n_)) then do; _t=1;leave;end;
4372 end;
4373 end;
4374 do _n_=dim(t) by -1 to 1;
4375 if not missing(t(_n_)) and t(_n_) ne 'Level1' then do; __t=1;leave;end;
4376 end;
4377 if _t or __t then delete;
4378 drop _:;
4379 run;
NOTE: There were 24 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.WANT has 16 observations and 10 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds
My output:
ID | Store | Month1 | Month2 | Month3 | Month4 | Month5 | Month6 | Month7 | Month8 |
1 | A | Level1 | Level1 | Level1 | Level1 | ||||
2 | A | Level1 | Level1 | ||||||
5 | B | Level1 | |||||||
7 | A | Level1 | Level1 | ||||||
8 | A | Level1 | Level1 | Level1 | Level1 | Level1 | |||
10 | D | Level1 | Level1 | Level1 | |||||
11 | E | Level1 | |||||||
12 | E | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | |
15 | A | Level1 | Level1 | Level1 | Level1 | Level1 | |||
16 | D | Level1 | Level1 | ||||||
17 | B | Level1 | Level1 | Level1 | |||||
18 | E | Level1 | Level1 | Level1 | |||||
19 | F | Level1 | Level1 | Level1 | Level1 | Level1 | |||
20 | D | Level1 | Level1 | Level1 | Level1 | Level1 | |||
22 | A | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | Level1 | |
24 | E | Level1 | Level1 |
I tested again, use this HAVE and and check the WANT. If this works for your sample, replicate carefully for your original source
data have;
input ID Store$ (Month1-Month8)($);
infile datalines dsd missover;
datalines;
1,A,Level1,Level1,Level1,Level1,,,,
2,A,Level1,Level1,,,,,,
3,B,Level1,Level1,,,Level1,,Level1,Level1
4,C,Level1,Level1,Level2,Level2,Level2,,,
5,B,Level1,,,,,,,
6,B,Level1,,Level1,Level1,Level3,Level3,Level1,Level1
7,A,Level1,Level1,,,,,,
8,A,Level1,Level1,Level1,Level1,Level1,,,
9,D,Level1,Level4,Level4,Level4,Level4,Level4,Level4,
10,D,Level1,Level1,Level1,,,,,
11,E,Level1,,,,,,,
12,E,Level1,Level1,Level1,Level1,Level1,Level1,Level1,
13,D,Level1,,Level1,,,,,
14,B,Level1,Level1,Level1,,Level3,Level3,Level3,Level3
15,A,Level1,Level1,Level1,Level1,Level1,,,
16,D,Level1,Level1,,,,,,
17,B,Level1,Level1,Level1,,,,,
18,E,Level1,Level1,Level1,,,,,
19,F,Level1,Level1,Level1,Level1,Level1,,,
20,D,Level1,Level1,Level1,Level1,Level1,,,
21,A,Level1,Level1,,Level1,Level1,Level1,Level1,Level1
22,A,Level1,Level1,Level1,Level1,Level1,Level1,Level1,
23,D,Level1,Level1,Level1,Level1,Level1,Level1,Level2,
24,E,Level1,Level1,,,,,,
;
data want;
set have;
array t(*) month:;
_k=whichc(' ', of t(*));
if _k>1 then do;
do _n_=dim(t) by -1 to _k+1;
if not missing(t(_n_)) then do; _t=1;leave;end;
end;
end;
do _n_=dim(t) by -1 to 1;
if not missing(t(_n_)) and t(_n_) ne 'Level1' then do; __t=1;leave;end;
end;
if _t or __t then delete;
drop _:;
run;
I tried with the sample dataset like you said for both part 1 and part 2. This is definitely working for the sample dataset that I provided to you. But somehow is not working for the full dataset. In the sample dataset, No. of variables remain the same which is 10 for both Part1 and part 2 code. But for my full dataset, Part 1 code returns me 11 variables (11th variable is k) even after carefully making sure I am using the right dataset.
i use underscore(_k) as a prefix to flag variables to drop them later as you may have noticed in the code. Please make sure there are absolutely no typos at all.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.