hello everyone,
i want to select rows only if the second row with 7 and add a number , for exsample:
data want;
set have;
array a(1);
x=zweck;
if x=7 and x-2=7 then a1=7;
if x-1=7 and x+1=7 then a1=1;
run;
ID | ZWECK |
4101000003 | 4 |
4101000003 | 7 |
4101000003 | 5 |
4101000003 | 7 |
4101000003 | 4 |
4101000003 | 7 |
4101000003 | 4 |
4101000003 | 7 |
4101000010 | 5 |
4101000010 | 7 |
4101000010 | 11 |
4101000010 | 7 |
4101000010 | 4 |
4101000010 | 7 |
4101000010 | 6 |
4101000010 | 7 |
4101000011 | 4 |
4101000011 | 7 |
4101000011 | 11 |
4101000011 | 4 |
4101000011 | 7 |
4101000011 | 6 |
4101000011 | 7 |
4101000011 | 9 |
4101000011 | 5 |
4101000011 | 7 |
4101000011 | 4 |
4101000011 | 7 |
4101000011 | 4 |
4101000011 | 7 |
4101000011 | 6 |
4101000011 | 7 |
4101000011 | 5 |
4101000011 | 7 |
4101000015 | 4 |
4101000015 | 7 |
4101000015 | 11 |
4101000015 | 7 |
and what i want is like the following:
ID | ZWECK | HZ |
4101000003 | 4 | 47 |
4101000003 | 7 | 47 |
4101000003 | 5 | 57 |
4101000003 | 7 | 57 |
4101000003 | 4 | 47 |
4101000003 | 7 | 47 |
4101000003 | 4 | 47 |
4101000003 | 7 | 47 |
4101000010 | 5 | 57 |
4101000010 | 7 | 57 |
4101000010 | 11 | 117 |
4101000010 | 7 | 117 |
4101000010 | 4 | 47 |
4101000010 | 7 | 47 |
4101000010 | 6 | 67 |
4101000010 | 7 | 67 |
4101000011 | 4 | 47 |
4101000011 | 7 | 47 |
4101000011 | 11 | 1 |
4101000011 | 4 | 1 |
4101000011 | 7 | 1 |
4101000011 | 6 | 67 |
4101000011 | 7 | 67 |
4101000011 | 9 | 1 |
4101000011 | 5 | 1 |
4101000011 | 7 | 1 |
4101000011 | 4 | 47 |
4101000011 | 7 | 47 |
4101000011 | 4 | 47 |
4101000011 | 7 | 47 |
4101000011 | 6 | 67 |
4101000011 | 7 | 67 |
4101000011 | 5 | 57 |
4101000011 | 7 | 57 |
4101000015 | 4 | 47 |
4101000015 | 7 | 47 |
4101000015 | 11 | 117 |
4101000015 | 7 | 117 |
if x=7 and x-2=7
If x is 7, it can't be 9 (so that x-2 equals 7), so both parts of the condition can never be true at the same time, which means that the whole condition can NEVER be true.
Show what you want to get out of your dataset.
This seems to produce what you want, not clear what the heck it MEANS however.
It wasn't clear what types of variable ZWECK or HZ were. So I made them character and included a delimiter between the two values since they seem to be variable length.
data want;
set have ;
by id;
set have(keep=ZWECK rename=(ZWECK=next) firstobs=2) have(obs=1 drop=_all_);
previous=lag(ZWECK);
if first.id then call missing(previous);
if last.id then call missing(next);
length hz $5 ;
retain hz;
if next='7' then do;
if previous='7' or first.id then hz=catx('-',ZWECK,next);
else hz='1';
end;
else if ZWECK ne '7' then hz='1';
run;
Results:
OBS ID ZWECK want next previous hz 1 4101000003 4 47 7 4-7 2 4101000003 7 47 5 4 4-7 3 4101000003 5 57 7 7 5-7 4 4101000003 7 57 4 5 5-7 5 4101000003 4 47 7 7 4-7 6 4101000003 7 47 4 4 4-7 7 4101000003 4 47 7 7 4-7 8 4101000003 7 47 4 4-7 9 4101000010 5 57 7 5-7 10 4101000010 7 57 11 5 5-7 11 4101000010 11 117 7 7 11-7 12 4101000010 7 117 4 11 11-7 13 4101000010 4 47 7 7 4-7 14 4101000010 7 47 6 4 4-7 15 4101000010 6 67 7 7 6-7 16 4101000010 7 67 6 6-7 17 4101000011 4 47 7 4-7 18 4101000011 7 47 11 4 4-7 19 4101000011 11 1 4 7 1 20 4101000011 4 1 7 11 1 21 4101000011 7 1 6 4 1 22 4101000011 6 67 7 7 6-7 23 4101000011 7 67 9 6 6-7 24 4101000011 9 1 5 7 1 25 4101000011 5 1 7 9 1 26 4101000011 7 1 4 5 1 27 4101000011 4 47 7 7 4-7 28 4101000011 7 47 4 4 4-7 29 4101000011 4 47 7 7 4-7 30 4101000011 7 47 6 4 4-7 31 4101000011 6 67 7 7 6-7 32 4101000011 7 67 5 6 6-7 33 4101000011 5 57 7 7 5-7 34 4101000011 7 57 5 5-7 35 4101000015 4 47 7 4-7 36 4101000015 7 47 11 4 4-7 37 4101000015 11 117 7 7 11-7 38 4101000015 7 117 11 11-7
Hi Tom,
Thanks for your solution. Firstly, to your Question: all of them are numeric. also I have a few questions about your code:
1. following your code should i rewrite 'length hz $5' as 'length hz 12.', considering they are numeric?
2. how should i define first.id and last.id? or it will work automatic?
@ZhihanZhou wrote:
Hi Tom,
Thanks for your solution. Firstly, to your Question: all of them are numeric. also I have a few questions about your code:
1. following your code should i rewrite 'length hz $5' as 'length hz 12.', considering they are numeric?
2. how should i define first.id and last.id? or it will work automatic?
If they are numeric what the heck does the new HZ variable represent? How can you combine two numbers into one? Since the second one seems to be always one digit then perhaps you could use:
hz = previous*10 + ZWECK;
You do not normally need to define a LENGTH for a numeric variable. You certainly CANNOT define a numeric variable as length of 12 since all numeric variables are 64 bit floating point numbers so the maximum number of bytes they can occupy is 8. Perhaps you are confusing the format attached to a number which will tell SAS how you want the number to PRINT. The LENGTH of a variable is how much space is required to store it in the dataset.
FIRST. and LAST. are defined by the use of the BY statement. SAS will set them to TRUE or FALSE as it reads through the input data.
Hello Tom,
thanks for your reply!
i have tried your code, a part of them succeed, but still come some erorr Warnings: 'invalid numeric data'(is it because 'if previous='7' or first.id then hz=catx('-',ZWECK,next);' this sentence makes those numeric data as character?) and some missing value, like following:
ID | ZWECK | next | previous | hz |
4101000003 | 4 | 7 | . | . |
4101000003 | 7 | 5 | 4 | 47 |
4101000003 | 5 | 7 | 7 | . |
4101000003 | 7 | 4 | 5 | 57 |
4101000003 | 4 | 7 | 7 | . |
4101000003 | 7 | 4 | 4 | 47 |
4101000003 | 4 | 7 | 7 | . |
4101000003 | 7 | . | 4 | 47 |
4101000010 | 5 | 7 | . | . |
4101000010 | 7 | 11 | 5 | 57 |
4101000010 | 11 | 7 | 7 | . |
4101000010 | 7 | 4 | 11 | 117 |
4101000010 | 4 | 7 | 7 | . |
4101000010 | 7 | 6 | 4 | 47 |
4101000010 | 6 | 7 | 7 | . |
4101000010 | 7 | . | 6 | 67 |
4101000011 | 4 | 7 | . | . |
4101000011 | 7 | 11 | 4 | 47 |
Why did you keep the quotes when the variable is numeric?
if previous=7 then ...
And what if you have a sequence such as
4
7
7
for zweck?
You can do this relatively simply using a self-merge with offset:
data want (drop=nxt_:);
merge have
have (firstobs=2 keep=id zweck rename=(id=nxt_id zweck=nxt_zweck));
retain hz ' ';
if nxt_zweck=7 and id=nxt_id then hz=catx('-',zweck,nxt_zweck);
output;
if nxt_id^=id or nxt_zweck^=7 then hz='';
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: