BookmarkSubscribeRSS Feed
ZhihanZhou
Calcite | Level 5

hello everyone, 

i want to select rows only if the second row with 7 and add a number , for exsample:

 

data want;

set have;

array a(1);

x=zweck;

if x=7 and x-2=7 then a1=7;

if x-1=7 and x+1=7 then a1=1;

run;

 

IDZWECK
41010000034
41010000037
41010000035
41010000037
41010000034
41010000037
41010000034
41010000037
41010000105
41010000107
410100001011
41010000107
41010000104
41010000107
41010000106
41010000107
41010000114
41010000117
410100001111
41010000114
41010000117
41010000116
41010000117
41010000119
41010000115
41010000117
41010000114
41010000117
41010000114
41010000117
41010000116
41010000117
41010000115
41010000117
41010000154
41010000157
410100001511
41010000157

 

and what i want is like the following:

IDZWECKHZ
4101000003447
4101000003747
4101000003557
4101000003757
4101000003447
4101000003747
4101000003447
4101000003747
4101000010557
4101000010757
410100001011117
41010000107117
4101000010447
4101000010747
4101000010667
4101000010767
4101000011447
4101000011747
4101000011111
410100001141
410100001171
4101000011667
4101000011767
410100001191
410100001151
410100001171
4101000011447
4101000011747
4101000011447
4101000011747
4101000011667
4101000011767
4101000011557
4101000011757
4101000015447
4101000015747
410100001511117
41010000157117
11 REPLIES 11
Kurt_Bremser
Super User
if x=7 and x-2=7

If x is 7, it can't be 9 (so that x-2 equals 7), so both parts of the condition can never be true at the same time, which means that the whole condition can NEVER be true.

 

Show what you want to get out of your dataset.

ZhihanZhou
Calcite | Level 5
Hello Kurt Bremer,
Thanks for your reply.
I just want to move the line pointer to the last row or next row, that is why my code is like this.
What I want is like the second table. When there is only one row before the row with zweck 7, then add a number in form Zweck of last row+7, otherwise make it 1.
Tom
Super User Tom
Super User

This seems to produce what you want, not clear what the heck it MEANS however.

It wasn't clear what types of variable ZWECK or HZ were.  So I made them character and included a delimiter between the two values since they seem to be variable length.

data want;
  set have ;
  by id;
  set have(keep=ZWECK rename=(ZWECK=next) firstobs=2) have(obs=1 drop=_all_);
  previous=lag(ZWECK);
  if first.id then call missing(previous);
  if last.id then call missing(next);
  length hz $5 ;
  retain hz;
  if next='7' then do;
    if previous='7' or first.id then hz=catx('-',ZWECK,next);
    else hz='1';
  end;
  else if ZWECK ne '7' then hz='1';
run;

Results:

OBS        ID        ZWECK    want    next    previous    hz

  1    4101000003     4       47       7                  4-7
  2    4101000003     7       47       5         4        4-7
  3    4101000003     5       57       7         7        5-7
  4    4101000003     7       57       4         5        5-7
  5    4101000003     4       47       7         7        4-7
  6    4101000003     7       47       4         4        4-7
  7    4101000003     4       47       7         7        4-7
  8    4101000003     7       47                 4        4-7
  9    4101000010     5       57       7                  5-7
 10    4101000010     7       57       11        5        5-7
 11    4101000010     11      117      7         7        11-7
 12    4101000010     7       117      4         11       11-7
 13    4101000010     4       47       7         7        4-7
 14    4101000010     7       47       6         4        4-7
 15    4101000010     6       67       7         7        6-7
 16    4101000010     7       67                 6        6-7
 17    4101000011     4       47       7                  4-7
 18    4101000011     7       47       11        4        4-7
 19    4101000011     11      1        4         7        1
 20    4101000011     4       1        7         11       1
 21    4101000011     7       1        6         4        1
 22    4101000011     6       67       7         7        6-7
 23    4101000011     7       67       9         6        6-7
 24    4101000011     9       1        5         7        1
 25    4101000011     5       1        7         9        1
 26    4101000011     7       1        4         5        1
 27    4101000011     4       47       7         7        4-7
 28    4101000011     7       47       4         4        4-7
 29    4101000011     4       47       7         7        4-7
 30    4101000011     7       47       6         4        4-7
 31    4101000011     6       67       7         7        6-7
 32    4101000011     7       67       5         6        6-7
 33    4101000011     5       57       7         7        5-7
 34    4101000011     7       57                 5        5-7
 35    4101000015     4       47       7                  4-7
 36    4101000015     7       47       11        4        4-7
 37    4101000015     11      117      7         7        11-7
 38    4101000015     7       117                11       11-7
ZhihanZhou
Calcite | Level 5

Hi Tom,

Thanks for your solution. Firstly, to your Question: all of them are numeric. also I have a few questions about your code:

1. following your code should i rewrite 'length hz $5' as 'length hz 12.', considering they are numeric?

2. how should i define first.id and last.id? or it will work automatic?

 

 

Tom
Super User Tom
Super User

@ZhihanZhou wrote:

Hi Tom,

Thanks for your solution. Firstly, to your Question: all of them are numeric. also I have a few questions about your code:

1. following your code should i rewrite 'length hz $5' as 'length hz 12.', considering they are numeric?

2. how should i define first.id and last.id? or it will work automatic?

 

 


If they are numeric what the heck does the new HZ variable represent? How can you combine two numbers into one?  Since the second one seems to be always one digit then perhaps you could use:

hz = previous*10 + ZWECK;

You do not normally need to define a LENGTH for a numeric variable.   You certainly CANNOT define a numeric variable as length of 12 since all numeric variables are 64 bit floating point numbers so the maximum number of bytes they can occupy is 8.  Perhaps you are confusing the format attached to a number which will tell SAS how you want the number to PRINT.  The LENGTH of a variable is how much space is required to store it in the dataset.

 

FIRST. and LAST. are defined by the use of the BY statement.  SAS will set them to TRUE or FALSE as it reads through the input data.

ZhihanZhou
Calcite | Level 5

Hello Tom,

 

thanks for your reply!

i have tried your code, a part of them succeed, but still come some erorr Warnings: 'invalid numeric data'(is it because 'if previous='7' or first.id then hz=catx('-',ZWECK,next);' this sentence makes those numeric data as character?) and some missing value, like following:

IDZWECKnextprevioushz
410100000347..
410100000375447
4101000003577.
410100000374557
4101000003477.
410100000374447
4101000003477.
41010000037.447
410100001057..
4101000010711557
41010000101177.
41010000107411117
4101000010477.
410100001076447
4101000010677.
41010000107.667
410100001147..
4101000011711447

 

ZhihanZhou_1-1662233520084.png

 

 

Tom
Super User Tom
Super User

Why did you keep the quotes when the variable is numeric?

if previous=7 then ...
ZhihanZhou
Calcite | Level 5
Thanks!! It‘s succeed!
mkeintz
PROC Star

And what if you have a sequence such as

 

4

7

7

 

for zweck?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ZhihanZhou
Calcite | Level 5
Normally, it will not. Because those are mobility data about demand profiles of high ways and those ‚zweck‘ represent different demand. ‚7‘ means drive to home, logically there is at less one another activity between two zweck ‚7‘. People can not just back to home twice without going out.
mkeintz
PROC Star

You can do this relatively simply using a self-merge with offset:

 

 data want (drop=nxt_:);
   merge have  
         have (firstobs=2 keep=id zweck rename=(id=nxt_id zweck=nxt_zweck));

   retain hz '     ';
   if nxt_zweck=7 and id=nxt_id then hz=catx('-',zweck,nxt_zweck);
   output;
   if nxt_id^=id or nxt_zweck^=7 then hz='';
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 677 views
  • 2 likes
  • 4 in conversation