turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- Data Management
- /
- Forum
- /
- Referencing data elements in SAS

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 02:35 PM

Hi All,

An hour and a half on Google Search didn't get me anywhere with this basic question

on referencing data elements:

I want to modify a variable in a sas dataset based on the values

of another variable.

Lets call the vars myA and myB which are both numeric and are 5000x1 cloumns.

The following program shows what i need but doesn't do the job.

Data new_data;

set old_data;

Do i=2 to 5000;

If MyB(i)=0 then MyA(i)=MyA(i-1);

end;

drop i;

run;

I appreciate any help!

Accepted Solutions

Solution

12-01-2011
06:03 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 06:03 PM

Thanks Lawrence for pointing out the lag function, LinLin for your timely response and art297 for the detailed procedure.

art297's answer would probably help but since the actual number of

records in my dataset was +72000 I wasn't sure if it was the best thing to do. I also needed something more

straightforward.

After a long search on the web, I figured out how

to create a matrix from the data set, do my operations and

transform it back into a data set which does the thing i needed to do.

Here's my code:

Proc IML;

use old_dataset var{myB, myA};

read all into my_matrix;

Do i=2 to 72676;

If my_matrix[i,1]=0 then my_matrix[i,2]=my_matrix[i-1,2];

end;

Create new_dataset from my_matrix[colname={'myB' 'myA'}];

append from my_matrix;

Quit;

the above code uses IML. To check if you have it as part of your

SAS package run the code below:

* determine what SAS compoentes are installed;

Proc Setinit;

Run;

All Replies

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 02:53 PM

this only works if "5000x1 cloumns" means 5000 records.

Data new_data;

set old_data;

If MyB=0 then MyA=MyA-1;

run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 03:11 PM

Thanks for your reply, but it doesn't seem to work. and yes i meant 5000 records.

The data looks like this:

MyA MyB

------- ------

4 5

2 5

. 0

. 0

. 0

1 3

1 2

1 2

The operation is supposed to replace the missing values with 2.

I also tried if MyA=. then MyA=MyA-1;

It didn't work either.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 03:47 PM

Moein,

This is a classic problem for using the LAG function (available in the DATA step, but not in SQL). Search for

lag function

in Documentaion on support.sas.com.

It goes something like this (untested):

DATA new;

SET old_data;

prevA=lag(mya);

if myb=0 then mya=preva;

RUN;

There are a lot of nuances to the LAG statement, so be sure to read all the documention.

Doc Muhlbaier

Duke

EDIT: Opps. I missed one of the nuances. This will get the first missing, but not the subsequent ones. You will probably need to conditionally RETAIN some values to get it to work.

Message was edited by: Lawrence Muhlbaier

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 04:08 PM

**data** old;

input a myb;

cards;

4 5

2 5

. 0

. 0

. 0

1 3

1 2

1 2

;

**run**;

**data** new(drop=b ;

set old;

b1=lag1(a);b2=lag2(a);b3=lag3(a) ;

if a eq **.** and b1 ne **.** then a=b1;

else if a eq **.** and b1 eq **.** and b2 ne **.** then a=b2;

else if a eq **.** and b1 eq **.** and b2 eq **.** and b3 ne **.** then a=b3;

**run**;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 04:16 PM

Linlin, I didn't use that approach as we don't know the maximum number of consecutive missing values in the file.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 05:38 PM

Thank you Art!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 04:03 PM

Since you only have 5,000 records, I would use a rather brute force approach:

/*use proc summary to transpose the records to one long record*/

proc summary nway data=old_data missing;

output out = transposed (drop=_type_ _freq_)

idgroup(out[5000](MyA MyB)=);

run;

data new_data (keep=MyA MyB);

set transposed;

/*put all of your data into 2 arrays*/

array aMyA(*) MyA:;

array aMyB(*) MyB:;

/*do all of your computations*/

do i=2 to dim(aMyA);

if aMyB(i)=0 then aMyA(i)=aMyA(i-1);

end;

/*output your new records*/

do i=1 to dim(aMyA);

MyA=aMyA(i);

MyB=aMyB(i);

output;

end;

run;

Solution

12-01-2011
06:03 PM

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 06:03 PM

Thanks Lawrence for pointing out the lag function, LinLin for your timely response and art297 for the detailed procedure.

art297's answer would probably help but since the actual number of

records in my dataset was +72000 I wasn't sure if it was the best thing to do. I also needed something more

straightforward.

After a long search on the web, I figured out how

to create a matrix from the data set, do my operations and

transform it back into a data set which does the thing i needed to do.

Here's my code:

Proc IML;

use old_dataset var{myB, myA};

read all into my_matrix;

Do i=2 to 72676;

If my_matrix[i,1]=0 then my_matrix[i,2]=my_matrix[i-1,2];

end;

Create new_dataset from my_matrix[colname={'myB' 'myA'}];

append from my_matrix;

Quit;

the above code uses IML. To check if you have it as part of your

SAS package run the code below:

* determine what SAS compoentes are installed;

Proc Setinit;

Run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

12-01-2011 06:48 PM

Definitely go with proc IML. It will do the task in a fraction of a second. The method I proposed wouldn't work anyhow, as it can only handle 100 records. Proc transpose, in itself, would have taken over a second just to do one transposition, and two would be needed.

I have to review your code (since I'm not sufficiently familiar with IML), but it definitely appears to be a nice way of accomplishing the task.