BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
yanshuai
Quartz | Level 8

Hello all.

 

I hve data such as

Name              YEAR                Weight

Jack                2000                     200

Jack                2001                    205

Jack                2002                    206

Jack                2003                     203

Jack                2004                     209

Jack                2005                     210

John                2000                     200

John               2001                     205

John               2002                     206

John                2003                     203

John               2004                     209

John                2005                     210

 

How to capture the previous 3 years standard deviation of WEIGHT for each NAME? If previous years only has observation less than 3, it can return 0.

I know STD code, but I don't know how to capure the standard deviation for a certain range of time.

 

This is the data table I want

 

Name              YEAR                Weight         STD

Jack                2000                     200             0

Jack                2001                    205              0

Jack                2002                    206              0

Jack                2003                     203             3.5 (This is a number I make up, but it should be the STD of WEIGHT from 2000 to 2003)

Jack                2004                     209             3.4 (Again, this should be the STD of WEIGHT from 2001 to 2004)  

Jack                2005                     210             5.2 (Again, this should be the STD of WEIGHT from 2002 to 2005)

John                2000                     200

John               2001                     205

John               2002                     206

John                2003                     203

John               2004                     209

John                2005                     210

 

proc sql;
create table want as select *,
(select std(WEIGHT) from have where YEAR between a.YEAR-3 and a.YEAR and NAME=a.NAME) as std  
from have as a;
quit;

Above is the code I use.

But the result shows even for the beginning year, it also has a small standard deviation. That looks weird to me. Because, if it only has one value, and it should have no variation then no standard deviation.

 

and sub query is also confusing me. Why I cannot just put the condition from year-3 to year using the GROUP BY?

Thank you all!

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

If the data are sorted, use the DATA step, with a BY statement:

 

data want;
  set have;
  by name;
  seq+1;
  if first.name then seq=1;
  wgt_std3=ifn(seq>=4
              ,std(lag3(weight),lag2(weight),lag1(weight),weight)
              ,0);
run;

 

The lag function takes the prior OBSERVATIONS, so the program assumes there are no gaps in the sequence of years.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

5 REPLIES 5
novinosrin
Tourmaline | Level 20

Are you after something like this?

data have;
input Name    $          Age                Weight ;
datalines;
Jack                18                     200
Jack                19                     205
Jack                20                     206
Jack                21                     203
Jack                22                     209
Jack                23                     210
John                18                     200
John               19                     205
John               20                     206
John                21                     203
John               22                     209
John                23                     210
John               24                     209
John                25                     210
;


proc sql;
create table want1(drop=k) as
select *,floor(age/3) as k,ifn(count(calculated k)>=3,std(weight),0) as std
from have
group by name,k	;
quit;
novinosrin
Tourmaline | Level 20

or just name and std variables?

 

proc sql;
create table want(drop=k) as
select name,floor(age/3) as k,ifn(count(calculated k)>=3,std(weight),0) as std
from have
group by name,k	;
quit;
yanshuai
Quartz | Level 8

Hello.

I modify the problem a little.

What if the AGE is the YEAR?

And I only want the STD every 4 years.

mkeintz
PROC Star

If the data are sorted, use the DATA step, with a BY statement:

 

data want;
  set have;
  by name;
  seq+1;
  if first.name then seq=1;
  wgt_std3=ifn(seq>=4
              ,std(lag3(weight),lag2(weight),lag1(weight),weight)
              ,0);
run;

 

The lag function takes the prior OBSERVATIONS, so the program assumes there are no gaps in the sequence of years.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
yanshuai
Quartz | Level 8

Thank you very much.

This works.

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 888 views
  • 0 likes
  • 3 in conversation