BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Josie1
Obsidian | Level 7

I am trying to retain a value using 2 by variables, namely ID and month. In addition to these, it may be useful to know that I have the two digit month and the 4 digit year as separate variables on this data set as well. this dataset spans from January 2012 - May 2018 and the first zero value for each id could start anywhere SO thats why i have the average inches for the year because some zeros are legitimate. If the yearly avg inches are zero then thats the to let me know that the bad data started either that year or sometime during the prior year. and i want to fill the inches for ID 101 for January 2018 with the value for January 2017 if it isn't 0 or if it is zero only use the zero if any of the rest of the inches2017 values for ID 101 are non zero after January. If all the inches2017 in 2017 after January 2017  are zero then apparently the bad data started and id prefer to use the january of the first year where where the average inches are non zero. I want the data to replace inches with the last nonzero value of inches for that ID for that month. like for ID 101, in January 2018, give me the inches for January 2017 or if thats not populated with nonzero give me 2016 January value .

 

data have;

input ID date inches inches2016 inches2017 inches2018 avg_in2016 avg_in2017 avg_in2018 baddatastart;

datalines;

101 201601 75 75 . . 85 49 0 201707

101 201602 85 85 . .  85 49 0 201707

101 201603 95 95 . . 85 49 0 201707

101 201604 105 105 . . 85 49 0 201707

101 201701 55 . 55 . 85 49 0 201707

101 201702 58 . 58 . 85 49 0 201707

101 201703 42 . 42 . 85 49 0 201707

101 201704 43 . 43 . 85 49 0 201707

101 201801 0 . . 0 85 49 0 201707

101 201802 0 . . 0 85 49 0 201707

102 201601 777 777 . . 751 840 0 201704

102 201602 799 799 . . 751 840 0 201704

102 201603 729 729 . . 751 840 0 201704

102 201604 749 749 . . 751 840 0 201704

102 201701 888 . 888 . 751 840 0 201704

102 201702 828 . 828 . 751 840 0 201704

102 201703 724 . 828 . 751 840 0 201704

102 201704 0 . 0 . 751 840 0 201704

102 201801 0 . . 0 751 840 0 201704; run;

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Sounds like you don't need all of those other variables.

Just make a new variable and retain the value.

data want ;
  set have ;
  by id date ;
  retain new_inches ;
  if first.id then new_inches=inches ;
  if inches not in (0,.) then new_inches=inches ;
run;

View solution in original post

4 REPLIES 4
lrudolphi
Obsidian | Level 7

Have you considered working with a wide file instead of a long file? Using the sample data you provided and following along with your post as best as I could, I think the below code may work. It is very basic, using array processing for each year of data to look back on the prior year. Hope this helps!

/*Create a wide file with a single row per ID.*/
proc sort data=have;
	by id avg_in2016 avg_in2017 avg_in2018 baddatastart date;
run;

/*Since the average inches and bad data start are constant for each ID,*/
/*use those as by variables.*/
proc transpose data=have out=have_wide (drop=_name_) prefix=inches_;
	by id avg_in2016 avg_in2017 avg_in2018 baddatastart;
	var inches;
	id date;
run;

/*Swap out zeroes for prior year, if required.*/
data have_wide_swap (drop=i);
	set have_wide;

	/*These could be made more dynamic depending on the volume of years you are working with*/
	array yr1 {4} inches_201601-inches_201604;
	array yr2 {4} inches_201701-inches_201704;
	array yr3 {2} inches_201801-inches_201802;

	/*Look at the year prior first. If not zero, use it.*/
	/*Otherwise, use the year before that.*/
	do i=1 to 2;
		if yr2{i} ne 0 then yr3{i}=yr2{i};
		else if yr2{i} eq 0 then yr3{i}=yr1{i};
	end;
run;
Tom
Super User Tom
Super User

Sounds like you don't need all of those other variables.

Just make a new variable and retain the value.

data want ;
  set have ;
  by id date ;
  retain new_inches ;
  if first.id then new_inches=inches ;
  if inches not in (0,.) then new_inches=inches ;
run;
Josie1
Obsidian | Level 7
Looks like this works! Geez! thanks
AnnaBrown
Community Manager

Hi @Josie1,

 

I'm glad you found some useful info! If one of the replies was the exact solution to your problem, can you "Accept it as a solution"? Or if one was particularly helpful, feel free to "Like" it. This will help other community members who may run into the same issue know what worked.

Thanks!
Anna

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1212 views
  • 1 like
  • 4 in conversation