Hi SAS Forum,
I have this data set which has 3 variables (Orig_date, Curr_QTR, Village), of them two variables are having missing values.
Orig_date Curr_QTR Village
03/01/1993 | 03/31/2008 | A |
. | 06/30/2008 |
|
. | 09/30/2008 |
|
. | 12/31/2008 |
|
06/01/1995 | 03/31/2008 |
|
. | 06/30/2008 |
|
12/01/1995 | 03/31/2008 | B |
. | 06/30/2008 |
|
. | 09/30/2008 |
|
. | 12/31/2008 |
|
03/01/1996 | 03/31/2008 |
|
. | 06/30/2008 |
|
. | 09/30/2008 |
|
. | 12/31/2008 |
|
. | 03/31/2009 |
|
. | 06/30/2009 |
|
. | 09/30/2009 |
|
. | 12/31/2009 |
|
Please see below SAS code which readily produces the above data set.
data have;
informat orig_date curr_QTR date9.;
input village $ 1-1 orig_date curr_QTR;
format orig_date curr_QTR MMDDYY10.;
cards;
A 01MAR1993 31MAR2008
. 30JUN2008
. 30SEP2008
. 31DEC2008
01JUN1995 31MAR2008
. 30JUN2008
B 01DEC1995 31MAR2008
. 30JUN2008
. 30SEP2008
. 31DEC2008
01MAR1996 31MAR2008
. 30JUN2008
. 30SEP2008
. 31DEC2008
. 31MAR2009
. 30JUN2009
. 30SEP2009
. 31DEC2009
; run;
Question:
I need to fill in the missing values of variables “orig_date” and “village” with the upper value that would be found at the top of a given missing block.
Below is the table that I want.
Orig_date Curr_QTR Village
03/01/1993 | 03/31/2008 | A |
03/01/1993 | 06/30/2008 | A |
03/01/1993 | 09/30/2008 | A |
03/01/1993 | 12/31/2008 | A |
06/01/1995 | 03/31/2008 | A |
06/01/1995 | 06/30/2008 | A |
12/01/1995 | 03/31/2008 | B |
12/01/1995 | 06/30/2008 | B |
12/01/1995 | 09/30/2008 | B |
12/01/1995 | 12/31/2008 | B |
03/01/1996 | 03/31/2008 | B |
03/01/1996 | 06/30/2008 | B |
03/01/1996 | 09/30/2008 | B |
03/01/1996 | 12/31/2008 | B |
03/01/1996 | 03/31/2009 | B |
03/01/1996 | 06/30/2009 | B |
03/01/1996 | 09/30/2009 | B |
03/01/1996 | 12/31/2009 | B |
I have tried this but no hope.
proc sort data=have;
by curr_qtr;
run;
data want;
set have;
retain oldorig_date;
by curr_qtr;
if first.curr_qtr then oldorig_date = orig_date;
else orig_date = oldorig_date;
drop oldorig_date;
run;
Could anyone help me.
Thanks
Mirisage
Try this:
data want (drop= ori vill);
retain ori 0 Vill ' ';
set have;
if orig_date = .
then orig_date=ori;
else ori=orig_date;
if village=''
then village=vill;
else vill=village;
run;
Regards,
Try this:
data want (drop= ori vill);
retain ori 0 Vill ' ';
set have;
if orig_date = .
then orig_date=ori;
else ori=orig_date;
if village=''
then village=vill;
else vill=village;
run;
Regards,
Do not sort; the values are carried over according to the original data order. One way to do this is :
data want;
length lastVillage $1;
retain lastVillage lastOrig;
set have;
village = coalescec(village, lastVillage);
orig_date = coalesce(orig_date, lastOrig);
lastVillage = village;
lastOrig = orig_date;
drop lastVillage lastOrig;
run;
PG
Or you can try a trick I learned from DN:
data have;
informat orig_date curr_QTR date9.;
input village $ 1-1 orig_date curr_QTR;
format orig_date curr_QTR MMDDYY10.;
retain dummy '1';
cards;
A 01MAR1993 31MAR2008
. 30JUN2008
. 30SEP2008
. 31DEC2008
01JUN1995 31MAR2008
. 30JUN2008
B 01DEC1995 31MAR2008
. 30JUN2008
. 30SEP2008
. 31DEC2008
01MAR1996 31MAR2008
. 30JUN2008
. 30SEP2008
. 31DEC2008
. 31MAR2009
. 30JUN2009
. 30SEP2009
. 31DEC2009
;
data want;
update have (obs=0) have;
by dummy;
output;
drop dummy;
run;
Haikuo
Hello,
For simpler approach you can try PROC EXPAND
http://www.ats.ucla.edu/stat/sas/faq/creating_tiimeseries_variables_proc_expand.htm
Try if this helps
@yeshwanth,
If you can afford SAS/ETS, Proc expand/Proc timeseries are very powerful tools. However, in this case, I doubt either will work. They generally required data to be sorted by a time/date variable in general or within groups; and if there are groups, group variable can NOT be missing.
I could be very wrong since I haven't used these tools very often. I am therefore hoping you or some expert like / @data null can provide some code to prove me wrong.
Thanks,
Haikuo
I posted this questions few minutes ago
I am not sure if I posted it correctly. could you please let me know I it is posted correctly.
with many thanks
Stoline
Hi sstoline,
I think you have not posted it correctly. You have posted your query as a response to my query.
Just post as a new separate posting.
Thanks
Mirisage
Dear Mirisage:
Did you see this question posted before?
I posted it few times, but no answers available yet. So I am not sure if others able to see it.
many thanks
abou
This question is Not Answered.(Mark as assumed answered)
Dear All:
I do need helps with how to use both PROC surveyfreq and PROC surveymeans to do the following analysis.
I really thank you all for your helps.
I have the following variables:
Stratum1 (1,2)
Months (0,1,2)
X1 (1=yes,2=No) ---- nominal categorical
X2 ------ interval continuous
Weight1 ---- weight variable
I need
(1) to test the equality of proportions of ones in months (0,1,2), and conduct a multiple comparison test for these proportions.
(2) to test the equality of the averages in months(0,1,2) and conduct a multiple comparison test for these means.
(3) test for trend: for example if there is increasing or decreasing in averages (X2), if there is increasing or decreasing in proportions of ones (X1).
Any helps will be appreciated.
Please email me copy of your answer.
With many thanks
Steve
Email: sstoline@gmail.com
data1--- as an example
=============================
X2 X1 stratum1 months weight1
29.4822 2 2 0 16.61921705
33.2849 2 2 0 16.61921705
16.9370 1 1 0 2.09612124
19.0055 1 1 0 2.09612124
21.1151 1 2 0 16.61921705
25.4055 2 2 0 16.61921705
24.4164 1 1 0 1.56167161
19.0767 1 1 0 1.56167161
23.1562 2 1 0 2.09612124
23.3479 1 1 0 2.09612124
21.7370 1 1 0 2.09612124
24.9726 2 2 0 16.61921705
20.3836 1 2 0 16.61921705
20.0575 1 2 0 20.88637405
17.6603 1 2 0 16.61921705
25.2274 2 2 0 16.61921705
20.9644 1 1 0 2.09612124
32.8055 2 2 0 14.51497509
39.5233 2 1 0 1.56167161
17.7288 1 1 1 2.09612124
18.3096 1 2 1 16.61921705
25.8055 2 2 1 16.61921705
30.0904 1 2 1 16.61921705
27.8082 2 1 1 2.09612124
37.3863 2 1 1 2.09612124
26.2548 2 1 1 2.09612124
17.8795 1 2 1 20.88637405
19.8000 1 2 1 16.61921705
28.6932 2 1 1 1.56167161
27.1041 2 2 1 16.61921705
36.1836 2 2 1 16.61921705
19.8192 1 1 1 2.09612124
31.9644 2 2 1 20.88637405
27.0740 1 1 1 2.09612124
23.5288 2 1 1 2.09612124
21.9068 1 2 1 14.51497509
20.3534 1 2 1 16.61921705
29.4767 2 2 1 14.51497509
22.8274 1 2 1 16.61921705
21.2685 1 2 1 20.88637405
31.8932 1 1 1 1.56167161
31.8795 2 1 1 2.09612124
21.2630 2 1 1 2.09612124
18.7562 2 1 1 2.09612124
16.8822 1 2 1 20.88637405
22.0164 2 2 2 20.88637405
27.4959 2 1 2 2.09612124
27.4904 2 1 2 1.56167161
2.7096 2 2 2 14.51497509
23.0027 1 2 2 16.61921705
19.2767 1 2 2 16.61921705
30.4466 1 2 2 16.61921705
36.5425 2 2 2 16.61921705
32.2521 2 1 2 1.56167161
36.9534 1 2 2 14.51497509
30.4164 2 2 2 14.51497509
20.2767 2 1 2 2.09612124
17.2356 1 2 2 20.88637405
20.8247 2 2 2 16.61921705
20.8795 1 2 2 16.61921705
27.5260 2 2 16.61921705
19.2493 1 2 2 16.61921705
17.8521 1 2 2 16.61921705
19.6822 1 2 2 16.61921705
This is again a good case for using the UPDATE trick that I have seen Data_null_ post many times.
In this case you need to create a variable with a constant value for all observations as UPDATE requires a BY variable and you do not have one.
data want;
set have;
dummy=1;
run;
data want ;
update want(obs=0) want;
by dummy;
output;
drop dummy;
run;
Hi Tom, can you explain why OUTPUT is required in that last datastep? Thanks. PG
By default, update statement will only output at the end of each by variable. In this case, it will only output one record if not adding "output".
Haikuo
So UPDATE alters the default behaviour of the datastep to have an implicit OUTPUT at the end if there isn't an explicit one. Are there other cases where that happens?
There is "always" an implicit OUTPUT unless you get MODIFY involved. Other exceptions I reckon but I don't know any off hand.
UPDATE "applies" all transaction records to the uniquely keyed master records before the implicit output is executed.
like IF LAST.key THEN OUTPUT;
In this atypical usage we are only interested in the transaction records and the special UPDATE feature "missing check".
Hi Torres, PGStats, Hai.kuo and Tom,
Many thanks to each one of you.
Each one of your codes generates idenitical resutls for my larger dataset as well.
Thanks again!
Hi yeshwanth and data_null_;
Thanks both of you for the contribution.
Regards
Mirisage
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.