Hi
I have a Data Set that looks like below:
It has an ID and then values bimonthly from 201802 onward, you can see that if the start date is 201802 then the values start from 201802, if its 201806 then they will start from 201806
ID | Start Date | 201802 | 201804 | 201806 | 201808 | 201810 | 201812 | 201902 | 201904 | 201906 | 201908 |
112 | 201802 | 500 | 210 | 525 | 555 | 565 | 555 | 598 | 600 | 521 | 123 |
231 | 201904 | 500 | 25 | 365 | |||||||
367 | 201806 | 100 | 256 | 586 | 265 | 254 | 2156 | 145 | 241 | ||
856 | 201908 | 255 | |||||||||
245 | 201812 | 258 | 125 | 582 | 254 | 153 |
I need the following as my output -
ID | Start Date | Refresh1 | Refresh2 | Refresh3 | Refresh4 | Refresh5 | Refresh6 | Refresh7 | Refresh8 | Refresh9 | Refresh10 |
112 | 201802 | 500 | 210 | 525 | 555 | 565 | 555 | 598 | 600 | 521 | 123 |
231 | 201904 | 500 | 25 | 365 | |||||||
367 | 201806 | 100 | 256 | 586 | 265 | 254 | 2156 | 145 | 241 | ||
856 | 201908 | 255 | |||||||||
245 | 201812 | 258 | 125 | 582 | 254 | 153 |
So basically all the values from the first column shift to the left and start from there....i am Just lost in how to accomplish this, any help is much appreciated
Thanks!
data have;
input ID StartDate :yymmn6. _201802 _201804 _201806 _201808 _201810 _201812 _201902 _201904 _201906 _201908;
infile datalines dlm=',' dsd;
datalines;
112,201802,500,210,525,555,565,555,598,600,521,123
231,201904,,,,,,,,500,25,365
367,201806,,,100,256,586,265,254,2156,145,241
856,201908,,,,,,,,,,255
245,201812,,,,,,258,125,582,254,153
;
data want (drop=_: j i);
set have;
array in _:;
array out Refresh1-Refresh10;
j=1;
do i=1 to 10;
if in[i] ne . then do;
out[j]=in[i];
j+1;
end;
end;
run;
Storing the data as a spreadsheet ("wide") is generally bad if you wish to transform it somehow.
First transpose it to long, with a month - value pair on each row.
Then you can in a succeeding data step just have a row counter going for each by group (ID startDate).
Then transpose it back (but not necessary as a data set, use a reporting PROC to give the desired output, like PROC REPORT.
data have;
infile cards expandtabs truncover ;
input ID StartDate _201802 _201804 _201806 _201808 _201810 _201812 _201902 _201904 _201906 _201908;
cards;
112 201802 500 210 525 555 565 555 598 600 521 123
231 201904 . . . . . . . 500 25 365
367 201806 . . 100 256 586 265 254 2156 145 241
856 201908 . . . . . . . . . 255
245 201812 . . . . . 258 125 582 254 153
;
run;
proc transpose data=have out=temp;
by id startdate notsorted;
var _201802 -- _201908;
run;
proc transpose data=temp(where=(col1 is not missing)) out=want prefix=refresh_;
by id startdate notsorted;
var col1;
run;
data test;
infile cards dsd firstobs=2;
input ID Start:B8601DA. S201802 S201804 S201806 S201808 S201810 S201812 S201902 S201904 S201906 S201908;
format start B8601DA10.;
cards;
ID, Start Date, 201802, 201804, 201806, 201808, 201810, 201812, 201902, 201904, 201906, 201908
112, 201802, 500, 210, 525, 555, 565, 555, 598, 600, 521, 123
231, 201904, ., ., ., ., ., ., ., 500, 25, 365
367, 201806, ., ., 100, 256, 586, 265, 254, 2156, 145, 241
856, 201908, ., ., ., ., ., ., ., ., ., 255
245, 201812, ., ., ., ., ., 258, 125, 582, 254, 153
;;;;
run;
proc print;
run;
proc transpose data=test out=test2(where=(not missing(col1)));
by id start notsorted;
run;
proc transpose data=test2 out=test2(drop=_name_) prefix=refresh ;
by id start notsorted;
run;
proc print;
run;
Second what all the others said, having data (dates) in structure (variable names) is always a bad idea and only useful for human consumption of final results. During processing and storage, always use a "long" format.
See Maxims 19 and 33.
Hello @SNG1
My share of fun. I'd rather not recommend unless you know this and it's safer to use linear loop method, but FWIW
data have;
infile cards expandtabs truncover firstobs=2;
input (ID StartDate) ($) _201802 _201804 _201806 _201808 _201810 _201812 _201902 _201904 _201906 _201908;
cards;
ID Start Date 201802 201804 201806 201808 201810 201812 201902 201904 201906 201908
112 201802 500 210 525 555 565 555 598 600 521 123
231 201904 . . . . . . . 500 25 365
367 201806 . . 100 256 586 265 254 2156 145 241
856 201908 . . . . . . . . . 255
245 201812 . . . . . 258 125 582 254 153
;
data want;
set have;
array t _201802--_201908;
__t1=whichn(coalesce(of t(*)),of t(*));
__k=dim(t)-__t1 +1;
__j=peekclong(addrlong(t(__t1)),8*__k);
call missing(of t(*));
call pokelong(__j,addrlong(t(1)),8*__k);
drop __:;
run;
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.