Hi SAS Community!
I was hoping someone could help me with the following issue.
I have 2 data sets, Dataset 1 and Dataset 2. I need to use Dataset 2 to manipulate the data in Dataset 1 to create Dataset 3. Dataset 2 has the Start time of the data and I need all the data that is populated before the Start time for each variable to be changed to blank. Below is the visuals for this process.
| Dataset 1 | |||||
| Time | T1 | T2 | T3 | T4 | T5 |
| 1/1/2017 0:00 | 6.818272 | 4.030038 | 8.213272 | 0.060326 | 9.521269 |
| 1/1/2017 1:00 | 0.100979 | 3.009709 | 2.359012 | 1.749067 | 8.0581 |
| 1/1/2017 2:00 | 1.20293 | 0.524906 | 0.686842 | 2.561308 | 3.58268 |
| 1/1/2017 3:00 | 0.178333 | 8.815986 | 7.657224 | 5.581614 | 1.142414 |
| 1/1/2017 4:00 | 6.654463 | 2.398171 | 2.696395 | 2.181824 | 2.000096 |
| 1/1/2017 5:00 | 5.428405 | 7.665879 | 7.996566 | 9.112351 | 3.012801 |
| 1/1/2017 6:00 | 7.710257 | 1.296825 | 7.965649 | 1.278932 | 8.198926 |
| 1/1/2017 7:00 | 5.733929 | 8.763634 | 7.37632 | 4.684965 | 6.252313 |
| 1/1/2017 8:00 | 0.173659 | 8.191108 | 7.592309 | 0.764799 | 1.81264 |
| 1/1/2017 9:00 | 2.782826 | 3.713065 | 8.526002 | 1.06912 | 9.897541 |
| 1/1/2017 10:00 | 4.46292 | 4.154905 | 4.766856 | 2.918054 | 1.398898 |
| 1/1/2017 11:00 | 1.949195 | 1.306601 | 4.989425 | 6.146979 | 6.172791 |
| 1/1/2017 12:00 | 9.865355 | 5.047937 | 2.965091 | 8.109938 | 0.024781 |
| 1/1/2017 13:00 | 6.361114 | 0.850972 | 3.595049 | 1.18486 | 3.629694 |
| 1/1/2017 14:00 | 7.494559 | 7.221369 | 9.410831 | 1.467266 | 2.488785 |
| 1/1/2017 15:00 | 2.129789 | 5.699575 | 0.627782 | 5.550582 | 0.275467 |
| 1/1/2017 16:00 | 9.046102 | 5.785753 | 0.764385 | 5.695247 | 6.812834 |
| 1/1/2017 17:00 | 8.056089 | 2.70044 | 0.916423 | 8.568327 | 3.729957 |
| 1/1/2017 18:00 | 9.318779 | 7.345054 | 5.430452 | 0.23345 | 0.561244 |
| 1/1/2017 19:00 | 4.827045 | 9.602832 | 1.841213 | 9.62645 | 8.05601 |
| 1/1/2017 20:00 | 1.623731 | 9.719941 | 4.036305 | 2.812807 | 7.412819 |
| 1/1/2017 21:00 | 2.199275 | 7.198554 | 0.28249 | 0.405406 | 6.038437 |
| 1/1/2017 22:00 | 8.08961 | 6.457083 | 8.185543 | 7.05954 | 5.085864 |
| 1/1/2017 23:00 | 3.040053 | 7.585161 | 8.839786 | 2.42752 | 7.113683 |
| Dataset 2 | |
| Name | Start |
| T1 | 1/1/2017 7:00 |
| T2 | 1/1/2017 2:00 |
| T3 | 1/1/2017 15:00 |
| T4 | 1/1/2017 0:00 |
| T5 | 1/1/2017 20:00 |
This is the desired output:
| Final Dataset | |||||
| Time | T1 | T2 | T3 | T4 | T5 |
| 1/1/2017 0:00 | 0.060326 | ||||
| 1/1/2017 1:00 | 1.749067 | ||||
| 1/1/2017 2:00 | 0.524906 | 2.561308 | |||
| 1/1/2017 3:00 | 8.815986 | 5.581614 | |||
| 1/1/2017 4:00 | 2.398171 | 2.181824 | |||
| 1/1/2017 5:00 | 7.665879 | 9.112351 | |||
| 1/1/2017 6:00 | 1.296825 | 1.278932 | |||
| 1/1/2017 7:00 | 5.733929 | 8.763634 | 4.684965 | ||
| 1/1/2017 8:00 | 0.173659 | 8.191108 | 0.764799 | ||
| 1/1/2017 9:00 | 2.782826 | 3.713065 | 1.06912 | ||
| 1/1/2017 10:00 | 4.46292 | 4.154905 | 2.918054 | ||
| 1/1/2017 11:00 | 1.949195 | 1.306601 | 6.146979 | ||
| 1/1/2017 12:00 | 9.865355 | 5.047937 | 8.109938 | ||
| 1/1/2017 13:00 | 6.361114 | 0.850972 | 1.18486 | ||
| 1/1/2017 14:00 | 7.494559 | 7.221369 | 1.467266 | ||
| 1/1/2017 15:00 | 2.129789 | 5.699575 | 0.627782 | 5.550582 | |
| 1/1/2017 16:00 | 9.046102 | 5.785753 | 0.764385 | 5.695247 | |
| 1/1/2017 17:00 | 8.056089 | 2.70044 | 0.916423 | 8.568327 | |
| 1/1/2017 18:00 | 9.318779 | 7.345054 | 5.430452 | 0.23345 | |
| 1/1/2017 19:00 | 4.827045 | 9.602832 | 1.841213 | 9.62645 | |
| 1/1/2017 20:00 | 1.623731 | 9.719941 | 4.036305 | 2.812807 | 7.412819 |
| 1/1/2017 21:00 | 2.199275 | 7.198554 | 0.28249 | 0.405406 | 6.038437 |
| 1/1/2017 22:00 | 8.08961 | 6.457083 | 8.185543 | 7.05954 | 5.085864 |
| 1/1/2017 23:00 | 3.040053 | 7.585161 | 8.839786 | 2.42752 | 7.113683 |
Hopefully someone out there can help on this!
Thank you!
Here is one way. A nice problem to use the array lookup
data dataset1;
input Time:anydtdtm. T1-T5;
format Time datetime20.;
infile datalines dlm=',';
datalines;
1/1/2017 0:00,6.818272,4.030038,8.213272,0.060326,9.521269
1/1/2017 1:00,0.100979,3.009709,2.359012,1.749067,8.0581
1/1/2017 2:00,1.20293,0.524906,0.686842,2.561308,3.58268
1/1/2017 3:00,0.178333,8.815986,7.657224,5.581614,1.142414
1/1/2017 4:00,6.654463,2.398171,2.696395,2.181824,2.000096
1/1/2017 5:00,5.428405,7.665879,7.996566,9.112351,3.012801
1/1/2017 6:00,7.710257,1.296825,7.965649,1.278932,8.198926
1/1/2017 7:00,5.733929,8.763634,7.37632,4.684965,6.252313
1/1/2017 8:00,0.173659,8.191108,7.592309,0.764799,1.81264
1/1/2017 9:00,2.782826,3.713065,8.526002,1.06912,9.897541
1/1/2017 10:00,4.46292,4.154905,4.766856,2.918054,1.398898
1/1/2017 11:00,1.949195,1.306601,4.989425,6.146979,6.172791
1/1/2017 12:00,9.865355,5.047937,2.965091,8.109938,0.024781
1/1/2017 13:00,6.361114,0.850972,3.595049,1.18486,3.629694
1/1/2017 14:00,7.494559,7.221369,9.410831,1.467266,2.488785
1/1/2017 15:00,2.129789,5.699575,0.627782,5.550582,0.275467
1/1/2017 16:00,9.046102,5.785753,0.764385,5.695247,6.812834
1/1/2017 17:00,8.056089,2.70044,0.916423,8.568327,3.729957
1/1/2017 18:00,9.318779,7.345054,5.430452,0.23345,0.561244
1/1/2017 19:00,4.827045,9.602832,1.841213,9.62645,8.05601
1/1/2017 20:00,1.623731,9.719941,4.036305,2.812807,7.412819
1/1/2017 21:00,2.199275,7.198554,0.28249,0.405406,6.038437
1/1/2017 22:00,8.08961,6.457083,8.185543,7.05954,5.085864
1/1/2017 23:00,3.040053,7.585161,8.839786,2.42752,7.113683
;
data dataset2;
input Name $ Time:anydtdtm.;
infile datalines dlm=',';
format Time datetime20.;
datalines;
T1,1/1/2017 7:00
T2,1/1/2017 2:00
T3,1/1/2017 15:00
T4,1/1/2017 0:00
T5,1/1/2017 20:00
;
data want(drop= i j name);
array lookup{5} _temporary_;
do i=1 by 1 until(eof1);
set dataset2 end=eof1;
lookup[i]=Time;
end;
array T{5};
do until(eof2);
set dataset1 end=eof2;
do j=1 to dim(lookup);
if Time lt lookup[j] then T[j]=.;
end;
output;
end;
run;
Here is one way. A nice problem to use the array lookup
data dataset1;
input Time:anydtdtm. T1-T5;
format Time datetime20.;
infile datalines dlm=',';
datalines;
1/1/2017 0:00,6.818272,4.030038,8.213272,0.060326,9.521269
1/1/2017 1:00,0.100979,3.009709,2.359012,1.749067,8.0581
1/1/2017 2:00,1.20293,0.524906,0.686842,2.561308,3.58268
1/1/2017 3:00,0.178333,8.815986,7.657224,5.581614,1.142414
1/1/2017 4:00,6.654463,2.398171,2.696395,2.181824,2.000096
1/1/2017 5:00,5.428405,7.665879,7.996566,9.112351,3.012801
1/1/2017 6:00,7.710257,1.296825,7.965649,1.278932,8.198926
1/1/2017 7:00,5.733929,8.763634,7.37632,4.684965,6.252313
1/1/2017 8:00,0.173659,8.191108,7.592309,0.764799,1.81264
1/1/2017 9:00,2.782826,3.713065,8.526002,1.06912,9.897541
1/1/2017 10:00,4.46292,4.154905,4.766856,2.918054,1.398898
1/1/2017 11:00,1.949195,1.306601,4.989425,6.146979,6.172791
1/1/2017 12:00,9.865355,5.047937,2.965091,8.109938,0.024781
1/1/2017 13:00,6.361114,0.850972,3.595049,1.18486,3.629694
1/1/2017 14:00,7.494559,7.221369,9.410831,1.467266,2.488785
1/1/2017 15:00,2.129789,5.699575,0.627782,5.550582,0.275467
1/1/2017 16:00,9.046102,5.785753,0.764385,5.695247,6.812834
1/1/2017 17:00,8.056089,2.70044,0.916423,8.568327,3.729957
1/1/2017 18:00,9.318779,7.345054,5.430452,0.23345,0.561244
1/1/2017 19:00,4.827045,9.602832,1.841213,9.62645,8.05601
1/1/2017 20:00,1.623731,9.719941,4.036305,2.812807,7.412819
1/1/2017 21:00,2.199275,7.198554,0.28249,0.405406,6.038437
1/1/2017 22:00,8.08961,6.457083,8.185543,7.05954,5.085864
1/1/2017 23:00,3.040053,7.585161,8.839786,2.42752,7.113683
;
data dataset2;
input Name $ Time:anydtdtm.;
infile datalines dlm=',';
format Time datetime20.;
datalines;
T1,1/1/2017 7:00
T2,1/1/2017 2:00
T3,1/1/2017 15:00
T4,1/1/2017 0:00
T5,1/1/2017 20:00
;
data want(drop= i j name);
array lookup{5} _temporary_;
do i=1 by 1 until(eof1);
set dataset2 end=eof1;
lookup[i]=Time;
end;
array T{5};
do until(eof2);
set dataset1 end=eof2;
do j=1 to dim(lookup);
if Time lt lookup[j] then T[j]=.;
end;
output;
end;
run;
Thank you for all your help! Quick question. The dataset I showed only had 5 observations. My actual dataset has thousands each with unique names that is not as simple as T1, T2, and so on. What can I change the array to so it handles this?
@Singham20 can you post some data that represents this? Not with thousands of unique variables though.
So here is how the variable look. I am just showing 5, however there are 24,785 variables.
| Time | _48164_1 | _W468s_1 | _di4ek452_5 | _47816_1 | _zidle_45 |
| 1/1/2017 0:00 | 6.818272 | 4.030038 | 8.213272 | 0.060326 | 9.521269 |
1/1/2017 1:00 | 0.100979 | 3.009709 | 2.359012 | 1.749067 | 8.0581 |
| 1/1/2017 2:00 | 1.20293 | 0.524906 | 0.686842 | 2.561308 | 3.58268 |
| 1/1/2017 3:00 | 0.178333 | 8.815986 | 7.657224 | 5.581614 | 1.142414 |
| 1/1/2017 4:00 | 6.654463 | 2.398171 | 2.696395 | 2.181824 | 2.000096 |
| 1/1/2017 5:00 | 5.428405 | 7.665879 | 7.996566 | 9.112351 | 3.012801 |
| 1/1/2017 6:00 | 7.710257 | 1.296825 | 7.965649 | 1.278932 | 8.198926 |
| 1/1/2017 7:00 | 5.733929 | 8.763634 | 7.37632 | 4.684965 | 6.252313 |
| 1/1/2017 8:00 | 0.173659 | 8.191108 | 7.592309 | 0.764799 | 1.81264 |
| 1/1/2017 9:00 | 2.782826 | 3.713065 | 8.526002 | 1.06912 | 9.897541 |
| 1/1/2017 10:00 | 4.46292 | 4.154905 | 4.766856 | 2.918054 | 1.398898 |
| 1/1/2017 11:00 | 1.949195 | 1.306601 | 4.989425 | 6.146979 | 6.172791 |
| 1/1/2017 12:00 | 9.865355 | 5.047937 | 2.965091 | 8.109938 | 0.024781 |
| 1/1/2017 13:00 | 6.361114 | 0.850972 | 3.595049 | 1.18486 | 3.629694 |
| 1/1/2017 14:00 | 7.494559 | 7.221369 | 9.410831 | 1.467266 | 2.488785 |
| 1/1/2017 15:00 | 2.129789 | 5.699575 | 0.627782 | 5.550582 | 0.275467 |
| 1/1/2017 16:00 | 9.046102 | 5.785753 | 0.764385 | 5.695247 | 6.812834 |
| 1/1/2017 17:00 | 8.056089 | 2.70044 | 0.916423 | 8.568327 | 3.729957 |
| 1/1/2017 18:00 | 9.318779 | 7.345054 | 5.430452 | 0.23345 | 0.561244 |
| 1/1/2017 19:00 | 4.827045 | 9.602832 | 1.841213 | 9.62645 | 8.05601 |
| 1/1/2017 20:00 | 1.623731 | 9.719941 | 4.036305 | 2.812807 | 7.412819 |
| 1/1/2017 21:00 | 2.199275 | 7.198554 | 0.28249 | 0.405406 | 6.038437 |
| 1/1/2017 22:00 | 8.08961 | 6.457083 | 8.185543 | 7.05954 | 5.085864 |
| 1/1/2017 23:00 | 3.040053 | 7.585161 | 8.839786 | 2.42752 | 7.113683 |
| 1/1/2017 23:00 | 3.040053 | 7.585161 | 8.839786 | 2.42752 | 7.113683 |
If your data is representative and all of the variables of interest have prefix _, then simply do
data want(drop= i j name);
array lookup{5} _temporary_;
do i=1 by 1 until(eof1);
set dataset2 end=eof1;
lookup[i]=Time;
end;
do until(eof2);
set dataset1 end=eof2;
array T{*} _:;
do j=1 to dim(lookup);
if Time lt lookup[j] then T[j]=.;
end;
output;
end;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.