Hi, I have a question about how to deal with each column using macro. Here is what the sample looks like:
date | time | interval | weekday | fj_dummy | jump | surec1 | surec2 | surus1 | surus2 | surus3 | suruk1 | suruk2 | |||
11/1/2004 | 09:00.0 | 2 | 2 | 0 | 0 | 2 | 1 | ||||||||
11/1/2004 | 14:00.0 | 3 | 2 | 0 | 0 | 1 | 3 | ||||||||
11/1/2004 | 19:00.0 | 4 | 2 | 0 | 0 | 1 | 2 | ||||||||
11/1/2004 | 24:00.0 | 5 | 2 | 0 | 0 | 4 | 1 | ||||||||
11/1/2004 | 29:00.0 | 6 | 2 | 0 | 0 |
For each time, I just want to deal with 'date','time','interval','weekday','fj_dummy','jump', and one of the rest of the columns. I only want to keep the observations that are non-missing and delete the missing observations. I realized I need to use macro since my whole sample is much bigger than the sample displayed here. I don't know within each 'ec','us','uk' category how to select the non-missing observations for each subcategory, which is 'surec1', 'surec2', etc.
If possible, can I export the files with macro for each subcategory and also keep 'date','time','interval','weekday','fj_dummy','jump' ? That is to say, in each file, I have date','time','interval','weekday','fj_dummy','jump' and one of the rests of columns.
This is one of the csv files called EUR, I also have two other csv files with the same structure as this. So I'm wondering in this case, do I need more than one macro?
Here is the code I wrote, but I was stuck here for a long time. Any help would be appreciated.
%let path=E:\one_drive\OneDrive\RA\Jumps; /*Set path*/
libname RA "&path";
%Macro news(cur);
PROC IMPORT OUT= RA.&cur._jump
DATAFILE= "&path\Data\&cur._jump_news_abs1.csv"
DBMS=CSV REPLACE;
GETNAMES=yes;
DATAROW=2;
RUN;
DATA RA.&cur._jump;
set RA.&cur._jump;
IF fj_dummy=0 THEN DELETE;
RUN;
%Mend news;
%news (EUR);
%news (GBP);
%news (JPY);
%Macro currency(cur);
data RA.&cur._jump;
set RA.&cur._jump;
%macro surprise(cur,n_f,n_e);
%do i=&n_f %to &n_e;
IF sur&cur.&i.=. THEN sur&cur.&i.=0;
%end;
%mend surprise;
%surprise(ec,1,16);
%surprise(us,1,43);
%surprise(fr,1,3);
%surprise(ge,1,13);
%surprise(it,1,8);
%surprise(jp,1,3);
%surprise(po,1,2);
%surprise(sp,1,4);
%surprise(uk,1,13);
run;
%Mend currency;
%currency(EUR);
%currency(GBP);
%currency(JPY);
%Macro currency(cur);
%Macro country(con);
DATA RA.&cur._jump_sur&con.;
set RA.&cur._jump;
Keep Date time interval weekday fj_dummy jump sur&con.:;
RUN;
%Mend country;
%country(ec);
%country(us);
%country(fr);
%country(ge);
%country(it);
%country(jp);
%country(po);
%country(sp);
%country(uk);
%Mend currency;
%currency(EUR);
%currency(GBP);
%currency(JPY);
I'm afraid your explanation of the desired result isn't clear.
and one of the rest of the column
Which one? you didn't tell us
I only want to keep the observations that are non-missing and delete the missing observations.
What is an observation that is non-missing in your example? I don't see one.
I don't know within each 'ec','us','uk' category
There are no such categories that I can see
Show us, for this example data, what is the desired output. Type it in, show us.
Sure, 'one of the rest of column' means choose from 'surec1', 'surec2', 'surec3', 'surus1',' surus2', and so on.
where 'sur' means 'surprise', 'ec', 'us' are the code for different region or countries. 1, 2, 3 are the different categories within each country. I have 9 countries in total and within each country, I have different number of categories. The sample I attached was cut by the page so that's why the sample didn't help much in explanation.
non-missing observations are the numbers in each column. the missing one are the blank cell in each column. The missing observations only appear under the column 'surec1', 'surec2' and so on.
The desired output would be as following.
date | time | interval | weekday | fj_dummy | jump | surec1 | surec2 | surus1 | surus2 | surus3 | suruk1 | suruk2 |
11/1/2004 | 09:00.0 | 2 | 2 | 0 | 0 | 2 | 1 | |||||
11/1/2004 | 14:00.0 | 3 | 2 | 0 | 0 | 1 | 3 | |||||
11/1/2004 | 19:00.0 | 4 | 2 | 0 | 0 | 1 | 2 | |||||
11/1/2004 | 24:00.0 | 5 | 2 | 0 | 0 | 4 | 1 | |||||
11/1/2004 | 29:00.0 | 6 | 2 | 0 | 0 | |||||||
Count | 1 | 2 | 1 | 0 | 1 | 1 | 2 |
Well I hope someone smarter than I am can figure out how to go from the inputs to the outputs.
In your original data, 11/1/2004 09:00.0, surec2 is blank, but in the output dataset surec2 is 2.
My head hurts.
Sorry, I forgot to update my input file. Now it should make more sense.
I really have no idea what your question is, but it you want to produce that line labeled COUNT in the spreadsheet like thing you posted then you might look at PROC MEANS/SUMMARY.
proc summary data=have ;
var sur: ;
output out=counts n= ;
run;
Thank you, Tom. I will try that.
Hi Tom, the log erro shows 'surec1' in list does not match type prescribed. Do you know why does that happen?
If you are reading this data in from Excel (or even sometimes if you are not), sometimes the columns that appear to us humans to be numbers are treated as characters values.
If that's the problem, you can force the columns to be numeric.
@Xusheng wrote:
Hi Tom, the log erro shows 'surec1' in list does not match type prescribed. Do you know why does that happen?
Sounds like the variable is character instead of numeric.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.