I am a beginner and would you explain what this code does in English?
data monthly;
set test.monthly_exch_mv(where=(_type_=3));
by yyyymm exchange;
format numberofFirms comma8.0;
drop _type_;
run;
Within a DATA step, _TYPE_ is nothing special. It's just the name of a variable that exists within the incoming data set TEST.MONTHLY_EXCH_MV.
The WHERE clause subsets which observations should be read in from that incoming data set.
The harder part is what you don't see here. How did _TYPE_ get created within the data, what values does it take on, and why read in those observations where _TYPE_ is 3? For that, you will need to do a little bit of studying (perhaps more than a little), but I can point you in the right direction. Almost certainly, there is an earlier PROC MEANS or PROC SUMMARY that creates _TYPE_. Look at the documentation for either of those procedures (they perform the same calculations, so it doesn't matter which one you choose). In particular, look at the effects of adding a CLASS statement.
Experiment with a few PROC SUMMARY examples, to get a feel for the values of _TYPE_. It may not be the easiest thing in the world, but it is worthwhile to spend the time.
Good luck.
Yes, a special variable named _type_ is created by several SAS procedures. ALSO, a variable named _type_ can be created in a data step:
For example:
data test;
input sales_id $
sales_jn
sales_fe
sales_mr;
datalines;
W6790 50 400 350
W7693 25 100 125
W1387 99 300 250
;
run;
data tot;
set test;
_type_ = sales_jn + sales_fe + sales_mr;
run;
data sel;
set tot(where=(_type_=250));
run;
OUTPUT: test
sales_id | sales_jn | sales_fe | sales_mr |
W6790 | 50 | 400 | 350 |
W7693 | 25 | 100 | 125 |
W1387 | 99 | 300 | 250 |
OUTPUT: tot
sales_id | sales_jn | sales_fe | sales_mr | _type_ |
W6790 | 50 | 400 | 350 | 800 |
W7693 | 25 | 100 | 125 | 250 |
W1387 | 99 | 300 | 250 | 649 |
OUTPUT: sel
sales_id | sales_jn | sales_fe | sales_mr | _type_ |
W7693 | 25 | 100 | 125 | 250 |
Filters the data based on the _TYPE_ variable, but since we don't know the source data we can't comment. There are several procs that add a _TYPE_ variable. Do you know which one was used to create the input data set, test.monthly_exch_mv?
data monthly; /* You are creating an output data set named MONTHLY; This is a temporary data set and will be stored in the WORK library; It gets deleted once you terminate your sas session */
set test.monthly_exch_mv(where=(_type_=3)); /*
#1 test.monthly_exch_mv is your source data set. You applied a filter where you want the records to be _type_=3 only
#TEST - is a permanent library which is assigned in the libname statement
A libname statement is an alias to a path or a folder location;
Once you terminate your sas session the dataset monthly_exch_mv will still be there "physically" (as stated in the location defined in your libname TEST statement)
*/
by yyyymm exchange; /* I presume here the variables yyyymm and exchange were pre-sorted, serves as a grouping or classification variables but does not seem to have any effect in your succeeding statements below */
format numberofFirms comma8.0; /* format statement is used because you want the value to appear in max length of 8 with commas)
drop _type_; (you do not want to see this variable in your output dataset */
run; /* execute and termiantes the data step */
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.