can u help me trough this
I wrote code like this but its not the efficient code
data AU CANADA France Germany NZ UK US;
set myhome.product;
if country = 'Australia' then output AU;
if country = 'Canada' then output Canada;
if country = 'France' then output France;
if country = 'Germany' then output Germany;
if country = 'New Zealand' then output NZ;
if country = 'United Kingdom' then output UK;
if country = 'United States' then output US;
IF / THEN statements are very quick. It's surprising that you find the program slow. At any rate, adding ELSE would speed things up:
if country = 'Australia' then output AU;
ELSE if country = 'Canada' then output Canada;
ELSE if country = 'France' then output France;
ELSE if country = 'Germany' then output Germany;
ELSE if country = 'New Zealand' then output NZ;
ELSE if country = 'United Kingdom' then output UK;
ELSE if country = 'United States' then output US;
In addition, you could change the order of the statements if you know something about your data. Right now, the possibilities get evaluated in alphabetical order ("Australia" first, "United States" last). You could improve that by listing the possibilities in order from most frequently occurring to least frequently occurring.
can you help me to write more efficient code , what if number of countries are more than 100
here there are only few of them
so i can directly use the if else statement
It sounds like your definition of "efficient" means easy for you to write, not code that runs faster.
We can always find a way to make the programming easier, but the program will become more complex and might take longer to run. If that's your objective, start by creating a SAS data set that contains two variables: the country (such as "United States") and the data set name you want to use for that country (such as "US"). Then we can worry about how to automate creation of the IF/THEN statements.
actually u didnt get my point, what i m trying to tell u that in this case i know the name of countries in advance that my data set have this this countries. what if i dont know the countries and a that time i have to create separate data set for each country
@Karan_Dumbre wrote:
actually u didnt get my point, what i m trying to tell u that in this case i know the name of countries in advance that my data set have this this countries. what if i dont know the countries and a that time i have to create separate data set for each country
I agree with what others have said, and I repeat their advice much more loudly: USE ONE DATASET, PROCESS USING BY GROUPS
Do you know all the possible countries in your myhome.product dataset in advance?
yes i know
but what if i dont know the countries . in that case what would be the solution
I agree with @RW9 - having a seperate file for each country isn't a good strategy and in the long run will lead to more complications than it resolves. However if you don't know which countries are in your file you find out as follows
proc sql;
create table countries
as select distinct country
from product;
quit;
Once you have that file you COULD, if you absolutely need to, write a macro to generate one file for each country using that as a control file but I'd strongly advise against it.
@Karan_Dumbre wrote:
yes i know
but what if i dont know the countries . in that case what would be the solution
Do you imply that you want to do the same thing for every country in the data? The typical approach for that is to use BY group processing.
Something like:
Proc sort data=product;
by country;
run;
Proc Print data=product;
by country;
run;
Almost every report or analysis procedure supports BY groups and it is much easier (and "efficient") in general to use By groups for analysis in one data set and then split the results to different pages of a report or similar based on the by groups as the last step.
My first question is going to be why. Its rarely, if ever a good idea to split data into separate datasets - consider if you then want to print them, you have to write code to proc print each dataset - whereas with one dataset you can use by groups. Just one example of why one dataset is better than many.
The second quesiton is going to be, how do you know that Canada should go out to Canada and no CA, as some have two letter codes others do not - no consitency.
Third question, do you know all the combinations, what about countries you haven't though of - again this ties in with splitting data, by grouping you don't need to know this up front.
Finally for multiple ifs, select may be simpler:
data au canada...; set myhome.product; select (country); when ("Australia") output au; when ("Canada") output canada; ... otherwise; end; run;
I have to agree with lots of others that splitting a data set for perceived efficiency is usually not needed. But, sometimes we don't get to choose our tasks.
This blog post presents a few lines of code that uses PROC SQL to generate the DATA steps that will split your data. It's not the most efficient in performance, but it requires almost no additional planning or data prep. Other experts chimed in on the comments with their own ideas.
Code:
/* define which libname.member table, and by which column */
%let TABLE=sashelp.cars;
%let COLUMN=origin;
proc sql noprint;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct
cat("DATA out_",compress(&COLUMN.,,'kad'),
"; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
"')); run;") length=500 into :allsteps separated by ';'
from &TABLE.;
quit;
/* macro that includes the program we just generated */
%macro runSteps;
&allsteps.;
%mend;
/* and...run the macro when ready */
%runSteps;
@Karan_Dumbre To your question- "what if i dont know the countries . in that case what would be the solution?".
Are you comfortable with Hashes? If yes, here you go-
/*This is super simple and fully dynamic*/
proc sort data=product;
by country;
run;
data _null_;
if _n_=1 then do;
if 0 then set product;
declare hash h(dataset:'product(obs=0)',multidata:'yes' );
h.defineKey('country');
h.defineData(all:'yes');
h.defineDone();
end;
set product;
by country;
if first.country then h.clear();
h.add();
if last.country then h.output(dataset:'country'||compress(country));
run;
warning: fragile code!
select cat(...) %*fragile!;
into :mvar
@ChrisHemedinger, remember that the cat* functions create text whose default length is $200
and when truncation occurs --- you stuffed tooooo much text into the function call ---
no warning is given
Ron Fehd been burned by this very trick maven
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.