Re: create a new Dataset for every unique Country and it should have d...

Karan_Dumbre · Posted 09-06-2017 05:32 AM

can u help me trough this

I wrote code like this but its not the efficient code

data AU CANADA France Germany NZ UK US;
set myhome.product;
if country = 'Australia' then output AU;
if country = 'Canada' then output Canada;
if country = 'France' then output France;
if country = 'Germany' then output Germany;
if country = 'New Zealand' then output NZ;
if country = 'United Kingdom' then output UK;
if country = 'United States' then output US;

Astounding · Posted 09-06-2017 05:36 AM

IF / THEN statements are very quick. It's surprising that you find the program slow. At any rate, adding ELSE would speed things up:

if country = 'Australia' then output AU;
ELSE if country = 'Canada' then output Canada;
ELSE if country = 'France' then output France;
ELSE if country = 'Germany' then output Germany;
ELSE if country = 'New Zealand' then output NZ;
ELSE if country = 'United Kingdom' then output UK;
ELSE if country = 'United States' then output US;

In addition, you could change the order of the statements if you know something about your data. Right now, the possibilities get evaluated in alphabetical order ("Australia" first, "United States" last). You could improve that by listing the possibilities in order from most frequently occurring to least frequently occurring.

Karan_Dumbre · Posted 09-06-2017 05:40 AM

can you help me to write more efficient code , what if number of countries are more than 100

here there are only few of them

so i can directly use the if else statement

Astounding · Posted 09-06-2017 05:46 AM

It sounds like your definition of "efficient" means easy for you to write, not code that runs faster.

We can always find a way to make the programming easier, but the program will become more complex and might take longer to run. If that's your objective, start by creating a SAS data set that contains two variables: the country (such as "United States") and the data set name you want to use for that country (such as "US"). Then we can worry about how to automate creation of the IF/THEN statements.

Karan_Dumbre · Posted 09-06-2017 05:49 AM

actually u didnt get my point, what i m trying to tell u that in this case i know the name of countries in advance that my data set have this this countries. what if i dont know the countries and a that time i have to create separate data set for each country

PaigeMiller · Posted 09-06-2017 10:56 AM

@Karan_Dumbre wrote:

actually u didnt get my point, what i m trying to tell u that in this case i know the name of countries in advance that my data set have this this countries. what if i dont know the countries and a that time i have to create separate data set for each country

I agree with what others have said, and I repeat their advice much more loudly: USE ONE DATASET, PROCESS USING BY GROUPS

--
Paige Miller

PeterClemmensen · Posted 09-06-2017 05:37 AM

Do you know all the possible countries in your myhome.product dataset in advance?

The DATA to DATA Step Macro
Blog: SASnrd

Karan_Dumbre · Posted 09-06-2017 05:41 AM

yes i know

but what if i dont know the countries . in that case what would be the solution

ChrisBrooks · Posted 09-06-2017 05:54 AM

I agree with @RW9 - having a seperate file for each country isn't a good strategy and in the long run will lead to more complications than it resolves. However if you don't know which countries are in your file you find out as follows

proc sql;
	create table countries
	as select distinct country
	from product;
quit;

Once you have that file you COULD, if you absolutely need to, write a macro to generate one file for each country using that as a control file but I'd strongly advise against it.

Ron_MacroMaven · Posted 09-06-2017 04:15 PM

Macro_CallText has code for this in section 3.3 Splitting_a_Data_Set

ballardw · Posted 09-06-2017 10:48 AM

@Karan_Dumbre wrote:

yes i know

but what if i dont know the countries . in that case what would be the solution

Do you imply that you want to do the same thing for every country in the data? The typical approach for that is to use BY group processing.

Something like:

Proc sort data=product;

by country;

run;

Proc Print data=product;

by country;

run;

Almost every report or analysis procedure supports BY groups and it is much easier (and "efficient") in general to use By groups for analysis in one data set and then split the results to different pages of a report or similar based on the by groups as the last step.

RW9 · Posted 09-06-2017 05:41 AM

My first question is going to be why. Its rarely, if ever a good idea to split data into separate datasets - consider if you then want to print them, you have to write code to proc print each dataset - whereas with one dataset you can use by groups. Just one example of why one dataset is better than many.

The second quesiton is going to be, how do you know that Canada should go out to Canada and no CA, as some have two letter codes others do not - no consitency.

Third question, do you know all the combinations, what about countries you haven't though of - again this ties in with splitting data, by grouping you don't need to know this up front.

Finally for multiple ifs, select may be simpler:

data au canada...;
  set myhome.product;
  select (country);
    when ("Australia") output au;
    when ("Canada") output canada;
    ...
    otherwise;
  end;
run;

ChrisHemedinger · Posted 09-06-2017 04:34 PM

I have to agree with lots of others that splitting a data set for perceived efficiency is usually not needed. But, sometimes we don't get to choose our tasks.

This blog post presents a few lines of code that uses PROC SQL to generate the DATA steps that will split your data. It's not the most efficient in performance, but it requires almost no additional planning or data prep. Other experts chimed in on the comments with their own ideas.

Code:

/* define which libname.member table, and by which column */
%let TABLE=sashelp.cars;
%let COLUMN=origin;
 
proc sql noprint;
/* build a mini program for each value */
/* create a table with valid chars from data value */
select distinct 
   cat("DATA out_",compress(&COLUMN.,,'kad'),
   "; set &TABLE.(where=(&COLUMN.='", &COLUMN.,
   "')); run;") length=500 into :allsteps separated by ';' 
  from &TABLE.;
quit;
 
/* macro that includes the program we just generated */
%macro runSteps;
 &allsteps.;
%mend;
 
/* and...run the macro when ready */
%runSteps;

SAS For Dummies 3rd Edition! Check out the new edition, covering SAS 9.4, SAS Viya, and all of the modern ways to use SAS!

novinosrin · Posted 09-06-2017 05:32 PM

@Karan_Dumbre To your question- "what if i dont know the countries . in that case what would be the solution?".

Are you comfortable with Hashes? If yes, here you go-

/*This is super simple and fully dynamic*/

proc sort data=product;

by country;

run;

data _null_;

if _n_=1 then do;

if 0 then set product;

declare hash h(dataset:'product(obs=0)',multidata:'yes' );

h.defineKey('country');

h.defineData(all:'yes');

h.defineDone();

end;

set product;

by country;

if first.country then h.clear();

h.add();

if last.country then h.output(dataset:'country'||compress(country));

run;

Ron_MacroMaven · Posted 09-08-2017 10:50 AM

warning: fragile code!

select cat(...)  %*fragile!;
into :mvar

@ChrisHemedinger, remember that the cat* functions create text whose default length is $200

and when truncation occurs --- you stuffed tooooo much text into the function call ---

no warning is given

Ron Fehd been burned by this very trick maven

create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Re: create a new Dataset for every unique Country and it should have data of that specific country

Registration is open

SAS Training: Just a Click Away