Hello,
I have a database who looks like this:
Years | pf1 | pf2 | pf3 | pf4 | pf5 | pf6 | pf7 | pf8 | … | pf119 |
2006 | 0,99995 | 0,99984 | 0,99969 | 0,99954 | 0,99939 | 0,99924 | 0,99909 | 0,99894 | 0,99864 | |
2007 | 0,90419475 | 0,86805513 | 0,95407293 | 0,8676255 | 0,9106782 | 0,89055109 | 0,99534984 | 0,82628474 | 0,85245713 | |
2008 | 0,9547502 | 0,86854718 | 0,80312147 | 0,94014155 | 0,92665103 | 0,97910746 | 0,95293207 | 0,94262974 | 0,86068283 | |
2009 | 0,84331363 | 0,81522629 | 0,98906643 | 0,88725626 | 0,94812541 | 0,85975439 | 0,93789548 | 0,98424024 | 0,82137827 | |
2010 | 0,83663445 | 0,84507837 | 0,91888898 | 0,95447863 | 0,87323238 | 0,98027643 | 0,93558909 | 0,85553856 | 0,97474632 | |
2011 | 0,86543087 | 0,9058221 | 0,80951478 | 0,97289518 | 0,80704643 | 0,94382676 | 0,89678733 | 0,93423198 | 0,89965292 | |
2012 | 0,91703081 | 0,88559104 | 0,8006615 | 0,89746979 | 0,96224727 | 0,93283978 | 0,99704944 | 0,95622047 | 0,92032314 | |
… | ||||||||||
2056 | 0,91923551 | 0,98895088 | 0,94733737 | 0,88924355 | 0,84496988 | 0,88136723 | 0,86645268 | 0,88707045 | 0,9918978 |
I would like to create a single database for each line (without the first column, which is the name of the file), i.e.:
2006.sas7bdat:
pf1 | pf2 | pf3 | pf4 | pf5 | pf6 | pf7 | pf8 | … | pf119 |
0,99995 | 0,99984 | 0,99969 | 0,99954 | 0,99939 | 0,99924 | 0,99909 | 0,99894 | 0,99864 |
2007.sas7bdat:
pf1 | pf2 | pf3 | pf4 | pf5 | pf6 | pf7 | pf8 | … | pf119 |
0,90419475 | 0,86805513 | 0,95407293 | 0,8676255 | 0,9106782 | 0,89055109 | 0,99534984 | 0,82628474 | 0,85245713 |
etc.
Is there a way to do this with a macro or something. Otherwise, it's not very efficient to do it manually.
Thanks
I would have done it this way.
%Macro temp;
Data _null_;
Set Temp end = eof;
count+1;
if eof then call symput('c_yr', compress(count));
Run;
Proc sql;
select years into : yr1 - :yr&c_yr from temp;
quit;
%do i= 1 %to &c_yr;
Data Y&&yr&i.;
Set temp ;
where years eq &&yr&i.;
Run;
%end;
%Mend;
%temp;
data have;
input year p1;
cards;
2005 234.1
2006 456.4
2007 786.1
;
proc sql noprint;
select year into:key separated by ' '
from have;
quit;
%macro dist;
%do i=1 %to %sysfunc(countw(&key));
%let dsn=%scan(&key,&i);
data _&dsn;
set have;
where year=&dsn;
drop year;
run;
%end;
%mend;
%dist
I would have done it this way.
%Macro temp;
Data _null_;
Set Temp end = eof;
count+1;
if eof then call symput('c_yr', compress(count));
Run;
Proc sql;
select years into : yr1 - :yr&c_yr from temp;
quit;
%do i= 1 %to &c_yr;
Data Y&&yr&i.;
Set temp ;
where years eq &&yr&i.;
Run;
%end;
%Mend;
%temp;
Thanks all. This forum is wonderful. Hopefully I would be as good programer as you are in the next years.
I think if you move the WHERE statements to the DATA statement as a data set options for each data set name you generate you could create all the data sets with one pass of the data.
Yes, Data_null_ , you are right. that would make it fast.
Hi Bhpinder,
You can combine your two steps:
Data _null_;
Set Temp end = eof;
count+1;
if eof then call symput('c_yr', compress(count));
Run;
Proc sql;
select years into : yr1 - :yr&c_yr from temp;
quit;
into one:
Data _null_;
Set Temp end = eof;
call symputx(cats('yr',_n_),years);
if eof then call symputx('c_yr',_n_);
Run;
Thank you so much Linlin. I did not know about call symputx. It's such a great place to learn.
The macro solutions are great for such problems.
Incidentally, the hash-of-hashes (HOH) technique was first demonstrated by Richard DeVenezia for dataset splitting.
I have used a similar HOH approach, using dataset functions to read in data:
data have;
infile datalines dsd dlm=" ";
input Years $ (pf1-pf8) ($);
datalines;
2006 0,99995 0,99984 0,99969 0,99954 0,99939 0,99924 0,99909 0,99894 0,99864
2007 0,90419475 0,86805513 0,95407293 0,8676255 0,9106782 0,89055109 0,99534984 0,82628474 0,85245713
2008 0,9547502 0,86854718 0,80312147 0,94014155 0,92665103 0,97910746 0,95293207 0,94262974 0,86068283
2009 0,84331363 0,81522629 0,98906643 0,88725626 0,94812541 0,85975439 0,93789548 0,98424024 0,82137827
2010 0,83663445 0,84507837 0,91888898 0,95447863 0,87323238 0,98027643 0,93558909 0,85553856 0,97474632
2011 0,86543087 0,9058221 0,80951478 0,97289518 0,80704643 0,94382676 0,89678733 0,93423198 0,89965292
2012 0,91703081 0,88559104 0,8006615 0,89746979 0,96224727 0,93283978 0,99704944 0,95622047 0,92032314
;
run;
data _null_;
length pf $15;
if(1=2) then set have;
declare hash p (ordered:"a");
p.defineKey("years");
p.defineData("years","ObjPF");
p.defineDone();
declare hiter hip("p");
declare hash ObjPF;
*****Bring in data using data access functions**********************;
dsid=open("have");
nrows=attrn(dsid,"NOBS");
ncols=attrn(dsid,"NVAR");
do x=1 to nrows;
rc=fetchobs(dsid,x);
years=getvarc(dsid,1);
nametext="YR"||strip(left(years));
rc=fetchobs(dsid,x);
do y=2 to ncols;
pf=getvarc(dsid,y);
***** Create Hash-of-Hashes*********************************;
if p.find() ne 0 then do;
ObjPF=_new_ hash(ordered:"a",multidata:"y");
ObjPF.defineKey("y");
ObjPF.defineData("pf");
ObjPF.defineDone();
p.replace();
end; *for p.find;
ObjPF.replace();
end; *for y;
end; *for x;
****hash-of-hashes readout and dataset creation*********************;
do rc = hip.first() by 0 while (rc = 0);
ObjPF.output (dataset:'YR_'||years);
rc = hip.next();
end; *for hip.first;
stop;
run;
This might be an easier hash solution:
proc sql noprint;
select cats("'",name,"'")
into :vars
separated by ","
from dictionary.columns
where libname="WORK" and
memname="HAVE" and
upcase(name) ne "YEARS"
;
quit;
data _null_ ;
dcl hash hid (ordered: 'a') ;
hid.definekey ('years') ;
hid.definedata (&vars. ) ;
hid.definedone ( ) ;
do until ( last.years ) ;
set have ;
by years ;
hid.add() ;
end ;
hid.output (dataset: "_"||put (years, best.-l)) ;
run;
Another way, inspired by Linlin's :
data have;
input year p1;
cards;
2005 234.1
2006 456.4
2007 786.1
;
proc sql noprint;
select
cats("_",year),
cats("when(",year,") output _",year)
into
:dts separated by ' ',
:when separated by ';'
from have;
quit;
data &dts.;
set have;
select (year);
&when.;
end;
drop year;
run;
PG
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.