Hi Experts,
Hope you are doing well and safe in this Covid19 Pandamic.
I want to read following csv files and create dataset. Please help me.
file:
student_id,mark1,mark2,mark3
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87
Required Output:
student_id | mark1 | mark2 | mark3 |
1A | 58 | 52 | 85 |
2B | 78 | 43 | 0 |
1C | 58 | 98 | 65 |
4A | 87 | 99 | 95 |
5D | 45 | 0 | 43 |
6S | 23 | 45 | 0 |
and so on.... for missing marks put 0.
please help to solve
Thanks in Advance
This one avoids looping thus helps understanding so much more easier
data want;
infile '\\corp\sites\RIB1001\HLSCreditRisk\NaveenS\file.csv' firstobs=2 dsd end=l;
length student_id $2;
array t(3) mark1-mark3;
retain student_id t;
input @;
if char(_infile_,lengthn(_infile_))=',' then
_infile_=substr(_infile_,1,findc(_infile_,',','b')-1);
input _temp $ @@;
if _n_>1 then n+1;
_temp2=mod(n,4);
if _temp2=0 then student_id=_temp;
else t(_temp2)=ifn(_temp>' ',input(_temp,32.),0);
if nmiss(of t(*))=0 ;
output;
call missing(of t(*));
keep student_id mark:;
run;
student_id | mark1 | mark2 | mark3 |
1A | 58 | 52 | 85 |
2B | 78 | 43 | 0 |
1C | 58 | 98 | 65 |
4A | 87 | 99 | 95 |
5D | 45 | 0 | 43 |
6S | 23 | 45 | 0 |
7F | 32 | 56 | 45 |
1D | 32 | 45 | 67 |
2C | 45 | 67 | 33 |
6G | 43 | 37 | 87 |
data want;
infile cards flowover;
input student_id $2. mark1 mark2 mark3 @@;
cards;
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87
;;;;
@Navnath_Gajare wrote:
Hi Experts,
Hope you are doing well and safe in this Covid19 Pandamic.
I want to read following csv files and create dataset. Please help me.
file:
student_id,mark1,mark2,mark3
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87
Required Output:
student_id mark1 mark2 mark3 1A 58 52 85 2B 78 43 0 1C 58 98 65 4A 87 99 95 5D 45 0 43 6S 23 45 0 and so on.... for missing marks put 0.
please help to solve
Thanks in Advance
Thanks for reply but not getting required output
data want;
infile cards flowover dsd;
length student_id $2.;
input student_id mark1 mark2 mark3 @@;
cards;
1A,58,52,
85,2B,78,
43,,1C,58,98,65,4A,
87,99,95,5D,45,,43,6S,
23,
45,,7F,32,56,45,
1D,
32,45,67,2C,45,67,33,6G,43
37,87
;;;;
;;;;
This gets you closer. What EOL characters do you have in your file?
Were these files ever processed by spreadsheet software?
Did people make manual entries or edits?
The real question is "why are there irregular placement of end of line characters?"
Very few systems do that unless people get involved and manually do stuff. If there have been manual edits it may be possible to get the data before such intervention took place and have something that is cleaner to read.
Hi @Navnath_Gajare It's a simple solution once @Reeza gave us the right trigger i.e. Flowover. The rest is a breeze making flowover flow through the loop. Have fun!
data want;
infile '\\corp\sites\RIB1001\HLSCreditRisk\NaveenS\file.csv' firstobs=2 dsd end=l;
length student_id $2;
array t(3) mark1-mark3;
do n=0 by 1 until(l and temp2=3);
input @;
if char(_infile_,lengthn(_infile_))=',' then
_infile_=substr(_infile_,1,findc(_infile_,',','b')-1);
input _temp $ @;
_temp2=mod(n,4);
if _temp2=0 then student_id=_temp;
else t(_temp2)=ifn(_temp>' ',input(_temp,32.),0);
if nmiss(of t(*))=0 then do;
output;
call missing(of t(*));
end;
end;
keep student_id mark:;
run;
student_id | mark1 | mark2 | mark3 |
1A | 58 | 52 | 85 |
2B | 78 | 43 | 0 |
1C | 58 | 98 | 65 |
4A | 87 | 99 | 95 |
5D | 45 | 0 | 43 |
6S | 23 | 45 | 0 |
7F | 32 | 56 | 45 |
1D | 32 | 45 | 67 |
2C | 45 | 67 | 33 |
6G | 43 | 37 | 87 |
This one avoids looping thus helps understanding so much more easier
data want;
infile '\\corp\sites\RIB1001\HLSCreditRisk\NaveenS\file.csv' firstobs=2 dsd end=l;
length student_id $2;
array t(3) mark1-mark3;
retain student_id t;
input @;
if char(_infile_,lengthn(_infile_))=',' then
_infile_=substr(_infile_,1,findc(_infile_,',','b')-1);
input _temp $ @@;
if _n_>1 then n+1;
_temp2=mod(n,4);
if _temp2=0 then student_id=_temp;
else t(_temp2)=ifn(_temp>' ',input(_temp,32.),0);
if nmiss(of t(*))=0 ;
output;
call missing(of t(*));
keep student_id mark:;
run;
student_id | mark1 | mark2 | mark3 |
1A | 58 | 52 | 85 |
2B | 78 | 43 | 0 |
1C | 58 | 98 | 65 |
4A | 87 | 99 | 95 |
5D | 45 | 0 | 43 |
6S | 23 | 45 | 0 |
7F | 32 | 56 | 45 |
1D | 32 | 45 | 67 |
2C | 45 | 67 | 33 |
6G | 43 | 37 | 87 |
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.