Gather data from HUGE text file
Gather data from HUGE text file
(OP)
I am attempting to write a macro that would retrieve data from text file that could be large (1500+ pages of text). The text file is the output of an engineering software package. Currently we manually gather the data to perform calculations that the software package does not handle. What I would like to do is have some way of using Excel to retrieve the data with just a few inputs by a user.
Example of the data:
MEMBER END FORCES STRUCTURE TYPE = PLANE
-----------------
ALL UNITS ARE -- KIP FEET
MEMBER LOAD JT AXIAL SHEAR-Y SHEAR-Z TORSION MOM-Y MOM-Z
1 1 1 54.05 -2.00 0.00 0.00 0.00 -61.73
3 -52.26 2.00 0.00 0.00 0.00 21.71
3 1 40.71 18.99 0.00 0.00 0.00 247.88
3 -39.36 -9.99 0.00 0.00 0.00 41.96
2 1 3 33.81 -5.48 0.00 0.00 0.00 -21.71
7 -33.07 5.48 0.00 0.00 0.00 -60.43
3 3 28.90 -0.16 0.00 0.00 0.00 -41.96
7 -28.35 6.91 0.00 0.00 0.00 -11.07
3 1 2 58.79 0.00 -2.00 0.00 0.00 0.00
6 -56.99 0.00 2.00 0.00 40.02 0.00
3 2 55.17 0.00 -3.51 0.00 0.00 0.00
6 -53.82 0.00 3.51 0.00 70.15 0.00
Currently we search the text file to find the "MEMBER" for which we seek data (not difficult but tedious), and then we manually enter this data into a spreadsheet that handles the calculations the software package does not.
What I want to do is automate this last step. If a user knows the "MEMBER", "LOAD", and "JT", can a macro be written to retrieve it from this HUGE file.
OR EVEN BETTER
Can a macro be written to import all of the data for one, two or three particular member(s) into Excel. Please note that the data might span a few pages and each page will have a header similar to that show in the sample.
Factors that complicate this search:
1 This table might begin well into the text file
2 The data will almost always span several pages, and the headings shown in the sample data will appear at the top of each page. the information sought will most likely be far into the table.
Any thoughts as to the best approach would be greatly appreciated.
Thanks
Example of the data:
MEMBER END FORCES STRUCTURE TYPE = PLANE
-----------------
ALL UNITS ARE -- KIP FEET
MEMBER LOAD JT AXIAL SHEAR-Y SHEAR-Z TORSION MOM-Y MOM-Z
1 1 1 54.05 -2.00 0.00 0.00 0.00 -61.73
3 -52.26 2.00 0.00 0.00 0.00 21.71
3 1 40.71 18.99 0.00 0.00 0.00 247.88
3 -39.36 -9.99 0.00 0.00 0.00 41.96
2 1 3 33.81 -5.48 0.00 0.00 0.00 -21.71
7 -33.07 5.48 0.00 0.00 0.00 -60.43
3 3 28.90 -0.16 0.00 0.00 0.00 -41.96
7 -28.35 6.91 0.00 0.00 0.00 -11.07
3 1 2 58.79 0.00 -2.00 0.00 0.00 0.00
6 -56.99 0.00 2.00 0.00 40.02 0.00
3 2 55.17 0.00 -3.51 0.00 0.00 0.00
6 -53.82 0.00 3.51 0.00 70.15 0.00
Currently we search the text file to find the "MEMBER" for which we seek data (not difficult but tedious), and then we manually enter this data into a spreadsheet that handles the calculations the software package does not.
What I want to do is automate this last step. If a user knows the "MEMBER", "LOAD", and "JT", can a macro be written to retrieve it from this HUGE file.
OR EVEN BETTER
Can a macro be written to import all of the data for one, two or three particular member(s) into Excel. Please note that the data might span a few pages and each page will have a header similar to that show in the sample.
Factors that complicate this search:
1 This table might begin well into the text file
2 The data will almost always span several pages, and the headings shown in the sample data will appear at the top of each page. the information sought will most likely be far into the table.
Any thoughts as to the best approach would be greatly appreciated.
Thanks





RE: Gather data from HUGE text file
It looks like output from some sort of of simulation program. The output format is usually pretty consistent.
I usually use a macro call from Excel into Word, since Word's search engine is marginally better than Excel's.
TTFN
RE: Gather data from HUGE text file
RE: Gather data from HUGE text file
The PERL language has excellent text search capabilities and text file handling, so you could write a very simple script in PERL (http://www.activestate.com/Perl.plex?hdr=1) to extract the lines you need; I doubt it would be more than 15 lines of code.
RE: Gather data from HUGE text file
IRstuff
I would do that if the output were small enough to be imported into Excel. It far exceeds the maximum number of rows for a worksheet. I need to search the source file and only import a smal portion of the data. Also the format for this portion of the data is consistent but this regular layout might not start until line 64,586 of the output file. The last file I ran had over 219,000 lines of output, but this varies with size of the analysis. Recording a macro for one output file would yield useless sata when applied to a different output file, or if I wished to gather data for a different "MEMBER" in the same output file.
zappedagain
The output font is 8 pt. so there are even more lines (see above). Sorry I am not familiar with PERL, is it similar to VBA (I can handle some VBA)
Any further thoughts on this using VBA?
Thanks
RE: Gather data from HUGE text file
They operate on text files, a line at a time.
They don't work quite like VBA
For your application, in awk, program flow might go something like this:
Scan through stdin a line at a time. <You would redirect your huge file to stdin, from the command line or a batch file or a script file.>
When the beginning of a line looks like this:
2 1 3 <for that member | load | jt >
then process that line in this way,
and process the next line in this way,
etc.
and send the result to stdout, and
keep doing it until
you see the beginning of the next data set
(or for some arbitrary number of lines,
or whatever you need to do)
Then scan through the rest of the file, etc.
<You would redirect stdout to a new file, e.g. a csv file that you could import to Excel, containing only the data you want.>
I.e., the awk file acts as a "filter". At no time does it attempt to store the entire contents of the huge text file. There's no funny business associated with finding, opening or closing the huge file; redirection just pumps the huge file into the filter, and the filter picks out the part you want and spits it out, and redirection captures it in a form you can use.
The possibility of a page header inserted in your data can be dealt with by writing a secondary filter to remove page headers from the csv file, or, probably better, by writing a pre-processor to remove all the page headers from the huge file, generating a headerless huge file to pass to the data extractor.
Filters are in some ways easier to write than other programs, but they require a shift in the way you perceive the problem to be solved. Once you get over that, they are rather fun.
Mike Halloran
Pembroke Pines, FL, USA
RE: Gather data from HUGE text file
Moreover, I didn't expect you to use the exact recorded macro for all your jobs. I assumed that you would take the recorded macro and add the required hooks to get the data you want.
TTFN
RE: Gather data from HUGE text file
Line Input # Statement
That'll allow you to search for the name in the text file, then just read the succeeding lines into the ss
Cheers
Greg Locock
Please see FAQ731-376 for tips on how to make the best use of Eng-Tips.
RE: Gather data from HUGE text file
If Excel's row limitation is a problem, you might need to break the text file in to sections.
You might need to strip out the page headers first, which Word can do for you nicely via find and replace.
Alternately, you might consider importing your data into Access. You could then link Excel and Access together with VBA and generate queries to return the data you need.
RE: Gather data from HUGE text file
RE: Gather data from HUGE text file
i rarely post messages on this forum, so please excuse my "interruption" ..... i had a simliar need to read MANY large, poorly formatted text files into EXCEL.
after a few years of manually importing and tweaking these into EXCEL, i stumbled upon AWK a few years back .... it was probably my most productive discovery since POWERPOINT came out. i use AWK to read the large files and sort the data i need into neat space separated files that i then simply open in EXCEL.
if you are going to process lots of text files and reformat or operate on the data (basic math stuff) you REALLY should look at AWK (or the newer version called GAWK). the few days i spent learning the syntax paid off years ago.
i run AWK on linux and unix but i know it is also available (FREE!) for MSwindows PC's.
daveleo
RE: Gather data from HUGE text file
RE: Gather data from HUGE text file
I have heard GAWK is good too but never used it. I haven't programmed in VBA so I can't compare; if you like to do things from the command line you might prefer PERL (you can add a HTML GUI onto the front of a PERL program if you want to).
Here is a example PERL program to list lines of text that contain "5**-" (5, any char, any char, dash), "400-", "800-", or "801-". The filename is passed on the command line (to STDIN) and the lines with the text have the first 80 characters printed (to STDOUT so you can redirect it to a file if necessary). The text comparision is very powerful in PERL so it only takes one line to search for all of these. I'm a rookie PERL programmer so this can probably be done in even less lines.
#perl -w
use IO::File;
# get the filename from the command line
$filename = shift(@ARGV);
# open the file or quit if there is an error
open( IN, $filename) or die "error - Can't open \"$filename\"; use fsp.pl <filename>\n";
while (<IN>)
{
$line = $_;
if( $line =~ m/(5..-|400-|801-|800-)/) # check for 5**-, 400-, 800-, or 801-
{
printf ("%s\n", substr($line, 1, 80));
}
}
close (IN);
# that's all
exit;
-------------------------------------
Enjoy!
RE: Gather data from HUGE text file
What software do you use? Ask customer service if they provide access to the output database or maybe theres' a command to create formatted database-oriented text file. Many structural analysis softwares (GTSTRUDL, STAAD, SAP, STRAND7) have these options. With large amount of data you will benefit learning working with databases, linking with MS Access per MintJulep is the easiest way to manipulate data in excel. A few years ago I developed a excel-access post-processor for GTSTRUDL that saved me hours and hours of work = $$.