Hiya-
Well, to get more to the point (and I'm sorry, it's been too long for my Fortran stuff).
But, as pointed out earlier, most (and fortran is no exception) have "buffer" reads.
You have mentioned that you have 1Gigabyte files, so be it. Not a problem, however, we usually don't read in an entire buffer full of that size usually.
We do, however, make use of many megabyte buffer while reading in data. Also, we sometimes do not use the formatting of data allowed by some of the programming languages. Indeed, sometimes when convienient, we just do indeed read in the "raw" binary data.
Some of the "tricks" of the business, is that you can have more than one program doing the task. Using shared memory, in some operating systems, we can have one program (or task) read in data to "common or shared memory" while another program will read and interpret the data. This allows some asynchrounous nature of the beast or system. The reader program could schedule or buffer in many of the reads at a time. The reader program would then block while the secondary storage is being accessed. The "calculator" task can then look at a "semiphore" from the reader program to see if there is any of several buffers filled with data. It can then read the particular filled buffer while the reader task is still blocked. It processes data then puts it where it needs to go. That could be also a "writer" task that is similar to the reader task.
When the "reader" task is unblocked (allowed to run, because
the operating system has gotten around to finishing the read request), the reader program can then get another "chunk" started pointing to a different buffer.
Whew! That is quite a task that I've set out for you! Probably too much. Sorry. BUT you were asking about speed.
You were also asking about the programming languages that are used. O.K. here's the second part of the lecture (again sorry!)
C is a very popular programming language these days. Fortran, although usefull and well suited to doing formula translations, is not used too much in the computer geek area any more. Some of the "structured" programming styles allowed by C, C++, Pascal, etc. Makes for programs that are easier to follow and hence, debug.
There are other programming languages that are available, of course. But they have their disadvantages.
Assembly language- The fastest, but nowadays, not by much. This is very close to the "raw" binary bits the computer understands. It is NOT portable. It relies on "tricks" the programmer can play. Tricks are not good things. It make the code hard to understand and so, it is harder to maintain. It also requires a through understanding of the underlying architecture of the system, the instruction set, and the hardware associated in the system. Definitely not for the faint of heart. It's good for geeks like me that do embedded processing systems, not for general purpose computers.
Java and some "older" BASIC- These are "interpreter" languages. They incurr an overhead as they are not running "native" or compiled code. They either run the raw source code through, or they run an intermediate code through an interpreter. The code (or partially translated code) is run through another program that will interpret the code. The problem here is that there is a program that is running a program. Lots of overhead. I'm not sure how Visual Basic does it, the older basics worked this way. Nor am I sure about Matlab and the like.
C++ and other "object oriented programming languages". I can only answer for C++, not the ".Net" and others. In most cases however, there is additional overhead imposed on the language's operation to "protect" and "objectify" the operation. Notably, with C++ for example is the constructor and destructor code that is enforced. This tends to slow down the operation and why it is not used as much as C programming in operating system design and drivers.
Other programming languages? Who knows how many there are. I have only hit the "high" points of stuff out there today. I haven't touched the "legacy" languages like algol or forth. But they fill many books on the subject.
Whew! More crap. O.K. if you really want to get into the design side of things, C is a good choice. Now, we haven't
even talked about the operating system of choice. That depends on the delivery of your software. Is it going to a big 'nix box or is it designed to be run on Windows? Is there a choice of doing client-server information? I.e. there is a lot of data that has to be crunched, but does it yield a small amount of data out? Have a big machine do the number crunching, and serve a small "gui" (graphical user interface") client machine running windows?
If you are running just a single windows situation, then you're best bet is to run large data buffers with binary reads then do the formatting (interpret the ascii data) while in memory.
I've sure gone into a long winded answer, but well, you asked.
Hope that this helps and doesn't confuse the issue more.
Cheers,
Rich S.