Lockheed Martin Prepared P3D v4+ & Microsoft Flight Simulator FSX
Ignorance about true performance can be costly. Know when optimization matters—and then optimize when it does! Austrian Flight 94 Smoke in the Cabin. Both instructions are 2 bytes long, and in both cases it is the 8-cycle instruction fetch time, not the 3 or 4-cycle Execution Unit execution time, that limits performance. Old Habbits Die Hard. Cessna D Skyhawk, NU: Kennedy Tower- Of Strobes and Rabbits.
Muhammad Ali's 1970 Rolls-Royce Silver Shadow is going for auction
Lists by death toll by cost. Rail Maritime Shipwreck Aircraft Airship. Humanitarian aid Emergency population warning Emergency Alert System Earthquake preparedness Earthquake warning system Evacuations Emergency management Hurricane preparedness Crisis management Disaster risk reduction. Retrieved from " https: Accidents and incidents involving airliners Airline-related lists Lists of aviation accidents and incidents Aviation accidents and incidents by year Aviation-related lists. Articles needing additional references from September All articles needing additional references Pages with editnotices.
Views Read Edit View history. This page was last edited on 14 September , at The key is the concept of handling data in restartable blocks; that is, reading a chunk of data, operating on the data until it runs out, suspending the operation while more data is read in, and then continuing as though nothing had happened. At any rate, Listing 1. Always consider the alternatives; a bit of clever thinking and program redesign can go a long way.
I have said time and again that optimization is pointless until the design is settled. When that time comes, however, optimization can indeed make a significant difference. These are considerable improvements, well worth pursuing—once the design has been maxed out. Note that in Table 1. By the way, the execution times even of Listings 1. If a disk cache is enabled and the file to be checksummed is already in the cache, the assembly version is three times as fast as the C version.
In other words, the inherent nature of this application limits the performance improvement that can be obtained via assembly. All this is basically a way of saying: What have we learned? Consider the ratios on the vertical axis of Table 1. Optimization is no panacea. This chapter has presented a quick step-by-step overview of the design process.
Create code however you want, but never forget that design matters more than detailed optimization. Certainly if you use assembly at all, make absolutely sure you use it right. The potential of assembly code to run slowly is poorly understood by a lot of people, but that potential is great, especially in the hands of the ignorant. Some time ago, I was asked to work over a critical assembly subroutine in order to make it run as fast as possible.
The task of the subroutine was to construct a nibble out of four bits read from different bytes, rotating and combining the bits so that they ultimately ended up neatly aligned in bits of a single byte.
I examined the subroutine line by line, saving a cycle here and a cycle there, until the code truly seemed to be optimized. When I was done, the key part of the code looked something like this:.
Still, something bothered me, so I spent a bit of time going over the code again. Suddenly, the answer struck me—the code was rotating each bit into place separately, so that a multibit rotation was being performed every time through the loop, for a total of four separate time-consuming multibit rotations!
While the instructions themselves were individually optimized, the overall approach did not make the best possible use of the instructions. This moved the costly multibit rotation out of the loop so that it was performed just once, rather than four times. While the code may not look much different from the original, and in fact still contains exactly the same number of instructions, the performance of the entire subroutine improved by about 10 percent from just this one change.
The point is this: To write truly superior assembly programs, you need to know what the various instructions do and which instructions execute fastest…and more. You must also learn to look at your programming problems from a variety of perspectives so that you can put those fast instructions to work in the most effective ways.
Is it really so hard as all that to write good assembly code for the PC? Thanks to the decidedly quirky nature of the x86 family CPUs, assembly language differs fundamentally from other languages, and is undeniably harder to work with. On the other hand, the potential of assembly code is much greater than that of other languages, as well.
To understand why this is so, consider how a program gets written. A programmer examines the requirements of an application, designs a solution at some level of abstraction, and then makes that design come alive in a code implementation. If not handled properly, the transformation that takes place between conception and implementation can reduce performance tremendously; for example, a programmer who implements a routine to search a list of , sorted items with a linear rather than binary search will end up with a disappointingly slow program.
The process of turning a design into executable code by way of a high-level language involves two transformations: Consequently, the machine language code generated by compilers is usually less than optimal given the requirements of the original design. High-level languages provide artificial environments that lend themselves relatively well to human programming skills, in order to ease the transition from design to implementation. The price for this ease of implementation is a considerable loss of efficiency in transforming source code into machine language.
This is particularly true given that the x86 family in real and bit protected mode, with its specialized memory-addressing instructions and segmented memory architecture, does not lend itself particularly well to compiler design.
Even the bit mode of the and its successors, with their more powerful addressing modes, offer fewer registers than compilers would like. Assembly, on the other hand, is simply a human-oriented representation of machine language.
As a result, assembly provides a difficult programming environment—the bare hardware and systems software of the computer— but properly constructed assembly programs suffer no transformation loss , as shown in Figure 2. Assemblers perform no transformation from source code to machine language; instead, they merely map assembler instructions to machine language instructions on a one-to-one basis. The key, of course, is the programmer, since in assembly the programmer must essentially perform the transformation from the application specification to machine language entirely on his or her own.
The assembler merely handles the direct translation from assembly to machine language. The first part of assembly language optimization, then, is self. An assembler is nothing more than a tool to let you design machine-language programs without having to think in hexadecimal codes. So assembly language programmers—unlike all other programmers—must take full responsibility for the quality of their code. Since assemblers provide little help at any level higher than the generation of machine language, the assembly programmer must be capable both of coding any programming construct directly and of controlling the PC at the lowest practical level—the operating system, the BIOS, even the hardware where necessary.
High-level languages handle most of this transparently to the programmer, but in assembly everything is fair—and necessary—game, which brings us to another aspect of assembly optimization: In the PC world, you can never have enough knowledge, and every item you add to your store will make your programs better.
Thorough familiarity with both the operating system APIs and BIOS interfaces is important; since those interfaces are well-documented and reasonably straightforward, my advice is to get a good book or two and bring yourself up to speed.
Similarly, familiarity with the PC hardware is required. While that topic covers a lot of ground—display adapters, keyboards, serial ports, printer ports, timer and DMA channels, memory organization, and more—most of the hardware is well-documented, and articles about programming major hardware components appear frequently in the literature, so this sort of knowledge can be acquired readily enough.
The single most critical aspect of the hardware, and the one about which it is hardest to learn, is the CPU. The x86 family CPUs have a complex, irregular instruction set, and, unlike most processors, they are neither straightforward nor well-documented true code performance.
In fact, since most articles and books are written for inexperienced assembly programmers, there is very little information of any sort available about how to generate high-quality assembly code for the x86 family CPUs. As a result, knowledge about programming them effectively is by far the hardest knowledge to gather. A good portion of this book is devoted to seeking out such knowledge.
Is the never-ending collection of information all there is to the assembly optimization, then? Knowledge is simply a necessary base on which to build. Basically, there are only two possible objectives to high-performance assembly programming: Given the requirements of the application, keep to a minimum either the number of processor cycles the program takes to run, or the number of bytes in the program, or some combination of both.
You will notice that my short list of objectives for high-performance assembly programming does not include traditional objectives such as easy maintenance and speed of development. Those are indeed important considerations—to persons and companies that develop and distribute software. People who actually buy software, on the other hand, care only about how well that software performs, not how it was developed nor how it is maintained.
These days, developers spend so much time focusing on such admittedly important issues as code maintainability and reusability, source code control, choice of development environment, and the like that they often forget rule 1: Knowledge of the sort described earlier is absolutely essential to fulfilling either of the objectives of assembly programming. Knowledge makes that possible, but your programming instincts make it happen.
And it is that intuitive, on-the-fly integration of a program specification and a sea of facts about the PC that is the heart of the Zen-class assembly optimization. As with Zen of any sort, mastering that Zen of assembly language is more a matter of learning than of being taught. You will have to find your own path of learning, although I will start you on your way with this book.
The subtle facts and examples I provide will help you gain the necessary experience, but you must continue the journey on your own.
Each program you create will expand your programming horizons and increase the options available to you in meeting the next challenge.
The ability of your mind to find surprising new and better ways to craft superior code from a concept—the flexible mind, if you will—is the linchpin of good assembler code, and you will develop this skill only by doing.
Never underestimate the importance of the flexible mind. Good assembly code is better than good compiled code. High-level languages are the best choice for the majority of programmers, and for the bulk of the code of most applications. When the best code—the fastest or smallest code possible—is needed, though, assembly is the only way to go.
Simple logic dictates that no compiler can know as much about what a piece of code needs to do or adapt as well to those needs as the person who wrote the code. Given that superior information and adaptability, an assembly language programmer can generate better code than a compiler, all the more so given that compilers are constrained by the limitations of high-level languages and by the process of transformation from high-level to machine language.
Consequently, carefully optimized assembly is not just the language of choice but the only choice for the 1 percent to 10 percent of code—usually consisting of small, well-defined subroutines—that determines overall program performance, and it is the only choice for code that must be as compact as possible, as well.
In the run-of-the-mill, non-time-critical portions of your programs, it makes no sense to waste time and effort on writing optimized assembly code—concentrate your efforts on loops and the like instead; but in those areas where you need the finest code quality, accept no substitutes. Note that I said that an assembly programmer can generate better code than a compiler, not will generate better code.
While it is true that good assembly code is better than good compiled code, it is also true that bad assembly code is often much worse than bad compiled code; since the assembly programmer has so much control over the program, he or she has virtually unlimited opportunities to waste cycles and bytes. The sword cuts both ways, and good assembly code requires more, not less, forethought and planning than good code written in a high-level language.
The gist of all this is simply that good assembly programming is done in the context of a solid overall framework unique to each program, and the flexible mind is the key to creating that framework and holding it together. To summarize, the skill of assembly language optimization is a combination of knowledge, perspective, and a way of thought that makes possible the genesis of absolutely the fastest or the smallest code.
With that in mind, what should the first step be? Development of the flexible mind is an obvious step. Still, the flexible mind is no better than the knowledge at its disposal.
The first step in the journey toward mastering optimization at that exalted level, then, would seem to be learning how to learn. A case in point: The author had, however, chosen a small, well-defined assembly language routine to refine, consisting of about 30 instructions that did nothing more than expand 8 bits to 16 bits by duplicating each bit. In short, he had used all the information at his disposal to improve his code, and had, as a result, saved cycles by the bushel.
There was, in fact, only one slight problem with the optimized version of the routine…. As diligent as the author had been, he had nonetheless committed a cardinal sin of x86 assembly language programming: He had assumed that the information available to him was both correct and complete. While the execution times provided by Intel for its processors are indeed correct, they are incomplete; the other—and often more important—part of code performance is instruction fetch time, a topic to which I will return in later chapters.
There you have an important tenet of assembly language optimization: I cannot emphasize this strongly enough—when you care about performance, do your best to improve the code and then measure the improvement.
Ignorance about true performance can be costly. When I wrote video games for a living, I spent days at a time trying to wring more performance from my graphics drivers. I rewrote whole sections of code just to save a few cycles, juggled registers, and relied heavily on blurry-fast register-to-register shifts and adds.
As I was writing my last game, I discovered that the program ran perceptibly faster if I used look-up tables instead of shifts and adds for my calculations. In truth, instruction fetching was rearing its head again, as it often does, and the fetching of the shifts and adds was taking as much as four times the nominal execution time of those instructions. Ignorance can also be responsible for considerable wasted effort. The letter-writers counted every cycle in their timing loops, just as the author in the story that started this chapter had.
Like that author, the letter-writers had failed to take the prefetch queue into account. In fact, they had neglected the effects of video wait states as well, so the code they discussed was actually much slower than their estimates.
The proper test would, of course, have been to run the code to see if snow resulted, since the only true measure of code performance is observing it in action. Clearly, one key to mastering Zen-class optimization is a tool with which to measure code performance. The can be started at the beginning of a block of code of interest and stopped at the end of that code, with the resulting count indicating how long the code took to execute with an accuracy of about 1 microsecond.
To be precise, the counts once every A nanosecond is one billionth of a second, and is abbreviated ns. On the other hand, it is by no means essential that you understand exactly how the Zen timer works. Interesting, yes; essential, no. ZTimerOn is called at the start of a segment of code to be timed. ZTimerOn saves the context of the calling code, disables interrupts, sets timer 0 of the to mode 2 divide-by-N mode , sets the initial timer count to 0, restores the context of the calling code, and returns.
Two aspects of ZTimerOn are worth discussing further. One point of interest is that ZTimerOn disables interrupts. Were interrupts not disabled by ZTimerOn , keyboard, mouse, timer, and other interrupts could occur during the timing interval, and the time required to service those interrupts would incorrectly and erratically appear to be part of the execution time of the code being measured.
As a result, code timed with the Zen timer should not expect any hardware interrupts to occur during the interval between any call to ZTimerOn and the corresponding call to ZTimerOff , and should not enable interrupts during that time. A second interesting point about ZTimerOn is that it may introduce some small inaccuracy into the system clock time whenever it is called. The actually contains three timers, as shown in Figure 3.
Each of the three timers counts down in a programmable way, generating a signal on its output pin when it counts down to 0. Timer 2 drives the speaker, although it can be used for other timing purposes when the speaker is not in use. As shown in Figure 3. On the other hand, the output of timer 2 is connected to nothing other than the speaker. Timer 1 is dedicated to providing dynamic RAM refresh, and should not be tampered with lest system crashes result.
Finally, timer 0 is used to drive the system clock. A millisecond is one-thousandth of a second, and is abbreviated ms. This line is connected to the hardware interrupt 0 IRQ0 line on the system board, so every Each timer channel of the can operate in any of six modes.
Timer 0 normally operates in mode 3: In square wave mode, the initial count is counted down two at a time; when the count reaches zero, the output state is changed. The initial count is again counted down two at a time, and the output state is toggled back when the count reaches zero. The result is a square wave that changes state more slowly than the input clock by a factor of the initial count.
In its normal mode of operation, timer 0 generates an output pulse that is low for about Square wave mode is not very useful for precision timing because it counts down by two twice per timer interrupt, thereby rendering exact timings impossible. Fortunately, the offers another timer mode, mode 2 divide-by-N mode , which is both a good substitute for square wave mode and a perfect mode for precision timing. Divide-by-N mode counts down by one from the initial count.
When the count reaches zero, the timer turns over and starts counting down again without stopping, and a pulse is generated for a single clock period.
As a result, timer 0 continues to generate timer interrupts in divide-by-N mode, and the system clock continues to maintain good time. Why not use timer 2 instead of timer 0 for precision timing? We need the interrupt generated by the output of timer 0 to tell us when the count has overflowed, and we will see shortly that the timer interrupt also makes it possible to time much longer periods than the Zen timer shown in Listing 3.
In fact, the Zen timer shown in Listing 3. Fifty-four ms may not seem like a very long time, but even a CPU as slow as the can perform more than 1, divides in 54 ms, and division is the single instruction that the performs most slowly.
If a measured period turns out to be longer than 54 ms that is, if timer 0 has counted down and turned over , the Zen timer will display a message to that effect. A long-period Zen timer for use in such cases will be presented later in this chapter. The Zen timer determines whether timer 0 has turned over by checking to see whether an IRQ0 interrupt is pending. Remember, interrupts are off while the Zen timer runs, so the timer interrupt cannot be recognized until the Zen timer stops and enables interrupts.
If an IRQ0 interrupt is pending, then timer 0 has turned over and generated a timer interrupt. Recall that ZTimerOn initially sets timer 0 to 0, in order to allow for the longest possible period—about 54 ms—before timer 0 reaches 0 and generates the timer interrupt.
Since timer 0 is initially set to 0 by the Zen timer, and since the system clock ticks only when timer 0 counts off In addition, a timer interrupt is generated when timer 0 is switched from mode 3 to mode 2, advancing the system clock by up to Finally, up to The system clock will run up to ms about a ninth of a second slow each time the Zen timer is used.
Potentially far greater inaccuracy can be incurred by timing code that takes longer than about ms to execute. Recall that all interrupts, including the timer interrupt, are disabled while timing code with the Zen timer. The interrupt controller is capable of remembering at most one pending timer interrupt, so all timer interrupts after the first one during any given Zen timing interval are ignored.
Consequently, if a timing interval exceeds System that have battery-backed clocks, AT-style machines; that is, virtually all machines in common use automatically reset the correct time whenever the computer is booted, and systems without battery-backed clocks prompt for the correct date and time when booted.
Also, repeated use of the Zen timer usually makes the system clock slow by at most a total of a few seconds, unless code that takes much longer than 54 ms to run is timed in which case the Zen timer will notify you that the code is too long to time. ZTimerOff saves the context of the calling program, latches and reads the timer 0 count, converts that count from the countdown value that the timer maintains to the number of counts elapsed since ZTimerOn was called, and stores the result.
Immediately after latching the timer 0 count—and before enabling interrupts— ZTimerOff checks the interrupt controller to see if there is a pending timer interrupt, setting a flag to mark that the timer overflowed if there is indeed a pending timer interrupt.
After that, ZTimerOff executes just the overhead code of ZTimerOn and ZTimerOff 16 times, and averages and saves the results in order to determine how many of the counts in the timing result just obtained were incurred by the overhead of the Zen timer rather than by the code being timed.
Finally, ZTimerOff restores the context of the calling program, including the state of the interrupt flag that was in effect when ZTimerOn was called to start timing, and returns.
One interesting aspect of ZTimerOff is the manner in which timer 0 is stopped in order to read the timer count. We simply tell the to latch the current count, and the does so without breaking stride. ZTimerReport first checks to see whether the timer overflowed counted down to 0 and turned over before ZTimerOff was called; if overflow did occur, ZTimerOff prints a message to that effect and returns. Otherwise, ZTimerReport subtracts the reference count representing the overhead of the Zen timer from the count measured between the calls to ZTimerOn and ZTimerOff , converts the result from timer counts to microseconds, and prints the resulting time in microseconds to the standard output.
There are many ways to deal with this. A second approach is modification of ZTimerReport to place the result at some safe location in memory, such as an unused portion of the BIOS data area. A third approach is alteration of ZTimerReport to print the result over a serial port to a terminal or to another PC acting as a terminal.
Similarly, many debuggers can be run from a remote terminal via a serial link. A final approach is to modify ZTimerReport to print the result to the auxiliary output via DOS function 4, and to then write and load a special device driver named AUX , to which DOS function 4 output would automatically be directed. This device driver could send the result anywhere you might desire. The result might go to the secondary display adapter, over a serial port, or to the printer, or could simply be stored in a buffer within the driver, to be dumped at a later time.
Credit for this final approach goes to Michael Geary, and thanks go to David Miller for passing the idea on to me. The Zen timer subroutines are designed to be near-called from assembly language code running in the public segment Code.
The Zen timer subroutines can, however, be called from any assembly or high-level language code that generates OBJ files that are compatible with the Microsoft linker, simply by modifying the segment that the timer code runs in to match the segment used by the code being timed, or by changing the Zen timer routines to far procedures and making far calls to the Zen timer code from the code being timed, as discussed at the end of this chapter.
All three subroutines preserve all registers and all flags except the interrupt flag, so calls to these routines are transparent to the calling code. If you do change the Zen timer routines to far procedures in order to call them from code running in another segment, be sure to make all the Zen timer routines far, including ReferenceZTimerOn and ReferenceZTimerOff.
Please be aware that the inaccuracy that the Zen timer can introduce into the system clock time does not affect the accuracy of the performance measurements reported by the Zen timer itself. On the other hand, there is certainly no guarantee that code performance as measured by the Zen timer will be the same on compatible computers as on genuine IBM machines, or that either absolute or relative code performance will be similar even on different IBM models; in fact, quite the opposite is true.
The differences were minor, mind you, but my experience illustrates the risk of assuming that a specific make of computer will perform in a certain way without actually checking.
Not that this variation between models makes the Zen timer one whit less useful—quite the contrary. The Zen timer is an excellent tool for evaluating code performance over the entire spectrum of PC-compatible computers. This listing measures the time required to execute 1, loads of AL from the memory variable MemVar. Note that Listing 3. This approach lets us avoid reproducing Listing 3. Note that only after the initial jump is performed in Listing 3.
ASM contains Listing 3. The same is true of Listing 3. Assuming that Listing 3. ASM and Listing 3. BAT, the code in Listing 3. When the above command is executed on an original 4.
While the exact number is 3. Exactly why that is so is just what this book is all about. In order to perform any of the timing tests in this book, enter Listing 3.
ASM, enter Listing 3. ASM, and enter Listing 3. Then simply enter the listing you wish to run into the file filename and enter the command:. Code fragments you write yourself can be timed in just the same way. If you wish to time code directly in place in your programs, rather than in the test-bed program of Listing 3. Occasionally, however, we will need to time longer intervals. The long-period Zen timer so named by contrast with the precision Zen timer just presented shown in Listing 3.
The key difference between the long-period Zen timer and the precision Zen timer is that the long-period timer leaves interrupts enabled during the timing period. As a result, timer interrupts are recognized by the PC, allowing the BIOS to maintain an accurate system clock time over the timing period. Theoretically, this enables measurement of arbitrarily long periods.
Practically speaking, however, there is no need for a timer that can measure more than a few minutes, since the DOS time of day and date functions or, indeed, the DATE and TIME commands in a batch file serve perfectly well for longer intervals. If a period longer than an hour is timed, the long-period Zen timer prints a message to the effect that it is unable to time an interval of that length. For implementation reasons, the long-period Zen timer is also incapable of timing code that starts before midnight and ends after midnight; if that eventuality occurs, the long-period Zen timer reports that it was unable to time the code because midnight was crossed.
You should not use the long-period Zen timer to time code that requires interrupts to be disabled for more than 54 ms at a stretch during the timing interval, since when interrupts are disabled the long-period Zen timer is subject to the same 54 ms maximum measurement time as the precision Zen timer.
While permitting the timer interrupt to occur allows long intervals to be timed, that same interrupt makes the long-period Zen timer less accurate than the precision Zen timer, since the time the BIOS spends handling timer interrupts during the timing interval is included in the time measured by the long-period timer.
Likewise, any other interrupts that occur during the timing interval, most notably keyboard and mouse interrupts, will increase the measured time. The long-period Zen timer does not, however, have the same potential for introducing major inaccuracy into the system clock time during a single timing run since it leaves interrupts enabled and therefore allows the system clock to update normally.
The problem is this: In order to measure times longer than 54 ms, we must maintain not one but two timing components, the timer 0 count and the BIOS time-of-day count.
The time-of-day count measures the passage of We need to read the two time components simultaneously in order to get a clean reading. Otherwise, we may read the timer count just before it turns over and generates an interrupt, then read the BIOS time-of-day count just after the interrupt has occurred and caused the time-of-day count to turn over, with a resulting 54 ms measurement inaccuracy. The opposite sequence—reading the time-of-day count and then the timer count—can result in a 54 ms inaccuracy in the other direction.
The only way to avoid this problem is to stop timer 0, read both the timer and time-of-day counts while the timer is stopped, and then restart the timer. The latched read feature we used in Listing 3. What should we do? As it turns out, an undocumented feature of the makes it possible to stop the timer dead in its tracks. Setting the timer to a new mode and waiting for an initial count to be loaded causes the timer to stop until the count is loaded.
Surprisingly, the timer count remains readable and correct while the timer is waiting for the initial load. In my experience, this approach works beautifully with fully compatible chips. The PS2 equate selects between the two modes of operation. If PS2 is 1 as it is in Listing 3. The latch-and-read method will work on all PC-compatible computers, but may occasionally produce results that are incorrect by 54 ms.
Rebooting should clear up any timer-related problems of the sort described above. This gives us another reason to reboot at the end of each code-timing session. You should immediately reboot and set the PS2 equate to 1 if you get erratic or obviously incorrect results with the long-period Zen timer when PS2 is set to 0.
If you want to set PS2 to 0, it would be a good idea to time a few of the listings in this book with PS2 set first to 1 and then to 0, to make sure that the results match. If you do leave the PS2 equate at 1 in Listing 3. The long-period Zen timer has exactly the same calling interface as the precision Zen timer, and can be used in place of the precision Zen timer simply by linking it to the code to be timed in place of linking the precision timer code.
Whenever the precision Zen timer informs you that the code being timed takes too long for the precision timer to handle, all you have to do is link in the long-period timer instead. While this program is similar to Listing 3. Since interrupts must be left on in order to time periods longer than 54 ms, the interrupts generated by keystrokes including the upstroke of the Enter key press that starts the program —or any other interrupts, for that matter—could incorrectly inflate the time recorded by the long-period Zen timer.
In light of this, resist the temptation to type ahead, move the mouse, or the like while the long-period Zen timer is timing.
As with the precision Zen timer, the program in Listing 3. This is just slightly longer than the time per load of AL measured by the precision Zen timer, as we would expect given that interrupts are left enabled by the long-period Zen timer. Note that the command can take as much as 10 minutes to finish on a slow PC if you are using MASM, with most of that time spent assembling Listing 3. The Zen timer can be used to measure code performance when programming in C—but not right out of the box.
As presented earlier, the timer is designed to be called from assembly language; some relatively minor modifications are required before the ZTimerOn start timer , ZTimerOff stop timer , and ZTimerReport display timing results routines can be called from C. There are two separate cases to be dealt with here: Altering the Zen timer for linking to a small code model C program involves the following steps: These changes convert the code to use C-style external label names and the small model C code segment.
The long-period timer—Listing 3. Again, the line numbers are specific to the precision timer, but the long-period timer is very similar. One important safety tip when modifying the Zen timer for use with large code model C code: Watch out for optimizing assemblers!
This is normally a great optimization, being both smaller and faster than a far call. By the way, there is, to the best of my knowledge, no such problem with MASM up through version 5. In my mind, the whole business of optimizing assemblers is a mixed blessing. The Zen timer is not perfect. Another problem is that the timing code itself interferes with the state of the prefetch queue and processor cache at the start of the code being timed, because the timing code is not necessarily fetched and does not necessarily access memory in exactly the same time sequence as the code immediately preceding the code under measurement normally does.
Similarly, the state of the prefetch queue at the end of the code being timed affects how long the code that stops the timer takes to execute. Consequently, the Zen timer tends to be more accurate for longer code sequences, since the relative magnitude of the inaccuracy introduced by the Zen timer becomes less over longer periods. This chapter, adapted from my earlier book, Zen of Assembly Language located on the companion CD-ROM, goes right to the heart of my philosophy of optimization: Understand where the time really goes when your code runs.
That may sound ridiculously simple, but, as this chapter makes clear, it turns out to be a challenging task indeed, one that at times verges on black magic. This chapter is a long-time favorite of mine because it was the first—and to a large extent only—work that I know of that discussed this material, thereby introducing a generation of PC programmers to pedal-to-the-metal optimization. This chapter focuses almost entirely on the first popular xfamily processor, the Nonetheless, the overall theme of this chapter—that understanding dimly-seen and poorly-documented code gremlins called cycle-eaters that lurk in your system is essential to performance programming—is every bit as valid today.
Also, later chapters often refer back to the basic cycle-eaters described in this chapter, so this chapter is the foundation for the discussions of xfamily optimization to come.
Programming has many levels, ranging from the familiar high-level languages, DOS calls, and the like down to the esoteric things that lie on the shadowy edge of hardware-land. Why start at the lowest level? Simply because cycle-eaters affect the performance of all assembler code, and yet are almost unknown to most programmers.
A full understanding of code optimization requires an understanding of cycle-eaters and their implications. Nearly all literature on assembly programming discusses only the programming interface: Here are pictures of a fuel line hose at the back of one of my IO engines that was laying in there for 9 years since the overhauled engine was installed. My IA discovered this near disaster when making up the new hoses!
Note how the hose looks very nice in the top picture and how little actual "bite" the connector fitting had on the hose! Here is a harrowing tale from Beech Talker Gary S. I lost my right engine on lift off climb out, low slow and trying to clear trees and houses, put it down in field no place to go, point is, NTSB, FAA and attorney all said, "Sounds like fuel line collapse, we see this quite often, but never got to talk to a survivor", I said REALLY, You know this but don't say anything?
I bought my new Baron 3 months ago and required a annual before closing. I asked the mechanic did the fuel lines need replacing around the engines and he replied, they will probably be ok for another year or two!
I said, are you kidding, they are 9 years old and the POH says to replace every 5 years, he said nobody does that, this was a different mechanic from before. The inner liner separates and suction causes the line to collapse and starve your engine immediately with no notice on takeoff, most fuel demand.
My goal is if I just help save one pilot from this type accident it's worth it to me. Here is a cockpit fuel line that was buried behind the panel when Dave B. The aluminum was riding against the steel wire in the defroster duct hose.
The pilot's wife was complaining about the fuel smell! Pictured below is a Bonanza fuel selector with telltale fuel leakage signs of o-ring death. This is why you want your IA to uncover it during the annual and give it a good inspection. How many years do you think you should get out of those WWII vintage nitrile rubber technology o-rings anyway? Flurosilicone o-rings have been reported to give exceptional service life in fuel cap and brake caliper applications.
DUH- Bad o-rings is a way for air to get into your fuel system and make for some troubling engine running. The engine was from a top-rated rebuilder. Using a clear section of hose, we attached his vac pump to the fuel supply line where it hooks to the engine driven fuel pump. When we applied vacuum, fuel was pulled thru the system,. There were no fuel leaks aft of the firewall, so we figured it had to be the fuel selector valve.
The problem ended when we replaced the valve with a new unit. I ended up buying a new valve, because after calling. As for which parts inside the valve were bad, the major suspects were the small O-ring. The indents are located on the top of the top plate. The small O-ring can be replaced without removing the selector valve, but not the large one. Pictured below is a Baron 58 carry through spar section.
The picture on the left with the stop drill is the front. Picture on the right is the back of the spar on the lower left side. Some pictures from when I had an AN3 bolt come loose in the induction airbox behind the air filter and ahead of the turbo charger.
Trashed the turbo and launched tiny bits of metal downstream through the intercooler and into the engine. The AN3 bolt and a washer was found loose in the airbox having been ground down to almost nothing. The metal lock nut found sitting below the airbox. The intercooler trapped a lot of the metal, but enough passed through into the engine to warrant a teardown. Beech Lister, Jeff W. Here is what happens when a loose 10 nut wanders unsupervised in the intake air box.
At least that's what we assume it was, it's the right size and mass, although we never did find any missing hardware anywhere on that intake or engine. I was fortunate, this happened at 11K ft about 20 miles from the town where my in-laws live, so it amounted to little more than an inconvenience. But an hour earlier I had taken off from Gaston's on a hot day, I would not have liked to have lost a cylinder on the climb out of there. The little Travel Air doesn't have any cylinders to spare.
I'm "Saving" My Engine. Here's the rust evidence: High humidity and inactivity very likely contributed to this engine's very early demise. Be a CSOB by flying your plane more often and not letting corrosion be a possible outcome for your very expensive engine. Pirep and pics courtesy of Aerobatic Bo owner Chuck G. Here is a real gem - for an engine control cable attachment found by Stuart S.
Here is a picture, contributed by Kent F. I hope the elevator cable is OK! Battery to Alternator Wire Chaffing. The engine gages were rebooting occasionally, the EDM and FS as well as the P tachometer would shut down and start back up - with the attendant self-check - as if the battery master had been flicked off and back on.
Worse, it happened with no rhyme nor reason and last time out, the avionics stack rebooted too and thus, the MX20 rebooted as well. Confounded, I sought advice and set out to find the cause with instructions largely consisting of looking for bad connections, or grounds.
Imagine my surprise at finding the fat wire coming off the alternator, which goes through the firewall on it's way to the battery master, grounding against a piece of sheet bracket for the baffles! Worse, each time it arced it was eating away a bit of the bracket - eventually perhaps even being enough to self-correct as it created clearance - picture attached.
Was it the fault of the shop, which performed the annual? However, the issue 'does' arise from being in a shop, giving credence to the old saw. You see, the very first day I owned her I found the terminal where that wire attached to the alternator was improperly crimped 'and' the wire was too short to permit properly adjusting the drive belt tension.
So the mechanic in Dear Valley Phoenix simply spliced in a section of wire to make it longer and crimped everything properly. End of story, right? Not quite because in retrospect he also should have added an Adel clamp for support.
Fortunately, dressing the bracket along with a dab of paint , plus once again splicing in a piece of wire and this time supporting it with an Adel clamp , will put things right. Moral of the story? Don't be too quick to assign blame. Eyeball all wires to ensure they're properly tied-off or supported and NOT rubbing on other things! Be sure your mechanic follows this SB from TCM regarding the installation of the alternator coupling assembly!
Here is another geared alternator service letter from Hartzell: Bonanza owner Jim H. I have a backup and the airport was close, so I charlie miked and had them look at the alternator. It had failed electrically, so they replaced it. I flew home happy as clams. Until I got about 10 miles out of Montgomery The guys at Montgomery aviation fixed it The guys at Abililene aero stood behind it and everything was all good. The newest alternator was installed on August Worked great for 17 hours foreshadowing and as I was flying toward Jacksonville to watch a football game it failed again.
Oh, pooh, I thought. I had just left. Today my guy called me to tell me that the gear had broken, there was metal in the engine and I shudder a little to consider some of the possible outcomes if I had just shrugged it off and flew on Probably would have when the pistons started coming through the crankcase, ya reckon?
It was a fairly mild IFR day, but