This post is about C++. If you don’t care about C++, don’t read it.
In fact it’s really just about a difference between compilers. If you’re already using g++ on your HP-UX system (not a bad idea, by the way), don’t read it.
The g++ compiler doesn’t like the type
long long, at least not if you use the compile option
–std=C++98. Oh, it will go along, as long as you don’t use the
-pedantic option when you compile – but with
-pedantic in charge, g++ doesn’t let you declare a
long long. It doesn’t just complain; it refuses to compile. The reason is that
long long is wrong, and doesn’t belong. The 1998 C++ Standard doesn’t recognize it as a type.
The ideal solution, to make a long story short, is to eliminate the use of
long long. On a typical 64-bit system, a
long long is no longer than a
long anyway, so just use a
long. In practice this solution may not be possible, either because it would take too long, or because you need to compile with C functions that have a
long long in their signatures.
Another solution, then, is to compile without the
-pedantic option. I don’t like this solution myself, because I need all the help I can get in finding mistakes and sloppiness.
If you like
-pedantic, but you still long for a
long long, add the option
-Wno-long-long. This option tells the compiler that a
long long is okay as a compiler extension – and wrong no longer.
On Linux I tried to compile a C program that I had brought over from HP-UX and got the following warning:
foo.c: In function 'bar':
foo.c:42: warning: assignment makes integer from pointer without a cast
The code in question terminates a character string with a nul character. However, instead of coding the nul character as some variety of zero, it codes it as
NULL. This is bad form, because
NULL is intended to be used as a pointer value, not as a number or character. The compiler may or may not issue a warning, depending on the vendor’s taste, and depending on how the
NULL macro is defined.
NULL may be defined in either of two ways:
#define NULL 0
#define NULL (void *) 0
In the former case,
NULL is of type
int, and may be assigned to a character variable without a qualm (apart from stylistic objections). In the latter case,
NULL is of type
void *, and the compiler may be a little queasy about assigning it to a character.
On Linux, gcc defines
NULL as a pointer, and issues the warning. I’m not sure what the HP-UX compiler does, because the
#define is tangled in a thicket of
#ifdefs. It may define
NULL as an
int, which would probably not result in a warning. Or maybe we did get a warning and ignored it.
The offending line is embarrassingly gauche, but not actually broken. Either compiler will almost certainly do the right thing.
I can fix the code before or after migration. I could even leave it unchanged, if I were willing to ignore the compiler warning. However I don’t want to get in the habit of ignoring compiler warnings, especially when I can easily eliminate them.
This assessment applies to any case where you misuse
NULL as a numeric value. If you see the same compiler warning in a context where the pointer value is anything other than
NULL, the issue is more serious and should be addressed immediately. However those cases probably would have already been noticed because they would introduce nasty bugs.
Learn more about Legacy Systems Support, contact Unified Development, Inc.
We use some utility encryption routines in C for various purposes. When I first tried to migrate these routines from HP-UX to Red Hat Linux, I ran into a problem right away: the linker couldn’t find the libcl.a library.
There is no library by that name in Linux, and I had no idea what the linker wanted to find there. As an experiment, I edited the make file to remove the reference to –lcl. Now the linker complained that it couldn’t find the functions setkey() and encrypt().
A little research showed that, in Linux, those functions reside in the library libcrypt.a. I edited the make file again, replacing the reference to –lcl with a reference to –lcrypt.
(Under HP-UX, the libcl.a library includes not only the encryption functions but also a bunch of math functions. Under Linux, the math functions presumably reside in the library libmath.a. However I haven’t needed any of them so far, so I’m guessing.)
Now the linker was happy, but I wasn’t. Whenever I tried to encrypt anything, all I got was a string of zeros. Um, that’s not encryption. At best, it’s a really, really, really bad hash function.
At that point I fell down the rabbit hole of debugging – grepping the source, studying the code, inserting displays, Googling for clues, and trying experiments. Here’s what I found out.
The setkey() and encrypt() functions apply DES encryption to blocks of 64 bits at a time. That’s bits, not bytes. However, you don’t pass the bits as 8-character arrays. You pass them as 64-character arrays, one character for each bit. It’s the job of your application to translate back and forth between these bit arrays and whatever the data should really look like.
One reason for this arrangement is probably that not all machines use 8-bit bytes. The C language requires that a byte contain at least eight bits, but it can have more. I’ve heard of CPUs that use 9-bit bytes, and others that use 64-bit bytes. If you really want to be perverse, you could build a machine with 37-bit bytes. It may be easier for an encryption standard to support exotic architectures if it breaks everything down to the bit level.
In the HP-UX version of these functions, the bits in these arrays can be encoded as the ASCII characters ‘1’ and ‘0’. I haven’t found any documentation of that fact, but that’s how our programs were coded, and they seem to work okay.
In the Linux version of these functions, the bits in these bit arrays evidently must be encoded as the binary values 0×00 and 0×01. I haven’t found any documentation of that fact either. The man pages on both systems are ambiguous.
The best evidence is that, after I rewrote our code to pass the bit arrays with binary values instead of ASCII characters, the encryption started working on Linux. I can encrypt something and then decrypt it, and get the same thing I started with. Furthermore I can encrypt something on HP-UX, decrypt it on Linux, and get the right answer.
Out of curiosity, I tried the Linux version on HP-UX – and it worked! Evidently the HP-UX implementation of setkey() and encrypt() can accept either character values or binary values in their bit arrays, but the Linux version only accepts binary values. Once I converted the code to use binary values, I could use it on both platforms without a bunch of ugly #ifdefs cluttering up the code.
The C and C++ languages don’t try to define the results of every syntactically correct program. For example, if you dynamically allocate some memory, and then free it, and then try to access the memory you freed, you invoke undefined behavior. The C and C++ standards don’t specify how the program will respond.
Undefined behavior means that anything can happen, because the compiler is under no constraints. The traditional formulation is that undefined behavior can make demons fly out your nose. In practice the consequences are usually less dramatic.
If demons ever fly out your nose, you’ll know you have a bug. You can track it down and fix it. More insidious is undefined behavior that happens to be exactly what you want.
I ran across an example as I was preparing to port some code from HP-UX to Linux. The program was freeing a linked list, using code similar to the following:
Node * curr_node = first_node;
while( curr_node )
free( curr_node );
curr_node = curr_node->next;
To a long-time C coder, this code immediately looks fishy, because the loop has only two statements in it. Look a little closer. The second statement in the loop tries to access memory through a pointer that has already been freed. It invokes undefined behavior.
This program has been running for years with no obvious ill effects from this bug. Apparently HP-UX isn’t very persnickety about accessing previously freed memory. That’s legal. “Anything can happen” includes “what you want.”
When I first saw this code, I wasn’t ready to port the entire program to Linux yet, but I could experiment. I dashed off a little test program that built a linked list and then freed it, using the logic shown above. Under HP-UX this program ran to completion without incident. Under Linux, the same program stopped abruptly in the first iteration. It didn’t issue any messages, dump core, or even leave a non-zero condition code; it just stopped cold. That’s legal too. Anything can happen.
I don’t know whether this difference is attributable to the operating systems, the compilers, the libraries, or the machine architectures. I don’t care. What matters is that I can’t run this program under Linux without fixing the loop:
Node * curr_node = first_node;
while( curr_node )
Node * temp = curr_node->next;
free( curr_node );
curr_node = temp;
A few days later, another example of undefined behavior popped up, in the form of a buffer overflow. Under HP-UX the overflow had no visible effect, at least not until I started poking around with printf statements. Under Linux the code just didn’t work. Probably the variables are arranged differently in memory. In HP-UX the overflow didn’t damage anything that mattered, and in Linux it did.
These examples are just things that I stumbled across. There will be more, and I won’t catch them all so painlessly. Fancy code analyzers may help catch things in advance, but there is no substitute for vigilance.
It’s tempting to conclude that HP-UX is more forgiving of blunders than Linux is, since some things work in HP-UX but don’t work in Linux. That conclusion is premature. Maybe the two platforms are just forgiving about different things. If the bugs had done obvious damage under HP-UX they would have been fixed already.
There’s a Darwinian process at work here. Bugs survive when they’re well adapted to the environment. When the environment changes, some of those bugs will go extinct. Unfortunately, new species will probably replace them.
You might think that migrating from HP-UX to Linux would be a piece of cake. Hey, they’re both UNIX, right? Sort of?
Naturally, it’s not that simple. Even among two UNIX-like environments, there are all sorts of pesky little differences to trip over. Typically the resulting problems are not hard to fix, but they can be hard to anticipate.
Lately we’ve been migrating an application from HP-UX to Red Hat Linux, and stumbling over a series of little gotchas. I plan to report some of them on this blog, and maybe save somebody else some hair-pulling.
For example: take the echo command.
Under HP-UX, echo recognizes escape sequences such as “\t” for horizonal tab, “\n” for newline, and so forth. Under Linux, it doesn’t, at least not by default. However the -e option tells the Linux version to recognize escape sequences.
One solution would be to look through all our shell scripts for ones that use echo with escape sequences, and fix them. Ideally, I should do that. However, the search would be tedious and time-consuming.
Instead, I defined the following alias in a logon script that everybody goes through:
alias echo=”echo -e”
If you really want echo not to recognize escape sequences, use the -E option, which works even with the alias in place. When the -e and -E options are both present, the last one wins.
Under HP-UX, the echo command is built into the shell, i.e. ksh and there is also a separate executable /usr/bin/echo. Both versions recognize escape sequences.
Likewise under Linux: the echo command is built into bash, and there is also a separate executable /bin/echo. Neither version recognizes escape sequences unless you include the -e option.