I've written a little test program that tiggers FPU-exceptions through feraiseexcept()
:
#include <iostream>
#include <cfenv>
using namespace std;
int main()
{
auto test = []( int exc, char const *what )
{
feclearexcept( FE_ALL_EXCEPT );
feraiseexcept( exc );
exc = fetestexcept( FE_ALL_EXCEPT );
int n = 0;
auto print = [&]( int mask, char const *what )
{
if( !(exc &mask) )
return;
cout << (n++ ? ", " : "") << what;
};
cout << what << ": ";
print( FE_DIVBYZERO, "div by zero" );
print( FE_INEXACT, "inexact" );
print( FE_INVALID, "invalid" );
print( FE_OVERFLOW, "overflow" );
print( FE_UNDERFLOW, "underflow" );
cout << endl;
};
test( FE_DIVBYZERO, "div by zero" );
test( FE_INEXACT, "inexact" );
test( FE_INVALID, "invalid" );
test( FE_OVERFLOW, "overflow" );
test( FE_UNDERFLOW, "underflow" );
}
The results under Linux with glibc are as expected, i.e. there's only one exception triggered per test call. Windows triggers FE_INEXACT in addition when FE_OVERFLOW or FE_UNDERFLOW are test()
ed. As this isn't clean I'd like to impement my own feraisexcept()
function. A pair of _mm_getcsr()
and _mm_setcsr()
doesn't work since setting the exception flag doesn't trigger an appropriate trap, which can be caught with SEH under Windows and signal-handling under Unix.
So with which arithmetic operations can I trigger the SSE-exceptions indivdually with only one flag per operation ?
I've written a little test program that tiggers FPU-exceptions through feraiseexcept()
:
#include <iostream>
#include <cfenv>
using namespace std;
int main()
{
auto test = []( int exc, char const *what )
{
feclearexcept( FE_ALL_EXCEPT );
feraiseexcept( exc );
exc = fetestexcept( FE_ALL_EXCEPT );
int n = 0;
auto print = [&]( int mask, char const *what )
{
if( !(exc &mask) )
return;
cout << (n++ ? ", " : "") << what;
};
cout << what << ": ";
print( FE_DIVBYZERO, "div by zero" );
print( FE_INEXACT, "inexact" );
print( FE_INVALID, "invalid" );
print( FE_OVERFLOW, "overflow" );
print( FE_UNDERFLOW, "underflow" );
cout << endl;
};
test( FE_DIVBYZERO, "div by zero" );
test( FE_INEXACT, "inexact" );
test( FE_INVALID, "invalid" );
test( FE_OVERFLOW, "overflow" );
test( FE_UNDERFLOW, "underflow" );
}
The results under Linux with glibc are as expected, i.e. there's only one exception triggered per test call. Windows triggers FE_INEXACT in addition when FE_OVERFLOW or FE_UNDERFLOW are test()
ed. As this isn't clean I'd like to impement my own feraisexcept()
function. A pair of _mm_getcsr()
and _mm_setcsr()
doesn't work since setting the exception flag doesn't trigger an appropriate trap, which can be caught with SEH under Windows and signal-handling under Unix.
So with which arithmetic operations can I trigger the SSE-exceptions indivdually with only one flag per operation ?
2 Answers
Reset to default 2AFAICT, not possible to trigger only under or overflow without inexact. The only way to get that state is to use stmxcsr
/ or
/ ldxcsr
, or something like save original MXCSR, trigger over and/or underflow, then, if FE_INEXACT wasn't also requested, merge the original IE bit from the original MXCSR.
You could temporarily mask the inexact exceptions if it's unmasked and FE_INEXACT
wasn't requested, then you could generate it and restore the original state of that bit without running an exception handler that shouldn't have run.
But if U or O exceptions are unmasked, the fault will be taken with IE set (if MXCSR bits are updated before taking a fault?) and the exception mask modified.
pcmpeqd xmm0, xmm0
/ pslld xmm0,0x18
(zeroing the mantissa and low bit of the exponent, producing -1.70141183e+38
) / addss xmm0, xmm0
raises the OE (overflow exception) and PE (precision exception) flags in MXCSR as it produces -inf
.
(In GDB, single-step and use p $mxcsr
)
I think there's no way to raise an overflow exception without a precision (inexact) exception, because by definition over/underflow means the actual result (inf or zero) isn't the exact mathematical result. (I also tried -1.70141183e+38/0.0
but that apparently doesn't count as overflow or even invalid, it only raises ZE (divide-by-zero), producing -inf
.)
The comments in glibc's implementation say that Overflow and Underflow can't be generated on their own: https://codebrowser.dev/glibc/glibc/sysdeps/x86_64/fpu/fraiseexcpt.c.html#120 is the glibc libm implementation for x86-64 using inline asm.
It makes the insane choice to only affect the x87 FP environment not the SSE MSCSR for FE_OVERFLOW
or FE_UNDERFLOW
, despite using SSE divss
for FE_INVALID
. (Does fetestexcept
read both x87 and MXCSR and OR their flags?)
Anyway, it uses fldenv
for over/underflow instead of an operation, hence no other exception flags. So you wouldn't get SIGFPE
with those exceptions unmasked.
Presumably whichever Windows implementation you used uses an actual operation that under/overflows and hence also raises FE_INEXACT
. That's allowed by C: If one of the exceptions is FE_OVERFLOW
or FE_UNDERFLOW
, this function may additionally raise FE_INEXACT
- https://en.cppreference/w/c/numeric/fenv/feraiseexcept
It seems right what Peter Cordes mentioned in his comment: only the x87 FPU status word is set for FE_OVERFLOW and FE_UNDERFLOW. glibc then combines both words in its return from fetestexcept:
So there's no way to trigger FE_OVERFLOW or FE_UNDEFLOW individually with SSE.
int
fetestexcept (int excepts)
{
int temp;
unsigned int mxscr;
/* Get current exceptions. */
__asm__ ("fnstsw %0\n"
"stmxcsr %1" : "=m" (*&temp), "=m" (*&mxscr));
return (temp | mxscr) & excepts & FE_ALL_EXCEPT;
}
libm_hidden_def (fetestexcept)
发布者:admin,转转请注明出处:http://www.yc00.com/questions/1744734539a4590664.html
feraisexcept
not taking care to avoid other exceptions. – Peter Cordes Commented Mar 12 at 18:55FE_OVERFLOW
orFE_UNDERFLOW
, despite using SSEdivss
for INVALID. (Doesfetestexcept
read both x87 and MXCSR and OR their flags?) Anyway, it usesfldenv
for that instead of an operation, hence no other exception flags, so you wouldn't get SIGFPE even with those exceptions unmasked. – Peter Cordes Commented Mar 12 at 19:08