c++ - How to trigger exactly only *one* SSE-exception - Stack Overflow|江阴雨辰互联

I've written a little test program that tiggers FPU-exceptions through feraiseexcept():

#include <iostream>
#include <cfenv>

using namespace std;

int main()
{
    auto test = []( int exc, char const *what )
    {
        feclearexcept( FE_ALL_EXCEPT );
        feraiseexcept( exc );
        exc = fetestexcept( FE_ALL_EXCEPT );
        int n = 0;
        auto print = [&]( int mask, char const *what )
        {
            if( !(exc &mask) )
                return;
            cout << (n++ ? ", " : "") << what;
        };
        cout << what << ": ";
        print( FE_DIVBYZERO, "div by zero" );
        print( FE_INEXACT, "inexact" );
        print( FE_INVALID, "invalid" );
        print( FE_OVERFLOW, "overflow" );
        print( FE_UNDERFLOW, "underflow" );
        cout << endl;
    };
    test( FE_DIVBYZERO, "div by zero" );
    test( FE_INEXACT, "inexact" );
    test( FE_INVALID, "invalid" );
    test( FE_OVERFLOW, "overflow" );
    test( FE_UNDERFLOW, "underflow" );
}

The results under Linux with glibc are as expected, i.e. there's only one exception triggered per test call. Windows triggers FE_INEXACT in addition when FE_OVERFLOW or FE_UNDERFLOW are test()ed. As this isn't clean I'd like to impement my own feraisexcept() function. A pair of _mm_getcsr() and _mm_setcsr() doesn't work since setting the exception flag doesn't trigger an appropriate trap, which can be caught with SEH under Windows and signal-handling under Unix.
So with which arithmetic operations can I trigger the SSE-exceptions indivdually with only one flag per operation ?

I've written a little test program that tiggers FPU-exceptions through feraiseexcept():

#include <iostream>
#include <cfenv>

using namespace std;

int main()
{
    auto test = []( int exc, char const *what )
    {
        feclearexcept( FE_ALL_EXCEPT );
        feraiseexcept( exc );
        exc = fetestexcept( FE_ALL_EXCEPT );
        int n = 0;
        auto print = [&]( int mask, char const *what )
        {
            if( !(exc &mask) )
                return;
            cout << (n++ ? ", " : "") << what;
        };
        cout << what << ": ";
        print( FE_DIVBYZERO, "div by zero" );
        print( FE_INEXACT, "inexact" );
        print( FE_INVALID, "invalid" );
        print( FE_OVERFLOW, "overflow" );
        print( FE_UNDERFLOW, "underflow" );
        cout << endl;
    };
    test( FE_DIVBYZERO, "div by zero" );
    test( FE_INEXACT, "inexact" );
    test( FE_INVALID, "invalid" );
    test( FE_OVERFLOW, "overflow" );
    test( FE_UNDERFLOW, "underflow" );
}

Share Improve this question edited Mar 12 at 18:53 Peter Cordes 368k49 gold badges717 silver badges981 bronze badges asked Mar 12 at 18:24 Edison von Myosotis 7074 silver badges10 bronze badges

What compiler / standard library are you using on Windows? If it's MSVC's standard library, probably you should tag [msvc] (aka [visual-c++]). I'd assume the problem is their implementation of feraisexcept not taking care to avoid other exceptions. – Peter Cordes Commented Mar 12 at 18:55
codebrowser.dev/glibc/glibc/sysdeps/x86_64/fpu/… is the glibc libm implementation for x86-64 using inline asm. Which makes the insane choice to only affect the x87 FP environment, not the SSE MSCSR, for FE_OVERFLOW or FE_UNDERFLOW, despite using SSE divss for INVALID. (Does fetestexcept read both x87 and MXCSR and OR their flags?) Anyway, it uses fldenv for that instead of an operation, hence no other exception flags, so you wouldn't get SIGFPE even with those exceptions unmasked. – Peter Cordes Commented Mar 12 at 19:08
Fwiw, ICX also does what MSVC does (while clang does what gcc does): godbolt./z/n5coThW7c – Ted Lyngmo Commented Mar 12 at 21:23

Add a comment |

2 Answers 2

Sorted by: Reset to default 2

AFAICT, not possible to trigger only under or overflow without inexact. The only way to get that state is to use stmxcsr / or / ldxcsr, or something like save original MXCSR, trigger over and/or underflow, then, if FE_INEXACT wasn't also requested, merge the original IE bit from the original MXCSR.

You could temporarily mask the inexact exceptions if it's unmasked and FE_INEXACT wasn't requested, then you could generate it and restore the original state of that bit without running an exception handler that shouldn't have run.

But if U or O exceptions are unmasked, the fault will be taken with IE set (if MXCSR bits are updated before taking a fault?) and the exception mask modified.

pcmpeqd xmm0, xmm0 / pslld xmm0,0x18 (zeroing the mantissa and low bit of the exponent, producing -1.70141183e+38) / addss xmm0, xmm0 raises the OE (overflow exception) and PE (precision exception) flags in MXCSR as it produces -inf.
(In GDB, single-step and use p $mxcsr)

I think there's no way to raise an overflow exception without a precision (inexact) exception, because by definition over/underflow means the actual result (inf or zero) isn't the exact mathematical result. (I also tried -1.70141183e+38/0.0 but that apparently doesn't count as overflow or even invalid, it only raises ZE (divide-by-zero), producing -inf.)

The comments in glibc's implementation say that Overflow and Underflow can't be generated on their own: https://codebrowser.dev/glibc/glibc/sysdeps/x86_64/fpu/fraiseexcpt.c.html#120 is the glibc libm implementation for x86-64 using inline asm.

It makes the insane choice to only affect the x87 FP environment not the SSE MSCSR for FE_OVERFLOW or FE_UNDERFLOW, despite using SSE divss for FE_INVALID. (Does fetestexcept read both x87 and MXCSR and OR their flags?)

Anyway, it uses fldenv for over/underflow instead of an operation, hence no other exception flags. So you wouldn't get SIGFPE with those exceptions unmasked.

Presumably whichever Windows implementation you used uses an actual operation that under/overflows and hence also raises FE_INEXACT. That's allowed by C: If one of the exceptions is FE_OVERFLOW or FE_UNDERFLOW, this function may additionally raise FE_INEXACT - https://en.cppreference/w/c/numeric/fenv/feraiseexcept

It seems right what Peter Cordes mentioned in his comment: only the x87 FPU status word is set for FE_OVERFLOW and FE_UNDERFLOW. glibc then combines both words in its return from fetestexcept:
So there's no way to trigger FE_OVERFLOW or FE_UNDEFLOW individually with SSE.

int
fetestexcept (int excepts)
{
  int temp;
  unsigned int mxscr;

  /* Get current exceptions.  */
  __asm__ ("fnstsw %0\n"
       "stmxcsr %1" : "=m" (*&temp), "=m" (*&mxscr));

  return (temp | mxscr) & excepts & FE_ALL_EXCEPT;
}
libm_hidden_def (fetestexcept)

发布者：admin，转转请注明出处：http://www.yc00.com/questions/1744734539a4590664.html

c++ - How to trigger exactly only one SSE-exception - Stack Overflow

2 Answers 2

发表回复

评论列表（0条）

联系我们

400-800-8888

c++ - How to trigger exactly only *one* SSE-exception - Stack Overflow

2 Answers 2

相关推荐