Skip to content
  • Ivo Anjo's avatar
    b2be36ef
    [Backport #11036] Add explicit compiler fence when pushing frames to ensure safe profiling (#11090) · b2be36ef
    Ivo Anjo authored
    **What does this PR do?**
    
    This PR tweaks the `vm_push_frame` function to add an explicit compiler
    fence (`atomic_signal_fence`) to ensure profilers that use signals
    to interrupt applications (stackprof, vernier, pf2, Datadog profiler)
    can safely sample from the signal handler.
    
    This is a backport of #11036 to Ruby 3.3 .
    
    **Motivation:**
    
    The `vm_push_frame` was specifically tweaked in
    https://github.com/ruby/ruby/pull/3296 to initialize the a frame
    before updating the `cfp` pointer.
    
    But since there's nothing stopping the compiler from reordering
    the initialization of a frame (`*cfp =`) with the update of the cfp
    pointer (`ec->cfp = cfp`) we've been hesitant to rely on this on
    the Datadog profiler.
    
    In practice, after some experimentation + talking to folks, this
    reordering does not seem to happen.
    
    But since modern compilers have a way for us to exactly tell them
    not to do the reordering (`atomic_signal_fence`), this seems even
    better.
    
    I've actually extracted `vm_push_frame` into the "Compiler Explorer"
    website, which you can use to see the assembly output of this function
    across many compilers and architectures: https://godbolt.org/z/3oxd1446K
    
    On that link you can observe two things across many compilers:
    1. The compilers are not reordering the writes
    2. The barrier does not change the generated assembly output
       (== has no cost in practice)
    
    **Additional Notes:**
    
    The checks added in `configure.ac` define two new macros:
    * `HAVE_STDATOMIC_H`
    * `HAVE_DECL_ATOMIC_SIGNAL_FENCE`
    
    Since Ruby generates an arch-specific `config.h` header with
    these macros upon installation, this can be used by profilers
    and other libraries to test if Ruby was compiled with the fence enabled.
    
    **How to test the change?**
    
    As I mentioned above, you can check https://godbolt.org/z/3oxd1446K
    to confirm the compiled output of `vm_push_frame` does not change
    in most compilers (at least all that I've checked on that site).
    b2be36ef
    [Backport #11036] Add explicit compiler fence when pushing frames to ensure safe profiling (#11090)
    Ivo Anjo authored
    **What does this PR do?**
    
    This PR tweaks the `vm_push_frame` function to add an explicit compiler
    fence (`atomic_signal_fence`) to ensure profilers that use signals
    to interrupt applications (stackprof, vernier, pf2, Datadog profiler)
    can safely sample from the signal handler.
    
    This is a backport of #11036 to Ruby 3.3 .
    
    **Motivation:**
    
    The `vm_push_frame` was specifically tweaked in
    https://github.com/ruby/ruby/pull/3296 to initialize the a frame
    before updating the `cfp` pointer.
    
    But since there's nothing stopping the compiler from reordering
    the initialization of a frame (`*cfp =`) with the update of the cfp
    pointer (`ec->cfp = cfp`) we've been hesitant to rely on this on
    the Datadog profiler.
    
    In practice, after some experimentation + talking to folks, this
    reordering does not seem to happen.
    
    But since modern compilers have a way for us to exactly tell them
    not to do the reordering (`atomic_signal_fence`), this seems even
    better.
    
    I've actually extracted `vm_push_frame` into the "Compiler Explorer"
    website, which you can use to see the assembly output of this function
    across many compilers and architectures: https://godbolt.org/z/3oxd1446K
    
    On that link you can observe two things across many compilers:
    1. The compilers are not reordering the writes
    2. The barrier does not change the generated assembly output
       (== has no cost in practice)
    
    **Additional Notes:**
    
    The checks added in `configure.ac` define two new macros:
    * `HAVE_STDATOMIC_H`
    * `HAVE_DECL_ATOMIC_SIGNAL_FENCE`
    
    Since Ruby generates an arch-specific `config.h` header with
    these macros upon installation, this can be used by profilers
    and other libraries to test if Ruby was compiled with the fence enabled.
    
    **How to test the change?**
    
    As I mentioned above, you can check https://godbolt.org/z/3oxd1446K
    to confirm the compiled output of `vm_push_frame` does not change
    in most compilers (at least all that I've checked on that site).
Loading