Skip to content
  • Jeremy Evans's avatar
    99c6d19e
    Generalize cfunc large array splat fix to fix many additional cases raising SystemStackError · 99c6d19e
    Jeremy Evans authored
    
    
    Originally, when 2e7bceb3 fixed cfuncs to no
    longer use the VM stack for large array splats, it was thought to have fully
    fixed Bug #4040, since the issue was fixed for methods defined in Ruby (iseqs)
    back in Ruby 2.2.
    
    After additional research, I determined that same issue affects almost all
    types of method calls, not just iseq and cfunc calls.  There were two main
    types of remaining issues, important cases (where large array splat should
    work) and pedantic cases (where large array splat raised SystemStackError
    instead of ArgumentError).
    
    Important cases:
    
    ```ruby
    define_method(:a){|*a|}
    a(*1380888.times)
    
    def b(*a); end
    send(:b, *1380888.times)
    
    :b.to_proc.call(self, *1380888.times)
    
    def d; yield(*1380888.times) end
    d(&method(:b))
    
    def self.method_missing(*a); end
    not_a_method(*1380888.times)
    
    ```
    
    Pedantic cases:
    
    ```ruby
    def a; end
    a(*1380888.times)
    def b(_); end
    b(*1380888.times)
    def c(_=nil); end
    c(*1380888.times)
    
    c = Class.new do
      attr_accessor :a
      alias b a=
    end.new
    c.a(*1380888.times)
    c.b(*1380888.times)
    
    c = Struct.new(:a) do
      alias b a=
    end.new
    c.a(*1380888.times)
    c.b(*1380888.times)
    ```
    
    This patch fixes all usage of CALLER_SETUP_ARG with splatting a large
    number of arguments, and required similar fixes to use a temporary
    hidden array in three other cases where the VM would use the VM stack
    for handling a large number of arguments.  However, it is possible
    there may be additional cases where splatting a large number
    of arguments still causes a SystemStackError.
    
    This has a measurable performance impact, as it requires additional
    checks for a large number of arguments in many additional cases.
    
    This change is fairly invasive, as there were many different VM
    functions that needed to be modified to support this. To avoid
    too much API change, I modified struct rb_calling_info to add a
    heap_argv member for storing the array, so I would not have to
    thread it through many functions.  This struct is always stack
    allocated, which helps ensure sure GC doesn't collect it early.
    
    Because of how invasive the changes are, and how rarely large
    arrays are actually splatted in Ruby code, the existing test/spec
    suites are not great at testing for correct behavior.  To try to
    find and fix all issues, I tested this in CI with
    VM_ARGC_STACK_MAX to -1, ensuring that a temporary array is used
    for all array splat method calls.  This was very helpful in
    finding breaking cases, especially ones involving flagged keyword
    hashes.
    
    Fixes [Bug #4040]
    
    Co-authored-by: default avatarJimmy Miller <jimmy.miller@shopify.com>
    99c6d19e
    Generalize cfunc large array splat fix to fix many additional cases raising SystemStackError
    Jeremy Evans authored
    
    
    Originally, when 2e7bceb3 fixed cfuncs to no
    longer use the VM stack for large array splats, it was thought to have fully
    fixed Bug #4040, since the issue was fixed for methods defined in Ruby (iseqs)
    back in Ruby 2.2.
    
    After additional research, I determined that same issue affects almost all
    types of method calls, not just iseq and cfunc calls.  There were two main
    types of remaining issues, important cases (where large array splat should
    work) and pedantic cases (where large array splat raised SystemStackError
    instead of ArgumentError).
    
    Important cases:
    
    ```ruby
    define_method(:a){|*a|}
    a(*1380888.times)
    
    def b(*a); end
    send(:b, *1380888.times)
    
    :b.to_proc.call(self, *1380888.times)
    
    def d; yield(*1380888.times) end
    d(&method(:b))
    
    def self.method_missing(*a); end
    not_a_method(*1380888.times)
    
    ```
    
    Pedantic cases:
    
    ```ruby
    def a; end
    a(*1380888.times)
    def b(_); end
    b(*1380888.times)
    def c(_=nil); end
    c(*1380888.times)
    
    c = Class.new do
      attr_accessor :a
      alias b a=
    end.new
    c.a(*1380888.times)
    c.b(*1380888.times)
    
    c = Struct.new(:a) do
      alias b a=
    end.new
    c.a(*1380888.times)
    c.b(*1380888.times)
    ```
    
    This patch fixes all usage of CALLER_SETUP_ARG with splatting a large
    number of arguments, and required similar fixes to use a temporary
    hidden array in three other cases where the VM would use the VM stack
    for handling a large number of arguments.  However, it is possible
    there may be additional cases where splatting a large number
    of arguments still causes a SystemStackError.
    
    This has a measurable performance impact, as it requires additional
    checks for a large number of arguments in many additional cases.
    
    This change is fairly invasive, as there were many different VM
    functions that needed to be modified to support this. To avoid
    too much API change, I modified struct rb_calling_info to add a
    heap_argv member for storing the array, so I would not have to
    thread it through many functions.  This struct is always stack
    allocated, which helps ensure sure GC doesn't collect it early.
    
    Because of how invasive the changes are, and how rarely large
    arrays are actually splatted in Ruby code, the existing test/spec
    suites are not great at testing for correct behavior.  To try to
    find and fix all issues, I tested this in CI with
    VM_ARGC_STACK_MAX to -1, ensuring that a temporary array is used
    for all array splat method calls.  This was very helpful in
    finding breaking cases, especially ones involving flagged keyword
    hashes.
    
    Fixes [Bug #4040]
    
    Co-authored-by: default avatarJimmy Miller <jimmy.miller@shopify.com>
Loading