Skip to content
  • eileencodes's avatar
    637d1cc0
    Improve the performance of super · 637d1cc0
    eileencodes authored
    
    
    This PR improves the performance of `super` calls. While working on some
    Rails optimizations jhawthorn discovered that `super` calls were slower
    than expected.
    
    The changes here do the following:
    
    1) Adds a check for whether the call frame is not equal to the method
    entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line
    which is quite slow. If the current call frame is equal to the method
    entry we know we can't have an instance eval, etc.
    2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've
    already done the check for `T_ICLASS` above.
    3) Adds a benchmark for `T_ICLASS` super calls.
    4) Note: makes a chage for `method_entry_cref` to use `const`.
    
    On master the benchmarks showed that `super` is 1.76x slower. Our
    changes improved the performance so that it is now only 1.36x slower.
    
    Benchmark IPS:
    
    ```
    Warming up --------------------------------------
                   super   244.918k i/100ms
             method call   383.007k i/100ms
    Calculating -------------------------------------
                   super      2.280M (± 6.7%) i/s -     11.511M in   5.071758s
             method call      3.834M (± 4.9%) i/s -     19.150M in   5.008444s
    
    Comparison:
             method call:  3833648.3 i/s
                   super:  2279837.9 i/s - 1.68x  (± 0.00) slower
    ```
    
    With changes:
    
    ```
    Warming up --------------------------------------
                   super   308.777k i/100ms
             method call   375.051k i/100ms
    Calculating -------------------------------------
                   super      2.951M (± 5.4%) i/s -     14.821M in   5.039592s
             method call      3.551M (± 4.9%) i/s -     18.002M in   5.081695s
    
    Comparison:
             method call:  3551372.7 i/s
                   super:  2950557.9 i/s - 1.20x  (± 0.00) slower
    ```
    
    Ruby VM benchmarks also showed an improvement:
    
    Existing `vm_super` benchmark`.
    
    ```
    $ make benchmark ITEM=vm_super
    
    |          |compare-ruby|built-ruby|
    |:---------|-----------:|---------:|
    |vm_super  |     21.555M|   37.819M|
    |          |           -|     1.75x|
    ```
    
    New `vm_iclass_super` benchmark:
    
    ```
    $ make benchmark ITEM=vm_iclass_super
    
    |                 |compare-ruby|built-ruby|
    |:----------------|-----------:|---------:|
    |vm_iclass_super  |      1.669M|    3.683M|
    |                 |           -|     2.21x|
    ```
    
    This is the benchmark script used for the benchmark-ips benchmarks:
    
    ```ruby
    require "benchmark/ips"
    
    class Foo
      def zuper; end
      def top; end
    
      last_method = "top"
    
      ("A".."M").each do |module_name|
        eval <<-EOM
        module #{module_name}
          def zuper; super; end
          def #{module_name.downcase}
            #{last_method}
          end
        end
        prepend #{module_name}
        EOM
        last_method = module_name.downcase
      end
    end
    
    foo = Foo.new
    
    Benchmark.ips do |x|
      x.report "super" do
        foo.zuper
      end
    
      x.report "method call" do
        foo.m
      end
    
      x.compare!
    end
    ```
    
    Co-authored-by: default avatarAaron Patterson <tenderlove@ruby-lang.org>
    Co-authored-by: default avatarJohn Hawthorn <john@hawthorn.email>
    637d1cc0
    Improve the performance of super
    eileencodes authored
    
    
    This PR improves the performance of `super` calls. While working on some
    Rails optimizations jhawthorn discovered that `super` calls were slower
    than expected.
    
    The changes here do the following:
    
    1) Adds a check for whether the call frame is not equal to the method
    entry iseq. This avoids the `rb_obj_is_kind_of` check on the next line
    which is quite slow. If the current call frame is equal to the method
    entry we know we can't have an instance eval, etc.
    2) Changes `FL_TEST` to `FL_TEST_RAW`. This is safe because we've
    already done the check for `T_ICLASS` above.
    3) Adds a benchmark for `T_ICLASS` super calls.
    4) Note: makes a chage for `method_entry_cref` to use `const`.
    
    On master the benchmarks showed that `super` is 1.76x slower. Our
    changes improved the performance so that it is now only 1.36x slower.
    
    Benchmark IPS:
    
    ```
    Warming up --------------------------------------
                   super   244.918k i/100ms
             method call   383.007k i/100ms
    Calculating -------------------------------------
                   super      2.280M (± 6.7%) i/s -     11.511M in   5.071758s
             method call      3.834M (± 4.9%) i/s -     19.150M in   5.008444s
    
    Comparison:
             method call:  3833648.3 i/s
                   super:  2279837.9 i/s - 1.68x  (± 0.00) slower
    ```
    
    With changes:
    
    ```
    Warming up --------------------------------------
                   super   308.777k i/100ms
             method call   375.051k i/100ms
    Calculating -------------------------------------
                   super      2.951M (± 5.4%) i/s -     14.821M in   5.039592s
             method call      3.551M (± 4.9%) i/s -     18.002M in   5.081695s
    
    Comparison:
             method call:  3551372.7 i/s
                   super:  2950557.9 i/s - 1.20x  (± 0.00) slower
    ```
    
    Ruby VM benchmarks also showed an improvement:
    
    Existing `vm_super` benchmark`.
    
    ```
    $ make benchmark ITEM=vm_super
    
    |          |compare-ruby|built-ruby|
    |:---------|-----------:|---------:|
    |vm_super  |     21.555M|   37.819M|
    |          |           -|     1.75x|
    ```
    
    New `vm_iclass_super` benchmark:
    
    ```
    $ make benchmark ITEM=vm_iclass_super
    
    |                 |compare-ruby|built-ruby|
    |:----------------|-----------:|---------:|
    |vm_iclass_super  |      1.669M|    3.683M|
    |                 |           -|     2.21x|
    ```
    
    This is the benchmark script used for the benchmark-ips benchmarks:
    
    ```ruby
    require "benchmark/ips"
    
    class Foo
      def zuper; end
      def top; end
    
      last_method = "top"
    
      ("A".."M").each do |module_name|
        eval <<-EOM
        module #{module_name}
          def zuper; super; end
          def #{module_name.downcase}
            #{last_method}
          end
        end
        prepend #{module_name}
        EOM
        last_method = module_name.downcase
      end
    end
    
    foo = Foo.new
    
    Benchmark.ips do |x|
      x.report "super" do
        foo.zuper
      end
    
      x.report "method call" do
        foo.m
      end
    
      x.compare!
    end
    ```
    
    Co-authored-by: default avatarAaron Patterson <tenderlove@ruby-lang.org>
    Co-authored-by: default avatarJohn Hawthorn <john@hawthorn.email>
Loading