Skip to content
  • Peter Zhu's avatar
    d42b9ffb
    Reuse Regexp ptr when recompiling · d42b9ffb
    Peter Zhu authored
    When matching an incompatible encoding, the Regexp needs to recompile.
    If `usecnt == 0`, then we can reuse the `ptr` because nothing else is
    using it. This avoids allocating another `regex_t`.
    
    This speeds up matches that switch to incompatible encodings by 15%.
    
    Branch:
    
    ```
    Regex#match? with different encoding
                              1.431M (± 1.3%) i/s -      7.264M in   5.076153s
    Regex#match? with same encoding
                             16.858M (± 1.1%) i/s -     85.347M in   5.063279s
    ```
    
    Base:
    
    ```
    Regex#match? with different encoding
                              1.248M (± 2.0%) i/s -      6.342M in   5.083151s
    Regex#match? with same encoding
                             16.377M (± 1.1%) i/s -     82.519M in   5.039504s
    ```
    
    Script:
    
    ```
    regex = /foo/
    str1 = "日本語"
    str2 = "English".force_encoding("ASCII-8BIT")
    
    Benchmark.ips do |x|
      x.report("Regex#match? with different encoding") do |times|
        i = 0
        while i < times
          regex.match?(str1)
          regex.match?(str2)
          i += 1
        end
      end
    
      x.report("Regex#match? with same encoding") do |times|
        i = 0
        while i < times
          regex.match?(str1)
          i += 1
        end
      end
    end
    ```
    d42b9ffb
    Reuse Regexp ptr when recompiling
    Peter Zhu authored
    When matching an incompatible encoding, the Regexp needs to recompile.
    If `usecnt == 0`, then we can reuse the `ptr` because nothing else is
    using it. This avoids allocating another `regex_t`.
    
    This speeds up matches that switch to incompatible encodings by 15%.
    
    Branch:
    
    ```
    Regex#match? with different encoding
                              1.431M (± 1.3%) i/s -      7.264M in   5.076153s
    Regex#match? with same encoding
                             16.858M (± 1.1%) i/s -     85.347M in   5.063279s
    ```
    
    Base:
    
    ```
    Regex#match? with different encoding
                              1.248M (± 2.0%) i/s -      6.342M in   5.083151s
    Regex#match? with same encoding
                             16.377M (± 1.1%) i/s -     82.519M in   5.039504s
    ```
    
    Script:
    
    ```
    regex = /foo/
    str1 = "日本語"
    str2 = "English".force_encoding("ASCII-8BIT")
    
    Benchmark.ips do |x|
      x.report("Regex#match? with different encoding") do |times|
        i = 0
        while i < times
          regex.match?(str1)
          regex.match?(str2)
          i += 1
        end
      end
    
      x.report("Regex#match? with same encoding") do |times|
        i = 0
        while i < times
          regex.match?(str1)
          i += 1
        end
      end
    end
    ```
Loading