Skip to content
  • Sean Griffin's avatar
    c4cb6862
    Make Active Record emit significantly smaller YAML · c4cb6862
    Sean Griffin authored
    This reduces the size of a YAML encoded Active Record object by ~80%
    depending on the number of columns. There were a number of wasteful
    things that occurred when we encoded the objects before that have
    resulted in numerous wins
    
    - We were emitting the result of `attributes_before_type_cast` as a hack
      to work around some laziness issues
    - The name of an attribute was emitted multiple times, since the
      attribute objects were in a hash keyed by the name. We now store them
      in an array instead, and reconstruct the hash using the name
    - The types were included for every attribute. This would use backrefs
      if multiple objects were encoded, but really we don't need to include
      it at all unless it differs from the type at the class level. (The
      only time that will occur is if the field is the result of a custom
      select clause)
    - `original_attribute:` was included over and over and over again since
      the ivar is almost always `nil`. We've added a custom implementation
      of `encode_with` on the attribute objects to ensure we don't write the
      key when the field is `nil`.
    
    This isn't without a cost though. Since we're no longer including the
    types, an object can find itself in an invalid state if the type changes
    on the class after serialization. This is the same as 4.1 and earlier,
    but I think it's worth noting.
    
    I was worried that I'd introduce some new state bugs as a result of
    doing this, so I've added an additional test that asserts mutation not
    being lost as the result of YAML round tripping.
    
    Fixes #25145
    c4cb6862
    Make Active Record emit significantly smaller YAML
    Sean Griffin authored
    This reduces the size of a YAML encoded Active Record object by ~80%
    depending on the number of columns. There were a number of wasteful
    things that occurred when we encoded the objects before that have
    resulted in numerous wins
    
    - We were emitting the result of `attributes_before_type_cast` as a hack
      to work around some laziness issues
    - The name of an attribute was emitted multiple times, since the
      attribute objects were in a hash keyed by the name. We now store them
      in an array instead, and reconstruct the hash using the name
    - The types were included for every attribute. This would use backrefs
      if multiple objects were encoded, but really we don't need to include
      it at all unless it differs from the type at the class level. (The
      only time that will occur is if the field is the result of a custom
      select clause)
    - `original_attribute:` was included over and over and over again since
      the ivar is almost always `nil`. We've added a custom implementation
      of `encode_with` on the attribute objects to ensure we don't write the
      key when the field is `nil`.
    
    This isn't without a cost though. Since we're no longer including the
    types, an object can find itself in an invalid state if the type changes
    on the class after serialization. This is the same as 4.1 and earlier,
    but I think it's worth noting.
    
    I was worried that I'd introduce some new state bugs as a result of
    doing this, so I've added an additional test that asserts mutation not
    being lost as the result of YAML round tripping.
    
    Fixes #25145
Loading