Skip to content
  • Felipe Sateler's avatar
    42ce7057
    postgres: Make the initial type map query less expensive · 42ce7057
    Felipe Sateler authored
    When the pg_type table is large, the initial type map query is very expensive.
    This is due to two reasons:
    
    1. The complete query does not allow use of any indexes.
    2. The complete query returns much more rows than we need
    
    Point 1 is fixed by splitting up each OR into its own query. For reasons
    unknown to me, the query plan is slower with OR instead of separate
    queries.
    
    Point 2 is a problem when databases have lots of tables. This is because
    for each table, postgres creates both a type for each record, and a type
    for an array of records. The latter types are matched by the condition
    `t.typelem != 0`. There is no easy way to filter out these types, but
    since we already make a new query due to point 1, we can change our query
    to bring us exactly the data we need: all the types that reference (via
    `typelem`) a type we already know about. This way we greatly reduce the
    number of rows returned.
    
    Additionally, we remove the condition
    `t.typinput = 'array_in(cstring,oid,integer)'::regprocedure`, since it
    brings no value. Array types have a nonzero `typelem`.
    
    This change brings a type map initialization performance improvement.
    A production system with aprox 1.6MM rows in `pg_type`, improves from
    2.5-4.1 seconds to 1.4-1.6 seconds.
    42ce7057
    postgres: Make the initial type map query less expensive
    Felipe Sateler authored
    When the pg_type table is large, the initial type map query is very expensive.
    This is due to two reasons:
    
    1. The complete query does not allow use of any indexes.
    2. The complete query returns much more rows than we need
    
    Point 1 is fixed by splitting up each OR into its own query. For reasons
    unknown to me, the query plan is slower with OR instead of separate
    queries.
    
    Point 2 is a problem when databases have lots of tables. This is because
    for each table, postgres creates both a type for each record, and a type
    for an array of records. The latter types are matched by the condition
    `t.typelem != 0`. There is no easy way to filter out these types, but
    since we already make a new query due to point 1, we can change our query
    to bring us exactly the data we need: all the types that reference (via
    `typelem`) a type we already know about. This way we greatly reduce the
    number of rows returned.
    
    Additionally, we remove the condition
    `t.typinput = 'array_in(cstring,oid,integer)'::regprocedure`, since it
    brings no value. Array types have a nonzero `typelem`.
    
    This change brings a type map initialization performance improvement.
    A production system with aprox 1.6MM rows in `pg_type`, improves from
    2.5-4.1 seconds to 1.4-1.6 seconds.
Loading