-
Mike Dalessio authored
Essentially, this change updates `yp_unescape_calculate_difference` to not create syntax errors, and we rely entirely on `yp_unescape_manipulate_string` to report syntax errors. To do that, this PR adds another (!) parameter to `unescape`: `yp_list_t *error_list`. When present, `unescape` reports syntax errors (and otherwise does not). However, an edge case that needed to be addressed is reporting syntax errors in this case: ?\u{1234 2345} In a string context, it's possible to have multiple codepoints by doing something like `"\u{1234 2345}"`; however, in the character literal context, this is a syntax error -- only a single codepoint is allowed. Unfortunately, when `yp_unescape_manipulate_string` is called, there's nothing to indicate that we are in a "character literal" context and that only a single codepoint is valid. To make this work, this PR: - introduces a new static utility function in yarp.c, `yp_char_literal_node_create_and_unescape`, which is called when we're parsing `YP_TOKEN_CHARACTER_LITERAL` - introduces a new (unexported) function, `yp_unescape_manipulate_char_literal` which does the same thing as `yp_unescape_manipulate_string` but tells `unescape` that only a single codepoint is expected https://github.com/ruby/yarp/commit/f6a65840b5
Mike Dalessio authoredEssentially, this change updates `yp_unescape_calculate_difference` to not create syntax errors, and we rely entirely on `yp_unescape_manipulate_string` to report syntax errors. To do that, this PR adds another (!) parameter to `unescape`: `yp_list_t *error_list`. When present, `unescape` reports syntax errors (and otherwise does not). However, an edge case that needed to be addressed is reporting syntax errors in this case: ?\u{1234 2345} In a string context, it's possible to have multiple codepoints by doing something like `"\u{1234 2345}"`; however, in the character literal context, this is a syntax error -- only a single codepoint is allowed. Unfortunately, when `yp_unescape_manipulate_string` is called, there's nothing to indicate that we are in a "character literal" context and that only a single codepoint is valid. To make this work, this PR: - introduces a new static utility function in yarp.c, `yp_char_literal_node_create_and_unescape`, which is called when we're parsing `YP_TOKEN_CHARACTER_LITERAL` - introduces a new (unexported) function, `yp_unescape_manipulate_char_literal` which does the same thing as `yp_unescape_manipulate_string` but tells `unescape` that only a single codepoint is expected https://github.com/ruby/yarp/commit/f6a65840b5
Loading