Skip to content

Conversation

@kuldeepkk-dev
Copy link
Contributor

Title: Fix: Escape Perl variable $1 in cloud-sql-proxy.sh to prevent 'unbound variable' error
Description
This PR resolves a shell execution error that occurs during the provisioning of Dataproc clusters using the cloud-sql-proxy.sh initialization action when set -u (or set -o nounset) is active.

🐛 Issue
The repair_old_backports function in cloud-sql-proxy.sh uses a perl substitution command within double quotes:

perl -pi -e "s{...}{\$1 ...}g"

The shell (running with set -u) attempts to perform variable substitution on the Perl back-reference $1 before passing the string to the perl interpreter. Since no positional parameter $1 is passed to the shell function, this causes the script to fail with the error:

/etc/google-dataproc/startup-scripts/dataproc-initialization-script-1: line 176: $1: unbound variable

✅ Solution
The fix involves triple-escaping the $1 within the double-quoted string:

# Corrected line 176:
perl -pi -e "s{...}{\\\$1 ...}g"

This ensures the string $1 is passed literally to perl, allowing perl to correctly interpret it as a reference to the first captured group, thus preventing the shell from attempting to interpret it as an unset shell variable.

@kuldeepkk-dev kuldeepkk-dev requested a review from cjac November 17, 2025 19:20
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @kuldeepkk-dev, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical shell execution error that occurred during Dataproc cluster provisioning when the cloud-sql-proxy.sh initialization action was used with set -u enabled. The fix prevents script failures by correctly escaping a Perl back-reference within a shell command, ensuring the Perl interpreter receives the intended string without premature shell variable expansion by the shell.

Highlights

  • Shell Variable Expansion Fix: Corrected an issue where the shell attempted to expand $1 as a shell variable within a Perl substitution command, leading to an 'unbound variable' error when set -u was active during Dataproc cluster provisioning.
  • Perl Back-reference Escaping: The Perl back-reference $1 in the repair_old_backports function was triple-escaped (\$1) to ensure it is passed literally to the Perl interpreter, allowing Perl to correctly interpret it as a captured group and preventing premature shell variable expansion.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@kuldeepkk-dev
Copy link
Contributor Author

/gcbrun

@kuldeepkk-dev
Copy link
Contributor Author

/gemini review

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a shell execution error in cloud-sql-proxy.sh when set -u is active, which was caused by the shell incorrectly trying to expand a Perl back-reference. The fix involves adding another layer of escaping to the back-reference, which is a valid solution for environments with multiple levels of command evaluation. While the fix is functional, I've provided a suggestion to refactor the code for better readability and maintainability, which would make the logic clearer and less reliant on complex escaping rules.

Comment on lines 172 to +173
perl -pi -e "s{^(deb[^\s]*) https?://[^/]+/debian ${oldoldstable}-backports }
{\$1 https://archive.debian.org/debian ${oldoldstable}-backports }g" "${filename}"
{\\\$1 https://archive.debian.org/debian ${oldoldstable}-backports }g" "${filename}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of \\\$1 to escape the Perl back-reference is a valid way to handle nested evaluation by the shell, which seems to be the case in this execution environment. However, this triple-backslash escaping is quite subtle and can be difficult for other developers to understand and maintain.

For improved clarity and long-term maintainability, I recommend refactoring this command to use shell variables. This approach separates the search and replace patterns from the perl command itself, making the quoting much simpler and the overall logic easier to follow.

Here's a suggested alternative:

    local search_pattern="^(deb[^\s]*) https?://[^/]+/debian ${oldoldstable}-backports "
    # The '$1' here is for perl, so it's in single quotes to prevent shell expansion.
    local replace_pattern='$1 https://archive.debian.org/debian '${oldoldstable}'-backports'
    perl -pi -e "s{${search_pattern}}{${replace_pattern}}g" "${filename}"

This would replace the current multi-line perl command and make the script's intent more explicit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is not true. The LLM is incorrect here. The $1 is already being escaped. I asked duckie to be sure:


You are correct to critically evaluate the analysis provided by the on-the-fly code assist. The explanation regarding \\\$1 appears to be incorrect.

Here's a breakdown of the escaping:

  1. The Goal: Inside the Perl substitution command s{...}{...}, we want to use the content of the first capturing group from the search pattern. In Perl, this is represented by $1 in the replacement section.

  2. Shell Quoting: The entire Perl script s{...}{...} is enclosed in double quotes (") in the shell command. The shell processes strings within double quotes before passing them to the command (in this case, perl).

    • $ has special meaning to the shell (variable expansion).
    • \ has special meaning to the shell (escaping).
  3. How the Shell Interprets Escapes:

    • \$1: The shell sees \$ and replaces it with a literal $. So, Perl receives the string $1. This is the correct way to pass the backreference variable $1 to Perl's replacement string when inside shell double quotes.

    • \\\$1:

      • The first \\ tells the shell to treat the second \ as a literal backslash.
      • The shell then sees \$, which it treats as a literal $.
      • So, the shell transforms \\\$1 into \$1 before passing it to Perl.
      • Perl's s/// operator, in the replacement part, would likely interpret \$1 as a literal dollar sign followed by the digit 1, NOT as the content of the first capture group.

Conclusion on the Analysis:

The statement "The use of \\\$1 to escape the Perl back-reference is a valid way to handle nested evaluation by the shell" is misleading in this context. It does not result in Perl using the value of the backreference; it results in a literal string \$1 being passed to Perl's replacement block, which isn't the intended backreference.

The original code snippet (presumably on line 172, using {\$1) was likely the correct one.

Gemini's Refactoring Suggestion:

The idea to use shell variables to construct the search and replace patterns is excellent for readability and maintainability, as it separates the concerns of shell quoting and Perl regex syntax. This aligns with Google's Shell Style Guide (go/shell-style) on making quoting clear.

However, there's a slight issue in the suggested replace_pattern:

# Gemini's suggestion for replace_pattern:
local replace_pattern='$1 https://archive.debian.org/debian '${oldoldstable}'-backports'

This would not expand the oldoldstable variable correctly because of the single quotes. A corrected version of the refactoring would be:

    local search_pattern="^(deb[^\s]*) https?://[^/]+/debian ${oldoldstable}-backports "
    # Ensure $1 is literal for Perl, and shell variables are expanded.
    local replace_pattern="\$1 https://archive.debian.org/debian ${oldoldstable}-backports"
    # Alternatively, using single quotes for the literal $1 part:
    # local replace_pattern='$1 https://archive.debian.org/debian '"${oldoldstable}"'-backports'

    perl -pi -e "s{${search_pattern}}{${replace_pattern}}g" "${filename}"

In this corrected version:

  • search_pattern has ${oldoldstable} expanded by the shell.
  • replace_pattern uses \$1 so the shell passes the literal $1 to Perl. The shell also expands ${oldoldstable}.
  • The final perl command is much cleaner.

In summary: Your skepticism is well-founded. The triple backslash \\\$1 is not the correct way to represent a Perl backreference in this shell-quoted context. The refactoring approach is good, but the example provided needed a minor quoting correction.

Sources:

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses an 'unbound variable' error in a shell script caused by shell expansion of a Perl backreference. The proposed fix correctly escapes the variable to handle double evaluation. My review includes a suggestion to simplify the Perl command, which would eliminate the need for complex escaping and improve code readability.

# Fetch from archive.debian.org for ${oldoldstable}-backports
perl -pi -e "s{^(deb[^\s]*) https?://[^/]+/debian ${oldoldstable}-backports }
{\$1 https://archive.debian.org/debian ${oldoldstable}-backports }g" "${filename}"
{\\\$1 https://archive.debian.org/debian ${oldoldstable}-backports }g" "${filename}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

While this change correctly fixes the escaping issue for the $1 backreference, the use of this backreference is the root cause of the escaping complexity and appears to be unnecessary.

The capture group (deb[^\s]*) on the previous line will only ever capture the literal string deb, because the [^\s]* part is followed by a space in the pattern and will match an empty string.

A simpler and more robust solution would be to remove the capture group from the regex on line 172 and replace $1 here with the literal string deb. This would eliminate the need for the complex \\\$ escaping, making the code easier to read and maintain.

Since line 172 is not part of this change, I cannot provide a direct code suggestion, but I recommend applying this simplification in a follow-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not true. /deb[^\s]* / matches both deb and deb-src

The LLM is confused here.

@soogata soogata assigned soogata and unassigned soogata Nov 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants