Skip to content

Conversation

@patriciapampanelli
Copy link
Collaborator

Adds JSON config file support alongside YAML, enabling users to use either format for garak configuration files. Extension-less config lookups (--config fast) now work JSON-only, while YAML configs require explicit .yaml extension.

issue #913

Extension-less Lookup: JSON-Only

  • Extension-less lookups (--config fast) only work for JSON files
  • YAML files must specify explicit .yaml extension

Behavior

User Input .json exists .yaml exists Behavior
--config fast Yes No Uses fast.json
--config fast No Yes Error: YAML needs explicit .yaml extension
--config fast Yes Yes Warning + uses .json
--config fast.yaml Any Yes Uses fast.yaml (explicit)
--config fast.json Yes Any Uses fast.json (explicit)

Site Config

  • Both garak.site.json and garak.site.yaml supported
  • Errors if both exist

Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks you, some thoughts for consideration.

probe_spec: latentinjection
If we save this as ``latent1.yaml`` somewhere, then we can use it with ``garak --config latent1.yaml``.
Note: YAML configs require the explicit ``.yaml`` extension.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these also support .yml? This seems to be a somewhat contentious question on the internet. :-)

Comment on lines +165 to +168
if settings_filename.endswith(".json"):
settings = json.load(settings_file)
else:
settings = yaml.safe_load(settings_file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems inverted based on the assumptions and resolutions in the description:

Suggested change
if settings_filename.endswith(".json"):
settings = json.load(settings_file)
else:
settings = yaml.safe_load(settings_file)
if settings_filename.endswith(".yaml"):
settings = yaml.safe_load(settings_file)
else:
settings = json.load(settings_file)

Consider expanding this to have a try block that can be used to retry load with the other format for cases where the filename does not contain a supported extension.

This would account for user provided cli arguments like:

  • --config my_config
  • --config my_config.file
  • --config my_config.yml
  • --config my_config.yaml
  • --config my_config.json

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend accepting yaml, yml, json extensions in both upper & lower case

-- or we could even have --config_json and --config_yaml options, ignoring extension, and default to --config_yaml

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like both of these ideas, and we should consider just adding error handling here.



def test_load_json_config():
importlib.reload(_config)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using a fixture to reload before and after each test.

@pytest.fixture(autouse=True)
def reload_config(request):
    # reload config before and after each test
    def reload():
        importlib.reload(_config)

    reload()
    request.addfinalizer(reload)

Testing is needed to determine how this interacts with the allow_site_config fixture which may require marking tests that needs this vs setting autouse=True.

Copy link
Collaborator

@leondz leondz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty thorough. Maybe we can re-think how we do switching, and put the burden onto the user to specify their intent explicitly wrt. extension, rather than having to guess it.

Would also like deeper testing of edge cases & loading behaviour from YAML and JSON loading, so we know how to write garak configs in both these formats, and so we know we're treating them both evenly and not e.g. misapplying fixes on YAML loading to results from JSON loading

Comment on lines +165 to +168
if settings_filename.endswith(".json"):
settings = json.load(settings_file)
else:
settings = yaml.safe_load(settings_file)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend accepting yaml, yml, json extensions in both upper & lower case

-- or we could even have --config_json and --config_yaml options, ignoring extension, and default to --config_yaml

fq_site_config_filename = str(transient.config_dir / site_config_filename)
if os.path.isfile(fq_site_config_filename):
settings_files.append(fq_site_config_filename)
if site_config_filename == "garak.site.yaml":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this if intent to test for? the literal of a .yaml makes me think it could be re-thought

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would appreciate a more detailed test of YAML and JSON loading - perhaps a test version of garak.core.yaml in JSON, for example - to be sure that we're getting the same results out of both YAML and JSON loaders. YAML->Python has pecularities around Nones, for example, and it's not impossible JSON->Python has its own idiosyncracies.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

garak.core.yaml is not standard user controlled as it should only source from the distributed install location. The core config should not support a json version being found. For other files further more complex examples of each format would be reasonable though I think the current tests cover the required set of expectations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants