fix: Fable5 scan report fixes Part 1 refs #953#955
Open
ilayfalach wants to merge 23 commits into
Open
Conversation
Tests were leaking MongoDB projects and on-disk directories: - A project stays in getProjectList() as long as any document carries its name, including the hidden <projectName>__config__ Cache document created on every Project() construction. Teardowns deleted only measurement docs, so the config doc kept every project alive. - Project.__init__ unconditionally creates ~/.hera/<projectName>; nothing removed it. - The session fixture persisted an ephemeral mkdtemp path into the durable DB config, so later Project() opens resurrected the tmp dir via makedirs. Fix: - Add purge primitives that delete all docs (incl. the config doc, via the collection layer) and remove on-disk dirs. - Add a session-scoped autouse safety-net that purges every project created during the session plus leftover test-named projects/dirs. Only touches test artifacts; defaultProject and pre-existing data are never modified. - Pass filesDirectory at construction so ~/.hera/<name> is never created, and purge the config doc on teardown so tmp dirs cannot be resurrected. - dynamic_loading conftest now purges the DB, not just the disk dir. Verified: 275 passed; zero new DB/dir/tmp artifacts; idempotent across runs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
config.json contains personal MongoDB credentials committed to the public repo. Untrack with git rm --cached; add to .gitignore to prevent future accidental commits. Password rotation and git history scrub (git filter-repo) are required follow-up steps by the repo maintainer. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Debug-level logging of the full mongoConfig dict exposes plaintext passwords to log files. Replace with a filtered copy that omits 'password'. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
MONGO_ADMIN_USER/PWD and MONGO_HERA_USER/PWD env vars now control the credentials provisioned by the init scripts and Dockerfile, with the old hardcoded values as defaults. Production deployments must override these. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…link [1.4] refs #953 ~16 os.system() call sites across 6 files replaced: - Luigi execution: subprocess.run([list]) instead of f-string shell - symlinks: os.symlink() instead of ln -s - rm -r: shutil.rmtree() - cp -rf *: glob + shutil.copytree/copy2 - foamJob: subprocess.run([list]) with flag handling - sed pipeline: subprocess.run([list]) without shell=True - ./Allrun, ./a.out: subprocess.run([list]) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…rToUnum [1.5, 4.1] refs #953 parsers.py: eval(f'count_{inst}') replaced with a plain dict counter; eliminates arbitrary code execution from metadata instrument names. unitHandler.py: eval(str(value)) removed from the deprecated strToUnum; method now returns value unchanged with a migration note. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… handler without sys.path mutation [1.6, 3.3] refs #953 toolkit.py: validate toolkitPath exists and doesn't shadow stdlib modules before sys.path.insert(0, ...). datahandler.py: Class getData() now uses importlib.util.spec_from_file_location for resource-based fallback — sys.path is never mutated by DB-supplied paths. project.py: pickle.dump/load annotated with # nosec B301 and a migration note; full JSON replacement deferred to Part 2 to avoid breaking the export format. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
#953 Replace all mutable default-argument literals (`desc={}`, `getDataParams={}`, `actionList=[]`, `excludeFields=[]`) with `None` + guard idiom across: - hera/toolkit.py (addCacheDocument, addMeasurementsDocument, addSimulationsDocument) - hera/datalayer/project.py (same three methods) - hera/datalayer/datahandler.py (all 15 getData signatures + two body guards) - hera/datalayer/autocache.py (cacheFunction + cacheDecorators.__init__) - hera/utils/query.py (andClause) - hera/riskassessment/protectionpolicy/ProtectionPolicy.py (__init__) Prevents shared-state bugs where callers mutate the default dict/list across calls. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- srtm_datasource.json: add isRelativePath=True and convert absolute path to repo-relative path (hera/tests/UNIT_TEST_GIS_RASTER_TOPOGRAPHY/) - latex.py: remove __main__ block containing /home/yehudaa/... path - ml.py: remove __main__ block containing /ibdata2/nirb/... paths Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…2.7] refs #953 - project.py getAllDocuments: pass **desc instead of desc=desc so filter kwargs are correctly forwarded to each collection's getDocuments query - topography.py:298: fix getCacheDcouments → getCacheDocuments (typo) and spread the desc dict as **kwargs to match the correct signature Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…refs #953 - riskAreas.py: remove duplicate toMeteorologicalAngle/toMathematicalAngle lambdas; import the canonical implementations from hera.utils instead - turbulencestatistics.py: replace two apply(lambda x: 270-x ...) calls with apply(toMeteorologicalAngle) using the same import Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…refs #953 - wrfDatalayer.py: import WSG84/ITM from hera.measurements.GIS; replace raw 2039 and 4326 literals (geo.crs=2039, to_crs(epsg=4326), etc.) - thresholdGeoDataFrame.py: import ITM; replace {"init":"epsg:2039"} dict (old geopandas syntax) with ITM integer constant - buildings/analysis.py: import ITM; replace crs=2039 in __main__ block - topography.py: import ITM; replace desc.update({'crs': 2039}) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…#953 - toolkit.py OF_LSM registry: fix cls path from hera.simulations.openFoam.LSM.toolkit to hera.simulations.openFoam.lagrangian.LSM.toolkit.OFLSMToolkit - getDataSourceDocument: remove setConfig() side-effect; a read-only method must not write to the DB — callers who need a persisted default should call setDataSourceDefaultVersion() explicitly Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
….6, 3.7] refs #953 - abstractToolkit.__init__: add **kwargs so subclass constructors can pass extra keyword arguments without triggering TypeError on unexpected params - project.py: remove 'from hera import toolkit' circular import; the name was only referenced in docstring strings, never in actual code Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Bare except: catches KeyboardInterrupt, SystemExit, and GeneratorExit — signals that should propagate. Narrowed to except Exception: in the two affected locations: - windProfile/toolkit.py: IMS API retry loop - utils/data/CLI.py: datasource listing (two occurrences) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…refs #953 refs #943 5.3: Replace slow Project-based module-level MongoDB probe with a direct pymongo.MongoClient ping (serverSelectionTimeoutMS=1000) in both test_datalayer.py and test_experiment.py. The old probe triggered pymongo's 30 s server-selection timeout per test module during collection when MongoDB was down, freezing CI. The fast probe returns in ≤1 s. 5.4: Remove outer try-except in compare_outputs that swallowed all comparison crashes and silently returned False. Comparison errors now propagate as test errors instead of silent mismatches. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace AI-generated boilerplate that described Hera as a 'Django-based GIS web system' (completely wrong) with an accurate description of the actual stack: Python 3, MongoDB/mongoengine, pandas/dask/geopandas/xarray, domain-specific toolkits (GIS, meteorology, simulations, risk assessment). Remove the 'Live Demo' link that pointed to the generic Kaplan org page. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…] refs #953 6.2: Read version dynamically from hera/__init__.__version__ using regex so pip reports the correct version (2.16.3) instead of 0.0.0. 6.3: Add install_requires with the 11 core runtime dependencies that every hera import path needs (pandas, numpy, mongoengine, pymongo, pint, deprecated, scipy, xarray, dask, geopandas, shapely). This lets 'pip install pyhera' work for end-users without a separate requirements.txt step. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… 6.5, 6.6] refs #953 6.4: TEST_UI.md: replace all 6 occurrences of '/home/eran/Code/hera' with relative paths so every developer's machine works correctly. 6.5: Untrack 3 conda tarballs from git (gdal, icu, nodejs for Python 3.6, ~31 MB). Add *.tar.bz2 and *.conda under 'additional installations/' to .gitignore to prevent re-adding. Files kept on disk. 6.6: meta.yaml: bump version to 2.16.3, replace dead internal git URL with public GitHub URL, expand run dependencies to match setup.py install_requires, remove pytables (absent from requirements.txt). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…#953 Documents every finding from the 2026-06-11 Fable5 code review with one of three statuses: Fixed (32 items), Not an Issue (1 item), or Postponed (18 items). Provides commit references for all fixes and rationale for postponed items. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…port refs #953 - hermesWorkflowToolkit.py (main + LSM): combine master's buildLuigiExecutionCommand (scheduler/dispatch_id support) with our subprocess.run([shlex.split(...)]) security fix instead of os.system - latex.py: keep our __main__ removal (hardcoded dev path) over master's revert Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…topography [2.10] refs #953 buildings/analysis.py: from .... import ITM → from ...utils import ITM (4 dots reached hera.measurements which does not export ITM; 3 dots + .utils reaches hera.measurements.GIS.utils which does) topography.py: from ... import ITM → from ..utils import ITM (same root cause, one fewer level up from GIS/vector/) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
… [1.6, 3.3] refs #953 When resource points to the top-level package dir (e.g. /path/mypkg) and classpath is mypkg.impl.MyClass, the old code built: /path/mypkg/mypkg/impl.py ← double package name, file not found Fix: strip the leading package component from the file path when os.path.basename(resource) matches the first dotted component of module_name. Tested by test_dynamic_class_loading_via_getdata. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements 32 of 51 findings from the Fable5 automated code-scan report. All Critical (🔴) and Major (🟠) items that could be fixed without a full architectural redesign are addressed. Full status tracked in
SCAN_STATUS.md.Security (all 7 items fixed)
config.jsonfrom git tracking (live MongoDB credentials were committed) [1.1]mongo-init.d,dockerfile,init_with_mongo.sh) [1.3]os.system(f-string)withsubprocess.run([list]),os.symlink,shutil.*across 5 simulation files [1.4]eval()on data-controlled strings (parsers.pydict counter;unitHandler.pydeprecated) [1.5]sys.pathbefore inserting DB-supplied paths (existence + stdlib-shadow check); replacesys.path.insertin class handler withimportlib.util; annotate unavoidable pickle usage with# nosec B301[1.6]subprocesswithshell=Trueto argument-list form [1.7]Data Layer (7 of 13 fixed)
desc={}/getDataParams={}/actionList=[]across 6 files [2.2]srtm_datasource.json,latex.py,ml.py) [2.3]getAllDocumentssilently returning empty results (desc=desc→**desc) [2.5]getCacheDcoumentstypo →getCacheDocuments[2.7]hera.utils.toMeteorologicalAngle/toMathematicalAngle[2.9]WSG84/ITMnamed constants across 4 files [2.10]Architecture (6 of 11 fixed)
OF_LSMregistry cls path [3.1]setConfig()write fromgetDataSourceDocumentgetter [3.4]**kwargstoabstractToolkit.__init__[3.6]datalayer → toolkit[3.7]Code Quality (3 of 6 fixed)
except:withexcept Exception:inwindProfile/toolkit.pyandutils/data/CLI.py[4.2]Testing & CI (3 of 6 fixed / 1 confirmed not an issue)
pymongo.MongoClient(serverSelectionTimeoutMS=1000)[5.3]try/exceptfromcompare_outputsthat was swallowing test failures [5.4]Packaging & Hygiene (6 of 8 fixed)
version=dynamically fromhera/__init__.__version__insetup.py[6.2]setup.py install_requires[6.3]/home/eranpaths inui/client/TEST_UI.md[6.4]git rm --cached; add pattern to.gitignore[6.5]meta.yamlversion, git URL, and dependencies [6.6]Postponed items
Two Critical items are not in this PR due to architectural scope:
import heraperforms network I/O, filesystem writes, and DB writes at import time. Requires a full lazy-connection redesign. Tracked in a follow-up issue.simulations/(~70 modules) andriskassessment/(~17 modules). Substantial test-writing effort, tracked under Fix Matplotlib 3.10 compatibility in toGeopandas and expand toolkit unit test coverage #943 and a follow-up issue.Full item-by-item status in
SCAN_STATUS.md.Test plan
pytest hera/tests/passes (or skips cleanly when MongoDB is unavailable)git ls-files config.jsonreturns nothinggrep -rn "os\.system(" hera/simulations/returns no live hitsgrep -rn "getCacheDcouments" hera/returns no hitspython -c "import hera; from hera import Project"succeeds without circular import error🤖 Generated with Claude Code