-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Fixed invalid JSON format from the logs #588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
WalkthroughUpdates include a docstring edit in bayes_opt/logger.py and the addition of a new utility function load_logs in bayes_opt/util.py to import observations from JSON log files into a BayesianOptimization instance. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant Util as util.load_logs
participant FS as FileSystem
participant JSON as json
participant BO as BayesianOptimization
User->>Util: load_logs(optimizer, logs)
loop For each path in logs
Util->>FS: open(path)
FS-->>Util: file handle
Util->>JSON: json.load(file)
alt JSON decode error
JSON-->>Util: error
Util-->>Util: skip file
else OK
loop For each iteration in data
Util->>BO: check duplicate (params)
alt duplicate and not allowed
BO-->>Util: exists -> skip
else not duplicate
Util->>BO: register(params, target, constraint_value?)
end
end
end
end
Util-->>User: return optimizer
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes Poem
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
bayes_opt/util.py (1)
18-32
: Clarify and correct the docstring (class name, accepted types, and return type)The docstring is incomplete and references “BayesianOptimizer”. Tighten it up and describe Iterable inputs.
- """Load previous ... + """Load observations from one or more JSON log files into an optimizer. @@ - optimizer : BayesianOptimizer - Optimizer the register the previous observations with. + optimizer : BayesianOptimization + Optimizer to register the previous observations with. @@ - logs : str or os.PathLike - File to load the logs from. + logs : str | os.PathLike | Iterable[str | os.PathLike] + Path or iterable of paths to JSON log files to load. @@ - The optimizer with the state loaded. + BayesianOptimization + The optimizer after loading observations. """
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
bayes_opt/logger.py
(1 hunks)bayes_opt/util.py
(1 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
bayes_opt/util.py (2)
bayes_opt/bayesian_optimization.py (2)
BayesianOptimization
(42-445)register
(178-208)bayes_opt/target_space.py (3)
register
(424-518)params
(149-156)target
(159-166)
🪛 Ruff (0.12.2)
bayes_opt/util.py
16-16: Undefined name BayesianOptimization
(F821)
17-17: Undefined name BayesianOptimization
(F821)
# Read current data | ||
with self._path.open("r") as f: | ||
fileData = json.load(f) | ||
|
||
# Append next data point | ||
fileData.append(data) | ||
|
||
# Writes content back to a file | ||
with self._path.open("w") as f: | ||
json.dumps(fileData) | ||
|
||
params_config: Mapping[str, ParamsType] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Remove stray JSON I/O pseudo-code from the docstring; it’s misleading and incorrect
This block references undefined identifiers (self._path, data), and json.dumps() is called without writing to the file. It also interrupts the Parameters section. If you want to document JSON logging, put it under the appropriate file-logger class, not ScreenLogger.
Apply this diff to drop the block:
- # Read current data
- with self._path.open("r") as f:
- fileData = json.load(f)
-
- # Append next data point
- fileData.append(data)
-
- # Writes content back to a file
- with self._path.open("w") as f:
- json.dumps(fileData)
-
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
# Read current data | |
with self._path.open("r") as f: | |
fileData = json.load(f) | |
# Append next data point | |
fileData.append(data) | |
# Writes content back to a file | |
with self._path.open("w") as f: | |
json.dumps(fileData) | |
params_config: Mapping[str, ParamsType] | |
params_config: Mapping[str, ParamsType] |
🤖 Prompt for AI Agents
In bayes_opt/logger.py around lines 274 to 285, remove the stray JSON I/O
pseudo-code block (references to self._path, data, and the incorrect
json.dumps() call) from the ScreenLogger docstring so it no longer interrupts
the Parameters section; if JSON logging needs documenting, add a concise,
correct example under the dedicated file-logger class or module docs and, when
showing real code, use json.dump(fileData, f) to write to the file.
import json | ||
from collections.abc import Iterable | ||
from os import PathLike | ||
from pathlib import Path | ||
|
||
import numpy as np | ||
|
||
# from bayes_opt.bayesian_optimization import BayesianOptimization | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix undefined type reference and imports; address Ruff F821 and future-proof typing
Ruff flags BayesianOptimization as undefined. Import it under TYPE_CHECKING to avoid runtime deps/cycles. Also import NotUniqueError for duplicate handling and warnings for diagnostics.
+import warnings
import json
from collections.abc import Iterable
from os import PathLike
from pathlib import Path
import numpy as np
-# from bayes_opt.bayesian_optimization import BayesianOptimization
+from typing import TYPE_CHECKING
+from bayes_opt.target_space import NotUniqueError
+
+if TYPE_CHECKING:
+ from bayes_opt.bayesian_optimization import BayesianOptimization
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
import json | |
from collections.abc import Iterable | |
from os import PathLike | |
from pathlib import Path | |
import numpy as np | |
# from bayes_opt.bayesian_optimization import BayesianOptimization | |
import warnings | |
import json | |
from collections.abc import Iterable | |
from os import PathLike | |
from pathlib import Path | |
import numpy as np | |
from typing import TYPE_CHECKING | |
from bayes_opt.target_space import NotUniqueError | |
if TYPE_CHECKING: | |
from bayes_opt.bayesian_optimization import BayesianOptimization |
🤖 Prompt for AI Agents
In bayes_opt/util.py around lines 5 to 13, Ruff reports an undefined reference
to BayesianOptimization and you also need NotUniqueError and warnings for
duplicate handling/diagnostics; add "from typing import TYPE_CHECKING" and move
the BayesianOptimization import into an if TYPE_CHECKING: block so it’s only
used for type hints (avoids runtime deps/cycles), and add imports for
NotUniqueError (e.g., from sqlalchemy.exc import NotUniqueError) and warnings at
the top of the file so duplicate errors and warnings can be handled.
if isinstance(logs, (str, PathLike)): | ||
logs = [logs] | ||
|
||
for log in logs: | ||
try: | ||
with Path(log).open("r") as fil: | ||
fileData = json.load(fil) | ||
except json.JSONDecodeError: | ||
print(f"ERROR: JSON decode error when decoding '{log}'") | ||
continue | ||
|
||
for iteration in fileData: | ||
# Prevents duplicate points being registered when an exception can be raised | ||
if not optimizer._allow_duplicate_points and iteration["params"] in optimizer: | ||
continue | ||
|
||
optimizer.register( | ||
params=iteration["params"], | ||
target=iteration["target"], | ||
constraint_value=(iteration["constraint"] if optimizer.is_constrained else None), | ||
) | ||
|
||
return optimizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Refactor duplicate detection and error handling; avoid invalid membership test and improve robustness
iteration["params"] in optimizer
is unsafe unless BayesianOptimization implements contains. Rely on TargetSpace.register to detect duplicates and catch NotUniqueError.- Handle missing keys and constraint values gracefully.
- Prefer warnings over prints; handle file I/O errors and non-list JSON roots.
- Use snake_case variable names and explicit UTF-8.
if isinstance(logs, (str, PathLike)):
logs = [logs]
for log in logs:
- try:
- with Path(log).open("r") as fil:
- fileData = json.load(fil)
- except json.JSONDecodeError:
- print(f"ERROR: JSON decode error when decoding '{log}'")
- continue
-
- for iteration in fileData:
- # Prevents duplicate points being registered when an exception can be raised
- if not optimizer._allow_duplicate_points and iteration["params"] in optimizer:
- continue
-
- optimizer.register(
- params=iteration["params"],
- target=iteration["target"],
- constraint_value=(iteration["constraint"] if optimizer.is_constrained else None),
- )
+ path = Path(log)
+ try:
+ with path.open("r", encoding="utf-8") as fil:
+ file_data = json.load(fil)
+ except json.JSONDecodeError:
+ warnings.warn(f"Skipping '{path}': invalid JSON", RuntimeWarning)
+ continue
+ except OSError as e:
+ warnings.warn(f"Skipping '{path}': {e}", RuntimeWarning)
+ continue
+
+ if not isinstance(file_data, list):
+ warnings.warn(f"Skipping '{path}': expected a JSON array of iterations", RuntimeWarning)
+ continue
+
+ for iteration in file_data:
+ try:
+ params = iteration["params"]
+ target = iteration["target"]
+ constraint_value = iteration.get("constraint") if optimizer.is_constrained else None
+ optimizer.register(params=params, target=target, constraint_value=constraint_value)
+ except KeyError as e:
+ warnings.warn(f"Skipping malformed iteration in '{path}': missing key {e}", RuntimeWarning)
+ continue
+ except NotUniqueError:
+ # Duplicate point; skip when duplicates are not allowed
+ continue
+ except ValueError as e:
+ # e.g., missing constraint value when constrained
+ warnings.warn(f"Skipping iteration from '{path}': {e}", RuntimeWarning)
+ continue
return optimizer
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
if isinstance(logs, (str, PathLike)): | |
logs = [logs] | |
for log in logs: | |
try: | |
with Path(log).open("r") as fil: | |
fileData = json.load(fil) | |
except json.JSONDecodeError: | |
print(f"ERROR: JSON decode error when decoding '{log}'") | |
continue | |
for iteration in fileData: | |
# Prevents duplicate points being registered when an exception can be raised | |
if not optimizer._allow_duplicate_points and iteration["params"] in optimizer: | |
continue | |
optimizer.register( | |
params=iteration["params"], | |
target=iteration["target"], | |
constraint_value=(iteration["constraint"] if optimizer.is_constrained else None), | |
) | |
return optimizer | |
if isinstance(logs, (str, PathLike)): | |
logs = [logs] | |
for log in logs: | |
path = Path(log) | |
try: | |
with path.open("r", encoding="utf-8") as fil: | |
file_data = json.load(fil) | |
except json.JSONDecodeError: | |
warnings.warn(f"Skipping '{path}': invalid JSON", RuntimeWarning) | |
continue | |
except OSError as e: | |
warnings.warn(f"Skipping '{path}': {e}", RuntimeWarning) | |
continue | |
if not isinstance(file_data, list): | |
warnings.warn(f"Skipping '{path}': expected a JSON array of iterations", RuntimeWarning) | |
continue | |
for iteration in file_data: | |
try: | |
params = iteration["params"] | |
target = iteration["target"] | |
constraint_value = iteration.get("constraint") if optimizer.is_constrained else None | |
optimizer.register( | |
params=params, | |
target=target, | |
constraint_value=constraint_value, | |
) | |
except KeyError as e: | |
warnings.warn(f"Skipping malformed iteration in '{path}': missing key {e}", RuntimeWarning) | |
continue | |
except NotUniqueError: | |
# Duplicate point; skip when duplicates are not allowed | |
continue | |
except ValueError as e: | |
warnings.warn(f"Skipping iteration from '{path}': {e}", RuntimeWarning) | |
continue | |
return optimizer |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #588 +/- ##
==========================================
- Coverage 97.80% 96.68% -1.13%
==========================================
Files 10 10
Lines 1186 1205 +19
==========================================
+ Hits 1160 1165 +5
- Misses 26 40 +14 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Hey @Fdms-3741, thanks for opening a PR, contributions are very welcome in this repository. That being said, I'm a bit confused. Is this LLM-generated code? I'm reasonably sure that the state-saving produces valid JSON, however, even if not, this would not be the right way to fix the problem. |
The previous method saved data in an invalid JSON format that made it incompatible with any JSON parser. This modification proposes a change in the save and load functionality that allows it to write and read valid JSON.
Warning: This will break imports of previously saved data. The data itself will not be altered; thus, manual modification is required to turn the old log files into valid JSON files.
Summary by CodeRabbit
New Features
Documentation