-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: support the nbands autoset for monomer jy case #97
Conversation
📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces modifications to several files, enhancing the functionality of input reading and validation processes. Key changes include the addition of a default fitting basis in the Changes
Possibly related PRs
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (6)
examples/jy.json (5)
1-4
: Clarify the purpose of the "environment" field and consider parameterizing the MPI command.The "environment" field is currently empty. Please clarify its intended use or remove it if it's not needed.
Additionally, consider making the number of MPI processes a configurable parameter rather than hardcoding it to 8. This would allow for easier adjustment based on the available computational resources.
6-11
: Consider using relative paths for pseudopotentials and document parameter choices.The pseudopotential directory is currently specified as an absolute path, which may not be portable across different systems. Consider using a relative path or an environment variable to make it more flexible.
Additionally, it would be helpful to add comments explaining the chosen values for
ecutwfc
,bessel_nao_smooth
,bessel_nao_rcut
, andsmearing_sigma
. This documentation would aid users in understanding the rationale behind these selections and how they might need to be adjusted for different scenarios.
13-18
: LGTM! Consider adding documentation for less obvious parameters.The optimization settings align well with the PR objectives, particularly the use of "jy" as the fitting basis and "atomic" for the spill_guess. The "reduced" jY_type is also consistent with the description.
To improve clarity, consider adding comments or documentation for the less obvious parameters such as
max_steps
andnthreads_rcut
. This would help users understand their impact on the optimization process.
20-35
: LGTM! Consider documenting the rationale for chosen parameters.The reference systems (dimer and trimer) are well-defined with appropriate parameters. The increasing number of bands from dimer to trimer and the consistent lmaxmax value of 2 align well with the PR objectives.
To enhance understanding and maintainability:
- Consider adding comments explaining the rationale behind the chosen
nbands
values for each system.- Document why these specific bond length ranges were selected for each system.
This additional context would be valuable for future users or modifications of this configuration.
36-55
: LGTM! Clarify the meaning and impact of orb_ref values.The orbital configurations are well-structured and align with the PR objectives, particularly the "onion" approach mentioned. The progression in
nbands_ref
values (4, 10, 20) across configurations suggests an increasing level of complexity or accuracy.To improve clarity:
- Consider adding comments to explain the significance of the different
orb_ref
values (none, 0, 1) and how they relate to the "onion" approach for orbital generation.- Document how the
nbands_ref
values were chosen and their impact on the orbital generation process.This additional information would help users understand the orbital generation strategy and its implications.
SIAB/spillage/api.py (1)
696-698
: Improve clarity of the error messageThe error message in the assertion can be rephrased for better understanding. Consider rewording it to clearly convey that the requested number of bands exceeds the calculated bands.
Apply this diff to improve the error message:
- f"ERROR: number of bands for orbgen is larger than calculated: {nband} > {wfc.shape[1]}" + f"ERROR: The requested number of bands ({nband}) exceeds the number of calculated bands ({wfc.shape[1]})."
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
- SIAB/io/read_input.py (5 hunks)
- SIAB/spillage/api.py (2 hunks)
- examples/jy.json (1 hunks)
- examples/pw.json (1 hunks)
🧰 Additional context used
🪛 Ruff
SIAB/io/read_input.py
584-584: Loop control variable
iorb
not used within loop body(B007)
🔇 Additional comments (5)
examples/pw.json (2)
15-15
: Significant change in basis type - please clarifyThe change from
"basis_type": "jy"
to"fit_basis": "pw"
is a substantial modification that could significantly impact the simulation setup. This switch from jy (presumably Jastrow factor) to pw (plane wave) basis might affect various aspects of the calculation.Could you please clarify:
- Is this change intentional and aligned with the PR objectives?
- How does this relate to the stated goal of supporting nbands autoset for the monomer jy case?
- What are the implications of this change on other parameters in the configuration?
Additionally, it would be helpful to run the following script to check for any other occurrences of 'basis_type' or 'fit_basis' in the codebase:
#!/bin/bash # Search for 'basis_type' and 'fit_basis' in all files echo "Occurrences of 'basis_type':" rg --type json 'basis_type' echo "Occurrences of 'fit_basis':" rg --type json 'fit_basis'This will help ensure consistency across the project and identify any potential areas that might need updating due to this change.
Line range hint
1-62
: Verify configuration compatibility with new basis typeWhile the main change is in the basis type, it's crucial to ensure that the rest of the configuration remains compatible with the new "pw" (plane wave) basis. Please review the following aspects:
- Are the
ecutwfc
,bessel_nao_smooth
, andbessel_nao_rcut
parameters still relevant and correctly set for a plane wave basis?- Does the
spill_guess
parameter need any adjustment given the change in basis type?- Are the
reference_systems
andorbitals
configurations still appropriate for a plane wave calculation?To assist in this verification, you might want to run a script to check for any parameters that might need adjustment:
This will help identify any parameters that might need reconsideration given the change in basis type.
✅ Verification successful
Investigate usage of
jY_type
parameterThe
jY_type
parameter appears in the configuration and might be specific to the previous basis type. It's important to verify whether this parameter is still relevant and compatible with the new "pw" basis.To facilitate this, please run the following script to identify any references to
jY_type
in the codebase:
Configuration Compatibility Confirmed
The
jY_type
parameter is utilized across multiple parts of the codebase and is not exclusive to the "jy" basis. Its presence in theexamples/pw.json
is appropriate and compatible with the new "pw" (plane wave) basis.🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # List all keys in the JSON file echo "All keys in the JSON file:" jq 'keys' examples/pw.json # Check for any keys that might be specific to jy basis echo "Keys potentially specific to jy basis:" jq 'keys[] | select(. | test("jy|jastrow|basis_type"; "i"))' examples/pw.jsonLength of output: 615
Script:
#!/bin/bash # Search for references to 'jY_type' in the codebase rg 'jY_type' .Length of output: 948
examples/jy.json (1)
1-56
: Overall, the configuration is well-structured and aligns with PR objectives.The
jy.json
configuration file provides a comprehensive set of parameters for the jy case, including environment settings, optimization parameters, reference systems, and orbital configurations. The structure and values align well with the PR objectives, particularly in supporting the nbands autoset for monomer jy cases.To further enhance this configuration:
- Consider adding a top-level comment explaining the purpose and usage of this configuration file.
- Where possible, use relative paths or environment variables for better portability.
- Add inline comments or documentation for less obvious parameters to improve maintainability.
These improvements will make the configuration more robust and easier for other developers to understand and modify in the future.
SIAB/io/read_input.py (2)
16-16
: Initialize 'result' with default 'fit_basis'Initializing the
result
dictionary with{"fit_basis": "pw"}
ensures that a default fitting basis is set, which enhances the robustness of theread_siab_plaintext
function.
613-614
: Validate parameters before parsingCalling
_validate_param(user_settings)
at the beginning of theparse
function ensures that the user settings are validated before proceeding. This enhancement improves the robustness of the parsing process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (4)
SIAB/io/read_input.py (4)
16-16
: LGTM! Consider adding a comment for clarity.The addition of a default "fit_basis" key with the value "pw" is a good improvement. It provides a default configuration that can be overridden if needed.
Consider adding a brief comment explaining the purpose of this default value, such as:
# Set default fitting basis to plane wave (pw) result = {"fit_basis": "pw"}
51-59
: LGTM! Consider simplifying the path handling.The improved handling of the
pseudo_dir
path is excellent. It correctly manages paths starting with "~" and ensures all paths are absolute.You can simplify this code using
os.path.expanduser
andos.path.abspath
together:pseudo_dir = os.path.abspath(os.path.expanduser(result["pseudo_dir"])) print(f"NOTE: Redirecting `pseudo_dir` to {pseudo_dir}", flush=True) result["pseudo_dir"] = pseudo_dirThis approach handles both "~" paths and relative paths in one step, reducing the code complexity.
574-612
: Good addition of input validation. Consider some improvements.The new
_validate_param
function is a valuable addition that will help catch configuration errors early.Consider the following improvements:
- Use custom exceptions instead of assertions for better error handling in production:
class InvalidShapeError(ValueError): pass class InsufficientNbandsError(ValueError): pass # Then in the function: if not isinstance(s, (str, int)): raise InvalidShapeError(f"Shape {s} is not a valid shape") if orb["nbands_ref"] > user_settings["reference_systems"][shape2index[s]]["nbands"]: raise InsufficientNbandsError(f"nbands_ref for orbital {iorb} is larger than the number of bands set for reference system `{s}`")
Consider returning a boolean instead of using assertions, allowing the caller to decide how to handle validation failures.
Add type hints to improve code readability and enable static type checking:
def _validate_param(user_settings: dict) -> bool: # ... function body ... return True # if all validations passThese changes will make the function more robust and easier to use in different contexts.
🧰 Tools
🪛 Ruff
588-588: Loop control variable
iorb
not used within loop body(B007)
301-310
: LGTM! Consider using integer division for nbands_shapemax.The addition of specific handling for the 'jy' case in
fit_basis
is a good improvement. It provides more flexibility in setting the number of bands for the monomer case.In the calculation of
nbands_shapemax
, consider using integer division (//
) instead of regular division (/
) to ensure the result is always an integer:nbands_shapemax = max([i["nbands"] // natom_from_shape(j["shape"]) + 20 for i, j in zip(result, refsys)])This will prevent any potential issues with floating-point numbers being used for the number of bands.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
- SIAB/io/read_input.py (5 hunks)
🧰 Additional context used
🪛 Ruff
SIAB/io/read_input.py
588-588: Loop control variable
iorb
not used within loop body(B007)
🔇 Additional comments (2)
SIAB/io/read_input.py (2)
617-618
: LGTM! Good addition of input validation.Adding the call to
_validate_param
at the beginning of theparse
function is an excellent improvement. This ensures that input parameters are validated before any processing occurs, which can prevent errors and improve the robustness of the code.
286-299
: Excellent addition of explanatory comments!The new block of comments provides valuable context about the changes introduced in ABACUS-ORBGEN v3.0, particularly regarding the inference of
nzeta
and the handling of the monomer case. These comments significantly improve the code's maintainability and readability.The detailed explanation of the different approaches for setting the number of bands in the monomer case is particularly helpful for understanding the code's behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
SIAB/spillage/api.py (2)
389-399
: Improved error handling, but minor enhancement possibleThe additions to the error handling are good improvements. The detailed error message will help developers understand and debug potential issues more easily.
However, to fully address the previous review comment about potential IndexError, consider modifying line 389 as follows:
-assert len(ib) > 0, "ERROR: no band index found" +assert ib and len(ib[0]) > 0, "ERROR: no band index found"This change ensures that
ib
is not empty before accessingib[0]
, fully addressing the potential IndexError.🧰 Tools
🪛 Ruff
390-390: Ambiguous variable name:
l
(E741)
390-390
: Consider clarifying variable name 'l'While 'l' is commonly used to represent angular momentum in physics and chemistry calculations, it can be ambiguous in certain contexts or fonts. To improve code readability, consider using a more descriptive name such as
angular_momentum
or adding a comment to clarify its meaning.-for l, (nz, nz0) in enumerate(zip(nzeta, excluded))] +for l, (nz, nz0) in enumerate(zip(nzeta, excluded))] # l represents angular momentum🧰 Tools
🪛 Ruff
390-390: Ambiguous variable name:
l
(E741)
SIAB/io/read_input.py (2)
279-305
: Approved: Enhanced handling ofnbands
for monomer caseThe new implementation for setting
nbands
in the monomer case is a significant improvement. It uses a mixed approach, calculating bothnbands_lmax
andnbands_shapemax
to determine the appropriate number of bands. The distinction between "pw" and "jy" fit basis types is well-handled, and the informative print statements are helpful for understanding the autoset process.One minor suggestion for improvement:
Consider extracting the calculation of
nbands_shapemax
into a separate function for better readability and potential reuse. For example:def calculate_nbands_shapemax(result, refsys): return max([i["nbands"]//natom_from_shape(j["shape"]) + 20 for i, j in zip(result, refsys)]) # Then in the main function: nbands_shapemax = calculate_nbands_shapemax(result, refsys)This would make the main function slightly cleaner and easier to read.
569-607
: Approved: Good addition of input parameter validationThe new
_validate_param
function is a valuable addition that enhances the robustness of the code by validating input parameters. It checks for the validity of shapes assigned to orbitals and ensures that the number of bands set for reference systems is sufficient for fitting orbitals.Suggestion for improvement:
Consider using custom exceptions instead of assertions for better error handling and user feedback. For example:
class InvalidShapeError(ValueError): pass class InsufficientBandsError(ValueError): pass def _validate_param(user_settings: dict): # ... existing code ... for s in shape: if not isinstance(s, (str, int)): raise InvalidShapeError(f"Shape {s} is not a valid shape") if isinstance(s, str) and s not in shape2index: raise InvalidShapeError(f"Shape {s} is not found in reference systems") # ... existing code ... if orb["nbands_ref"] > user_settings["reference_systems"][shape2index[s]]["nbands"]: raise InsufficientBandsError(f"ERROR: `nbands_ref` for orbital {iorb} is larger than the number of bands set for reference system `{s}`") # ... rest of the function ...This approach would provide more informative error messages and allow for more flexible error handling in the calling code.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
- SIAB/io/read_input.py (6 hunks)
- SIAB/spillage/api.py (2 hunks)
🧰 Additional context used
📓 Learnings (1)
SIAB/io/read_input.py (2)
Learnt from: kirk0830 PR: kirk0830/ABACUS-ORBGEN#97 File: SIAB/io/read_input.py:504-0 Timestamp: 2024-10-07T14:16:14.290Z Learning: In the `skip_ppread` function, returning `False` when the 'element' key is missing in `user_settings` is intentional, as it indicates that the pseudopotential read-in cannot be skipped.
Learnt from: kirk0830 PR: kirk0830/ABACUS-ORBGEN#97 File: SIAB/io/read_input.py:504-508 Timestamp: 2024-10-07T13:44:50.084Z Learning: In the `skip_ppread` function in `SIAB/io/read_input.py`, returning `False` when the 'element' key is missing in `user_settings` is intentional, as it indicates that the pseudopotential read-in cannot be skipped.
🪛 Ruff
SIAB/spillage/api.py
390-390: Ambiguous variable name:
l
(E741)
🔇 Additional comments (5)
SIAB/spillage/api.py (1)
697-699
: Good addition for input validationThe new assertion is a valuable addition to the
_nzeta_infer
function. It ensures that the number of bands for orbital generation doesn't exceed the number of calculated bands, preventing potential runtime errors. The error message is clear and informative, which will help developers quickly identify and resolve issues.SIAB/io/read_input.py (4)
16-16
: Approved: Good addition of a default valueThe addition of a default value for "fit_basis" is a good practice. It provides a fallback option if not specified in the input file, improving the robustness of the function.
52-54
: Approved: Improved path handling forpseudo_dir
The new implementation for handling the
pseudo_dir
path is more robust and versatile. By usingos.path.abspath(os.path.expanduser(result["pseudo_dir"]))
, it correctly handles paths starting with a tilde (~) and ensures the path is absolute. This change resolves the previous limitation where the tilde symbol couldn't be understood byos.path.abspath
.
612-613
: Approved: Good addition of input validationThe addition of the
_validate_param(user_settings)
call at the beginning of theparse
function is a good practice. It ensures that the input parameters are validated early in the process, which can help catch potential issues before they cause problems in later stages of the code. This improves the overall robustness and reliability of the parsing process.
Line range hint
1-698
: Overall: Solid improvements to input handling and validationThis review has covered several significant improvements to the
SIAB/io/read_input.py
file:
- Enhanced path handling for
pseudo_dir
usingos.path.abspath
andos.path.expanduser
.- Improved calculation of
nbands
for the monomer case in theabacus_settings
function.- Addition of a new
_validate_param
function for input parameter validation.- Integration of the validation step in the
parse
function.These changes collectively enhance the robustness, flexibility, and error-handling capabilities of the input processing. The code is now better equipped to handle various input scenarios and provide more informative feedback in case of issues.
While some minor suggestions for improvement were made (such as function extraction and custom exception handling), the overall changes are well-implemented and significantly improve the quality of the code.
In ABACUS-ORBGEN, the introduction of spill_guess atomic requires an implicit dft calculation on monomer (single atom). Most parameters can be shared, such as ecutwfc, ecutrho, etc, but the nbands, lmaxmax should be set with much care.
The lmaxmax is the maximal angular momentum (among all atom types, but in present orbital generation design this is not considered). For PW, each planewave can have components with angular momentum from 0 to infinity, so an explicit lmaxmax is needed to determine the range of information that extract from PW calculation. So in PW case, the lmaxmax should be the maximal of all orbitals required to generate.
For isolated atom, partially according to Hund's rule, it is possible to get a rough estimation of number of bands that can include significant component with lmaxmax angular momentum, which is done by a function called
cal_nbands_fill_lmax
function. So now thenbands
andlmaxmax
both can have their values.However, for jy, the logic is somewhat inverted. Instead of inferring lmaxmax from zeta_notation that user defined what orbital wants to generate, now one should always give a lmaxmax value for each reference structure to calculate because lmaxmax is required in jy basis preparation. Then in the range of bands calculated for each structure, a range of bands will be used to generate orbitals. These orbitals (or say radial functions) will have different angular momentums and should get good initial guess from atomic calculation. In the "onion" way to generate orbitals, radial 1 may refer to band 1st, 2nd and 3rd, then with the same angluar momentum the radial 2 will refer to band other than 1, 2 and 3. The number of bands to calculate in monomer case should avoid the case in which all the bands are depleted.
Summary by CodeRabbit
Release Notes
New Features
Bug Fixes
Documentation
Refactor
Configuration Changes