All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- Fixed bug in implementation of
.fit
method of VIPRS models. Specifically, there was an issue with thecontinued=True
flag not working because theOptimizeResult
object wasn't refreshed. - Replaced
print
statements withlogging
where appropriate (still needs some more work). - Updated way we measure peak memory in
viprs_fit
- Updated
dict_concat
to just return the element if there's a single entry. - Refactored pars of
VIPRS
to cache some recurring computations. - Updated
VIPRSBMA
&VIPRSGridSearch
to only consider models that successfully converged. - Fixed bug in
psuedo_metrics
when extracting summary statistics data. - Streamlined evaluation code.
- Refactored code to slightly reduce import/load time.
- Added SNP position to output table from VIPRS objects.
- Added measure of time taken to prepare data in
viprs_fit
. - Added option to keep long-range LD regions in
viprs_fit
. - Added convergence check based on parameter values.
- Added
min_iter
parameter to.fit
methods to ensure CAVI is run for at leastmin_iter
iterations. - Added separate method for initializing optimization-related objects.
- Fixed bugs in the E-Step benchmarking script.
- Re-wrote the logic for finding BLAS libraries in the
setup.py
script. 🤞 - Fixed bugs in CI / GitHub Actions scripts.
Dockerfile
s for bothcli
andjupyter
modes.
A large scale restructuring of the code base to improve efficiency and usability.
- Moved plotting script to its own separate module.
- Updated some method names / commandline flags to be consistent throughout.
- Updated the
VIPRS
class to allow for more flexibility in the optimization process. - Removed the
VIPRSAlpha
model for now. This will be re-implemented in the future, using better interfaces / data structures. - Moved all hyperparameter search classes/models to their own directory.
- Restructured the
viprs_fit
commandline script to make the code cleaner, do better sanity checking, and introduce process parallelism over chromosomes.
- Basic integration testing with
pytest
and GitHub workflows. - Documentation for the entire package using
mkdocs
. - Integration testing / automating building with GitHub workflows.
- New self-contained implementation of E-Step in
Cython
andC++
.- Uses
OpenMP
for parallelism across chunks of variants. - Allows for de-quantization on the fly of the LD matrix.
- Uses BLAS linear algebra operations where possible.
- Allows model fitting with only
- Uses
- Benchmarking scripts (
benchmark_e_step.py
) to compare computational performance of different implementations. - Added functionality to allow the user to track time / memory utilization in
viprs_fit
. - Added
OptimizeResult
class to keep track of the info/parameters of EM optimization. - New evaluation metrics
pseudo_metrics
has been moved to its own module to allow for more flexibility in evaluation.- New evaluation metrics for binary traits:
nagelkerke_r2
,mcfadden_r2
,cox_snell_r2
liability_r2
,liability_probit_r2
,liability_logit_r2
. - New function to compute standard errors / test statistics for all R-Squared metrics.
- Removed the
--fast-math
compiler flag due to concerns about numerical precision (e.g. Beware of fast-math).
- New implementation for the e-step in
VIPRS
, where we multiply with the rows of the LD matrix only once. - Added support for deterministic annealing in the
VIPRS
optimization. - Added support for
pseudo_validation
as a metric for choosing models. Now, theVIPRS
class has a method calledpseudo_validate
. - New implementations for grid-based models:
VIPRSGrid
,VIPRSGridSearch
,VIPRSBMA
. - New python implementation of the
LDPredinf
model, using theviprs
/magenpy
data structures. - MIT license for the software.
- Corrected implementation of Mean Squared Error (MSE) metric.
- Changed the
c_utils.pyx
script to bemath_utils.pyx
. - Updated documentation in
README
to follow latest APIs.
- Updating the dependency structure between
viprs
andmagenpy
.
- Refactoring the code in the
viprs
repository and re-organizing it into a python package. - Added a module to compute predictive performance metrics.
- Added commandline scripts to allow users to access some of the functionalities of
viprs
without necessarily having to write python code. - Added the estimate of the posterior variance to the output from the module.
- Updated plotting script.
- Updated implementation of
VIPRSMix
,VIPRSAalpha
, etc. to inherit most of their functionalities from the baseVIPRS
class. - Cleaned up implementation of hyperparameter search modules.