This change includes several new features and performance improvements.
- Drop Python 2 support (Python 2 is no longer supported on January 1, 2020).
- Update dask and distributed support to versions >=2. (@bocklund)
- Users can tweak the AICc penalty factor for each phase to nudge parameter selection towards adding more or fewer parameters based on user modeling intitution. (@bocklund)
- Allow for tracefile and probfile to be set to None. (@jwsiegel2510)
- Weighting individual datasets in single phase fitting is now implemented via scikit-learn. (@bocklund)
- Performance improvements by reducing overhead. (@bocklund)
- Increased solver accuracy by using pycalphad’s exact Hessian solver. (@bocklund)
- Support writing SER reference state information to the ELEMENT keyword in TDBs based on the SGTE unary 5 database. (@bocklund)
- MCMC now calculates the likelihood of the initial parameter set so the starting point can be reasonably compared. (@bocklund)
- Fixed a bug where mis-aligned configurations and site occupancies in single phase datasets passed the dataset checker (@bocklund)
This is a small bugfix release that fixes the inability to provide the EmceeOptimizer a
This is a significant update reflecting many internal improvements, new features, and bugfixes. Users using the YAML input or the
run_espei Python API should see entirely backwards compatible changes with ESPEI 0.6.2.
pycalphad 0.8, which introduced many key features for these changes is now required. This should almost completely eliminate the time to build phases due to the symengine backend (phases will likely build in less time than to call the MCMC objective function). Users can expect a slight performance improvement for MCMC fitting.
- Priors can now be specified and are documented online.
- Weights for different datasets are added and are supported by a
"weight"key at the top level of any dataset.
- Weights for different types of data are added. These are controlled via the input YAML and are documented there.
- A new internal API is introduced for generic fitting of parameters to datasets in the
OptimizerBaseclass. The MCMC optimizer in emcee was migrated to this API (the
mcmc_fitfunction is now deprecated, but still works until the next major version of ESPEI). A simple SciPy-based optimizer was implemented using this API.
- Parameter selection can now be passed initial databases with parameters (e.g. for adding magnetic or other parameters manually).
- pycalphad’s reference state support can now be used to properly reference out different types of model contributions (ideal mixing, magnetic, etc.). This is especially useful for DFT thermochemical data which does not include model contributions from ideal mixing or magnetic heat capacity. Useful for experimental data which does include ideal mixing (previously ESPEI assumed all data
- Datasets and input YAML files now have a tag system where tags that are specified in the input YAML can override any keys/values in the JSON datasets at runtime. This is useful for tagging data with different weights/model contribution exclusions (e.g. DFT tags may get lower weights and can be set to exclude model contributions). If no tags are applied, removing ideal mixing from all thermochemical data is applied automatically for backwards compatibility. This backwards compatibility feature will be removed in the next major version of ESPEI (all model contributions will be included by default and exclusions must be specified manually).
- Bug fixed where asymmetric ternary parameters were not properly replaced in SymPy
- Fixed error where ZPF error was considering the chemical potentials of stoichiometric phases in the target hyperplane (they are meaningless)
- Report the actual file paths when dask’s work-stealing is set to false.
- Errors in the ZPF error function are no longer swallowed with -np.inf error. Any errors should be reported as bugs.
- Fix bug where subsets of symbols to fit are not built properly for thermochemical data
- Documentation recipe added for plot_parameters
- [Developer] ZPF and thermochemical datasets now have an function to get all the data up front in a dictionary that can be used in the functions for separation of concerns and calculation efficiency by not recalculating the same thing every iteration.
- [Developer] a function to generate the a context dict to pass to lnprob now exists. It gets the datasets automatically using the above.
- [Developer] transition to pycalphad’s new build_callables function, taking care of the
- [Developer] Load all YAML safely, silencing warnings.
This backwards-compatible release includes several bug fixes and improvements.
- Updated branding to include the new ESPEI logo. See the logo in the
- Add support for fitting excess heat capacity.
- Bug fix for broken potassium unary.
- Documentation improvements for recipes
- pycalphad 0.7.1 fixes for dask, sympy, and gmpy2 should mean that ESPEI should not require package upgrade or downgrades. Please report any installations issues in ESPEI’s Gitter Channel <https://gitter.im/PhasesResearchLab/ESPEI>.
- [Developers] ESPEI’s
- [Developers] matplotlib plotting tests are removed because nose is no longer supported.
This a major release with several important features and bug fixes.
- Enable use of ridge regression alpha for parameter selection via the
- Add ternary parameter selection. Works by default, just add data.
- Set memory limit to zero to avoid dask killing workers near the dask memory limits.
- Remove ideal mixing from plotting models so that
plot_parametersgives the correct entropy values.
- Add recipes documentation that contains some Python code for common utility operations.
- Add documentation for running custom distributed schedulers in ESPEI
This is a update including breaking changes to the input files and several minor improvements.
- Update input file schema and Python API to be more consistent so that the
tracealways refers to the collection of chains and
chainrefers to individual chains. Additionally removed some redundancy in the parameters nested under the
save_intervalin the input file and Python API. See Writing Input documentation for all of the updates.
- The default save interval is now 1, which is more reasonable for most MCMC systems with significant numbers of phase equilibria.
- Bug fixes for plotting and some better plotting defaults for plotting input data
- Dataset parsing and cleaning improvements.
- Documentation improvements (see the PDF!)
This is a major bugfix release for MCMC multi-phase fitting runs for single phase data.
- Fixed a major issue where single phase thermochemical data was always compared to Gibbs energy, giving incorrect errors in MCMC runs.
- Single phase errors in ESPEI incorrectly compared values with ideal mixing contributions to data, which is excess only.
- Fixed a bug where single phase thermochemical data with that are dependent on composition and pressure and/or temperature were not fit correctly.
- Added utilities for analyzing ESPEI results and add them to the Cu-Mg example docs.
This is a minor bugfix release.
- Parameter generation for phases with vacancies would produce incorrect parameters because the vacancy site fractions were not being correctly removed from the contributions due to their treatment as
- Parameter selection now uses the corrected AIC, which further prevents overparameterization where there is sparse training data.
- Activity and single phase thermochemical data can now be included in MCMC fitting runs. Including single phase data can help anchor metastable phases to DFT data when they are not on the stable phase diagram. See the Gathering input data documentation for information on how to input activity data.
- Dataset checking has been improved. Now there are checks to make sure sublattice interactions are properly sorted and mole fractions sum to less than 1.0 in ZPF data.
- Support for fitting phases with arbitrary pycalphad Models in MCMC, including (charged and neutral) species and ionic liquids. There are several consequences of this:
- ESPEI requires support on
- ESPEI now uses pycalphad
Modelobjects directly. Using the JIT compiled Models has shown up to a 50% performance improvement in MCMC runs.
- Using JIT compiled
Modelobjects required the use of
cloudpickleeverywhere. Due to challenges in overriding
picklefor upstream packages, we now rely solely on
daskfor scheduler tasks, including
dask-mpi. Note that users must turn off
- ESPEI requires support on
- [Developers] Each method for calculating error in MCMC has been moved into a module for that method in an
error_functionssubpackage. One top level function from each module should be imported into the
mcmc.pyand used in
lnprob. Developers should then just customize
- [Developers] Significant internal docs improvements: all non-trivial functions have complete docstrings.
- Enable plotting of isothermal sections with data using
- Tielines are now plotted in
dataplotfor isothermal sections and T-x phase diagrams
- Add a useful
ravel_conditionsmethod to unpack conditions from datasets
- MCMC is now deterministic by default (can be toggled off with the
- Added support for having no scheduler (running with no parallelism) with the
mcmc.scheduleroption set to
None. This may be useful for debugging.
- Logging improvements
- Extraneous warnings that may be confusing for users and dirty the log are silenced.
- A warning is added for when there are no datasets found.
- Fixed a bug where logging was silenced with the dask scheduler
optimal_parametersutility function as a helper to get optimal parameter sets for analysis
- Several improvements to plotting
- Users can now plot phase diagram data alone with
dataplot, useful for checking datasets visually. This changes the API for
dataplotto no longer infer the conditions from an equilibrium
Dataset(from pycalphad). That functionality is preserved in
- Experimental data points are now plotted with unique symbols depending on the reference key in the dataset. This is for both phase diagram and single phase parameter plots.
- Options to control plotting parameters (e.g. symbol size) and take user supplied Axes and Figures in the plotting functions. The symbol size is now smaller by default.
- Users can now plot phase diagram data alone with
- Documentation improvements for API and separation of theory from the Cu-Mg example
- Fixes a bug where elements with single character names would not find the correct reference state (which are typically named GHSERCC for the example of C).
- [Developer] All MCMC code is moved from the
paramselectmodule to the
mcmcmodule to separate these tasks
- [Developer] Support for arbitrary user reference states (so long as the reference state is in the
refdatamodule and follows the same format as SGTE91)
- Propagate the new entry point to setup.py
- Fix for module name/function conflict in entry point
- ESPEI is much easier to run interactively in Python and in Jupyter Notebooks
- Reference data is now included in ESPEI instead of in pycalphad
- Several reference data fixes including support for single character elements (‘V’, ‘B’, ‘C’, …)
- Support for using multiprocessing to parallelize MCMC runs, used by default (@olivia-higgins)
- Improved documentation for installing and developing ESPEI
- Add input-schema.yaml file to installer
- Add LICENSE to manifest
- ESPEI input is now described by a file. This change is breaking. Old command line arguments are not supported. See Writing input files for a full description of all the inputs.
- New input options are supported, including modifying the number of chains and standard deviation from the mean
- ESPEI is now available on conda-forge
- TinyDB 2 support is dropped in favor of TinyDB 3 for conda-forge deployment
- Allow for restarting previous mcmc calculations with a trace file
- Add Cu-Mg example to documentation
Fixes to the 0.2 release plotting interface
multiplotis renamed from
multi_plot, as in docs.
- Fixed an issue where phases in datasets, but not in equilibrium were not plotted by dataplot and raised an error.
multiplotinterface for convenient plotting of phase diagrams + data.
dataplotfunction underlies key data plotting features and can be used with
eqplot. See their API docs for examples. Will break existing code using multiplot.
MPI support for local/HPC runs. Only single node runs are explicitly supported currently. Use
--scheduler='MPIPool'command line option. Requires
Default debug reporting of acceptance ratios
Option (and default) to output the log probability array matching the trace. Use
--probfileoption to control.
Optimal parameters are now chosen based on lowest error in chain.
Bug fixes including
- py2/3 compatibility
- Unicode datasets
- handling of singular matrix errors from pycalphad’s
- reporting of failed conditions
- Significant error checking of JSON inputs.
- Add new
--check-datasetsoption to check the datasets at path. It should be run before you run ESPEI fittings. All errors must be resolved before you run.
- Move the espei script module from
- Better docs building with mocking
- Google docstrings are now NumPy docstrings
- Documentation improvements for usage and API docs
- Fail fast on JSON errors
- Fix bad version pinning in setup.py
- Explicitly support Python 2.7
- Fix dask incompatibility due to new API usage
- Fix a bug that caused logging to raise if bokeh isn’t installed
ESPEI is now a package! New features include
- Fork https://github.com/richardotis/pycalphad-fitting
- Use emcee for MCMC fitting rather than pymc
- Support single-phase only fitting
- More control options for running ESPEI from the command line
- Better support for incremental saving of the chain
- Control over output with logging over printing
- Significant code cleanup
- Better usage documentation