espei package¶
Subpackages¶
Submodules¶
espei.core_utils module¶
Module for handling data
-
espei.core_utils.
build_sitefractions
(phase_name, sublattice_configurations, sublattice_occupancies)¶ Convert nested lists of sublattice configurations and occupancies to a list of dictionaries. The dictionaries map SiteFraction symbols to occupancy values. Note that zero occupancy site fractions will need to be added separately since the total degrees of freedom aren’t known in this function.
Parameters: - phase_name (str) – Name of the phase
- sublattice_configurations ([[str]]) – sublattice configuration
- sublattice_occupancies ([[float]]) – occupancy of each sublattice
Returns: a list of site fractions over sublattices
Return type: [[float]]
-
espei.core_utils.
canonical_sort_key
(x)¶ Wrap strings in tuples so they’ll sort.
Parameters: x ([str]) – list of strings Returns: tuple of strings that can be sorted Return type: (str)
-
espei.core_utils.
canonicalize
(configuration, equivalent_sublattices)¶ Sort a sequence with symmetry. This routine gives the sequence a deterministic ordering while respecting symmetry.
Parameters: - configuration ([str]) – Sublattice configuration to sort.
- equivalent_sublattices ({{int}}) – Indices of ‘configuration’ which should be equivalent by symmetry, i.e., [[0, 4], [1, 2, 3]] means permuting elements 0 and 4, or 1, 2 and 3, respectively, has no effect on the equivalence of the sequence.
Returns: sorted tuple that has been canonicalized.
Return type: str
-
espei.core_utils.
endmembers_from_interaction
(configuration)¶
-
espei.core_utils.
get_data
(comps, phase_name, configuration, symmetry, datasets, prop)¶
-
espei.core_utils.
get_samples
(desired_data)¶
-
espei.core_utils.
list_to_tuple
(x)¶
-
espei.core_utils.
symmetry_filter
(x, config, symmetry)¶
espei.datasets module¶
-
exception
espei.datasets.
DatasetError
¶ Bases:
Exception
Exception raised when datasets are invalid.
-
espei.datasets.
check_dataset
(dataset)¶ Ensure that the dataset is valid and consistent.
Currently supports the following validation checks: * data shape is valid * phases and components used match phases and components entered
Planned validation checks: * all required keys are present * individual shapes of keys, such as ZPF, sublattice configs and site ratios
Note that this follows some of the implicit assumptions in ESPEI at the time of writing, such that conditions are only P, T, configs for single phase and essentially only T for ZPF data.
Parameters: dataset (dict) – Dictionary of the standard ESPEI dataset. Returns: Return type: None Raises: DatasetError
– If an error is found in the dataset
-
espei.datasets.
load_datasets
(dataset_filenames)¶ Create a PickelableTinyDB with the data from a list of filenames.
Parameters: - dataset_filenames ([str]) – List of filenames to load as datasets
- Returns –
- -------- –
- PickleableTinyDB –
-
espei.datasets.
recursive_glob
(start, pattern)¶ Recursively glob for the given pattern from the start directory.
Parameters: - start (str) – Path of the directory to walk while for file globbing
- pattern (str) – Filename pattern to match in the glob
Returns: List of matched filenames
Return type: [str]
espei.paramselect module¶
The paramselect module handles automated parameter selection for linear models.
Automated Parameter Selection End-members
Note: All magnetic parameters from literature for now. Note: No fitting below 298 K (so neglect third law issues for now).
For each step, add one parameter at a time and compute AIC with max likelihood.
Cp - TlnT, T**2, T**-1, T**3 - 4 candidate models (S and H only have one required parameter each. Will fit in full MCMC procedure)
Choose parameter set with best AIC score.
- G (full MCMC) - all parameters selected at least once by above procedure
MCMC uses an EnsembleSampler based on Goodman and Weare, Ensemble Samplers with Affine Invariance. Commun. Appl. Math. Comput. Sci. 5, 65-80 (2010).
-
espei.paramselect.
estimate_hyperplane
(dbf, comps, phases, current_statevars, comp_dicts, phase_models, parameters)¶
-
espei.paramselect.
fit
(input_fname, datasets, resume=None, scheduler=None, run_mcmc=True, tracefile=None, probfile=None, restart_chain=None, mcmc_steps=1000, save_interval=100, chains_per_parameter=2, chain_std_deviation=0.1)¶ Fit thermodynamic and phase equilibria data to a model.
Parameters: - input_fname (str) – name of the input file containing the sublattice models.
- datasets (PickleableTinyDB) – database of single- and multi-phase to fit.
- resume (Database) – pycalphad Database of a file to start from. Using this parameter causes single phase fitting to be skipped (multi-phase only).
- scheduler (callable) – Scheduler to use with emcee. Must implement a map method.
- run_mcmc (bool) – Controls if MCMC should be run. Default is True. Useful for first-principles (single-phase only) runs.
- tracefile (str) – filename to store the flattened chain with NumPy.save. Array has shape (nwalkers, iterations, nparams)
- probfile (str) – filename to store the flattened ln probability with NumPy.save
- restart_chain (np.ndarray) – ndarray of the previous chain. Should have shape (nwalkers, iterations, nparams)
- mcmc_steps (int) – number of chain steps to calculate in MCMC. Note the flattened chain will have (mcmc_steps*DOF) values. int (Default value = 1000)
- save_interval (int) – interval of steps to save the chain to the tracefile.
- chains_per_parameter (int) – number of chains for each parameter. Must be an even integer greater or equal to 2. Defaults to 2.
- chain_std_deviation (float) – standard deviation of normal for parameter initialization as a fraction of each parameter. Must be greater than 0. Default is 0.1, which is 10%.
Returns: - dbf (Database) – Resulting pycalphad database of optimized parameters
- sampler (EnsembleSampler, ndarray)) – emcee sampler for further data wrangling
- parameters_dict (dict) – Optimized parameters
-
espei.paramselect.
fit_formation_energy
(dbf, comps, phase_name, configuration, symmetry, datasets, features=None)¶ Find suitable linear model parameters for the given phase. We do this by successively fitting heat capacities, entropies and enthalpies of formation, and selecting against criteria to prevent overfitting. The “best” set of parameters minimizes the error without overfitting.
Parameters: - dbf (Database) – pycalphad Database. Partially complete, so we know what degrees of freedom to fix.
- comps ([str]) – Names of the relevant components.
- phase_name (str) – Name of the desired phase for which the parameters will be found.
- configuration (ndarray) – Configuration of the sublattices for the fitting procedure.
- symmetry ([[int]]) – Symmetry of the sublattice configuration.
- datasets (PickleableTinyDB) – All the datasets desired to fit to.
- features (dict) – Maps “property” to a list of features for the linear model. These will be transformed from “GM” coefficients e.g., {“CPM_FORM”: (v.T*sympy.log(v.T), v.T**2, v.T**-1, v.T**3)} (Default value = None)
Returns: {feature: estimated_value}
Return type: dict
-
espei.paramselect.
lnprob
(params, data=None, comps=None, dbf=None, phases=None, datasets=None, symbols_to_fit=None, phase_models=None, scheduler=None)¶ Returns the error from multiphase fitting as a log probability.
-
espei.paramselect.
multi_phase_fit
(dbf, comps, phases, datasets, phase_models, parameters=None, scheduler=None)¶
-
espei.paramselect.
phase_fit
(dbf, phase_name, symmetry, subl_model, site_ratios, datasets, refdata, aliases=None)¶ Generate an initial CALPHAD model for a given phase and sublattice model.
Parameters: - dbf (Database) – pycalphad Database to add parameters to.
- phase_name (str) – Name of the phase.
- symmetry ([[int]]) – Sublattice model symmetry.
- subl_model ([[str]]) – Sublattice model for the phase of interest.
- site_ratios ([float]) – Number of sites in each sublattice, normalized to one atom.
- datasets (PickleableTinyDB) – All datasets to consider for the calculation.
- refdata (dict) – Maps tuple(element, phase_name) -> SymPy object defining energy relative to SER
- aliases ([str]) – Alternative phase names. Useful for matching against reference data or other datasets. (Default value = None)
Returns: Modifies the dbf.
Return type: None
-
espei.paramselect.
tieline_error
(dbf, comps, current_phase, cond_dict, region_chemical_potentials, phase_flag, phase_models, parameters, debug_mode=False)¶
espei.plot module¶
Plotting of input data and calculated database quantities
-
espei.plot.
dataplot
(eq, datasets, ax=None)¶ Plot datapoints corresponding to the components and phases in the eq Dataset
Parameters: - eq (xarray.Dataset) – Result of equilibrium calculation.
- datasets (TinyDB) – Database of phase equilibria datasets
- ax (matplotlib.Axes) – Default axes used if not specified.
Returns: Return type: A plot of phase equilibria points as a figure
Examples
>>> from pycalphad import equilibrium, Database, variables as v >>> from pycalphad.plot.eqplot import eqplot >>> from espei.datasets import load_datasets, recursive_glob >>> datasets = load_datasets(recursive_glob('.', '*.json')) >>> dbf = Database('my_databases.tdb') >>> my_phases = list(dbf.phases.keys()) >>> eq = equilibrium(dbf, ['CU', 'MG', 'VA'], my_phases, {v.P: 101325, v.T: 1000, v.X('MG'): (0, 1, 0.01)}) >>> ax = eqplot(eq) >>> ax = dataplot(eq, datasets, ax=ax)
-
espei.plot.
multiplot
(dbf, comps, phases, conds, datasets, eq_kwargs=None, plot_kwargs=None, data_kwargs=None)¶ Plot a phase diagram with datapoints described by datasets. This is a wrapper around pycalphad.equilibrium, pycalphad’s eqplot, and dataplot.
Parameters: - dbf (Database) – pycalphad thermodynamic database containing the relevant parameters.
- comps (list) – Names of components to consider in the calculation.
- phases (list) – Names of phases to consider in the calculation.
- conds (dict) – Maps StateVariables to values and/or iterables of values.
- datasets (TinyDB) – Database of phase equilibria datasets
- eq_kwargs (dict) – Keyword arguments passed to pycalphad equilibrium()
- plot_kwargs (dict) – Keyword arguments passed to pycalphad eqplot()
- data_kwargs (dict) – Keyword arguments passed to dataplot()
Returns: Return type: A phase diagram with phase equilibria data as a figure
Examples
>>> from pycalphad import Database, variables as v >>> from pycalphad.plot.eqplot import eqplot >>> from espei.datasets import load_datasets, recursive_glob >>> datasets = load_datasets(recursive_glob('.', '*.json')) >>> dbf = Database('my_databases.tdb') >>> my_phases = list(dbf.phases.keys()) >>> multiplot(dbf, ['CU', 'MG', 'VA'], my_phases, {v.P: 101325, v.T: 1000, v.X('MG'): (0, 1, 0.01)}, datasets)
-
espei.plot.
plot_parameters
(dbf, comps, phase_name, configuration, symmetry, datasets=None)¶
espei.run_espei module¶
Automated fitting script.
A minimal run must specify an input.json and a datasets folder containing input files.
-
espei.run_espei.
get_run_settings
(input_dict)¶ Validate settings from a dict of possible input.
Performs the following actions: 1. Normalize (apply defaults) 2. Validate against the schema
Parameters: input_dict (dict) – Dictionary of input settings Returns: Validated run settings Return type: dict Raises: ValueError
-
espei.run_espei.
main
()¶
espei.utils module¶
Utilities for ESPEI
Classes and functions defined here should have some reuse potential.
-
espei.utils.
sigfigs
(x, n)¶ Round x to n significant digits