espei.error_functions package#

Submodules#

espei.error_functions.activity_error module#

Calculate error due to measured activities.

The residual function implemented in this module needs to exist because it is currently not possible to compute activity as a property via equilibrium calculations because as PyCalphad does not yet have a suitable notion of a reference state that could be used for equilibrium chemical potentials.

class espei.error_functions.activity_error.ActivityResidual(database: Database, datasets: PickleableTinyDB, phase_models: PhaseModelSpecification | None, symbols_to_fit: List[SymbolName] | None = None, weight: Dict[str, float] | None = None)#

Bases: ResidualFunction

get_likelihood(parameters: ndarray[Any, dtype[_ScalarType_co]]) float#

Return log-likelihood for the set of parameters.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Value of log-likelihood for the given set of parameters

Return type:

float

get_residuals(parameters: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) Tuple[List[float], List[float]]#

Return the residual comparing the selected data to the set of parameters.

The residual is zero if the database predictions under the given parameters agrees with the data exactly.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Tuple of (residuals, weights), which must obey len(residuals) == len(weights)

Return type:

Tuple[List[float], List[float]]

espei.error_functions.activity_error.calculate_activity_error(dbf, comps, phases, datasets, parameters=None, phase_models=None, callables=None, data_weight=1.0) float#

Return the sum of square error from activity data

Parameters:
  • dbf (pycalphad.Database) – Database to consider

  • comps (list) – List of active component names

  • phases (list) – List of phases to consider

  • datasets (espei.utils.PickleableTinyDB) – Datasets that contain single phase data

  • parameters (dict) – Dictionary of symbols that will be overridden in pycalphad.equilibrium

  • phase_models (dict) – Phase models to pass to pycalphad calculations

  • callables (dict) – Callables to pass to pycalphad

  • data_weight (float) – Weight for standard deviation of activity measurements, dimensionless. Corresponds to the standard deviation of differences in chemical potential in typical measurements of activity, in J/mol.

Returns:

A single float of the likelihood

Return type:

float

espei.error_functions.activity_error.calculate_activity_residuals(dbf, comps, phases, datasets, parameters=None, phase_models=None, callables=None, data_weight=1.0) Tuple[List[float], List[float]]#

Notes

General procedure: 1. Get the datasets 2. For each dataset

  1. Calculate reference state equilibrium

  2. Calculate current chemical potentials

  3. Find the target chemical potentials

  4. Calculate error due to chemical potentials

espei.error_functions.activity_error.target_chempots_from_activity(component, target_activity, temperatures, reference_result)#

Return an array of experimental chemical potentials for the component

Parameters:
  • component (str) – Name of the component

  • target_activity (numpy.ndarray) – Array of experimental activities

  • temperatures (numpy.ndarray) – Ravelled array of temperatures (of same size as exp_activity).

  • reference_result (xarray.Dataset) – Dataset of the equilibrium reference state. Should contain a singe point calculation.

Returns:

Array of experimental chemical potentials

Return type:

numpy.ndarray

espei.error_functions.context module#

Convenience function to create a context for the built in error functions

espei.error_functions.context.setup_context(dbf, datasets, symbols_to_fit=None, data_weights=None, phase_models=None, make_callables=True)#

Set up a context dictionary for calculating error.

Parameters:
  • dbf (Database) – A pycalphad Database that will be fit

  • datasets (PickleableTinyDB) – A database of single- and multi-phase data to fit

  • symbols_to_fit (list of str) – List of symbols in the Database that will be fit. If None (default) are passed, then all parameters prefixed with VV followed by a number, e.g. VV0001 will be fit.

  • phase_models (Optional[Dict[str, Any]]) – Phase model dictionary that will be converted to PhaseModelSpecification if it is provided.

Notes

A copy of the Database is made and used in the context. To commit changes back to the original database, the dbf.symbols.update method should be used.

espei.error_functions.equilibrium_thermochemical_error module#

Calculate error due to equilibrium thermochemical properties.

class espei.error_functions.equilibrium_thermochemical_error.EqPropData(dbf, species, phases, potential_conds, composition_conds, models, params_keys, phase_records, output, samples, weight, reference)#

Bases: NamedTuple

composition_conds: Sequence[Dict[MoleFraction, float]]#

Alias for field number 4

dbf: Database#

Alias for field number 0

models: Dict[str, Model]#

Alias for field number 5

output: str#

Alias for field number 8

params_keys: Dict[str, float]#

Alias for field number 6

phase_records: Sequence[Dict[str, PhaseRecord]]#

Alias for field number 7

phases: Sequence[str]#

Alias for field number 2

potential_conds: Dict[StateVariable, float]#

Alias for field number 3

reference: str#

Alias for field number 11

samples: ndarray#

Alias for field number 9

species: Sequence[Species]#

Alias for field number 1

weight: ndarray#

Alias for field number 10

class espei.error_functions.equilibrium_thermochemical_error.EquilibriumPropertyResidual(database: Database, datasets: PickleableTinyDB, phase_models: PhaseModelSpecification | None, symbols_to_fit: List[SymbolName] | None = None, weight: Dict[str, float] | None = None)#

Bases: ResidualFunction

get_likelihood(parameters) float#

Return log-likelihood for the set of parameters.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Value of log-likelihood for the given set of parameters

Return type:

float

get_residuals(parameters: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) Tuple[List[float], List[float]]#

Return the residual comparing the selected data to the set of parameters.

The residual is zero if the database predictions under the given parameters agrees with the data exactly.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Tuple of (residuals, weights), which must obey len(residuals) == len(weights)

Return type:

Tuple[List[float], List[float]]

espei.error_functions.equilibrium_thermochemical_error.build_eqpropdata(data: Document, dbf: Database, model: Dict[str, Type[Model]] | None = None, parameters: Dict[str, float] | None = None, data_weight_dict: Dict[str, float] | None = None) EqPropData#

Build EqPropData for the calculations corresponding to a single dataset.

Parameters:
  • data (tinydb.database.Document) – Document corresponding to a single ESPEI dataset.

  • dbf (Database) – Database that should be used to construct the Model and PhaseRecord objects.

  • model (Optional[Dict[str, Type[Model]]]) – Dictionary phase names to pycalphad Model classes.

  • parameters (Optional[Dict[str, float]]) – Mapping of parameter symbols to values.

  • data_weight_dict (Optional[Dict[str, float]]) – Mapping of a data type (e.g. HM or SM) to a weight.

Return type:

EqPropData

espei.error_functions.equilibrium_thermochemical_error.calc_prop_differences(eqpropdata: EqPropData, parameters: ndarray, approximate_equilibrium: bool | None = False) Tuple[ndarray, ndarray]#

Calculate differences between the expected and calculated values for a property

Parameters:
  • eqpropdata (EqPropData) – Data corresponding to equilibrium calculations for a single datasets.

  • parameters (np.ndarray) – Array of parameters to fit. Must be sorted in the same symbol sorted order used to create the PhaseRecords.

  • approximate_equilibrium (Optional[bool]) – Whether or not to use an approximate version of equilibrium that does not refine the solution and uses starting_point instead.

Returns:

Pair of * differences between the calculated property and expected property * weights for this dataset

Return type:

Tuple[np.ndarray, np.ndarray]

espei.error_functions.equilibrium_thermochemical_error.calculate_equilibrium_thermochemical_probability(eq_thermochemical_data: Sequence[EqPropData], parameters: ndarray, approximate_equilibrium: bool | None = False) float#

Calculate the total equilibrium thermochemical probability for all EqPropData

Parameters:
  • eq_thermochemical_data (Sequence[EqPropData]) – List of equilibrium thermochemical data corresponding to the datasets.

  • parameters (np.ndarray) – Values of parameters for this iteration to be updated in PhaseRecords.

  • approximate_equilibrium (Optional[bool], optional) –

  • eq_thermochemical_data

Returns:

Sum of log-probability for all thermochemical data.

Return type:

float

espei.error_functions.equilibrium_thermochemical_error.get_equilibrium_thermochemical_data(dbf: Database, comps: Sequence[str], phases: Sequence[str], datasets: PickleableTinyDB, model: Dict[str, Model] | None = None, parameters: Dict[str, float] | None = None, data_weight_dict: Dict[str, float] | None = None) Sequence[EqPropData]#

Get all the EqPropData for each matching equilibrium thermochemical dataset in the datasets

Parameters:
  • dbf (Database) – Database with parameters to fit

  • comps (Sequence[str]) – List of pure element components used to find matching datasets.

  • phases (Sequence[str]) – List of phases used to search for matching datasets.

  • datasets (PickleableTinyDB) – Datasets that contain single phase data

  • model (Optional[Dict[str, Type[Model]]]) – Dictionary phase names to pycalphad Model classes.

  • parameters (Optional[Dict[str, float]]) – Mapping of parameter symbols to values.

  • data_weight_dict (Optional[Dict[str, float]]) – Mapping of a data type (e.g. HM or SM) to a weight.

Notes

Found datasets will be subsets of the components and phases. Equilibrium thermochemical data is assumed to be any data that does not have the solver key, and does not have an output of ZPF or ACR (which correspond to different data types than can be calculated here.)

Return type:

Sequence[EqPropData]

espei.error_functions.non_equilibrium_thermochemical_error module#

Calculate error due to thermochemical quantities: heat capacity, entropy, enthalpy.

class espei.error_functions.non_equilibrium_thermochemical_error.FixedConfigurationPropertyResidual(database: Database, datasets: PickleableTinyDB, phase_models: PhaseModelSpecification | None, symbols_to_fit: List[SymbolName] | None = None, weight: Dict[str, float] | None = None)#

Bases: ResidualFunction

get_likelihood(parameters) float#

Return log-likelihood for the set of parameters.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Value of log-likelihood for the given set of parameters

Return type:

float

get_residuals(parameters: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) Tuple[List[float], List[float]]#

Return the residual comparing the selected data to the set of parameters.

The residual is zero if the database predictions under the given parameters agrees with the data exactly.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Tuple of (residuals, weights), which must obey len(residuals) == len(weights)

Return type:

Tuple[List[float], List[float]]

espei.error_functions.non_equilibrium_thermochemical_error.calculate_non_equilibrium_thermochemical_probability(thermochemical_data: List[FixedConfigurationCalculationData], parameters=None)#

Calculate the weighted single phase error in the Database

Parameters:
  • thermochemical_data (list) – List of thermochemical data dicts

  • parameters (np.ndarray) – Array of parameters to calculate the error with.

Returns:

A single float of the residual sum of square errors

Return type:

float

espei.error_functions.non_equilibrium_thermochemical_error.calculate_points_array(phase_constituents, configuration, occupancies=None)#

Calculate the points array to use in pycalphad calculate calls.

Converts the configuration data (and occupancies for mixing data) into the points array by looking up the indices in the active phase constituents.

Parameters:
  • phase_constituents (list) – List of active constituents in a phase

  • configuration (list) – List of the sublattice configuration

  • occupancies (list) – List of sublattice occupancies. Required for mixing sublattices, otherwise takes no effect.

Return type:

numpy.ndarray

Notes

Errors will be raised if components in the configuration are not in the corresponding phase constituents sublattice.

espei.error_functions.non_equilibrium_thermochemical_error.compute_fixed_configuration_property_differences(calc_data: FixedConfigurationCalculationData, parameters)#
espei.error_functions.non_equilibrium_thermochemical_error.filter_sublattice_configurations(desired_data: List[Dict[str, Any]], subl_model) List[Dict[str, Any]]#

Modify the desired_data to remove any configurations that cannot be represented by the sublattice model.

espei.error_functions.non_equilibrium_thermochemical_error.get_prop_samples(desired_data, constituents)#

Return data values and the conditions to calculate them using pycalphad.calculate

Parameters:
  • desired_data (List[Dict[str, Any]]) – List of dataset dictionaries that contain the values to sample

  • constituents (List[List[str]]) – Names of constituents in each sublattice.

Returns:

Dictionary of condition kwargs for pycalphad’s calculate and the expected values

Return type:

Dict[str, Union[float, ArrayLike, List[float]]]

espei.error_functions.non_equilibrium_thermochemical_error.get_sample_condition_dicts(calculate_dict: Dict[Any, Any], configuration_tuple: Tuple[str | Tuple[str]], phase_name: str) List[Dict[Symbol, float]]#
espei.error_functions.non_equilibrium_thermochemical_error.get_thermochemical_data(dbf, comps, phases, datasets, model=None, weight_dict=None, symbols_to_fit=None)#
Parameters:
  • dbf (pycalphad.Database) – Database to consider

  • comps (list) – List of active component names

  • phases (list) – List of phases to consider

  • datasets (espei.utils.PickleableTinyDB) – Datasets that contain single phase data

  • model (Optional[Dict[str, Type[Model]]]) – Dictionary phase names to pycalphad Model classes.

  • weight_dict (dict) – Dictionary of weights for each data type, e.g. {‘HM’: 200, ‘SM’: 2}

  • symbols_to_fit (list) – Parameters to fit. Used to build the models and PhaseRecords.

Returns:

List of data dictionaries to iterate over

Return type:

list

espei.error_functions.residual_base module#

class espei.error_functions.residual_base.ResidualFunction(database: Database, datasets: PickleableTinyDB, phase_models: PhaseModelSpecification, symbols_to_fit: List[SymbolName] | None = None, weight: Dict[str, float] | None = None)#

Bases: Protocol

Protocol class for computing the error (residual) between data and a model prediction given a set of parameters.

Classes that implement this protocol typically will concerned with implementing a residual/likelihood function for a certain type of data. The protocol is intentitionally left to be simple to implement while enabling flexibility to implement additional methods and attributes for performance and debugging purposes.

Parameters:
  • database (Database) –

  • datasets (PickleableTinyDB) – The candidate datasets for the a contribution. Usually the datasets are a superset (in components, phases, data types, etc.) of the actual data used to compute the residual for any particular contribution.

  • phase_models (PhaseModels) – Defines the active set of components that should be fit and any user-provided overrides to the PyCalphad Model class for each phase.

  • symbols_to_fit (Optional[List[SymbolName]]) – User-provided symbols to fit. By default, the symbols to fit should be set by espei.utils.database_symbols_to_fit.

  • weight (Optional[float]) – When computing the likelihood, this should be used to modify the probability distribution. Higher weights should correspond to narrower probability distributions, but it’s exact use will depend on the particular probability distribution.

weight#
Type:

Optional[Union[float, Dict[str, float]]]

get_likelihood(parameters) float#

Return log-likelihood for the set of parameters.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Value of log-likelihood for the given set of parameters

Return type:

float

get_residuals(parameters: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) Tuple[List[float], List[float]]#

Return the residual comparing the selected data to the set of parameters.

The residual is zero if the database predictions under the given parameters agrees with the data exactly.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Tuple of (residuals, weights), which must obey len(residuals) == len(weights)

Return type:

Tuple[List[float], List[float]]

class espei.error_functions.residual_base.ResidualRegistry#

Bases: object

get_registered_residual_functions() List[Type[ResidualFunction]]#
register(residual_function: Type[ResidualFunction])#

espei.error_functions.zpf_error module#

Calculate driving_force due to ZPF tielines.

The general approach is similar to the PanOptimizer rough search method.

  1. With all phases active, calculate the chemical potentials of the tieline endpoints via equilibrium calls. Done in estimate_hyperplane.

  2. Calculate the target chemical potentials, which are the average chemical potentials of all of the current chemical potentials at the tieline endpoints.

  3. Calculate the current chemical potentials of the desired single phases

  4. The error is the difference between these chemical potentials

There’s some special handling for tieline endpoints where we do not know the composition conditions to calculate chemical potentials at.

class espei.error_functions.zpf_error.PhaseRegion(hyperplane_vertices: Sequence[espei.error_functions.zpf_error.RegionVertex], vertices: Sequence[espei.error_functions.zpf_error.RegionVertex], potential_conds: Dict[pycalphad.variables.StateVariable, float], species: Sequence[pycalphad.variables.Species], phases: Sequence[str], models: Dict[str, pycalphad.model.Model])#

Bases: object

eq_str()#
hyperplane_vertices: Sequence[RegionVertex]#
models: Dict[str, Model]#
phases: Sequence[str]#
potential_conds: Dict[StateVariable, float]#
species: Sequence[Species]#
vertices: Sequence[RegionVertex]#
class espei.error_functions.zpf_error.RegionVertex(phase_name: str, composition: collections.abc.Buffer | numpy._typing._array_like._SupportsArray[numpy.dtype[Any]] | numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]] | bool | int | float | complex | str | bytes | numpy._typing._nested_sequence._NestedSequence[bool | int | float | complex | str | bytes], comp_conds: Dict[pycalphad.variables.MoleFraction, float], points: collections.abc.Buffer | numpy._typing._array_like._SupportsArray[numpy.dtype[Any]] | numpy._typing._nested_sequence._NestedSequence[numpy._typing._array_like._SupportsArray[numpy.dtype[Any]]] | bool | int | float | complex | str | bytes | numpy._typing._nested_sequence._NestedSequence[bool | int | float | complex | str | bytes], phase_records: Dict[str, pycalphad.core.phase_rec.PhaseRecord], is_disordered: bool, has_missing_comp_cond: bool)#

Bases: object

comp_conds: Dict[MoleFraction, float]#
composition: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]#
has_missing_comp_cond: bool#
is_disordered: bool#
phase_name: str#
phase_records: Dict[str, PhaseRecord]#
points: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]#
class espei.error_functions.zpf_error.ZPFResidual(database: Database, datasets: PickleableTinyDB, phase_models: PhaseModelSpecification | None, symbols_to_fit: List[SymbolName] | None = None, weight: Dict[str, float] | None = None)#

Bases: ResidualFunction

get_likelihood(parameters) float#

Return log-likelihood for the set of parameters.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Value of log-likelihood for the given set of parameters

Return type:

float

get_residuals(parameters: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes]) Tuple[List[float], List[float]]#

Return the residual comparing the selected data to the set of parameters.

The residual is zero if the database predictions under the given parameters agrees with the data exactly.

Parameters:

parameters (ArrayLike) – 1D parameter vector. The size of the parameters array should match the number of fitting symbols used to build the models. This is _not_ checked.

Returns:

Tuple of (residuals, weights), which must obey len(residuals) == len(weights)

Return type:

Tuple[List[float], List[float]]

espei.error_functions.zpf_error.calculate_zpf_driving_forces(zpf_data: Sequence[Dict[str, Any]], parameters: Buffer | _SupportsArray[dtype[Any]] | _NestedSequence[_SupportsArray[dtype[Any]]] | bool | int | float | complex | str | bytes | _NestedSequence[bool | int | float | complex | str | bytes] = None, approximate_equilibrium: bool = False, short_circuit: bool = False) Tuple[List[List[float]], List[List[float]]]#

Calculate error due to phase equilibria data

zpf_dataSequence[Dict[str, Any]]

Datasets that contain single phase data

parametersArrayLike

Array of parameters to calculate the error with.

approximate_equilibriumbool

Whether or not to use an approximate version of equilibrium that does not refine the solution and uses starting_point instead.

short_circuit: bool

If True, immediately return a size 1 array with a driving force of np.nan (failed hyperplane) or np.inf (failed driving force). Can save computational time if the caller will aggregate driving forces.

Returns:

Driving forces and weights as ragged 2D arrays with shape (len(zpf_data), len(vertices in each zpf_data))

Return type:

Tuple[List[List[float]], List[List[float]]]

Notes

The physical picture of the standard deviation is that we’ve measured a ZPF line. That line corresponds to some equilibrium chemical potentials. The standard deviation is the standard deviation of those ‘measured’ chemical potentials.

espei.error_functions.zpf_error.calculate_zpf_error(zpf_data: Sequence[Dict[str, Any]], parameters: ndarray = None, data_weight: int = 1.0, approximate_equilibrium: bool = False) float#

Calculate the likelihood due to phase equilibria data.

For detailed documentation, see calculate_zpf_driving_forces

Returns:

Log probability of ZPF driving forces

Return type:

float

espei.error_functions.zpf_error.driving_force_to_hyperplane(target_hyperplane_chempots: ndarray, phase_region: PhaseRegion, vertex: RegionVertex, parameters: ndarray, approximate_equilibrium: bool = False) float#

Calculate the integrated driving force between the current hyperplane and target hyperplane.

espei.error_functions.zpf_error.estimate_hyperplane(phase_region: PhaseRegion, parameters: ndarray, approximate_equilibrium: bool = False) ndarray#

Calculate the chemical potentials for the target hyperplane, one vertex at a time

Notes

This takes just one set of phase equilibria, a phase region, e.g. a dataset point of [[‘FCC_A1’, [‘CU’], [0.1]], [‘LAVES_C15’, [‘CU’], [0.3]]] and calculates the chemical potentials given all the phases possible at the given compositions. Then the average chemical potentials of each end point are taken as the target hyperplane for the given equilibria.

espei.error_functions.zpf_error.get_zpf_data(dbf: Database, comps: Sequence[str], phases: Sequence[str], datasets: PickleableTinyDB, parameters: Dict[str, float], model: Dict[str, Type[Model]] | None = None)#

Return the ZPF data used in the calculation of ZPF error

Parameters:
  • comps (list) – List of active component names

  • phases (list) – List of phases to consider

  • datasets (espei.utils.PickleableTinyDB) – Datasets that contain single phase data

  • parameters (dict) – Dictionary mapping symbols to optimize to their initial values

  • model (Optional[Dict[str, Type[Model]]]) – Dictionary phase names to pycalphad Model classes.

Returns:

List of data dictionaries with keys weight, phase_regions and dataset_references.

Return type:

list

Module contents#

Functions for calculating error.