Tools

Some tools are given with the library.

Benchmarking

Benchmarking tools for testing and comparing outputs between different files. Some of these functions are also used for testing.

pysd.tools.benchmarking.assert_allclose(x, y, rtol=1e-05, atol=1e-05)[source]

Asserts if all numeric values from two arrays are close.

Parameters:
  • x (ndarray) – Expected value.
  • y (ndarray) – Actual value.
  • rtol (float (optional)) – Relative tolerance on the error. Default is 1.e-5.
  • atol (float (optional)) – Absolut tolerance on the error. Default is 1.e-5.
Returns:

Return type:

None

pysd.tools.benchmarking.assert_frames_close(actual, expected, assertion='raise', verbose=False, precision=2, **kwargs)[source]

Compare DataFrame items by column and raise AssertionError if any column is not equal.

Ordering of columns is unimportant, items are compared only by label. NaN and infinite values are supported.

Parameters:
  • actual (pandas.DataFrame) – Actual value from the model output.
  • expected (pandas.DataFrame) – Expected model output.
  • assertion (str (optional)) – “raise” if an error should be raised when not able to assert that two frames are close. Otherwise, it will show a warning message. Default is “raise”.
  • verbose (bool (optional)) – If True, if any column is not close the actual and expected values will be printed in the error/warning message with the difference. Default is False.
  • precision (int (optional)) – Precision to print the numerical values of assertion verbosed message. Default is 2.
  • kwargs – Optional rtol and atol values for assert_allclose.

Examples

>>> assert_frames_close(
...     pd.DataFrame(100, index=range(5), columns=range(3)),
...     pd.DataFrame(100, index=range(5), columns=range(3)))
>>> assert_frames_close(
...     pd.DataFrame(100, index=range(5), columns=range(3)),
...     pd.DataFrame(110, index=range(5), columns=range(3)),
...     rtol=.2)
>>> assert_frames_close(
...     pd.DataFrame(100, index=range(5), columns=range(3)),
...     pd.DataFrame(150, index=range(5), columns=range(3)),
...     rtol=.2)  # doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
AssertionError:
Following columns are not close:
    '0'
>>> assert_frames_close(
...     pd.DataFrame(100, index=range(5), columns=range(3)),
...     pd.DataFrame(150, index=range(5), columns=range(3)),
...     verbose=True, rtol=.2)  # doctest: +IGNORE_EXCEPTION_DETAIL
Traceback (most recent call last):
...
AssertionError:
Following columns are not close:
    '0'
Column '0' is not close.
Expected values:
    [150, 150, 150, 150, 150]
Actual values:
    [100, 100, 100, 100, 100]
Difference:
    [50, 50, 50, 50, 50]
>>> assert_frames_close(
...     pd.DataFrame(100, index=range(5), columns=range(3)),
...     pd.DataFrame(150, index=range(5), columns=range(3)),
...     rtol=.2, assertion="warn")
...
UserWarning:
Following columns are not close:
    '0'

References

Derived from:
http://nbviewer.jupyter.org/gist/jiffyclub/ac2e7506428d5e1d587b
pysd.tools.benchmarking.detect_encoding(filename)[source]

Detects the encoding of a file.

Parameters:filename (str) – Name of the file to detect the encoding.
Returns:encoding – The encoding of the file.
Return type:str
pysd.tools.benchmarking.load_outputs(file_name, transpose=False, columns=None, encoding=None)[source]

Load outputs file

Parameters:
  • file_name (str) – Output file to read. Must be csv or tab.
  • transpose (bool (optional)) – If True reads transposed outputs file, i.e. one variable per row. Default is False.
  • columns (list or None (optional)) – List of the column names to load. If None loads all the columns. Default is None. NOTE: if transpose=False, the loading will be faster as only selected columns will be loaded. If transpose=True the whole file must be read and it will be subselected later.
  • encoding (str or None (optional)) – Encoding type to read output file. Needed if the file has special characters. Default is None.
Returns:

A pandas.DataFrame with the outputs values.

Return type:

pandas.DataFrame

pysd.tools.benchmarking.runner(model_file, canonical_file=None, transpose=False)[source]

Translates and runs a model and returns its output and the canonical output.

Parameters:
  • model_file (str) – Name of the original model file. Must be ‘.mdl’ or ‘.xmile’.
  • canonical_file (str or None (optional)) – Canonical output file to read. If None, will search for ‘output.csv’ and ‘output.tab’ in the model directory. Default is None.
  • transpose (bool (optional)) – If True reads transposed canonical file, i.e. one variable per row. Default is False.
Returns:

output, canon – pandas.DataFrame of the model output and the canonical output.

Return type:

(pandas.DataFrame, pandas.DataFrame)