Loss modelling framework.
Fatal Python error: PyGILState_Release
when using remote storage, issue comes from pyarrow==14.x.x
model_storage.listdir()
is called to check the connection before continuing with the execution.model_storage_config_fp
if the file already exists and is the same.The new static method getmodel/footprint.py::Footprint::get_footprint_fmt_priorities()
can now be called from execution/bin.py::set_footprint_set()
to get a list of footprint file format priorities. This removes the duplicate definition in this function. As before, the priority order is defined in getmodel/common.py
.
Adjust gulmc to take into account when the order in the loaded vulnerability table is adjusted (in particular when loading with parquet) This fix use the same solution as in gulpy, when we load the mapping between areaperil and vulnerability we use first the vulnerability id. Once we have loaded the vulnerability table and we know each index, we update this mapping to the index directly (in the preparation phase) to remove this lookup from the main look
Needed for https://github.com/OasisLMF/OasisPlatform/pull/994 otherwise running a keys lookup with multiprocessing
will throw an exception:
[2024-03-14 12:19:54,399: ERROR/ForkPoolWorker-1] generate_input[d60b57a2-f8f9-4794-8f0a-831380a44ea0]: daemonic processes are not allowed to have children
See: https://github.com/OasisLMF/ktools/releases/tag/v3.12.1
Tests were added to CI to test recent financial module features including support for account level participation (only) and handling of duplicate locations in OED with one blank CondTag.
In the FM files fm_policytc.csv
and fm_profile.csv
, the field name policytc_id
has been replaced by profile_id
. This brings these files in line with those of RI.
Adjust gulmc to take into account when the order in the loaded vulnerability table is adjusted (in particular when loading with parquet) This fix use the same solution as in gulpy, when we load the mapping between areaperil and vulnerability we use first the vulnerability id. Once we have loaded the vulnerability table and we know each index, we update this mapping to the index directly (in the preparation phase) to remove this lookup from the main look
Needed for https://github.com/OasisLMF/OasisPlatform/pull/994 otherwise running a keys lookup with multiprocessing
will throw an exception:
[2024-03-14 12:19:54,399: ERROR/ForkPoolWorker-1] generate_input[d60b57a2-f8f9-4794-8f0a-831380a44ea0]: daemonic processes are not allowed to have children
Symlinks using the new pytest package are broken for some tests in test_generate_losses.py
.
Fixed by running all of these checks in a temp model_run_dir
Needed for https://github.com/OasisLMF/OasisPlatform/pull/994 otherwise running a keys lookup with multiprocessing
will throw an exception:
[2024-03-14 12:19:54,399: ERROR/ForkPoolWorker-1] generate_input[d60b57a2-f8f9-4794-8f0a-831380a44ea0]: daemonic processes are not allowed to have children
Pandas 3 will bring several change to the way pandas is behaving. Pandas provide us with two option to set to mimic the future behavior. In this PR, make sure all test pass when the following option are set: pd.options.mode.copy_on_write = True pd.options.future.infer_string = True (the setting of the option is not part of the PR, they were just defined when testing locally)
loc_id
should exist in either keys.csv or keys_error.csv
2.3.0
This fix make sure a header is written in keys.csv even if the first block of results send to the keys written is all failling
footprint_set and vulnerability_set now correctly handle data provided in parquet format
Apply vulnerability filter before retrieving the vulnerability parquet to reduce memory usage and reading time
The correlation value for either damage or hazard is now optional and defaults to zero if not entered.
With OasisPlatform 2.3.0, the v2
endpoints can support both execution workflows (single server or distributed).
Fix the OasisAPI client to check for the new run_mode={v1| v2}
which waiting for an analysis to complete
Support for concurrent net and gross reinsurance output streams has been introduced to fmpy
. This change allows the user to request output at intermediate inuring priorities. This is facilitated by branching off gross losses at every requested inuring priority, establishing new streams. Requested reinsurance summaries are extracted from these streams.
When only aalcalcmeanonly
output requested and an identifier is used to identify the occurrence file to be used, a symbolic link to that file is created in the run static directory. This fixes an issue where the symbolic link was not created in the aforementioned scenario.
Make sure the information in the account file are merge even if no financial terms are present.
When only aalcalcmeanonly
output requested and an identifier is used to identify the occurrence file to be used, a symbolic link to that file is created in the run static directory. This fixes an issue where the symbolic link was not created in the aforementioned scenario.
loc_id
should exist in either keys.csv or keys_error.csv
2.3.0
This fix make sure a header is written in keys.csv even if the first block of results send to the keys written is all failling
The correlation value for either damage or hazard is now optional and defaults to zero if not entered.
Make sure the information in the account file are merge even if no financial terms are present.
Completed all units in validation/insurance_policy_coverages
KeyLookupInterface
class to have access to the lookup_complex_config_json
Use by setting oasislmf api run --server-version v1
or oasislmf api run --server-version v2
add support for AccParticipation in account all level. introduce new calcrule where a share term is positive for all direct calcrule. this "duplicated" calcrules have an id corresponding to their no share term calcrule plus 100 (ex: deductible and limit , id 1 => deductible, limit and share, id 101) Note that calcrules with the same terms can have different id if they are perform in "direct" levels or in "direct layer" levels because in "direct" the share is apply on top of the policy that may have to keep track of deductible underlimit and overlimit
use category for peril_id when reading keys.csv. use directly index when creating fm_xref_file
The analysis settings can contain a reference to a .csv file containing the changes or, directly, the necessary changes. If they do, while the specific ids are loaded, they will be taken from the replacements file (if present there) and not from the vulnerability file.
When account level aggregation is performed but there is no terms, some needed columns where not taken from the account file leading to error in get_xref_df: KeyError: "['acc_idx', 'PolNumber'] not in index"
This fix the issue by using all useful columns when the account file is merged.
Add missing calcrule for when there is only Account Participation at the financial terms for account level
By specifying adjustments to specific vulnerabilities in the analysis settings, an adjustment can be applied to the probabilities of that vulnerability.
Also, to be able to run our tests using exposure run, Peril need to be taken from LocPerilCovered in exposure run add option to use LocPerilCovered for peril id and use only certain peril During an exposure run, the perils used were determine base on num_subperils and their id were 1 to num_subperils With this change user can specify the peril covered by the deterministic model via --model-perils-covered if nothing is given all peril in LocPerilCovered will be attributed a key and will receive a loss from the model.
it is also now possible to specify extra summary column so they can be seen in the loss summary at the end of exposure run using --extra-summary-cols
example:
oasislmf exposure run -s ~/test/peril_test -r ~/OasisLMF/runs/peril_test --extra-summary-cols peril_id --model-perils-covered WTC
add the possibility to use Franchise deductible without an associated limit
Choosing --verbose when running oasislmf will cause ods_tools logs at level DEBUG and above to be seen in the output.
v2
gulmc
default to Truemodelpy
and gulpy
defaults to FalseThis adds the optional to load model_data files from a remote object store like S3
or Azure Blob storage
.
File access is configured via a file named model_storage.json
{
"storage_class": "oasis_data_manager.filestore.backends.aws_storage.AwsObjectStore",
"options": {
"bucket_name": "oasislmf-model-library-oasis-piwind",
"access_key": "<aws-s3-key-name>",
"secret_key": "<aws-s3-key-secret>",
"root_dir": "model_data/"
}
}
{
"model_storage_json": "model_storage.json",
"analysis_settings_json": "analysis_settings.json",
"lookup_config_json": "keys_data/PiWind/lookup_config.json",
"lookup_data_dir": "keys_data/PiWind"
}
oasislmf
package documentation - (PR #1320)This PR Fix #1249 by revamping the oasislmf
package documentation.
The complete documentation of the full Python API of oasislmf
is automatically generated using sphinx-autoapi
. There is no need to manually update the docs pages whenever the oasislmf
package is updated: sphinx-autoapi
dynamically finds the changes and generates the docs for the latest oasislmf
version.
The documentation is built using the build-docs.yml
GH action workflow on all PR targeting main
and is built & deployed to the gh-pages
branch for all commits on main
.
In order to save a bit of memory, delete and collect memory of df that are not used anymore
Making changes the global variable key_columns
, which is a list of location file columns used in the lookup process, can lead to errors. As the variable is only used in the method builtin.py::Lookup::process_locations
, it can be defined local to that method instead.
If both complex model config and model config are present, add the json dict from the complex config into the model config as below config['complex_config_dir'] = complex_config_dir config['complex_config'] = complex_config
Work is in progress to have perils columns such as LocPerilsCovered, LocPeril, ... supported in oasislmf. This change aim at changing all perils to AA1 as they represent generic test. some more test specific to peril covered will be added later on with the feature.
also improve the split combine scripts used to add fm unit test by adding support for reinsurance files
https://github.com/OasisLMF/OasisLMF/issues/1322
To enable the storage of footprints in multiple files rather than a single master file, optional identifiers in the form of footprint file suffixes are now supported. This is executed in a similar way to that currently in place to distinguish multiple events and event occurrences files. The footprint_set
model settings option in the analysis settings file can be set to the desired file suffix for the footprint files to be used. A symbolic link to the desired footprint set is created in the static/
directory within the model run directory. Footprint file priorities are identical to those set by modelpy
and gulmc
, which in order of descending priority are: parquet; zipped binary; binary; and csv.
This PR adds extensive documentation about gulmc.
The document has been updated to reflect recent additional financial fields that are supported, including
In addition a 'Version introduced" field has been included to identify the version of OasisLMF in which the field was first supported, if later than v1.15 LTS.
The requirement for amplifications file generated by the MDK as a trigger for the execution of Post Loss Amplification (PLA) has been replaced with the pla
flag in the analysis settings file. This allows a user to enable or disable (default) the PLA component plapy
.
Additionally, a secondary factor in the range [0, 1] can be specified from the command line with the argument -f
when running plapy
:
$ plapy -f 0.8 < gul_output.bin > plapy_output.bin
The secondary factor is applied to the deviation of the loss factor from 1. For example:
event_id | factor from model | relative factor from user | applied factor |
---|---|---|---|
1 | 1.10 | 0.8 | 1.08 |
2 | 1.20 | 0.8 | 1.16 |
3 | 1.00 | 0.8 | 1.00 |
4 | 0.90 | 0.8 | 0.92 |
Finally, an absolute, uniform, positive amplification/reduction factor can be specified from the command line with the argument -F
:
$ plapy -F 0.8 < gul_output.bin > plapy_output.bin
This factor is applied to all losses, thus loss factors from the model (those in lossfactors.bin
) are ignored. For example:
event_id | factor from model | uniform factor from user | applied factor |
---|---|---|---|
1 | 1.10 | 0.8 | 0.8 |
2 | 1.20 | 0.8 | 0.8 |
3 | 1.00 | 0.8 | 0.8 |
4 | 0.90 | 0.8 | 0.8 |
The absolute, uniform factor is incompatible with the relative, secondary factor. Therefore, if both are given by the user, a warning is logged and the secondary factor is ignored.
Model vendors can supply a custom Python module that will be run after the analysis has completed. This module will have access to the run directory, model data directory and analysis settings. It could for instance modify the output files, parse logs to produce user-friendly reports or generate plots.
The two new Oasis settings required to use this feature are similar to the ones used for the pre analysis hook.
post_analysis_module
: Path to the Python module containing the class.post_analysis_class_name
: Name of the class.The class must have a constructor that takes kwargs model_data_dir
, model_run_dir
and analysis_settings_json
, plus a run
method with no arguments. For example:
class MyPostAnalysis:
def __init__(self, model_data_dir=None, model_run_dir=None, analysis_settings_json=None):
self.model_data_dir = model_data_dir
self.model_run_dir = model_run_dir
self.analysis_settings_json = analysis_settings_json
def run(self):
# do something
The Tiv calculated in the output summaries was incorrect as the granularity has change after the implementation of stochastic dis-aggregation (when NumberOfBuilding > 1). Only 'loc_id', 'coverage_type_id' were taken in account leading to detect duplicate leading to lower TIV than it should With this change, we add 'building_id' and 'risk_id' to the summary_map and add building_id in the key to detect duplicate when we calculate the TIV
The new ktools component aalcalcmeanonly
(see PR https://github.com/OasisLMF/ktools/pull/357) calculates the overall average period loss but does not include the standard deviation. As a result, it has a faster execution time and uses less memory than aalcalc
.
Support for executing this component as part of a model run has been introduced through the aalcalc_meanonly
(legacy output) and alt_meanonly
(ORD output) flags in the analysis settings file.
Summary info files are now written in the same format as the ORD output reports. Therefore, should a user request ORD output reports in parquet format, the summary info files will also be in parquet format.
The data type for vulnerability weights that are read from the binary file weights.bin
by gulmc
has been changed from 32-bit integer to 32-bit float.
If supported OED versions are reported in the model settings, exposure files are converted to the latest compatible OED version before running the model.
The ktools
component summarycalc
does not output zero loss events by default. These zero loss events are required when net loss is called in fmpy
. Currently, net loss is called in all reinsurance instances, so the -z
flag has been assigned to all executions ofsummarycalc
when computing reinsurance losses.
The function str2bool(var) converts "False" (str)
to False (bool)
but is not correctly called from the oasislmf.json file.
So setting, a boolean flag with:
{
"do_disaggregation": "False"
}
Evaluates to True
because the type is str and not bool
> (self.do_disaggregation)
'False'
> bool(self.do_disaggregation)
True
The default model correlation factors can be overwritten by specific "correlation_settings" added to the analysis settings file.
Fix issue where CondTag was needed in location file if it was present in the account file making user have to add an empty CondTag column.
If "vulnerability_set" contains an identifier, the corresponding vulnerability file will be used.
Completed all units in validation/insurance_policy_coverages
Use by setting oasislmf api run --server-version v1
or oasislmf api run --server-version v2
add support for AccParticipation in account all level. introduce new calcrule where a share term is positive for all direct calcrule. this "duplicated" calcrules have an id corresponding to their no share term calcrule plus 100 (ex: deductible and limit , id 1 => deductible, limit and share, id 101) Note that calcrules with the same terms can have different id if they are perform in "direct" levels or in "direct layer" levels because in "direct" the share is apply on top of the policy that may have to keep track of deductible underlimit and overlimit
use category for peril_id when reading keys.csv. use directly index when creating fm_xref_file
When account level aggregation is performed but there is no terms, some needed columns where not taken from the account file leading to error in get_xref_df: KeyError: "['acc_idx', 'PolNumber'] not in index"
This fix the issue by using all useful columns when the account file is merged.
The requirement for amplifications file generated by the MDK as a trigger for the execution of Post Loss Amplification (PLA) has been replaced with the pla
flag in the analysis settings file. This allows a user to enable or disable (default) the PLA component plapy
.
Additionally, a secondary factor in the range [0, 1] can be specified from the command line with the argument -f
when running plapy
:
$ plapy -f 0.8 < gul_output.bin > plapy_output.bin
The secondary factor is applied to the deviation of the loss factor from 1. For example:
event_id | factor from model | relative factor from user | applied factor |
---|---|---|---|
1 | 1.10 | 0.8 | 1.08 |
2 | 1.20 | 0.8 | 1.16 |
3 | 1.00 | 0.8 | 1.00 |
4 | 0.90 | 0.8 | 0.92 |
Finally, an absolute, uniform, positive amplification/reduction factor can be specified from the command line with the argument -F
:
$ plapy -F 0.8 < gul_output.bin > plapy_output.bin
This factor is applied to all losses, thus loss factors from the model (those in lossfactors.bin
) are ignored. For example:
event_id | factor from model | uniform factor from user | applied factor |
---|---|---|---|
1 | 1.10 | 0.8 | 0.8 |
2 | 1.20 | 0.8 | 0.8 |
3 | 1.00 | 0.8 | 0.8 |
4 | 0.90 | 0.8 | 0.8 |
The absolute, uniform factor is incompatible with the relative, secondary factor. Therefore, if both are given by the user, a warning is logged and the secondary factor is ignored.
Fix issue where CondTag was needed in location file if it was present in the account file making user have to add an empty CondTag column.
If "vulnerability_set" contains an identifier, the corresponding vulnerability file will be used.
Completed all units in validation/insurance_policy_coverages