Accelerating the computation of shapley effects for datasets with many observations
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
<record>
<leader>00000cab a2200000 4500</leader>
<controlfield tag="001">MAP20260002972</controlfield>
<controlfield tag="003">MAP</controlfield>
<controlfield tag="005">20260211184612.0</controlfield>
<controlfield tag="008">260206e20250811che|||p |0|||b|eng d</controlfield>
<datafield tag="040" ind1=" " ind2=" ">
<subfield code="a">MAP</subfield>
<subfield code="b">spa</subfield>
<subfield code="d">MAP</subfield>
</datafield>
<datafield tag="084" ind1=" " ind2=" ">
<subfield code="a">6</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="0">MAPA20260002330</subfield>
<subfield code="a">Rabitti, Giovanni</subfield>
</datafield>
<datafield tag="245" ind1="1" ind2="0">
<subfield code="a">Accelerating the computation of shapley effects for datasets with many observations</subfield>
<subfield code="c">Giovanni Rabitti and George Tzougas
</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">The document presents a strategy to accelerate the computation of Shapley effects, a sensitivity-analysis method used to identify the importance of risk factors in actuarial models. The traditional procedure becomes computationally expensive when dealing with large datasets. The authors propose reducing the sample size using techniques such as Latin Hypercube Sampling, Conditional Latin Hypercube Sampling, and Hierarchical k-means, selecting representative observations while preserving calculation accuracy. They apply this methodology to the well-known French automobile claim-frequency dataset, demonstrating drastic reductions in computation time with minimal loss of precision. The study concludes that this approach enables efficient estimation of Shapley effects even in big-data contexts, providing a relevant advancement for actuarial modeling and insurance risk analysis</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20140022717</subfield>
<subfield code="a">Big data</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20080592011</subfield>
<subfield code="a">Modelos actuariales</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20140007837</subfield>
<subfield code="a">Clusters</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20080570651</subfield>
<subfield code="a">Siniestralidad</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20170005476</subfield>
<subfield code="a">Machine learning</subfield>
</datafield>
<datafield tag="700" ind1="1" ind2=" ">
<subfield code="0">MAPA20140009800</subfield>
<subfield code="a">Tzougas, George</subfield>
</datafield>
<datafield tag="710" ind1="2" ind2=" ">
<subfield code="0">MAPA20200009078</subfield>
<subfield code="a">Springer Nature</subfield>
</datafield>
<datafield tag="773" ind1="0" ind2=" ">
<subfield code="w">MAP20220007085</subfield>
<subfield code="g">11/08/2025 Volume 15 - Number 2 - August 2025 , p. 885 - 898</subfield>
<subfield code="t">European Actuarial Journal</subfield>
<subfield code="d">Cham, Switzerland : Springer Nature Switzerland AG, 2021-2022</subfield>
</datafield>
</record>
</collection>