Toward an explainable machine learning model for claim frequency: a use case in car insurance pricing with telematics data
<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
<record>
<leader>00000cab a2200000 4500</leader>
<controlfield tag="001">MAP20220008020</controlfield>
<controlfield tag="003">MAP</controlfield>
<controlfield tag="005">20220911210945.0</controlfield>
<controlfield tag="008">220310e20211206esp|||p |0|||b|spa d</controlfield>
<datafield tag="040" ind1=" " ind2=" ">
<subfield code="a">MAP</subfield>
<subfield code="b">spa</subfield>
<subfield code="d">MAP</subfield>
</datafield>
<datafield tag="084" ind1=" " ind2=" ">
<subfield code="a">6</subfield>
</datafield>
<datafield tag="100" ind1="1" ind2=" ">
<subfield code="0">MAPA20220002462</subfield>
<subfield code="a">Maillart, Arthur</subfield>
</datafield>
<datafield tag="245" ind1="1" ind2="0">
<subfield code="a">Toward an explainable machine learning model for claim frequency: a use case in car insurance pricing with telematics data</subfield>
<subfield code="c">Arthur Maillart</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">In this paper, we suggest an explainable machine learning approach to model the claim frequency of a telematics car dataset. In fact, we use a data-driven method based on tree ensembles, namely, the random forest, to create a claim frequency model. Then, we present a method to build a tree that faithfully synthesizes the predictions of a tree ensemble model such as those derived from the random forest or gradient boosting. This tree serves as a global explanation of the predictions of the black-box. Thanks to this surrogate model, we can extract knowledge from a black-box tree ensemble model. Then, we provide an application to improve the performance of a generalized linear model. Indeed, we integrate this new knowledge into a generalized linear model to increase the predictive power</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20170005476</subfield>
<subfield code="a">Machine learning</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20080603779</subfield>
<subfield code="a">Seguro de automóviles</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2="4">
<subfield code="0">MAPA20080556730</subfield>
<subfield code="a">Telemática</subfield>
</datafield>
<datafield tag="650" ind1=" " ind2=" ">
<subfield code="0">MAPA20220007825</subfield>
<subfield code="a">Data driven</subfield>
</datafield>
<datafield tag="773" ind1="0" ind2=" ">
<subfield code="w">MAP20220007085</subfield>
<subfield code="g">06/12/2021 Volúmen 11 - Número 2 - diciembre 2021 , p. 579-617</subfield>
<subfield code="t">European Actuarial Journal</subfield>
<subfield code="d">Cham, Switzerland : Springer Nature Switzerland AG, 2021-2022</subfield>
</datafield>
</record>
</collection>