Search

Web architecture for URL-based phishing detection based on Random Forest, Classification Trees, and Support Vector Machine

<?xml version="1.0" encoding="UTF-8"?><collection xmlns="http://www.loc.gov/MARC21/slim" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.loc.gov/MARC21/slim http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd">
  <record>
    <leader>00000cab a2200000   4500</leader>
    <controlfield tag="001">MAP20220013925</controlfield>
    <controlfield tag="003">MAP</controlfield>
    <controlfield tag="005">20220911185518.0</controlfield>
    <controlfield tag="008">220509e20220502esp|||p      |0|||b|spa d</controlfield>
    <datafield tag="040" ind1=" " ind2=" ">
      <subfield code="a">MAP</subfield>
      <subfield code="b">spa</subfield>
      <subfield code="d">MAP</subfield>
    </datafield>
    <datafield tag="084" ind1=" " ind2=" ">
      <subfield code="a">922.134</subfield>
    </datafield>
    <datafield tag="100" ind1="1" ind2=" ">
      <subfield code="0">MAPA20220004848</subfield>
      <subfield code="a">Lamas Piñeiro, Julio</subfield>
    </datafield>
    <datafield tag="245" ind1="1" ind2="0">
      <subfield code="a">Web architecture for URL-based phishing detection based on Random Forest, Classification Trees, and Support Vector Machine</subfield>
      <subfield code="c">Julio Lamas Piñeiro, Lenis R. Wong Portillo</subfield>
    </datafield>
    <datafield tag="520" ind1=" " ind2=" ">
      <subfield code="a">Nowadays phishing is as serious a problem as any other, but it has intensified a lot in the current coronavirus pandemic, a time when more than ever we all use the Internet even to make payments daily. In this context, tools have been developed to detect phishing, there are quite complex tools in a computational calculation, and they are not so easy to use for any user. Therefore, in this work, we propose a web architecture based on 3 machine learning models to predict whether a web address has phishing or not based mainly on Random Forest, Classification Trees, and Support Vector Machine. Therefore, 3 different models are developed with each of the indicated techniques and 2 models based on the models, which are applied to web addresses previously processed by a feature retrieval module. All this is deployed in an API that is consumed by a Frontend so that any user can use it and choose which type of model he/she wants to predict with. The results reveal that the best performing model when predicting both results is the Classification Trees model obtaining precision and accuracy of 80%.

</subfield>
    </datafield>
    <datafield tag="540" ind1=" " ind2=" ">
      <subfield code="a">La copia digital se distribuye bajo licencia "Attribution 4.0 International (CC BY NC 4.0)"</subfield>
      <subfield code="f"/>
      <subfield code="u">https://creativecommons.org/licenses/by-nc/4.0</subfield>
      <subfield code="9">64</subfield>
    </datafield>
    <datafield tag="650" ind1=" " ind2="4">
      <subfield code="0">MAPA20080611200</subfield>
      <subfield code="a">Inteligencia artificial</subfield>
    </datafield>
    <datafield tag="650" ind1=" " ind2="4">
      <subfield code="0">MAPA20080585389</subfield>
      <subfield code="a">Fraude informático</subfield>
    </datafield>
    <datafield tag="650" ind1=" " ind2=" ">
      <subfield code="0">MAPA20080541064</subfield>
      <subfield code="a">Fraude</subfield>
    </datafield>
    <datafield tag="700" ind1="1" ind2=" ">
      <subfield code="0">MAPA20220004855</subfield>
      <subfield code="a">Wong Portillo, Lenis R.</subfield>
    </datafield>
    <datafield tag="773" ind1="0" ind2=" ">
      <subfield code="w">MAP20200034445</subfield>
      <subfield code="g">02/05/2022 Volumen 25 Número 69 - mayo 2022 , p. 107-121</subfield>
      <subfield code="x">1988-3064</subfield>
      <subfield code="t">Revista Iberoamericana de Inteligencia Artificial</subfield>
      <subfield code="d"> : IBERAMIA, Sociedad Iberoamericana de Inteligencia Artificial , 2018-</subfield>
    </datafield>
    <datafield tag="856" ind1=" " ind2=" ">
      <subfield code="q">application/pdf</subfield>
      <subfield code="w">1115226</subfield>
      <subfield code="y">Recurso electrónico / Electronic resource</subfield>
    </datafield>
  </record>
</collection>