{"id":852,"date":"2017-02-09T18:00:42","date_gmt":"2017-02-09T18:00:42","guid":{"rendered":"http:\/\/blog.tiran.info\/?p=852"},"modified":"2017-02-09T18:00:42","modified_gmt":"2017-02-09T18:00:42","slug":"fonctions-approximatives-oracle-12-2-3","status":"publish","type":"post","link":"https:\/\/blog.tiran.stream\/?p=852","title":{"rendered":"Fonctions approximatives Oracle 12.2 (#3)"},"content":{"rendered":"<p style=\"text-align: justify;\">Avec Oracle 12.2, l&rsquo;\u00e9ventail des fonctions approximatives s&rsquo;est \u00e9largi avec l&rsquo;adjonction de fonctions de calcul de percentiles:\u00a0<a href=\"http:\/\/docs.oracle.com\/database\/122\/SQLRF\/APPROX_MEDIAN.htm#SQLRF-GUID-F6A11DF2-121A-4057-9D0B-BF1A221B5622\" target=\"_blank\">APPROX_MEDIAN<\/a>, <a href=\"http:\/\/docs.oracle.com\/database\/122\/SQLRF\/APPROX_PERCENTILE.htm#SQLRF-GUID-70D54091-EE2F-4283-A10B-1AB5A1242FE2\" target=\"_blank\">APPROX_PERCENTILE<\/a>,\u00a0<a href=\"http:\/\/docs.oracle.com\/database\/122\/SQLRF\/APPROX_PERCENTILE_AGG.htm#SQLRF-GUID-72A1DAB0-4A3E-42BF-9E20-92273AD62E11\" target=\"_blank\">APPROX_PERCENTILE_DETAIL<\/a>, <a href=\"http:\/\/docs.oracle.com\/database\/122\/SQLRF\/APPROX_PERCENTILE_DETAIL.htm#SQLRF-GUID-F9A0B9B5-671F-43CA-9FA7-69A2DD174F54\" target=\"_blank\">APPROX_PERCENTILE_AGG<\/a>, <a href=\"http:\/\/docs.oracle.com\/database\/122\/SQLRF\/TO_APPROX_PERCENTILE.htm#SQLRF-GUID-463702B2-9199-41ED-AE03-865CABAD3E23\" target=\"_blank\">TO_APPROX_PERCENTILE<\/a><\/p>\n<p style=\"text-align: justify;\">A l&rsquo;instar des fonctions de comptage distinct pr\u00e9sent\u00e9es dans l&rsquo;<a href=\"http:\/\/blog.tiran.info\/fonctions-approximatives-oracle-12-2-2\" target=\"_blank\">article pr\u00e9c\u00e9dent<\/a>, l&rsquo;id\u00e9e est ici de disposer de fonctions capables de traiter de mani\u00e8re performante de tr\u00e8s grands volumes avec des besoins limit\u00e9s de resource et cela au prix d&rsquo;une approximation &#8211; maitris\u00e9e &#8211; des r\u00e9sultats. Le m\u00e9canisme est bas\u00e9 sur l&rsquo;<a href=\"https:\/\/en.wikipedia.org\/wiki\/Count%E2%80%93min_sketch\" target=\"_blank\">algorithme Count-min sketch<\/a>.<\/p>\n<p style=\"text-align: justify;\">En utilisant les m\u00eames donn\u00e9es que dans les articles pr\u00e9c\u00e9dents, commen\u00e7ons par constituer une table listant pour chaque contributeur de la version anglaise de Wikipedia le nombre d&rsquo;actions enregistr\u00e9es:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; CREATE TABLE T_NB_CONTRIBUTIONS\n  2  AS\n  3        SELECT uname, COUNT (*) nb\n  4          FROM log_contribs\n  5      GROUP BY uname;\n\nTable created.\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">La table contient environ 29 millions d&rsquo;enregistrements pour un peu plus de 600MB:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; SELECT COUNT (*) FROM T_NB_CONTRIBUTIONS;\n\n  COUNT(*)\n----------\n  28839445\n\nSQL&gt;\nSQL&gt; SELECT bytes \/ POWER (1024, 2)\n  2    FROM user_segments\n  3   WHERE segment_name = &#039;T_NB_CONTRIBUTIONS&#039;;\n\nBYTES\/POWER(1024,2)\n-------------------\n                624\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">A partir de cette table, on peut d\u00e9terminer quel nombre de contributions correspond au 99eme percentile:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; set timing on\nSQL&gt;\nSQL&gt; SELECT PERCENTILE_DISC (0.99) WITHIN GROUP (ORDER BY nb)\n  2    FROM T_NB_CONTRIBUTIONS;\n\nPERCENTILE_DISC(0.99)WITHINGROUP(ORDERBYNB)\n-------------------------------------------\n                                          3\n\nElapsed: 00:00:41.96\nSQL&gt;\nSQL&gt; SELECT a.name, b.VALUE\n  2     FROM v$statname a, v$mystat b\n  3    WHERE a.statistic# = b.statistic# AND a.name = &#039;session pga memory max&#039;;\n\nNAME                                                                  VALUE\n---------------------------------------------------------------- ----------\nsession pga memory max                                            109971816\n\nElapsed: 00:00:00.03\nSQL&gt; \n<\/pre>\n<p style=\"text-align: justify;\">42 secondes et environ 100MB de PGA ont \u00e9t\u00e9 n\u00e9cessaire pour ce calcul.<\/p>\n<p style=\"text-align: justify;\">Utilisons maintenant une fonction approximative:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; set timing on\nSQL&gt; \nSQL&gt; SELECT APPROX_PERCENTILE (0.99 DETERMINISTIC) WITHIN GROUP (ORDER BY nb)\n  2    FROM T_NB_CONTRIBUTIONS;\n\nAPPROX_PERCENTILE(0.99DETERMINISTIC)WITHINGROUP(ORDERBYNB)\n----------------------------------------------------------\n                                                         3\n\nElapsed: 00:00:12.85\nSQL&gt; \nSQL&gt; SELECT a.name, b.VALUE\n  2     FROM v$statname a, v$mystat b\n  3    WHERE a.statistic# = b.statistic# AND a.name = &#039;session pga memory max&#039;;\n\nNAME                                                                  VALUE\n---------------------------------------------------------------- ----------\nsession pga memory max                                              4262248\n\nElapsed: 00:00:00.03\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">Le temps d&rsquo;ex\u00e9cution a \u00e9t\u00e9 divis\u00e9 par 3.26 et l&rsquo;impact de l&rsquo;op\u00e9ration sur la PGA est marginal\u00a0(4MB correspond \u00e0 l&rsquo;allocation initiale de la session).<\/p>\n<p style=\"text-align: justify;\">Essayons maintenant de d\u00e9terminer combien d&rsquo;actions sont n\u00e9cessaires pour faire parti des 1000 utilisateurs les plus actifs de la plateforme anglaise de Wikipedia:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; SELECT 1 - (1000 \/ COUNT (*))\n  2    FROM T_NB_CONTRIBUTIONS;\n\n1-(1000\/COUNT(*))\n-----------------\n       .999965325\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">Cela correspond donc au percentile 99.9965325. Comparons le calcul exact et une approximation:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; set timing on\nSQL&gt;\nSQL&gt; SELECT APPROX_PERCENTILE (0.999965325 DETERMINISTIC) WITHIN GROUP (ORDER BY nb)\n  2    FROM T_NB_CONTRIBUTIONS;\n\nAPPROX_PERCENTILE(0.999965325DETERMINISTIC)WITHINGROUP(ORDERBYNB)\n-----------------------------------------------------------------\n                                                             6620\n\nElapsed: 00:00:12.80\nSQL&gt;\nSQL&gt; SELECT PERCENTILE_DISC (0.999965325) WITHIN GROUP (ORDER BY nb)\n  2    FROM T_NB_CONTRIBUTIONS;\n\nPERCENTILE_DISC(0.999965325)WITHINGROUP(ORDERBYNB)\n--------------------------------------------------\n                                              6591\n\nElapsed: 00:00:39.50\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">On obtient \u00e0 nouveau un facteur 3 dans le temps d&rsquo;ex\u00e9cution des deux requ\u00eates. Le calcul de la valeur approximative est <strong>3 fois plus rapide pour une erreur de l&rsquo;odre de 0.5%<\/strong> [100 * ABS (6620 &#8211; 6591) \/ 6591]<\/p>\n<p style=\"text-align: justify;\">A l&rsquo;instar des fonctions approximatives de comptage distinct, les fonctions approximatives de calcul de percentile autorisent la \u00ab\u00a0r\u00e9aggregation\u00a0\u00bb via APPROX_PERCENTILE_DETAIL &amp; APPROX_PERCENTILE_AGG. Cf le <a href=\"http:\/\/blog.tiran.info\/fonctions-approximatives-oracle-12-2-2\" target=\"_blank\">billet pr\u00e9c\u00e9dent<\/a> pour un exemple similaire.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Avec Oracle 12.2, l&rsquo;\u00e9ventail des fonctions approximatives s&rsquo;est \u00e9largi avec l&rsquo;adjonction de fonctions de calcul de percentiles:\u00a0APPROX_MEDIAN, APPROX_PERCENTILE,\u00a0APPROX_PERCENTILE_DETAIL, APPROX_PERCENTILE_AGG, TO_APPROX_PERCENTILE<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"colormag_page_container_layout":"default_layout","colormag_page_sidebar_layout":"default_layout","footnotes":""},"categories":[6,15],"tags":[],"class_list":["post-852","post","type-post","status-publish","format-standard","hentry","category-oracle","category-statistique-exploratoire"],"_links":{"self":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/posts\/852","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=852"}],"version-history":[{"count":0,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/posts\/852\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=852"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=852"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=852"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}