{"id":1002,"date":"2017-07-08T22:12:42","date_gmt":"2017-07-08T22:12:42","guid":{"rendered":"http:\/\/blog.tiran.info\/?p=1002"},"modified":"2017-07-08T22:12:42","modified_gmt":"2017-07-08T22:12:42","slug":"annore-1","status":"publish","type":"post","link":"https:\/\/blog.tiran.stream\/?p=1002","title":{"rendered":"ANN\/ORE #1 &#8211; Reconnaissance de caract\u00e8res du dataset MNIST"},"content":{"rendered":"<p style=\"text-align: justify;\">Dans la continuit\u00e9 des <a href=\"http:\/\/blog.tiran.info\/reseaux-de-neurones-avec-r-3\" target=\"_blank\" rel=\"noopener\">pr\u00e9c\u00e9dents posts<\/a>, afin de tester des ANN plus cons\u00e9quents, je vais maintenant utiliser le <a href=\"https:\/\/fr.wikipedia.org\/wiki\/Base_de_donn%C3%A9es_MNIST\" target=\"_blank\" rel=\"noopener\">dataset MNIST<\/a>. Il s&rsquo;agit d&rsquo;un c\u00e9l\u00e8bre dataset de reconnaissance OCR popularis\u00e9 par les travaux de <a href=\"https:\/\/fr.wikipedia.org\/wiki\/Yann_Le_Cun\" target=\"_blank\" rel=\"noopener\">Yann Le Cun<\/a>.<\/p>\n<p style=\"text-align: justify;\">Le dataset <a href=\"http:\/\/yann.lecun.com\/exdb\/mnist\/\" target=\"_blank\" rel=\"noopener\">disponible ici<\/a> contient des images en noir et blanc de chiffres (0 \u00e0 9) manuscrits. Il est subdivis\u00e9 en un \u00e9chantillon d&rsquo;apprentissage de 60000 images et d&rsquo;un jeu de test de 10000 images. Chaque image est constitu\u00e9 de 784 pixels (28&#215;28).<\/p>\n<p style=\"text-align: justify;\">Le format sous lequel les donn\u00e9es sont mises \u00e0 disposition n&rsquo;est pas directement utilisable sous R (ou du moins pas simplement). Je vais donc passer par une premi\u00e8re \u00e9tape de chargement de ces donn\u00e9es sous la forme de tables au sein d&rsquo;une base Oracle. J&rsquo;y acc\u00e9derai ensuite via ORE ou ROracle en fonction du besoin&#8230;<\/p>\n<p style=\"text-align: justify;\">Le format des fichiers est d\u00e9taill\u00e9 en bas de la page <a href=\"http:\/\/yann.lecun.com\/exdb\/mnist\/\" target=\"_blank\" rel=\"noopener\">http:\/\/yann.lecun.com\/exdb\/mnist\/<\/a>.<\/p>\n<p style=\"text-align: justify;\">Ici, on transf\u00e8re les fichiers sur le serveur de base de donn\u00e9es afin d&rsquo;utiliser <a href=\"https:\/\/docs.oracle.com\/database\/122\/ARPLS\/UTL_FILE.htm#ARPLS-GUID-46843148-6037-4881-A784-F5B93D5F5A21\" target=\"_blank\" rel=\"noopener\">UTL_FILE<\/a> pour les lire byte par byte:<\/p>\n<pre class=\"brush: sql; ruler: true;\">[rtiran@psu888 ~]$ ll \/tmp\/*ubyte\n-rw-r----- 1 rtiran dba  7840016 Jun  8 11:31 \/tmp\/t10k-images-idx3-ubyte\n-rw-r----- 1 rtiran dba    10008 Jun  8 11:31 \/tmp\/t10k-labels-idx1-ubyte\n-rw-r--r-- 1 rtiran dba 47040016 Jun  8 11:31 \/tmp\/train-images-idx3-ubyte\n-rw-r--r-- 1 rtiran dba    60008 Jun  8 11:31 \/tmp\/train-labels-idx1-ubyte\n[rtiran@psu888 ~]$\n<\/pre>\n<p style=\"text-align: justify;\">Le fichier train-images-idx3-ubyte d\u00e9marre par 4 series de 4 bytes (32 bits) inutiles pour la suite de l&rsquo;analyse. Ensuite, chaque byte correspond \u00e0 un pixel et chaque image contient 784 pixels (28*28).<\/p>\n<p style=\"text-align: justify;\">Dans un premier temps, on stocke la valeur de chaque pixel dans la table IMGS_FLAT. Chaque ligne correspondant \u00e0 un pixel (l&rsquo;ID de l&rsquo;image et la position du pixel sont aussi stock\u00e9s):<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; CREATE TABLE imgs_flat\n  2  (\n  3      img_id NUMBER,\n  4      pix_id NUMBER,\n  5      pix_val NUMBER\n  6  );\n\nTable created.\n\nSQL&gt; CREATE OR REPLACE DIRECTORY D1 AS &#039;\/tmp&#039;;\n\nDirectory created.\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">La routine PL\/SQL suivante proc\u00e8de au balayage du fichier et a l&rsquo;extraction de la valeur de chaque pixel:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; SET TIMING on\nSQL&gt; DECLARE\n  2      f_imgs          UTL_FILE.file_type;\n  3      l_buffer        RAW (32);\n  4      l_cnt           NUMBER := 0;\n  5      l_pix_id        NUMBER := 0;\n  6      l_img_id        NUMBER := 0;\n  7      l_eof           BOOLEAN := FALSE;\n  8\n  9      TYPE t_imgs_flat IS TABLE OF imgs_flat%ROWTYPE\n 10          INDEX BY BINARY_INTEGER;\n 11\n 12      arr_imgs_flat   t_imgs_flat;\n 13  BEGIN\n 14      f_imgs := UTL_FILE.fopen (&#039;D1&#039;, &#039;train-images-idx3-ubyte&#039;, &#039;rb&#039;);\n 15\n 16      FOR i IN 1 .. 4\n 17      LOOP\n 18          UTL_FILE.get_raw (f_imgs, l_buffer, 4);\n 19          DBMS_OUTPUT.put_line (\n 20              UTL_RAW.cast_to_binary_integer (l_buffer,\n 21                                              endianess   =&gt; UTL_RAW.big_endian));\n 22      END LOOP;\n 23\n 24      LOOP\n 25          l_cnt := l_cnt + 1;\n 26          l_pix_id := MOD (l_cnt, 28 * 28);\n 27\n 28          IF l_pix_id = 0\n 29          THEN\n 30              l_pix_id := 784;\n 31          END IF;\n 32\n 33          l_img_id := CEIL (l_cnt \/ (28 * 28));\n 34\n 35          BEGIN\n 36              UTL_FILE.get_raw (f_imgs, l_buffer, 1);\n 37              arr_imgs_flat (l_cnt).img_id := l_img_id;\n 38              arr_imgs_flat (l_cnt).pix_id := l_pix_id;\n 39              arr_imgs_flat (l_cnt).pix_val :=\n 40                  UTL_RAW.cast_to_binary_integer (\n 41                      l_buffer,\n 42                      endianess   =&gt; UTL_RAW.big_endian);\n 43          EXCEPTION\n 44              WHEN NO_DATA_FOUND\n 45              THEN\n 46                  l_eof := TRUE;\n 47          END;\n 48\n 49          IF MOD (l_cnt, 1e6) = 0 OR l_eof\n 50          THEN\n 51              FORALL i IN arr_imgs_flat.FIRST .. arr_imgs_flat.LAST\n 52                  INSERT INTO imgs_flat\n 53                  VALUES arr_imgs_flat (i);\n 54\n 55              arr_imgs_flat.delete;\n 56\n 57              COMMIT;\n 58          END IF;\n 59\n 60          IF l_eof\n 61          THEN\n 62              EXIT;\n 63          END IF;\n 64      END LOOP;\n 65\n 66      COMMIT;\n 67  END;\n 68  \/\n\nPL\/SQL procedure successfully completed.\n\nElapsed: 00:11:31.96\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">Ce d\u00e9coupage dure une dizaine de minutes.<\/p>\n<p style=\"text-align: justify;\">On obtient alors une table de 47040000 enregistrements (60000 images * 784 pixels):<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; SET TIMING off\nSQL&gt; SELECT COUNT (*) FROM imgs_flat;\n\n  COUNT(*)\n----------\n  47040000\n\nSQL&gt; \n<\/pre>\n<p style=\"text-align: justify;\">L&rsquo;\u00e9tape suivante consiste a pivoter ces enregistrements de mani\u00e8re \u00e0 avoir une structure tabulaire constitu\u00e9e de 784 champs (1 champ par pixel) et 60000 enregistrements (1 par image).<\/p>\n<p style=\"text-align: justify;\">Pour cela, on cr\u00e9e la table cible IMGS. Elle contient un champ IMG_ID et 784 champs P1, P2, &#8230; P784 contenant la valeur du pixel associ\u00e9:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; CREATE TABLE imgs\n  2  (\n  3      img_id    NUMBER PRIMARY KEY\n  4  );\n\nTable created.\n\nSQL&gt; SET TIMING on\nSQL&gt; BEGIN\n  2      FOR i IN 1 .. 28 * 28\n  3      LOOP\n  4          EXECUTE IMMEDIATE &#039;alter table IMGS add (p&#039; || i || &#039; number)&#039;;\n  5      END LOOP;\n  6  END;\n  7  \/\n\nPL\/SQL procedure successfully completed.\n\nElapsed: 00:00:20.90\nSQL&gt; SET TIMING off\nSQL&gt;      SELECT column_name\n  2         FROM user_tab_columns\n  3        WHERE table_name = &#039;IMGS&#039;\n  4     ORDER BY column_id\n  5  FETCH FIRST 10 ROWS ONLY;\n\nCOLUMN_NAME\n--------------------------------------------------------------------------------\nIMG_ID\nP1\nP2\nP3\nP4\nP5\nP6\nP7\nP8\nP9\n\n10 rows selected.\n\nSQL&gt; \n<\/pre>\n<p style=\"text-align: justify;\">On utilise ensuite la <a href=\"https:\/\/docs.oracle.com\/database\/121\/SQLRF\/statements_10002.htm#CHDFAFIE\" target=\"_blank\" rel=\"noopener\">clause PIVOT<\/a> pour r\u00e9cup\u00e9rer sous forme lin\u00e9aire toutes les valeurs de pixels pour chaque image. C&rsquo;est ce qu&rsquo;on ins\u00e8re dans IMGS.<\/p>\n<p style=\"text-align: justify;\">L&rsquo;ordre SQL final \u00e9tant gigantesque, on le construit dynamiquement. On en profite au passage pour appliquer une <a href=\"https:\/\/en.wikipedia.org\/wiki\/Feature_scaling\" target=\"_blank\" rel=\"noopener\">normalisation MIN\/MAX<\/a> aux valeurs des pixels:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; SET TIMING on\nSQL&gt; DECLARE\n  2      l_pivot_clause   VARCHAR2 (32000);\n  3  BEGIN\n  4      FOR i IN 1 .. 28 * 28\n  5      LOOP\n  6          l_pivot_clause := l_pivot_clause || i || &#039; as P&#039; || i || &#039;,&#039;;\n  7      END LOOP;\n  8\n  9      l_pivot_clause := RTRIM (l_pivot_clause, &#039;,&#039;);\n 10\n 11      EXECUTE IMMEDIATE &#039;INSERT INTO IMGS\n 12      SELECT *\n 13        FROM (\n 14          SELECT a.IMG_ID,\n 15                 a.PIX_ID,\n 16                 a.pix_val \/ 255 pix_val\n 17            FROM imgs_flat a\n 18        )\n 19             PIVOT\n 20                 (MAX (pix_val) FOR pix_id IN (&#039; || l_pivot_clause || &#039;))&#039;;\n 21\n 22      COMMIT;\n 23  END;\n 24  \/\n\nPL\/SQL procedure successfully completed.\n\nElapsed: 00:01:38.92\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">A ce stade, on dispose d&rsquo;une table dont chaque ligne contient la valeur des pixels d&rsquo;une image.<\/p>\n<p style=\"text-align: justify;\">On va ensuite charger dans une autre table (IMGS_VAL) les labels des images:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; CREATE TABLE imgs_val\n  2  (\n  3      img_id    NUMBER PRIMARY KEY,\n  4      img_val   NUMBER\n  5  );\n\nTable created.\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">La encore on va utiliser une lecture byte par byte du fichier train-labels-idx1-ubyte. On va ignorer les 64 premiers bits (8 bytes) du fichier. Chaque byte suivant correspond au label de l&rsquo;image correspondante:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; SET TIMING on\nSQL&gt; DECLARE\n  2      f_imgs         UTL_FILE.file_type;\n  3      l_buffer       RAW (32);\n  4      l_cnt          NUMBER := 0;\n  5      l_val          NUMBER := 0;\n  6\n  7      TYPE t_imgs_val IS TABLE OF imgs_val%ROWTYPE\n  8          INDEX BY BINARY_INTEGER;\n  9\n 10      arr_imgs_val   t_imgs_val;\n 11  BEGIN\n 12      f_imgs := UTL_FILE.fopen (&#039;D1&#039;, &#039;train-labels-idx1-ubyte&#039;, &#039;rb&#039;);\n 13\n 14      FOR i IN 1 .. 2\n 15      LOOP\n 16          UTL_FILE.get_raw (f_imgs, l_buffer, 4);\n 17          DBMS_OUTPUT.put_line (\n 18              UTL_RAW.cast_to_binary_integer (l_buffer,\n 19                                              endianess   =&gt; UTL_RAW.big_endian));\n 20      END LOOP;\n 21\n 22      LOOP\n 23          l_cnt := l_cnt + 1;\n 24\n 25          BEGIN\n 26              UTL_FILE.get_raw (f_imgs, l_buffer, 1);\n 27              arr_imgs_val (l_cnt).img_id := l_cnt;\n 28              arr_imgs_val (l_cnt).img_val :=\n 29                  UTL_RAW.cast_to_binary_integer (\n 30                      l_buffer,\n 31                      endianess   =&gt; UTL_RAW.big_endian);\n 32          EXCEPTION\n 33              WHEN NO_DATA_FOUND\n 34              THEN\n 35                  EXIT;\n 36          END;\n 37      END LOOP;\n 38\n 39      FORALL i IN arr_imgs_val.FIRST .. arr_imgs_val.LAST\n 40          INSERT INTO imgs_val\n 41          VALUES arr_imgs_val (i);\n 42\n 43      COMMIT;\n 44  END;\n 45  \/\n\nPL\/SQL procedure successfully completed.\n\nElapsed: 00:00:01.23\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">A ce stade, la table IMGS_VAL contient 60000 enregistrements &#8211; 1 par image &#8211; et le champ IMG_VAL contient le label de chaque image de l&rsquo;\u00e9chantillon d&rsquo;apprentissage:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; SELECT COUNT (*) FROM imgs_val;\n\n  COUNT(*)\n----------\n     60000\n\nSQL&gt;\nSQL&gt;      SELECT *\n  2         FROM imgs_val\n  3     ORDER BY img_id\n  4  FETCH FIRST 5 ROWS ONLY;\n\n    IMG_ID    IMG_VAL\n---------- ----------\n         1          5\n         2          0\n         3          4\n         4          1\n         5          9\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">On peut alors utiliser une vue pour grouper les donn\u00e9es de IMGS et IMGS_VAL. On convertit le champ IMG_VAL au format texte afin qu&rsquo;il soit ensuite consid\u00e9r\u00e9 comme un facteur par R:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; CREATE OR REPLACE VIEW mnist_training_set\n  2  AS\n  3      SELECT to_char(img_val) img_lbl,\n  4             b.*\n  5        FROM imgs_val a, imgs b\n  6       WHERE a.img_id = b.img_id;\n\nView created.\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">A noter qu&rsquo;on ajoute une <a href=\"https:\/\/docs.oracle.com\/database\/121\/SQLRF\/clauses002.htm#i1002565\" target=\"_blank\" rel=\"noopener\">contrainte d&rsquo;int\u00e9grit\u00e9 d\u00e9clarative sur la vue<\/a>\u00a0afin de donner \u00e0 R un <a href=\"https:\/\/docs.oracle.com\/cd\/E67822_01\/OREUG\/GUID-738819E6-C049-4E0D-88D0-C58DB79490A4.htm\" target=\"_blank\" rel=\"noopener\">crit\u00e8re d&rsquo;ordonnancement des donn\u00e9es<\/a>:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; ALTER VIEW mnist_training_set ADD CONSTRAINT training_set_pk PRIMARY KEY (img_id)\n  2                               DISABLE NOVALIDATE;\n\nView altered.\n\nSQL&gt;\n<\/pre>\n<p style=\"text-align: justify;\">Sans cette contrainte, on obtiendrai le message suivant lors d&rsquo;une r\u00e9cup\u00e9ration des donn\u00e9es au sein d&rsquo;un dataframe :<\/p>\n<pre class=\"brush: js; ruler: true;\">&gt; ts &lt;- ore.pull(MNIST_TRAINING_SET)\nWarning message:\nORE object has no unique key - using random order \n&gt;\n<\/pre>\n<p style=\"text-align: justify;\">On proc\u00e8de de mani\u00e8re identique pour le chargement des donn\u00e9es de validation. Les tables interm\u00e9diaires se nomment IMGS_FLAT_TEST, IMGS_TEST, IMGS_TEST_VAL et la vue de regroupement se nomme MNIST_TEST_SET:<\/p>\n<pre class=\"brush: sql; ruler: true;\">SQL&gt; CREATE TABLE imgs_flat_test\n  2  (\n  3      img_id NUMBER,\n  4      pix_id NUMBER,\n  5      pix_val NUMBER\n  6  );\n\nTable created.\n\nSQL&gt;\nSQL&gt; CREATE OR REPLACE DIRECTORY D1 AS &#039;\/tmp&#039;;\n\nDirectory created.\n\nSQL&gt;\nSQL&gt; DECLARE\n  2      f_imgs               UTL_FILE.file_type;\n  3      l_buffer             RAW (32);\n  4      l_cnt                NUMBER := 0;\n  5      l_pix_id             NUMBER := 0;\n  6      l_img_id             NUMBER := 0;\n  7      l_eof                BOOLEAN := FALSE;\n  8\n  9      TYPE t_imgs_flat_test IS TABLE OF imgs_flat_test%ROWTYPE\n 10          INDEX BY BINARY_INTEGER;\n 11\n 12      arr_imgs_flat_test   t_imgs_flat_test;\n 13  BEGIN\n 14      f_imgs := UTL_FILE.fopen (&#039;D1&#039;, &#039;t10k-images-idx3-ubyte&#039;, &#039;rb&#039;);\n 15\n 16      FOR i IN 1 .. 4\n 17      LOOP\n 18          UTL_FILE.get_raw (f_imgs, l_buffer, 4);\n 19          DBMS_OUTPUT.put_line (\n 20              UTL_RAW.cast_to_binary_integer (l_buffer,\n 21                                              endianess   =&gt; UTL_RAW.big_endian));\n 22      END LOOP;\n 23\n 24      LOOP\n 25          l_cnt := l_cnt + 1;\n 26          l_pix_id := MOD (l_cnt, 28 * 28);\n 27\n 28          IF l_pix_id = 0\n 29          THEN\n 30              l_pix_id := 784;\n 31          END IF;\n 32\n 33          l_img_id := CEIL (l_cnt \/ (28 * 28));\n 34\n 35          BEGIN\n 36              UTL_FILE.get_raw (f_imgs, l_buffer, 1);\n 37              arr_imgs_flat_test (l_cnt).img_id := l_img_id;\n 38              arr_imgs_flat_test (l_cnt).pix_id := l_pix_id;\n 39              arr_imgs_flat_test (l_cnt).pix_val :=\n 40                  UTL_RAW.cast_to_binary_integer (\n 41                      l_buffer,\n 42                      endianess   =&gt; UTL_RAW.big_endian);\n 43          EXCEPTION\n 44              WHEN NO_DATA_FOUND\n 45              THEN\n 46                  l_eof := TRUE;\n 47          END;\n 48\n 49          IF MOD (l_cnt, 1e6) = 0 OR l_eof\n 50          THEN\n 51              FORALL i IN arr_imgs_flat_test.FIRST .. arr_imgs_flat_test.LAST\n 52                  INSERT INTO imgs_flat_test\n 53                  VALUES arr_imgs_flat_test (i);\n 54\n 55              arr_imgs_flat_test.delete;\n 56\n 57              COMMIT;\n 58          END IF;\n 59\n 60          IF l_eof\n 61          THEN\n 62              EXIT;\n 63          END IF;\n 64      END LOOP;\n 65\n 66      COMMIT;\n 67  END;\n 68  \/\n\nPL\/SQL procedure successfully completed.\n\nSQL&gt; CREATE TABLE imgs_test\n  2  (\n  3      img_id    NUMBER PRIMARY KEY\n  4  );\n\nTable created.\n\nSQL&gt;\nSQL&gt; BEGIN\n  2      FOR i IN 1 .. 28 * 28\n  3      LOOP\n  4          EXECUTE IMMEDIATE &#039;alter table IMGS_TEST add (p&#039; || i || &#039; number)&#039;;\n  5      END LOOP;\n  6  END;\n  7  \/\n\nPL\/SQL procedure successfully completed.\n\nSQL&gt; DECLARE\n  2      l_pivot_clause   VARCHAR2 (32000);\n  3  BEGIN\n  4      FOR i IN 1 .. 28 * 28\n  5      LOOP\n  6          l_pivot_clause := l_pivot_clause || i || &#039; as P&#039; || i || &#039;,&#039;;\n  7      END LOOP;\n  8\n  9      l_pivot_clause := RTRIM (l_pivot_clause, &#039;,&#039;);\n 10\n 11      EXECUTE IMMEDIATE &#039;INSERT INTO IMGS_TEST\n 12      SELECT *\n 13        FROM (\n 14          SELECT a.IMG_ID,\n 15                 a.PIX_ID,\n 16                 a.pix_val \/ 255 pix_val\n 17            FROM imgs_flat_test a\n 18        )\n 19             PIVOT\n 20                 (MAX (pix_val) FOR pix_id IN (&#039; || l_pivot_clause || &#039;))&#039;;\n 21\n 22      COMMIT;\n 23  END;\n 24  \/\n\nPL\/SQL procedure successfully completed.\n\nSQL&gt;\nSQL&gt; CREATE TABLE imgs_test_val\n  2  (\n  3      img_id    NUMBER PRIMARY KEY,\n  4      img_val NUMBER\n  5  );\n\nTable created.\n\nSQL&gt;\nSQL&gt; DECLARE\n  2      f_imgs              UTL_FILE.file_type;\n  3      l_buffer            RAW (32);\n  4      l_cnt               NUMBER := 0;\n  5      l_val               NUMBER := 0;\n  6\n  7      TYPE t_imgs_test_val IS TABLE OF imgs_test_val%ROWTYPE\n  8          INDEX BY BINARY_INTEGER;\n  9\n 10      arr_imgs_test_val   t_imgs_test_val;\n 11  BEGIN\n 12      f_imgs := UTL_FILE.fopen (&#039;D1&#039;, &#039;t10k-labels-idx1-ubyte&#039;, &#039;rb&#039;);\n 13\n 14      FOR i IN 1 .. 2\n 15      LOOP\n 16          UTL_FILE.get_raw (f_imgs, l_buffer, 4);\n 17          DBMS_OUTPUT.put_line (\n 18              UTL_RAW.cast_to_binary_integer (l_buffer,\n 19                                              endianess   =&gt; UTL_RAW.big_endian));\n 20      END LOOP;\n 21\n 22      LOOP\n 23          l_cnt := l_cnt + 1;\n 24\n 25          BEGIN\n 26              UTL_FILE.get_raw (f_imgs, l_buffer, 1);\n 27              arr_imgs_test_val (l_cnt).img_id := l_cnt;\n 28              arr_imgs_test_val (l_cnt).img_val :=\n 29                  UTL_RAW.cast_to_binary_integer (\n 30                      l_buffer,\n 31                      endianess   =&gt; UTL_RAW.big_endian);\n 32          EXCEPTION\n 33              WHEN NO_DATA_FOUND\n 34              THEN\n 35                  EXIT;\n 36          END;\n 37      END LOOP;\n 38\n 39      FORALL i IN arr_imgs_test_val.FIRST .. arr_imgs_test_val.LAST\n 40          INSERT INTO imgs_test_val\n 41          VALUES arr_imgs_test_val (i);\n 42\n 43      COMMIT;\n 44  END;\n 45  \/\n\nPL\/SQL procedure successfully completed.\n\nSQL&gt; CREATE OR REPLACE VIEW mnist_test_set\n  2  AS\n  3      SELECT to_char(img_val) img_lbl,\n  4             b.*\n  5        FROM imgs_test_val a, imgs_test b\n  6       WHERE a.img_id = b.img_id;\n\nView created.\n\nSQL&gt; ALTER VIEW mnist_test_set ADD CONSTRAINT test_set_pk PRIMARY KEY (img_id)\n  2                               DISABLE NOVALIDATE;\n\nView altered.\n\nSQL&gt;\n<\/pre>\n<p>On dispose maintenant de deux vues MNIST_TRAINING_SET et MNIST_TEST_SET pr\u00e9sentant les donn\u00e9es du dataset dans un format exploitable pour les fonctions R de construction de mod\u00e8les de r\u00e9seau de neurones.<\/p>\n<p>Les scripts ex\u00e9cut\u00e9s ci-dessus sont accessibles ici:\u00a0<a href=\"https:\/\/blog.tiran.stream\/wp-content\/uploads\/2017\/09\/load_mnist_training_set.txt\">load_mnist_training_set<\/a>\u00a0et <a href=\"https:\/\/blog.tiran.stream\/wp-content\/uploads\/2017\/09\/load_mnist_test_set.txt\">load_mnist_test_set<\/a>.<\/p>\n<p>A suivre&#8230;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Dans la continuit\u00e9 des pr\u00e9c\u00e9dents posts, afin de tester des ANN plus cons\u00e9quents, je vais maintenant utiliser le dataset MNIST.<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"colormag_page_container_layout":"default_layout","colormag_page_sidebar_layout":"default_layout","footnotes":""},"categories":[6,10],"tags":[],"class_list":["post-1002","post","type-post","status-publish","format-standard","hentry","category-oracle","category-preparation-des-donnees"],"_links":{"self":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/posts\/1002","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1002"}],"version-history":[{"count":0,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=\/wp\/v2\/posts\/1002\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1002"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=1002"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.tiran.stream\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=1002"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}