C.2 Évaluation de type extraction d’information

Avec coréférence - Focus absent - Sans rejet des réponses sans concordances
Avec coréférence - Focus absent - Rejet des réponses au seuil de 0
Avec coréférence - Focus absent - Rejet des réponses au seuil de 10
Avec coréférence - Focus absent - Rejet des réponses au seuil de 20
Avec coréférence - Focus absent - Rejet des réponses au seuil de 30
Avec coréférence - Focus absent - Rejet des réponses au seuil de 40
Avec coréférence - Focus présent - Sans rejet des réponses sans concordances
Avec coréférence - Focus présent - Rejet des réponses au seuil de 0
Avec coréférence - Focus présent - Rejet des réponses au seuil de 10
Avec coréférence - Focus présent - Rejet des réponses au seuil de 20
Avec coréférence - Focus présent - Rejet des réponses au seuil de 30
Avec coréférence - Focus présent - Rejet des réponses au seuil de 40
Sans coréférence - Focus absent - Sans rejet des réponses sans concordances
Sans coréférence - Focus absent - Rejet des réponses au seuil de 0
Sans coréférence - Focus absent - Rejet des réponses au seuil de 10
Sans coréférence - Focus absent - Rejet des réponses au seuil de 20
Sans coréférence - Focus absent - Rejet des réponses au seuil de 30
Sans coréférence - Focus absent - Rejet des réponses au seuil de 40
Sans coréférence - Focus présent - Sans rejet des réponses sans concordances
Sans coréférence - Focus présent - Rejet des réponses au seuil de 0
Sans coréférence - Focus présent - Rejet des réponses au seuil de 10
Sans coréférence - Focus présent - Rejet des réponses au seuil de 20
Sans coréférence - Focus présent - Rejet des réponses au seuil de 30
Sans coréférence - Focus présent - Rejet des réponses au seuil de 40
























Référence 83.18%35.74%65.7350.0040.34 89 78 18 116 0 0












Synonymes aveugles 81.20%38.15%66.2551.9142.68 95 81 22 113 0 0












Syntaxe 31.63%42.17%33.2936.1439.5310582227110 0 0












Synonymes 30.41%44.58%32.4836.1640.7811186254106 0 0












D 31.51%46.18%33.6537.4642.2511590250102 0 0












D-Syn 30.34%46.18%32.5836.6241.8211590264102 0 0












D-B 31.45%46.99%33.6837.6842.7611791255100 0 0












D-EWN-M 32.35%48.19%34.6238.7143.8912095251 97 0 0












D-Dér 31.28%46.99%33.5237.5642.7011791257101 0 0












D-B-EWN-M-Dér 32.04%49.80%34.5038.9944.8312497263 93 0 0












Tous enrichissements31.00%49.80%33.5338.2144.4112497276 93 0 0
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200107 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200117 0












Syntaxe 32.89%40.16%34.1336.1738.4610079204113 28 8












Synonymes 31.33%41.77%32.9735.8039.1610482228110 33 9












D 32.63%43.37%34.3337.2440.6910886223106 34 10












D-Syn 31.40%43.37%33.2336.4240.3010886236106 35 10












D-B 32.64%44.18%34.4437.5441.2611087227104 35 10












D-EWN-M 33.53%45.38%35.3838.5742.3911391224101 34 10












D-Dér 32.45%44.18%34.2737.4141.2011087229105 35 10












D-B-EWN-M-Dér 33.33%46.99%35.3939.0043.4311793234 97 36 10












Tous enrichissements32.23%46.99%34.3938.2443.0511793246 97 37 10
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200107 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200117 0












Syntaxe 33.00%40.16%34.2236.2338.4910079203114 29 8












Synonymes 31.42%41.77%33.0635.8639.1910482227111 34 9












D 32.73%43.37%34.4237.3140.7210886222107 35 10












D-Syn 31.49%43.37%33.3136.4940.3310886235107 36 10












D-B 32.74%44.18%34.5337.6141.2911087226105 36 10












D-EWN-M 33.63%45.38%35.4738.6342.4211391223102 35 10












D-Dér 32.54%44.18%34.3537.4841.2311087228106 36 10












D-B-EWN-M-Dér 33.43%46.99%35.4839.0743.4611793233 98 37 10












Tous enrichissements32.32%46.99%34.4738.3043.0811793245 98 38 10
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200107 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200117 0












Syntaxe 84.00%33.73%64.7148.1438.32 84 74 16 12023220












Synonymes 82.41%35.74%65.3549.8640.31 89 771911725722












D 80.53%36.55%64.9150.2841.03 91 802211425222












D-Syn 79.31%36.95%64.5250.4141.37 92 802411426323












D-B 77.50%37.35%63.7950.4141.67 93 812711125222












D-EWN-M 80.67%38.55%66.2152.1743.05 96 852310925222












D-Dér 80.87%37.35%65.5951.1041.85 93 812211325923












D-B-EWN-M-Dér 78.74%40.16%66.0553.1944.52100872710526023












Tous enrichissements78.29%40.56%66.0153.4444.89101872810527124
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200107 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200117 0












Syntaxe 85.11%32.13%64.0046.6536.70 80 721412223822












Synonymes 84.85%33.73%65.1248.2838.36 84 741512026625












D 81.90%34.54%64.2848.5939.06 86 771911726024












D-Syn 82.08%34.94%64.6449.0139.47 87 771911727326












D-B 78.57%35.34%63.1348.7539.71 88 782411426024












D-EWN-M 81.98%36.55%65.6650.5641.10 91 822011226024












D-Dér 82.24%35.34%64.9949.4439.89 88 781911626725












D-B-EWN-M-Dér 79.83%38.15%65.5251.6342.60 95 842410826825












Tous enrichissements80.00%38.55%65.8452.0343.01 96 842410828026
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200107 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200117 0












Syntaxe 86.36%30.52%63.2345.1035.0676681212624422












Synonymes 86.02%32.13%64.4146.7836.7380701312427224












D 82.83%32.93%63.5747.1337.4482731712026623












D-Syn 83.00%33.33%63.9447.5637.8683731712027925












D-B 79.81%33.33%62.4147.0337.7383742111826824












D-EWN-M 82.69%34.54%64.6648.7339.0986771811626723












D-Dér 83.17%33.73%64.3248.0038.2984741711927324












D-B-EWN-M-Dér 81.65%35.74%64.9649.7240.2789792011427825












Tous enrichissements81.82%36.14%65.3150.1440.6990792011429026
























Référence 83.18%35.74%65.7350.0040.34897818116 0 0












Synonymes aveugles 81.20%38.15%66.2551.9142.68958122113 0 0












Syntaxe 83.51%32.53%63.5846.8237.05817216122 0 0












Synonymes 81.31%34.94%64.2548.8839.44877520119 0 0












D 78.26%36.14%63.4749.4540.50907925115 0 0












D-Syn 77.78%36.55%63.4649.7340.88917926115 0 0












D-B 74.80%36.95%62.0849.4641.11928031112 0 0












D-EWN-M 78.51%38.15%64.8051.3542.52 95 8426110 0 0












D-Dér 78.63%36.95%64.1650.2741.33 92 8025114 0 0












D-B-EWN-M-Dér 76.15%39.76%64.3752.2443.96 99 8631106 0 0












Tous enrichissements75.76%40.16%64.3552.4944.331008632106 0 0
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001070












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001170












Syntaxe 84.78%31.33%63.2145.7535.85 78 7014124 5 1












Synonymes 83.67%32.93%63.9647.2637.48 82 7216122 9 2












D 81.73%34.14%63.9148.1638.64 85 7619118 11 3












D-Syn 81.13%34.54%63.8948.4539.02 86 7620118 11 3












D-B 78.38%34.94%62.7748.3339.30 87 7724115 12 3












D-EWN-M 81.82%36.14%65.3150.1440.69 90 8120113 11 3












D-Dér 82.08%34.94%64.6449.0139.47 87 7719117 11 3












D-B-EWN-M-Dér 79.66%37.75%65.1951.2342.19 94 8324109 12 3












Tous enrichissements79.83%38.15%65.5251.6342.60 95 8324109 13 3
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001070












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001170












Syntaxe 84.78%31.33%63.2145.7535.85 78 7014124 5 1












Synonymes 83.67%32.93%63.9647.2637.48827216122 9 2












D 81.73%34.14%63.9148.1638.64857619118 11 3












D-Syn 81.13%34.54%63.8948.4539.02867620118 11 3












D-B 78.38%34.94%62.7748.3339.30877724115 12 3












D-EWN-M 81.82%36.14%65.3150.1440.69908120113 11 3












D-Dér 82.08%34.94%64.6449.0139.47877719117 11 3












D-B-EWN-M-Dér 79.66%37.75%65.1951.2342.19948324109 12 3












Tous enrichissements79.83%38.15%65.5251.6342.60958324109 13 3
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001070












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001170












Syntaxe 84.62%30.92%62.8145.2935.42776914125 6 1












Synonymes 83.51%32.53%63.5846.8237.05817116123 10 2












D 81.37%33.33%63.1747.2937.80837419120 13 3












D-Syn 80.77%33.73%63.1647.5938.18847420120 13 3












D-B 77.98%34.14%62.0447.4938.46857524117 14 3












D-EWN-M 81.48%35.34%64.6149.3039.86887920115 13 3












D-Dér 81.73%34.14%63.9148.1638.64857519119 13 3












D-B-EWN-M-Dér 79.31%36.95%64.5250.4141.37928124111 14 3












Tous enrichissements79.49%37.35%64.8550.8241.78938124111 15 3
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001070












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001170












Syntaxe 84.62%30.92%62.8145.2935.42776914125 6 1












Synonymes 84.38%32.53%63.9846.9637.09817115123 11 3












D 81.37%33.33%63.1747.2937.80837419120 13 3












D-Syn 81.55%33.73%63.5447.7338.22847419120 14 4












D-B 77.98%34.14%62.0447.4938.46857524117 14 3












D-EWN-M 81.48%35.34%64.6149.3039.86887920115 13 3












D-Dér 81.73%34.14%63.9148.1638.64857519119 13 3












D-B-EWN-M-Dér 79.31%36.95%64.5250.4141.37928124111 14 3












Tous enrichissements79.49%37.35%64.8550.8241.78938124111 15 3
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001070












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 2001170












Syntaxe 85.88%29.32%61.9743.7133.77736512129 12 1












Synonymes 85.56%30.92%63.2245.4335.45776713127 17 2












D 82.29%31.73%62.4045.8036.17797017123 19 2












D-Syn 82.47%32.13%62.7946.2436.60807017123 20 3












D-B 79.21%32.13%61.2645.7136.46807121121 22 3












D-EWN-M 82.18%33.33%63.5547.4337.838374 18 119202












D-Dér 82.65%32.53%63.1846.6937.028171 17 122192












D-B-EWN-M-Dér 81.13%34.54%63.8948.4539.028676 20 117243












Tous enrichissements81.31%34.94%64.2548.8839.448776 20 117253
























Référence 83.75%26.91%58.8840.7331.136759 13 136 0 0












Synonymes aveugles 80.90%28.92%59.5042.6033.187261 17 133 0 0












Syntaxe 31.75%32.13%31.8231.9432.058064172125 0 0












Synonymes 29.68%33.73%30.4131.5832.848467199121 0 0












D 30.74%34.94%31.5032.7134.018770196117 0 0












D-Syn 29.29%34.94%30.2731.8733.648770210116 0 0












D-B 30.45%35.34%31.3232.7134.248871201115 0 0












D-EWN-M 31.49%36.55%32.3833.8335.419174198112 0 0












D-Dér 30.34%35.34%31.2332.6534.218870202117 0 0












D-B-EWN-M-Dér 30.79%37.35%31.9133.7635.829375209110 0 0












Tous enrichissements29.62%37.35%30.9033.0435.509375221110 0 0
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200800












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200890












Syntaxe 33.33%30.52%32.7331.8731.057661152128248












Synonymes 31.25%32.13%31.4231.6831.9580641761242710












D 32.55%33.33%32.7032.9433.1783671721202810












D-Syn 30.97%33.33%31.4232.1132.8383671851192911












D-B 32.31%33.73%32.5833.0133.4484681761182910












D-EWN-M 33.33%34.94%33.6434.1234.6187711741152810












D-Dér 32.18%33.73%32.4832.9433.4184671771202911












D-B-EWN-M-Dér 32.72%35.74%33.2834.1735.0989721831133011












Tous enrichissements31.34%35.74%32.1333.4034.7789721951133011
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 20080 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 20089 0












Syntaxe 33.48%30.52%32.8431.9331.07766115112925 8












Synonymes 31.37%32.13%31.5231.7531.9780641751252810












D 32.68%33.33%32.8133.0033.2083671711212910












D-Syn 31.09%33.33%31.5132.1732.8683671841203011












D-B 32.43%33.73%32.6833.0733.4784681751193010












D-EWN-M 33.46%34.94%33.7534.1834.6387711731162910












D-Dér 32.31%33.73%32.5833.0133.4484671761213011












D-B-EWN-M-Dér 32.84%35.74%33.3834.2335.1289721821143111












Tous enrichissements31.45%35.74%32.2233.4634.7989721941143111
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200 80 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200 89 0












Syntaxe 86.67%26.10%59.2040.1230.3565571013817716












Synonymes 84.34%28.11%60.2442.1732.4470601313420019












D 81.61%28.51%59.4642.2632.7871621613119618












D-Syn 80.00%28.92%59.1142.4833.1572621813020720












D-B 78.26%28.92%58.3542.2333.0972632012919718












D-EWN-M 80.65%30.12%60.3943.8634.4475661812619618












D-Dér 81.61%28.51%59.4642.2632.7871621613120319












D-B-EWN-M-Dér 76.77%30.52%58.9143.6834.7076672312320319












Tous enrichissements77.00%30.92%59.3244.1335.1377672312421420
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200 80 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200 89 0












Syntaxe 87.14%24.50%57.6638.2428.616155 9 14018218












Synonymes 86.67%26.10%59.2040.1230.3565571013820821












D 82.50%26.51%58.0040.1230.6766591413420320












D-Syn 82.72%26.91%58.4640.6131.1067591413421622












D-B 78.82%26.91%56.8840.1230.9967601813220420












D-EWN-M 81.40%28.11%59.0241.7932.3570631612920320












D-Dér 82.50%26.51%58.0040.1230.6766591413421021












D-B-EWN-M-Dér 77.17%28.51%57.5441.6432.6371642112621021












Tous enrichissements78.26%28.92%58.3542.2333.0972642012722222
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200 80 0












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200 89 0












Syntaxe 87.88%23.29%56.5336.8327.315852 8 14318618












Synonymes 87.32%24.90%58.1638.7529.056254 9 14121220












D 82.89%25.30%56.9638.7729.3863561313620719












D-Syn 83.12%25.70%57.4539.2629.8264561313622021












D-B 79.75%25.30%55.7538.4129.3063561613521019












D-EWN-M 82.50%26.51%58.0040.1230.6766591413220920












D-Dér 82.89%25.30%56.9638.7729.3863561313621420












D-B-EWN-M-Dér 79.52%26.51%56.8039.7630.5866591713121921












Tous enrichissements80.72%26.91%57.6640.3631.0567591613223122
























Référence 83.75%26.91%58.8840.7331.13675913136 0 0












Synonymes aveugles 80.90%28.92%59.5042.6033.18726117133 0 0












Syntaxe 84.93%24.90%57.3038.5129.00625511140 0 0












Synonymes 81.48%26.51%57.5940.0030.64665715137 0 0












D 77.27%27.31%56.5740.3631.37686020133 0 0












D-Syn 76.67%27.71%56.6540.7131.77696021132 0 0












D-B 73.40%27.71%55.2040.2331.65696125131 0 0












D-EWN-M 76.60%28.92%57.6041.9833.03726422128 0 0












D-Dér 77.27%27.31%56.5740.3631.37686020133 0 0












D-B-EWN-M-Dér 72.28%29.32%55.9041.7133.27736528125 0 0












Tous enrichissements73.27%29.72%56.6642.2933.73746527126 0 0
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200800












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200890












Syntaxe 86.76%23.69%56.6237.2227.735953 9 142 5 1












Synonymes 85.14%25.30%57.8039.0129.44635511139 7 2












D 82.28%26.10%57.5239.6330.23655814135 9 3












D-Syn 81.48%26.51%57.5940.0030.64665815134 9 3












D-B 78.57%26.51%56.4139.6430.56665918133103












D-EWN-M 81.18%27.71%58.5741.3231.91696216130 9 3












D-Dér 82.28%26.10%57.5239.6330.23655814135 9 3












D-B-EWN-M-Dér 76.92%28.11%57.1041.1832.20706321127103












Tous enrichissements78.02%28.51%57.9141.7632.66716320128103
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200800












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200890












Syntaxe 86.76%23.69%56.6237.2227.735953 9 142 5 1












Synonymes 85.14%25.30%57.8039.0129.44635511139 7 2












D 82.28%26.10%57.5239.6330.23655814135 9 3












D-Syn 81.48%26.51%57.5940.0030.64665815134 9 3












D-B 78.57%26.51%56.4139.6430.56665918133103












D-EWN-M 81.18%27.71%58.5741.3231.91696216130 9 3












D-Dér 82.28%26.10%57.5239.6330.23655814135 9 3












D-B-EWN-M-Dér 76.92%28.11%57.1041.1832.20706321127103












Tous enrichissements78.02%28.51%57.9141.7632.66716320128103
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200800












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200890












Syntaxe 86.57%23.29%56.0936.7127.285852 9 143 6 1












Synonymes 84.93%24.90%57.3038.5129.00625411140 8 2












D 81.82%25.30%56.5538.6529.36635614137113












D-Syn 81.01%25.70%56.6439.0229.77645615136113












D-B 78.05%25.70%55.4638.6729.68645718135123












D-EWN-M 80.72%26.91%57.6640.3631.05676016132113












D-Dér 81.82%25.30%56.5538.6529.36635614137113












D-B-EWN-M-Dér 76.40%27.31%56.2040.2431.34686121129123












Tous enrichissements77.53%27.71%57.0240.8331.80696120130123
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200800












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200890












Syntaxe 86.57%23.29%56.0936.7127.285852 9 143 6 1












Synonymes 86.11%24.90%57.7338.6329.03625410141 9 2












D 81.82%25.30%56.5538.6529.36635614137113












D-Syn 82.05%25.70%57.0439.1429.80645614137123












D-B 78.05%25.70%55.4638.6729.68645718135123












D-EWN-M 80.72%26.91%57.6640.3631.05676016132113












D-Dér 81.82%25.30%56.5538.6529.36635614137113












D-B-EWN-M-Dér 76.40%27.31%56.2040.2431.34686121129123












Tous enrichissements77.53%27.71%57.0240.8331.80696120130123
























Référence 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200800












Synonymes aveugles 0.00% 0.00% 0.00 0.00 0.00 0 0 0 200890












Syntaxe 87.30%22.09%54.8935.2625.975549 8 146101












Synonymes 86.76%23.69%56.6237.2227.735951 9 144131












D 82.19%24.10%55.4537.2728.06605313139152












D-Syn 82.43%24.50%55.9637.7728.50615313139162












D-B 78.95%24.10%54.2536.9227.99605316138182












D-EWN-M 81.82%25.30%56.5538.6529.36635614135173












D-Dér 82.19%24.10%55.4537.2728.06605313139152












D-B-EWN-M-Dér 78.75%25.30%55.3638.3029.28635617134213












Tous enrichissements80.00%25.70%56.2438.9129.74645616135213

Résumé

Cette thèse présente une méthode originale pour identifier et structurer l’information de documents et pour l’interroger. Comme les méthodes linguistiques améliorent les résultats des systèmes actuels, cette approche se base sur des analyses linguistiques et des ressources lexicales. Une analyse grammaticale de haut niveau (morphologique, syntaxique et sémantique) identifie d’abord les éléments d’information et les lie entre eux. Puisque le contexte des requêtes est faible, les textes sont analysés. Puis le contenu des ressources confère aux informations de nombreuses actualisations grâce à des transformations contextuelles : synonymie simple et complexe, dérivations avec adaptation du contexte syntaxique, adjonction de traits sémantiques...Enfin, l’interrogation des textes est testée. Une analyse morpho-syntaxique de la question en identifie les éléments d’information et choisit le type de la réponse attendue. Le fragment de texte contenant ces données constitue la réponse à la question.

Mots-clefs : question-réponse, extraction d’information, recherche d’information, désambiguïsation sémantique lexicale, dictionnaire électronique

Abstract

Construct and question the informative structure from a French documentary base

This thesis presents an original methodology to identify and structure information of a French textual base in order to question it. Linguistic techniques in current methods make it possible to improve the results. So we propose a methodology using high-level linguistic analysis (morphology, syntax and word sense disambiguation) to identify each piece of information and to connect them. Because of the lack of context in queries, the texts are analyzed. Then, the information from lexico-semantic resources is used to transform each piece of information in many realizations : simple and complex synonymy, derivation with adaptation to the syntactic context, addition of semantic features and categories...We finally tested the questioning method using a morpho-syntactic analysis to identify each piece of information in the question and to determine the semantic type of the required answer ; the passage of the texts containing these data is the answer to the question.

Keywords : question answering (QA), information extraction (IE), information retrieval (IR), word sense disambiguation (WSD), electronic dictionary


Université de la Sorbonne Nouvelle – Paris III ILPGA
Institut de Linguistique et Phonétique Générales et Appliquées
19, rue des Bernardins
75 005 Paris