meteva

提供气象产品检验相关python程序


数值型检验指标

<p>[TOC]</p> <pre><code class="language-python">import pandas as pd import meteva.method as mem import meteva.base as meb import meteva.product as mpd import meteva.perspact as mps # 透视分析模块 import datetime import meteva import numpy as np import os import traceback</code></pre> <h1>检验指标分类统计_df</h1> <p><strong>&lt;font face=&quot;黑体&quot; color=blue size = 3&gt;score_df(df_mid,method,s = None,g = None,gll_dict = None,plot = None,** kwargs)&lt;/font&gt;</strong><br /> 根据pandas.DataFrame格式的检验中间量,分类计算检验指标</p> <table> <thead> <tr> <th style="text-align: left;">参数</th> <th style="text-align: left;">说明</th> </tr> </thead> <tbody> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;df_mid &lt;/font&gt;</strong></td> <td style="text-align: left;">根据预报观测数据计算出的检验中间量</td> </tr> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;method &lt;/font&gt;</strong></td> <td style="text-align: left;">method中的各类数值型检验指标的函数名称,例如在本页面中已经import meteva.method as mem ,则ts评分的函数名称可以写为mem.ts ,均方根误差计算函数的名称可以写为 mem.rmse。这些函数选项包括<a href="https://www.showdoc.cc/nmc?page_id=2858658548509727">连续型预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=2860336958932349">二分类预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=2859693269266585">多分类预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=3651805946039771">概率预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=3629735872716279">集合预报</a>等类别中的评分函数</td> </tr> <tr> <td style="text-align: left;"><strong>s</strong></td> <td style="text-align: left;">用于选择数据样本的字典参数,具体的参数说明可参见meb.sele_by_dict中的<a href="https://www.showdoc.cc/meteva?page_id=3975604785954540">&lt;font face=&quot;黑体&quot; color=red size=5&gt;s&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>g</strong></td> <td style="text-align: left;">用于分组检验的参数,它可以是字符串,表示按单一维度分类,也可也是字符串的列表,表示同时按多个维度进行分类。对于单个分类参数,具体用法可参见meb.group中的<a href="https://www.showdoc.cc/meteva?page_id=4071849185300418">&lt;font face=&quot;黑体&quot; color=red size=5&gt;g&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>gll_dict</strong></td> <td style="text-align: left;">用于记录每个分类维度的具体分类方式,单个维度具体用法见下面的示例以及meb.group中的<a href="https://www.showdoc.cc/meteva?page_id=4071849185300418">&lt;font face=&quot;黑体&quot; color=red size=5&gt;gll&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>plot</strong></td> <td style="text-align: left;">是否需要直接将检验结果绘制图片,该参数为None时不绘制,当参数为&quot;bar&quot;时绘制柱状图,当参数为&quot;plot&quot;或&quot;line&quot;时绘制线条图</td> </tr> <tr> <td style="text-align: left;">**kwargs</td> <td style="text-align: left;">meteva.base.bar或meteva.base.plot 中的可选参数,具体用法参见下面的示例</td> </tr> <tr> <td style="text-align: left;">&lt;font face=&quot;黑体&quot; color=blue size=5&gt;return&lt;/font&gt;</td> <td style="text-align: left;">返回一个元组,其包含2个元素,其中第0个元素是数值检验指标的numpy数组,第1个元素是对numpy数组的维度描述字典</td> </tr> </tbody> </table> <p><strong>调用示例:</strong></p> <pre><code class="language-python">#加载整理好的观测预报数据 path = r&amp;quot;H:\test_data\input\mps\rain24.h5&amp;quot; sta_all = pd.read_hdf(path) print(sta_all)</code></pre> <pre><code> level time dtime id lon lat OBS \ 200113 0 2022-07-11 08:00:00 24 50136 122.52 52.97 0.0 200114 0 2022-07-11 08:00:00 24 50137 122.37 53.47 1.1 200115 0 2022-07-11 08:00:00 24 50246 124.72 52.35 7.1 200116 0 2022-07-11 08:00:00 24 50247 123.57 52.03 1.0 200117 0 2022-07-11 08:00:00 24 50349 124.40 51.67 1.3 ... ... ... ... ... ... ... ... 96435 0 2022-07-20 08:00:00 72 59945 109.70 18.65 0.0 96436 0 2022-07-20 08:00:00 72 59948 109.58 18.22 0.0 96437 0 2022-07-20 08:00:00 72 59951 110.33 18.80 0.0 96438 0 2022-07-20 08:00:00 72 59954 110.03 18.55 0.0 96439 0 2022-07-20 08:00:00 72 59981 112.33 16.83 0.0 ECMWF SCMOC 200113 5.447 0.96 200114 2.250 1.06 200115 4.791 0.51 200116 6.302 0.77 200117 0.961 0.68 ... ... ... 96435 999999.000 0.97 96436 999999.000 1.36 96437 999999.000 1.52 96438 999999.000 1.32 96439 999999.000 0.00 [126128 rows x 9 columns]</code></pre> <pre><code class="language-python">#加载水平站点的分组方式。 #分组参数需要是一个包含两列的DataFrame,第一列是站号,第二列是每个站号的分类标签 #例如,当需要将站点按省分类时,每个站点的分类标签就是省份名称。 path = r&amp;quot;H:\test_data\input\mps\station_id_province_name.dat&amp;quot; id_province = pd.read_csv(path, sep=&amp;quot;\\s+&amp;quot;, header=None, usecols=[3, 4]) id_province.columns = [&amp;quot;id&amp;quot;, &amp;quot;province&amp;quot;] print(id_province)</code></pre> <pre><code> id province 0 54398 北京 1 54399 北京 2 54406 北京 3 54416 北京 4 54419 北京 ... ... ... 37139 899209 新疆 37140 899211 新疆 37141 899223 新疆 37142 899233 新疆 37143 899533 新疆 [37144 rows x 2 columns]</code></pre> <pre><code class="language-python">#统计检验中间量 df_hfmc = mps.middle_df_sta(sta_all,meteva.method.hfmc,grade_list=[0.1,10,25,50,100,250],gid=id_province) print(df_hfmc) # 结果中H\F\M\C分别代表命中、空报、漏报、报无未出的样本数。</code></pre> <pre><code> time dtime member province grade H F M C 0 2022-07-11 08:00:00 24 ECMWF 北京 0.1 15.0 0.0 0.0 0.0 1 2022-07-11 08:00:00 24 ECMWF 北京 10.0 0.0 0.0 6.0 9.0 2 2022-07-11 08:00:00 24 ECMWF 北京 25.0 0.0 0.0 0.0 15.0 3 2022-07-11 08:00:00 24 ECMWF 北京 50.0 0.0 0.0 0.0 15.0 4 2022-07-11 08:00:00 24 ECMWF 北京 100.0 0.0 0.0 0.0 15.0 .. ... ... ... ... ... ... ... ... ... 181 2022-07-20 08:00:00 72 SCMOC 新疆 10.0 0.0 0.0 0.0 88.0 182 2022-07-20 08:00:00 72 SCMOC 新疆 25.0 0.0 0.0 0.0 88.0 183 2022-07-20 08:00:00 72 SCMOC 新疆 50.0 0.0 0.0 0.0 88.0 184 2022-07-20 08:00:00 72 SCMOC 新疆 100.0 0.0 0.0 0.0 88.0 185 2022-07-20 08:00:00 72 SCMOC 新疆 250.0 0.0 0.0 0.0 88.0 [16740 rows x 9 columns]</code></pre> <pre><code class="language-python">#g = [&amp;quot;grade&amp;quot;,&amp;quot;model&amp;quot;,&amp;quot;dtime&amp;quot;] 表示依次按照 grade,model和dtime 进行多分类的检验,不区分时间,也不区分省份 #绘图时,最后一个分类维度会成为横坐标,到时第二个会成为legend,更前面的会成为subplot score_array,g_name_dict = mps.score_df(df_hfmc,mem.ts,g = [&amp;quot;grade&amp;quot;,&amp;quot;member&amp;quot;,&amp;quot;dtime&amp;quot;],plot = &amp;quot;bar&amp;quot;)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=37c69343095ce0db79078b340ed1f78c&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">print(g_name_dict) #多维度分类的分类名称和标签 print(score_array.shape) #平均结果的shape print(score_array) #具体的评分值, 999999代表数据缺失或0/0的情况</code></pre> <pre><code>{'grade': array([1.0e-01, 1.0e+01, 2.5e+01, 5.0e+01, 1.0e+02, 2.5e+02]), 'member': array(['ECMWF', 'SCMOC'], dtype=object), 'dtime': [24, 48, 72]} (6, 2, 3) [[[5.66632155e-01 5.64092225e-01 9.99999000e+05] [6.36637435e-01 6.30343330e-01 6.20889081e-01]] [[4.09935058e-01 3.61929710e-01 9.99999000e+05] [4.16827438e-01 3.64711082e-01 3.71160505e-01]] [[3.04871661e-01 2.36801242e-01 9.99999000e+05] [3.19990954e-01 2.60084138e-01 2.42332831e-01]] [[1.96384552e-01 1.02092050e-01 9.99999000e+05] [2.57264352e-01 1.36645963e-01 1.10217216e-01]] [[3.20512821e-02 2.06896552e-02 9.99999000e+05] [7.72946860e-02 2.68817204e-02 3.27868852e-02]] [[0.00000000e+00 0.00000000e+00 9.99999000e+05] [9.99999000e+05 9.99999000e+05 9.99999000e+05]]]</code></pre> <pre><code class="language-python">#更换分组的顺序,绘图方式也自动跟着调整 score_array,g_name_dict = mps.score_df(df_hfmc,mem.ts,g = [&amp;quot;dtime&amp;quot;,&amp;quot;grade&amp;quot;,&amp;quot;member&amp;quot;],plot = &amp;quot;bar&amp;quot;)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=f60464a88063c0091c0a61ef80519cb0&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python"># 所有基于hfmc计算的检验指标,都可以基于df_hfmc来计算,例如下面是计算bias的方法 score_array,g_name_dict = mps.score_df(df_hfmc,mem.bias,g = [&amp;quot;dtime&amp;quot;,&amp;quot;grade&amp;quot;,&amp;quot;member&amp;quot;],plot = &amp;quot;bar&amp;quot;)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=eb3b28116c2419faf47cc39fa6ad4a30&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">#s = {&amp;quot;dtime&amp;quot;:24} 表示只选取24小时时效的数据进行统计 #g = [&amp;quot;grade&amp;quot;,&amp;quot;model&amp;quot;,&amp;quot;province&amp;quot;] 表示依次按照 grade,model和dtime 进行多分类的检验,不区分时间维度 score_array,g_name_dict = mps.score_df(df_hfmc,mem.ts,s = {&amp;quot;dtime&amp;quot;:24},g = [&amp;quot;grade&amp;quot;,&amp;quot;member&amp;quot;,&amp;quot;province&amp;quot;],plot = &amp;quot;line&amp;quot;, spasify_xticks = 1)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=17d99de362616b079c77db933b6f2ad7&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">#s = {&amp;quot;dtime&amp;quot;:24,&amp;quot;grade&amp;quot;:[25]} 表示只选取24小时时效的25毫米等级进行分析 #g = [&amp;quot;model&amp;quot;,&amp;quot;time&amp;quot;] 表示依次按照 model和time 进行多分类的检验 score_array,g_name_dict = mps.score_df(df_hfmc,mem.ts,s = {&amp;quot;dtime&amp;quot;:24,&amp;quot;grade&amp;quot;:[25]},g = [&amp;quot;member&amp;quot;,&amp;quot;time&amp;quot;],plot = &amp;quot;line&amp;quot;)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=14234d0bd3d1d80dd0a96fea8a7fea0e&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python"># 对观测和预报的事件频率进行统计 score_array,g_name_dict = mps.score_df(df_hfmc,mem.ob_fo_hr,g=[&amp;quot;grade&amp;quot;,&amp;quot;member&amp;quot;,&amp;quot;dtime&amp;quot;], plot = &amp;quot;bar&amp;quot;,vmin = 0)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=23248e49045f66416f5f6a9353e67977&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">#统计用于平均误差、平均绝对误差和均方根误差所需的中间量 df_tase = mps.middle_df_sta(sta_all,meteva.method.tase,gid=id_province) print(df_tase) # 结果中T\E\A\S分别代表总样本数,误差总和、绝对误差总和、误差平方和</code></pre> <pre><code> time dtime member province T E A \ 0 2022-07-11 08:00:00 24 ECMWF 北京 15.0 -57.290 64.394 1 2022-07-11 08:00:00 24 ECMWF 天津 11.0 236.114 254.426 2 2022-07-11 08:00:00 24 ECMWF 河北 142.0 517.643 1761.689 3 2022-07-11 08:00:00 24 ECMWF 山西 107.0 642.745 1827.355 4 2022-07-11 08:00:00 24 ECMWF 辽宁 55.0 2.689 7.621 .. ... ... ... ... ... ... ... 26 2022-07-20 08:00:00 72 SCMOC 内蒙古 107.0 -239.270 381.790 27 2022-07-20 08:00:00 72 SCMOC 西藏 39.0 68.820 86.900 28 2022-07-20 08:00:00 72 SCMOC 青海 43.0 0.930 0.930 29 2022-07-20 08:00:00 72 SCMOC 宁夏 19.0 -1.700 1.700 30 2022-07-20 08:00:00 72 SCMOC 新疆 88.0 -8.390 9.930 S 0 326.258558 1 9374.976760 2 45901.448617 3 55506.393427 4 6.624533 .. ... 26 6130.694300 27 560.525200 28 0.349100 29 2.890000 30 25.391500 [2790 rows x 8 columns]</code></pre> <pre><code class="language-python">score_array,g_name_dict = mps.score_df(df_tase,mem.rmse,g = [&amp;quot;dtime&amp;quot;,&amp;quot;member&amp;quot;,&amp;quot;time&amp;quot;],plot = &amp;quot;line&amp;quot;,ncol = 3)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=ebad81d0d5eeeaa5d09d45b774388e87&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">#统计用于平均误差、平均绝对误差和均方根误差所需的中间量 df_hfmc_sr = mps.middle_df_sta(sta_all,meteva.method.hfmc_of_sun_rain,gid=id_province) print(df_hfmc_sr) # 结果中Hsr\Fsr\Msr\Csr分别代表有雨样本的命中、空报、漏报、报无未出的样本数。</code></pre> <pre><code> time dtime member province Hsr Fsr Msr Csr 0 2022-07-11 08:00:00 24 ECMWF 北京 15.0 0.0 0.0 0.0 1 2022-07-11 08:00:00 24 ECMWF 天津 11.0 0.0 0.0 0.0 2 2022-07-11 08:00:00 24 ECMWF 河北 129.0 11.0 2.0 0.0 3 2022-07-11 08:00:00 24 ECMWF 山西 106.0 1.0 0.0 0.0 4 2022-07-11 08:00:00 24 ECMWF 辽宁 4.0 8.0 2.0 41.0 .. ... ... ... ... ... ... ... ... 26 2022-07-20 08:00:00 72 SCMOC 内蒙古 56.0 17.0 5.0 29.0 27 2022-07-20 08:00:00 72 SCMOC 西藏 12.0 11.0 0.0 16.0 28 2022-07-20 08:00:00 72 SCMOC 青海 0.0 3.0 0.0 40.0 29 2022-07-20 08:00:00 72 SCMOC 宁夏 0.0 0.0 1.0 18.0 30 2022-07-20 08:00:00 72 SCMOC 新疆 4.0 0.0 3.0 81.0 [2790 rows x 8 columns]</code></pre> <pre><code class="language-python">score_array,g_name_dict = mps.score_df(df_hfmc_sr,mem.pc_of_sun_rain,g = [&amp;quot;member&amp;quot;,&amp;quot;dtime&amp;quot;],plot = &amp;quot;bar&amp;quot;,vmin =0,tag= 2)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=ffd8b256f56597fe3bd2924bbadeaf89&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">#统计用于相关系数所需的中间量 df_tmmsss = mps.middle_df_sta(sta_all,meteva.method.tmmsss,gid=id_province) print(df_tmmsss) # 结果中T\MX\MY\SX\SY\SXY分别表示总样本数,观测平均、预报平均、观测方差、预报方差、观测预报协方差</code></pre> <pre><code> time dtime member province T MX MY \ 0 2022-07-11 08:00:00 24 ECMWF 北京 15.0 8.580000 4.760667 1 2022-07-11 08:00:00 24 ECMWF 天津 11.0 13.054545 34.519455 2 2022-07-11 08:00:00 24 ECMWF 河北 142.0 27.397183 31.042556 3 2022-07-11 08:00:00 24 ECMWF 山西 107.0 31.512150 37.519112 4 2022-07-11 08:00:00 24 ECMWF 辽宁 55.0 0.069091 0.117982 .. ... ... ... ... ... ... ... 26 2022-07-20 08:00:00 72 SCMOC 内蒙古 107.0 3.779439 1.543271 27 2022-07-20 08:00:00 72 SCMOC 西藏 39.0 1.756410 3.521026 28 2022-07-20 08:00:00 72 SCMOC 青海 43.0 0.000000 0.021628 29 2022-07-20 08:00:00 72 SCMOC 宁夏 19.0 0.089474 0.000000 30 2022-07-20 08:00:00 72 SCMOC 新疆 88.0 0.170455 0.075114 SX SY SXY 0 9.590933 3.599770 3.013720 1 27.453388 512.127872 74.026484 2 627.142668 677.666985 497.424380 3 1105.536955 681.324876 652.097045 4 0.083226 0.087628 0.026399 .. ... ... ... 26 52.432474 3.447757 1.792235 27 22.583997 24.258666 17.792045 28 0.000000 0.007651 0.000000 29 0.144100 0.000000 0.000000 30 0.497309 0.127011 0.172435 [2790 rows x 10 columns]</code></pre> <pre><code class="language-python">score_array,g_name_dict = mps.score_df(df_tmmsss,mem.corr,g = [&amp;quot;member&amp;quot;,&amp;quot;dtime&amp;quot;],plot = &amp;quot;bar&amp;quot;,vmin =0,tag= 2)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=129e99aacddac88de2bca103e3e732d6&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">wind_all = pd.read_hdf(r&amp;quot;H:\test_data\input\mps\wind.h5&amp;quot;) print(wind_all)</code></pre> <pre><code> level time dtime id lon lat u \ 115728 0 2022-08-14 08:00:00 12 50136 122.52 52.97 0.554585 115729 0 2022-08-14 08:00:00 12 50137 122.37 53.47 0.583025 115730 0 2022-08-14 08:00:00 12 50246 124.72 52.35 0.311893 115731 0 2022-08-14 08:00:00 12 50247 123.57 52.03 0.461026 115732 0 2022-08-14 08:00:00 12 50349 124.40 51.67 -0.000000 ... ... ... ... ... ... ... ... 120545 0 2022-08-17 20:00:00 48 59945 109.70 18.65 999999.000000 120546 0 2022-08-17 20:00:00 48 59948 109.58 18.22 999999.000000 120547 0 2022-08-17 20:00:00 48 59951 110.33 18.80 999999.000000 120548 0 2022-08-17 20:00:00 48 59954 110.03 18.55 999999.000000 120549 0 2022-08-17 20:00:00 48 59981 112.33 16.83 999999.000000 v u_ECMWF v_ECMWF u_CMA_GFS v_CMA_GFS 115728 0.576572 -0.32674 1.29144 0.535734 1.963796 115729 1.702963 -2.30537 -0.37547 1.381558 -1.434272 115730 -0.950117 -0.13760 -0.65160 0.626104 1.642932 115731 1.427395 -1.05139 0.90468 0.640829 1.452986 115732 -0.200000 -0.16590 0.86740 1.134700 1.234552 ... ... ... ... ... ... 120545 999999.000000 -0.36900 0.47650 0.138500 0.953140 120546 999999.000000 -2.11516 -1.77504 0.743881 1.147462 120547 999999.000000 -0.54240 0.10290 1.073980 2.832952 120548 999999.000000 -0.76325 0.34185 0.999144 1.847300 120549 999999.000000 -0.99463 4.39954 -0.359742 4.394487 [69919 rows x 12 columns]</code></pre> <pre><code class="language-python">df_nasws = mps.middle_df_sta(wind_all,mem.nasws_uv) #风速检验相关中间量 df_nas = mps.middle_df_sta(wind_all,mem.nas_uv) #风向检验相关中间量 df_na = mps.middle_df_sta(wind_all,mem.na_uv) #风向风速综合评分相关中间量</code></pre> <pre><code class="language-python">acs = mps.score_df(df_nasws,mem.acs_uv,g = [&amp;quot;member&amp;quot;]) #风速准确率 scs = mps.score_df(df_nasws,mem.scs_uv,g = [&amp;quot;member&amp;quot;]) #风速评分 sr = mps.score_df(df_nasws,mem.wind_severer_rate_uv,g = [&amp;quot;member&amp;quot;]) #风速偏强率 wr = mps.score_df(df_nasws,mem.wind_weaker_rate_uv,g = [&amp;quot;member&amp;quot;]) #风速偏弱率 acd = mps.score_df(df_nas,mem.acd_uv,g = [&amp;quot;member&amp;quot;]) #风向准确率 scd = mps.score_df(df_nas,mem.scd_uv,g = [&amp;quot;member&amp;quot;]) #风向评分 acz = mps.score_df(df_na,mem.acz_uv,g = [&amp;quot;member&amp;quot;,&amp;quot;dtime&amp;quot;], plot =&amp;quot;bar&amp;quot;,vmin = 0,tag = 2)#风向风速综合评分</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=cd9561d63fcc441c98cd1eb36edce05e&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">#基于风速数据检验风速相关指标 speed,_ = meb.wind_to_speed_angle(wind_all) df_nasws1 = mps.middle_df_sta(speed,mem.nasws_s) #风速检验相关中间量 acs = mps.score_df(df_nasws1,mem.acs,g = [&amp;quot;member&amp;quot;]) #风速准确率 scs = mps.score_df(df_nasws1,mem.scs,g = [&amp;quot;member&amp;quot;]) #风速评分 sr = mps.score_df(df_nasws1,mem.wind_severer_rate,g = [&amp;quot;member&amp;quot;]) #风速偏强率 wr = mps.score_df(df_nasws1,mem.wind_weaker_rate,g = [&amp;quot;member&amp;quot;,&amp;quot;dtime&amp;quot;], plot =&amp;quot;bar&amp;quot;,vmin = 0,tag = 3) #风速偏弱率</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=a18b33b590ec8caf9ddb9fad7bf29eba&amp;amp;file=file.png" alt="" /></p> <h1>检验指标水平分布_df</h1> <p><strong>&lt;font face=&quot;黑体&quot; color=blue size = 3&gt;score_xy_df(df_mid,method,s = None,g = None,gll_dict = None,save_path = None,** kwargs)&lt;/font&gt;</strong><br /> 根据pandas.DataFrame格式的检验中间量,分类计算检验指标,绘制成平面图</p> <table> <thead> <tr> <th style="text-align: left;">参数</th> <th style="text-align: left;">说明</th> </tr> </thead> <tbody> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;df_mid &lt;/font&gt;</strong></td> <td style="text-align: left;">根据预报观测数据计算出的检验中间量</td> </tr> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;method &lt;/font&gt;</strong></td> <td style="text-align: left;">method中的各类数值型检验指标的函数名称,例如在本页面中已经import meteva.method as mem ,则ts评分的函数名称可以写为mem.ts ,均方根误差计算函数的名称可以写为 mem.rmse。这些函数选项包括<a href="https://www.showdoc.cc/nmc?page_id=2858658548509727">连续型预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=2860336958932349">二分类预报</a></td> </tr> <tr> <td style="text-align: left;"><strong>s</strong></td> <td style="text-align: left;">用于选择数据样本的字典参数,具体的参数说明可参见meb.sele_by_dict中的<a href="https://www.showdoc.cc/meteva?page_id=3975604785954540">&lt;font face=&quot;黑体&quot; color=red size=5&gt;s&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>g</strong></td> <td style="text-align: left;">用于分组检验的参数,它是一个字符串,通常是df_mid 中除了id之外的列的列名称</td> </tr> <tr> <td style="text-align: left;"><strong>save_path</strong></td> <td style="text-align: left;">生成的图片的保存路径</td> </tr> <tr> <td style="text-align: left;">**kwargs</td> <td style="text-align: left;">meteva.base.contourf_2d_grid中的可选参数,具体用法参见下面的示例</td> </tr> <tr> <td style="text-align: left;">&lt;font face=&quot;黑体&quot; color=blue size=5&gt;return&lt;/font&gt;</td> <td style="text-align: left;">返回DataArray数据,其中记录了每块区域上的检验值</td> </tr> </tbody> </table> <p><strong>调用示例:</strong></p> <pre><code class="language-python">#批量读入网格实况和网格预报数据,生成中间结果 grid0 = meb.grid([70, 140, 0.05], [0, 60, 0.05]) marker = mps.get_grid_marker(grid0,step = 5) # 以5°间距生成等距的网格分区 dir_nmic = r&amp;quot;K:\DATA\NAFP\NMIC\ART_ATM_GLB\6HOR\YYYY\YYYYMMDD\ART_ATM_GLB_0P10_6HOR_ANAL_YYYYMMDDHH.grib2&amp;quot; dtime_list = np.arange(6,25,6).tolist() times = datetime.datetime(2022,6,15,0) timee = datetime.datetime(2022,6,16,0) time_ob = times df_tase_list = [] df_hfmc_list = [] while time_ob &amp;lt;= timee: level_list = [500,700,850,925] for level in level_list: try: dir_ob = r&amp;quot;H:\test_data\input\mps\cldas\tmp\\&amp;quot;+str(level)+r&amp;quot;\YYMMDDHH.000.nc&amp;quot; path_ob = meb.get_path(dir_ob,time_ob) grd_ob = meb.read_griddata_from_nc(path_ob,data_name = &amp;quot;ob&amp;quot;,grid = grid0) dir_fo_ec = r&amp;quot;R:\SCMOC_DATABASE\MODEL\globalECMWF\nc\TMP\\&amp;quot;+str(level)+r&amp;quot;\YYYYMMDD\YYMMDDHH.TTT.nc&amp;quot; dir_fo_ai = dir_fo_ec #此处需根据实际修改 for dtime in dtime_list: time_fo = time_ob - datetime.timedelta(hours=dtime) path_fo = meb.get_path(dir_fo_3d,time_fo,dtime) if not os.path.exists(path_fo): continue try: path_fo = meb.get_path(dir_fo_ec, time_fo, dtime) if not os.path.exists(path_fo): continue grd_ECMWF = meb.read_griddata_from_nc(path_fo, data_name=&amp;quot;ECMWF&amp;quot;, grid=grid0) #grd_ECMWF.values += meb.K print(path_fo) if grd_ECMWF is not None: df = mps.middle_df_grd(grd_ob, grd_ECMWF, meteva.method.tase, marker=marker) df_tase_list.append(df) df = mps.middle_df_grd(grd_ob, grd_ECMWF, meteva.method.hfmc,grade_list = [270,271], marker=marker) df_hfmc_list.append(df) except: exstr = traceback.format_exc() print(exstr) grd_ai = meb.read_griddata_from_nc(path_fo,data_name=&amp;quot;AI&amp;quot;,grid=grid0) if grd_ai.values[0,0,0,0,0,0] &amp;lt;120: grd_ai.values += meb.K grd_ai.values += 1 # 为展示检验结果的差异添加 if grd_ai is not None: df = mps.middle_df_grd(grd_ob, grd_ai, meteva.method.tase, marker=marker) df_tase_list.append(df) df = mps.middle_df_grd(grd_ob, grd_ai, meteva.method.hfmc,grade_list = [270,271], marker=marker) df_hfmc_list.append(df) except: exstr = traceback.format_exc() print(exstr) time_ob = time_ob + datetime.timedelta(hours=6) df_tase_all = meb.concat(df_tase_list) df_hfmc_all = meb.concat(df_hfmc_list) print(df_tase_all)</code></pre> <pre><code> level time dtime member id T E \ 0 500 2022-06-14 12:00:00 12.0 ECMWF 5120 9987.121541 3373.346488 0 500 2022-06-14 12:00:00 12.0 ECMWF 25090 9234.194216 3822.639197 0 500 2022-06-14 12:00:00 12.0 ECMWF 5125 9987.121541 8038.582762 0 500 2022-06-14 12:00:00 12.0 ECMWF 25095 9234.194216 1758.117448 0 500 2022-06-14 12:00:00 12.0 ECMWF 5130 9987.121541 6836.655794 .. ... ... ... ... ... ... ... 0 925 2022-06-15 00:00:00 24.0 AI 25075 9234.194216 3214.616800 0 925 2022-06-15 00:00:00 24.0 AI 5110 9987.121541 -1705.558028 0 925 2022-06-15 00:00:00 24.0 AI 25080 9234.194216 5083.701532 0 925 2022-06-15 00:00:00 24.0 AI 5115 9987.121541 -938.013798 0 925 2022-06-15 00:00:00 24.0 AI 25085 9234.194216 18117.880440 A S 0 3698.008681 1885.681563 0 5737.406856 5402.063689 0 8038.582762 7510.503087 0 3416.197708 2301.212886 0 6890.874434 5505.708527 .. ... ... 0 8682.849369 10858.347804 0 4752.523127 3365.128455 0 20175.935531 60452.670569 0 4915.136316 3572.277930 0 19171.324205 55285.910250 [15600 rows x 9 columns]</code></pre> <pre><code class="language-python">result = mps.score_xy_df(df_hfmc_all,mem.ts,s = {&amp;quot;member&amp;quot;:[&amp;quot;AI&amp;quot;],&amp;quot;level&amp;quot;:500},g=&amp;quot;grade&amp;quot;,ncol = 2)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=9f9f9472ac3f213fcc5cf0611f93fc8e&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python">#根据中间结果绘制检验指标的空间分布,按模式名称分类检验,并绘制不同子图中 result = mps.score_xy_df(df_tase_all,mem.mae,g=&amp;quot;member&amp;quot;,ncol = 2)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=6f024c5822c5a692cf2841c6500a36b9&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python"># 选取部分数据按照层次分类检验,并绘制在不同子图中 result = mps.score_xy_df(df_tase_all,mem.mae,s = {&amp;quot;member&amp;quot;:[&amp;quot;AI&amp;quot;]},g=&amp;quot;level&amp;quot;,ncol = 2)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=8b2209f351388f7173e30a846210028a&amp;amp;file=file.png" alt="" /></p> <h1>检验指标技巧水平分布_df</h1> <p><strong>&lt;font face=&quot;黑体&quot; color=blue size = 3&gt;score_xy_df_delta(df_mid,method,reference_member,s = None,save_path = None,</strong>kwargs)&lt;/font&gt;**<br /> 根据pandas.DataFrame格式的检验中间量,计算检验指标的平面分布场,再以其中1个模式的评分场作为基准,将其他模式的评分相对该基准的正负偏差</p> <table> <thead> <tr> <th style="text-align: left;">参数</th> <th style="text-align: left;">说明</th> </tr> </thead> <tbody> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;df_mid &lt;/font&gt;</strong></td> <td style="text-align: left;">根据预报观测数据计算出的检验中间量</td> </tr> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;method &lt;/font&gt;</strong></td> <td style="text-align: left;">method中的各类数值型检验指标的函数名称,例如在本页面中已经import meteva.method as mem ,则ts评分的函数名称可以写为mem.ts ,均方根误差计算函数的名称可以写为 mem.rmse。这些函数选项包括<a href="https://www.showdoc.cc/nmc?page_id=2858658548509727">连续型预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=2860336958932349">二分类预报</a></td> </tr> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;reference_member &lt;/font&gt;</strong></td> <td style="text-align: left;">被选择基准的模式的名称</td> </tr> <tr> <td style="text-align: left;"><strong>s</strong></td> <td style="text-align: left;">用于选择数据样本的字典参数,具体的参数说明可参见meb.sele_by_dict中的<a href="https://www.showdoc.cc/meteva?page_id=3975604785954540">&lt;font face=&quot;黑体&quot; color=red size=5&gt;s&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>save_path</strong></td> <td style="text-align: left;">生成的图片的保存路径</td> </tr> <tr> <td style="text-align: left;">**kwargs</td> <td style="text-align: left;">meteva.base.contourf_2d_grid中的可选参数,具体用法参见下面的示例</td> </tr> <tr> <td style="text-align: left;">&lt;font face=&quot;黑体&quot; color=blue size=5&gt;return&lt;/font&gt;</td> <td style="text-align: left;">返回DataArray数据,其中记录了每块区域上的检验值</td> </tr> </tbody> </table> <p><strong>调用示例:</strong></p> <pre><code class="language-python">#根据中间结果绘制检验指标的空间分布,按模式名称分类检验,并绘制不同子图中 result = mps.score_xy_df_delta(df_tase_all,mem.mae,reference_member = &amp;quot;ECMWF&amp;quot;,ncol = 2)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=1b88fb5dd3f082c53b0f877f7cbf7d5e&amp;amp;file=file.png" alt="" /></p> <h1>检验指标分类统计_ds</h1> <p><strong>&lt;font face=&quot;黑体&quot; color=blue size = 3&gt;score_ds(df_mid,method,s = None,g = None,gll_dict = None,plot = None,** kwargs)&lt;/font&gt;</strong><br /> 根据xarray.DataSet格式的检验中间量,分类计算检验指标。</p> <table> <thead> <tr> <th style="text-align: left;">参数</th> <th style="text-align: left;">说明</th> </tr> </thead> <tbody> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;ds_mid &lt;/font&gt;</strong></td> <td style="text-align: left;">根据预报观测数据计算出的检验中间量</td> </tr> <tr> <td style="text-align: left;"><strong>&lt;font face=&quot;黑体&quot; color=blue size = 5&gt;method &lt;/font&gt;</strong></td> <td style="text-align: left;">method中的各类数值型检验指标的函数名称,例如在本页面中已经import meteva.method as mem ,则ts评分的函数名称可以写为mem.ts ,均方根误差计算函数的名称可以写为 mem.rmse。这些函数选项包括<a href="https://www.showdoc.cc/nmc?page_id=2858658548509727">连续型预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=2860336958932349">二分类预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=2859693269266585">多分类预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=3651805946039771">概率预报</a>,<a href="https://www.showdoc.cc/nmc?page_id=3629735872716279">集合预报</a>等类别中的评分函数</td> </tr> <tr> <td style="text-align: left;"><strong>s</strong></td> <td style="text-align: left;">用于选择数据样本的字典参数,具体的参数说明可参见meb.sele_by_dict中的<a href="https://www.showdoc.cc/meteva?page_id=3975604785954540">&lt;font face=&quot;黑体&quot; color=red size=5&gt;s&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>g</strong></td> <td style="text-align: left;">用于分组检验的参数,它可以是字符串,表示按单一维度分类,也可也是字符串的列表,表示同时按多个维度进行分类。对于单个分类参数,具体用法可参见meb.group中的<a href="https://www.showdoc.cc/meteva?page_id=4071849185300418">&lt;font face=&quot;黑体&quot; color=red size=5&gt;g&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>gll_dict</strong></td> <td style="text-align: left;">用于记录每个分类维度的具体分类方式,单个维度具体用法见下面的示例以及meb.group中的<a href="https://www.showdoc.cc/meteva?page_id=4071849185300418">&lt;font face=&quot;黑体&quot; color=red size=5&gt;gll&lt;/font&gt;</a>参数</td> </tr> <tr> <td style="text-align: left;"><strong>plot</strong></td> <td style="text-align: left;">是否需要直接将检验结果绘制图片,该参数为None时不绘制,当参数为&quot;bar&quot;时绘制柱状图,当参数为&quot;plot&quot;或&quot;line&quot;时绘制线条图</td> </tr> <tr> <td style="text-align: left;">**kwargs</td> <td style="text-align: left;">meteva.base.bar或meteva.base.plot 中的可选参数,具体用法参见下面的示例</td> </tr> <tr> <td style="text-align: left;">&lt;font face=&quot;黑体&quot; color=blue size=5&gt;return&lt;/font&gt;</td> <td style="text-align: left;">返回一个元组,其包含2个元素,其中第0个元素是数值检验指标的numpy数组,第1个元素是对numpy数组的维度描述字典</td> </tr> </tbody> </table> <p><strong>调用示例:</strong></p> <pre><code class="language-python">#统计检验中间量 ds_hfmc = mps.middle_ds_sta(sta_all,meteva.method.hfmc,grade_list=[0.1,10,25,50,100,250],gid=id_province) print(ds_hfmc) # Data variables 中的 H\F\M\C分别代表命中、空报、漏报、报无未出的样本数。</code></pre> <pre><code>&amp;lt;xarray.Dataset&amp;gt; Dimensions: (dtime: 3, grade: 6, member: 2, province: 31, time: 19) Coordinates: * time (time) datetime64[ns] 2022-07-11T08:00:00 ... 2022-07-20T08:00:00 * dtime (dtime) int64 24 48 72 * member (member) object 'ECMWF' 'SCMOC' * province (province) object '上海' '云南' '内蒙古' '北京' ... '重庆' '陕西' '青海' '黑龙江' * grade (grade) float64 0.1 10.0 25.0 50.0 100.0 250.0 Data variables: H (time, dtime, member, province, grade) float64 1.0 0.0 ... nan nan F (time, dtime, member, province, grade) float64 7.0 1.0 ... nan nan M (time, dtime, member, province, grade) float64 0.0 0.0 ... nan nan C (time, dtime, member, province, grade) float64 2.0 9.0 ... nan nan</code></pre> <pre><code class="language-python">#统计检验中间量,以xarray的DataSet形式返回 ds_tase = mps.middle_ds_sta(sta_all,meteva.method.tase,gid=id_province) print(ds_tase) # 结果中的Data variables中的H\F\M\C分别记录了命中、空报、漏报、报无未出的样本数据。</code></pre> <pre><code>&amp;lt;xarray.Dataset&amp;gt; Dimensions: (dtime: 3, member: 2, province: 31, time: 19) Coordinates: * time (time) datetime64[ns] 2022-07-11T08:00:00 ... 2022-07-20T08:00:00 * dtime (dtime) int64 24 48 72 * member (member) object 'ECMWF' 'SCMOC' * province (province) object '上海' '云南' '内蒙古' '北京' ... '重庆' '陕西' '青海' '黑龙江' Data variables: T (time, dtime, member, province) float64 10.0 125.0 ... 43.0 77.0 E (time, dtime, member, province) float64 23.39 106.8 ... -149.8 A (time, dtime, member, province) float64 23.39 343.5 ... 0.93 181.3 S (time, dtime, member, province) float64 152.5 ... 4.071e+03</code></pre> <pre><code class="language-python">#将多个中间量合并到一个变量中 ds_all = ds_hfmc.merge(ds_tase) print(ds_all)</code></pre> <pre><code>&amp;lt;xarray.Dataset&amp;gt; Dimensions: (dtime: 3, grade: 6, member: 2, province: 31, time: 19) Coordinates: * time (time) datetime64[ns] 2022-07-11T08:00:00 ... 2022-07-20T08:00:00 * dtime (dtime) int64 24 48 72 * member (member) object 'ECMWF' 'SCMOC' * province (province) object '上海' '云南' '内蒙古' '北京' ... '重庆' '陕西' '青海' '黑龙江' * grade (grade) float64 0.1 10.0 25.0 50.0 100.0 250.0 Data variables: H (time, dtime, member, province, grade) float64 1.0 0.0 ... nan nan F (time, dtime, member, province, grade) float64 7.0 1.0 ... nan nan M (time, dtime, member, province, grade) float64 0.0 0.0 ... nan nan C (time, dtime, member, province, grade) float64 2.0 9.0 ... nan nan T (time, dtime, member, province) float64 10.0 125.0 ... 43.0 77.0 E (time, dtime, member, province) float64 23.39 106.8 ... -149.8 A (time, dtime, member, province) float64 23.39 343.5 ... 0.93 181.3 S (time, dtime, member, province) float64 152.5 ... 4.071e+03</code></pre> <pre><code class="language-python"># 所有基于hfmc计算的检验指标,都可以基于df_hfmc来计算,例如下面是计算bias的方法 score_array,g_name_dict = mps.score_ds(ds_hfmc,mem.ts,g = [&amp;quot;dtime&amp;quot;,&amp;quot;grade&amp;quot;,&amp;quot;member&amp;quot;],plot = &amp;quot;bar&amp;quot;)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=86ae78513e0db9402199d5c81676ec64&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python"># 基于包含多种中间量的DataSet,同样可以计算评分 score_array,g_name_dict = mps.score_ds(ds_all,mem.ts,g = [&amp;quot;dtime&amp;quot;,&amp;quot;grade&amp;quot;,&amp;quot;member&amp;quot;],plot = &amp;quot;bar&amp;quot;)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=86ae78513e0db9402199d5c81676ec64&amp;amp;file=file.png" alt="" /></p> <pre><code class="language-python"># 所有基于hfmc计算的检验指标,都可以基于df_hfmc来计算,例如下面是计算bias的方法 score_array,g_name_dict = mps.score_ds(ds_all,mem.me,g = [&amp;quot;member&amp;quot;,&amp;quot;dtime&amp;quot;],plot = &amp;quot;bar&amp;quot;,vmin = 0)</code></pre> <p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=5a9513c47d5b47d938da17f4c05df650&amp;amp;file=file.png" alt="" /></p>

页面列表

ITEM_HTML