FSS
<p>[TOC]</p>
<pre><code class="language-python">%matplotlib inline
%load_ext autoreload
%autoreload 2
import meteva.base as meb
import meteva.product as mpd
import meteva.method as mem
import datetime
import meteva.perspact as mps # 透视分析模块</code></pre>
<pre><code>The autoreload extension is already loaded. To reload it, use:
%reload_ext autoreload</code></pre>
<pre><code class="language-python">grade_list = [5,10]
half_window_size_list = [5,10,15,20,25]
path_ob = r'H:\test_data\input\mem\mode\ob\rain03\20070111.000.nc'
path_fo = r'H:\test_data\input\mem\mode\ec\rain03\20070108.003.nc'
grd_ob1 = meb.read_griddata_from_nc(path_ob,time = &quot;2020070111&quot;,dtime = 0,level =0,data_name = &quot;ob&quot;)
grd_fo1 = meb.read_griddata_from_nc(path_fo,time = &quot;2020070108&quot;,dtime = 3,level =0,data_name = &quot;ECMWF&quot;)
meb.contourf_2d_grid(grd_ob1,cmap = &quot;rain_3h&quot;)
meb.contourf_2d_grid(grd_fo1,cmap = &quot;rain_3h&quot;)</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitfile/sign/0a1cf443fb8d1ef631ff4e94071dbee8" alt="" /></p>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitfile/sign/b200d7d18f8abc898d063d078283109e" alt="" /></p>
<h1>统计fss中间结果</h1>
<p><font face="黑体" color=Blue size=3><strong>fbs_pobfo(ob_xy, fo_xy,grade_list=[1e-30],half_window_size_list=[1],compare = ">=", masker=None)</strong></font><br />
输入numpy形式的2维网格数据,计算fss评分相关的中间结果,计算步骤包括:</p>
<ol>
<li>设置1个阈值,将观测(预报)大于等于阈值的部分设置为1,其它部分设为0 </li>
<li>对网格数据进行滑动平均,滑动窗口大小为half_window_size * 2 + 1,所得结果记为窗口概率</li>
<li>将平滑后结果,乘以masker_xy </li>
<li>计算上一步之后的窗口概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值 </li>
<li>循环不同的阈值和窗口大小,重复上述步骤</li>
</ol>
<table>
<thead>
<tr>
<th style="text-align: left;">参数</th>
<th style="text-align: center;">说明</th>
<th style="text-align: left;">备注</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left;"><strong><font face="黑体" color=Blue size=5>ob_xy</font></strong></td>
<td style="text-align: center;">一个观测平面场数据</td>
<td style="text-align: left;">numpy数组</td>
</tr>
<tr>
<td style="text-align: left;"><strong><font face="黑体" color=Blue size=5>grd_fo</font></strong></td>
<td style="text-align: center;">一个预报平面场数据</td>
<td style="text-align: left;">numpy数组,shape和观测一致</td>
</tr>
<tr>
<td style="text-align: left;"><strong>grade_list</strong></td>
<td style="text-align: center;">该参数用于对连续变量做多种等级阈值的二分类检验,其中包含多个事件是否记录为发生的判断阈值,记其中一个阈值为g,则判断为事件发生的条件是要素值 >= g。该参数缺省时列表中只包含一个取值为1e-30的阈值,由于气象要素精度通常比该缺省值大,因此它相当于将 >0 作为事件发生的判据</td>
</tr>
<tr>
<td style="text-align: left;"><strong>half_window_size</strong></td>
<td style="text-align: center;">半边窗口的网格数据,例如当half_window_size = 1,时,计算区域中部的一个网格点的平均用到 3×3个格点,当half_window_size = 2时 用到了5×5个格点求平均, 在区域的边缘,计算平均的格点数为窗口内实际格点数,每个位置会略有差异,但会正确保持平均值的含义</td>
</tr>
<tr>
<td style="text-align: left;"><strong>compare</strong></td>
<td style="text-align: center;">比较方法,可选项包括">=",">","<=","<", 是要素原始值和阈值对比的方法,默认方法为">=",即原始值大于等于阈值记为1,否则记为0,当compare为"<="时,原始值小于等于阈值的站点记为1, 这在基于能见度标记大雾事件、低温事件或降温事件等场景中可以用到</td>
</tr>
<tr>
<td style="text-align: left;"><strong>masker</strong></td>
<td style="text-align: center;">用于裁剪部分区域的mask数据</td>
<td style="text-align: left;">numpy数据,shape和观测一致,其中只有0和1两种值,取值为1的部分是需要在计算中保留的部分</td>
</tr>
<tr>
<td style="text-align: left;"><font face="黑体" color=blue size=5>return</font></td>
<td style="text-align: center;">返回3维数组,shape = (窗口种类数,等级数,3),其中result[...,0],result[...,1],result[...,2]分别是概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值</td>
</tr>
</tbody>
</table>
<p>参数示例</p>
<pre><code class="language-python">ob_xy = grd_ob1.values.squeeze()
fo_xy = grd_fo1.values.squeeze()
fss_middle = mem.fbs_pobfo(ob_xy,fo_xy,grade_list=grade_list,half_window_size_list=half_window_size_list)
print(fss_middle)</code></pre>
<pre><code>[[[0.01026934 0.00910105 0.00344057]
[0.00686192 0.00180271 0.00339929]]
[[0.0064876 0.00541789 0.00129035]
[0.00412846 0.00095178 0.00157669]]
[[0.00455968 0.00382494 0.00072619]
[0.00280346 0.00061937 0.00098296]]
[[0.00345356 0.00297099 0.00047066]
[0.00206549 0.00044611 0.00070138]]
[[0.002774 0.00246103 0.00032273]
[0.00161626 0.0003422 0.00053346]]]</code></pre>
<h1>FSS</h1>
<p><font face="黑体" color=Blue size=3><strong>fss(grd_ob,grd_fo,grade_list=[1e-30],half_window_size_list=[1],compare = ">=", masker=None)</strong></font><br />
输入grid_data数据集,计算fss评分以及相关中间结果,计算步骤包括:</p>
<ol>
<li>调用fbs_pobfo函数计算中间结果</li>
<li>计算fss, fss = 1 - 窗口概率误差(预报-观测)的平方的均值/(窗口概率观测值的平方的均值+窗口概率预报值的平方的均值)</li>
<li>为了便于后续批量分析将结果重新组合成stadata形式</li>
</ol>
<table>
<thead>
<tr>
<th style="text-align: left;">参数</th>
<th style="text-align: center;">说明</th>
<th style="text-align: left;">备注</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align: left;"><strong><font face="黑体" color=Blue size=5>grd_ob</font></strong></td>
<td style="text-align: center;">一个观测平面场数据</td>
<td style="text-align: left;">grid_data格式的网格数据,时效维度取值必须是0</td>
</tr>
<tr>
<td style="text-align: left;"><strong><font face="黑体" color=Blue size=5>grd_fo</font></strong></td>
<td style="text-align: center;">一个预报平面场数据</td>
<td style="text-align: left;">ngrid_data格式的网格数据</td>
</tr>
<tr>
<td style="text-align: left;"><strong>grade_list</strong></td>
<td style="text-align: center;">该参数用于对连续变量做多种等级阈值的二分类检验,其中包含多个事件是否记录为发生的判断阈值,记其中一个阈值为g,则判断为事件发生的条件是要素值 >= g。该参数缺省时列表中只包含一个取值为1e-30的阈值,由于气象要素精度通常比该缺省值大,因此它相当于将 >0 作为事件发生的判据</td>
</tr>
<tr>
<td style="text-align: left;"><strong>half_window_size</strong></td>
<td style="text-align: center;">半边窗口的网格数据,例如当half_window_size = 1,时,计算区域中部的一个网格点的平均用到 3×3个格点,当half_window_size = 2时 用到了5×5个格点求平均, 在区域的边缘,计算平均的格点数为窗口内实际格点数,每个位置会略有差异,但会正确保持平均值的含义</td>
</tr>
<tr>
<td style="text-align: left;"><strong>compare</strong></td>
<td style="text-align: center;">比较方法,可选项包括">=",">","<=","<", 是要素原始值和阈值对比的方法,默认方法为">=",即原始值大于等于阈值记为1,否则记为0,当compare为"<="时,原始值小于等于阈值的站点记为1, 这在基于能见度标记大雾事件、低温事件或降温事件等场景中可以用到</td>
</tr>
<tr>
<td style="text-align: left;"><strong>masker</strong></td>
<td style="text-align: center;">用于裁剪部分区域的mask数据</td>
<td style="text-align: left;">numpy数据,shape和观测一致,其中只有0和1两种值,取值为1的部分是需要在计算中保留的部分</td>
</tr>
<tr>
<td style="text-align: left;"><font face="黑体" color=blue size=5>return</font></td>
<td style="text-align: center;">返回3维数组,shape = (窗口种类数,等级数,3),其中result[...,0],result[...,1],result[...,2]分别是概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值</td>
</tr>
</tbody>
</table>
<p>参数示例</p>
<pre><code class="language-python">fss_one = mem.fss(grd_ob1,grd_fo1,grade_list=grade_list,half_window_size_list = half_window_size_list)
print(fss_one)</code></pre>
<pre><code> level time dtime id lon lat fname \
0 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
1 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
2 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
3 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
4 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
5 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
6 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
7 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
8 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
9 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
half_window_size grade pob pfo fbs fss
0 5 5 0.010269 0.009101 0.003441 0.822380
1 5 10 0.006862 0.001803 0.003399 0.607682
2 10 5 0.006488 0.005418 0.001290 0.891618
3 10 10 0.004128 0.000952 0.001577 0.689642
4 15 5 0.004560 0.003825 0.000726 0.913391
5 15 10 0.002803 0.000619 0.000983 0.712824
6 20 5 0.003454 0.002971 0.000471 0.926740
7 20 10 0.002065 0.000446 0.000701 0.720742
8 25 5 0.002774 0.002461 0.000323 0.938351
9 25 10 0.001616 0.000342 0.000533 0.727612 </code></pre>
<p>在上述结果中,fname,half_window_size,grade,pob,pfo,fbs,fss 分布记录了预报名称,采用的半窗口宽度,阈值,窗口概率观测值的平方的均值,窗口概率预报值的平方的均值,窗口概率误差(预报-观测)的平方的均值,以及对一个的fss取值。 其中每一个fss值对应一个时刻、一个时效、一种参数下评分。 如果要进行批量数据的统计分析,则可以采用如下方式,循环的调用fss函数,生成一组结果,由于fss结果是以sta_data形式存储,因此批量结果可以拼接再一起,进一步可以基于拼接的结果开展分析。</p>
<pre><code class="language-python">#以下是批量读取观测和预报场数据并完成目标识别匹配和结果输出
dir_ob = r'H:\test_data\input\mem\mode\ob\rain03\YYMMDDHH.000.nc'
dir_fo1 = r'H:\test_data\input\mem\mode\ec\rain03\YYMMDDHH.TTT.nc'
dir_fo2 = r'H:\test_data\input\mem\mode\ncep\rain03\YYMMDDHH.TTT.nc'
save_dir = r&quot;H:\test_data\output\method\space\mode&quot;
time_s = datetime.datetime(2020,7,1,8,0)
time_e = datetime.datetime(2020,7,10,8,0)
grid0 = meb.grid([100,125,0.05],[20,35,0.05])
time_fo = time_s
fss_list = []
while time_fo &lt; time_e:
print(time_fo)
for dh in range(3,73,3):
time_ob = time_fo + datetime.timedelta(hours = dh)
path_ob = meb.get_path(dir_ob,time_ob)
grd_ob1 = meb.read_griddata_from_nc(path_ob,time = time_ob,dtime = 0,level =0,data_name = &quot;ob&quot;,grid = grid0)
path_fo = meb.get_path(dir_fo1,time_fo,dh)
grd_fo1 = meb.read_griddata_from_nc(path_fo,time = time_fo,dtime = dh,level =0,data_name = &quot;ECMWF&quot;,grid = grid0)
if grd_ob1 is not None and grd_fo1 is not None:
fss_one = mem.fss(grd_ob1,grd_fo1,grade_list=grade_list,half_window_size_list = half_window_size_list)
fss_list.append(fss_one)
path_fo = meb.get_path(dir_fo2,time_fo,dh)
grd_fo2 = meb.read_griddata_from_nc(path_fo,time = time_fo,dtime = dh,level =0,data_name = &quot;NCEP&quot;,grid = grid0)
if grd_ob1 is not None and grd_fo2 is not None:
fss_one = mem.fss(grd_ob1,grd_fo2,grade_list=grade_list,half_window_size_list = half_window_size_list)
fss_list.append(fss_one)
time_fo = time_fo + datetime.timedelta(hours = 12)
fss_all = meb.concat(fss_list)
print(fss_all)</code></pre>
<pre><code>2020-07-01 08:00:00
2020-07-01 20:00:00
2020-07-02 08:00:00
2020-07-02 20:00:00
2020-07-03 08:00:00
2020-07-03 20:00:00
2020-07-04 08:00:00
2020-07-04 20:00:00
2020-07-05 08:00:00
2020-07-05 20:00:00
2020-07-06 08:00:00
2020-07-06 20:00:00
2020-07-07 08:00:00
2020-07-07 20:00:00
2020-07-08 08:00:00
2020-07-08 20:00:00
2020-07-09 08:00:00
2020-07-09 20:00:00
level time dtime id lon lat fname \
0 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
1 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
2 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
3 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
4 NaN 2020-07-01 08:00:00 3 999999 NaN NaN ECMWF
.. ... ... ... ... ... ... ...
43 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP
44 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP
45 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP
46 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP
47 NaN 2020-07-09 20:00:00 72 999999 NaN NaN NCEP
half_window_size grade pob pfo fbs fss
0 5 0.1 0.194534 6.274214e-01 0.360689 0.561182
1 5 5.0 0.032807 3.808229e-02 0.037445 0.471784
2 5 10.0 0.013510 9.198992e-03 0.016541 0.271608
3 5 15.0 0.006045 2.196897e-03 0.007541 0.085087
4 5 20.0 0.002734 8.780202e-04 0.003548 0.017810
.. ... ... ... ... ... ...
43 40 5.0 0.006086 1.416543e-02 0.013688 0.324108
44 40 10.0 0.000715 5.947981e-04 0.001184 0.096224
45 40 15.0 0.000132 1.763409e-05 0.000147 0.016037
46 40 20.0 0.000014 4.863769e-08 0.000014 0.004465
47 40 25.0 0.000002 0.000000e+00 0.000002 0.000000
[41472 rows x 13 columns]</code></pre>
<p>上述结果中包含了1种模式、60种起报时间、24种时效、2种等级、5种窗口的检验fss结果,以及相应的中间结果。</p>
<pre><code class="language-python"># 从总结果中挑选某一种参数的结果进行绘图分析。
fss_one_para = meb.sele_by_para(fss_all,grade =[5], half_window_size =[10],fname = [&quot;ECMWF&quot;]) # 选取一种参数的结果
fss_part = meb.sele_by_para(fss_one_para,member = [&quot;fss&quot;]) # 选取fss评分
print(fss_part)</code></pre>
<pre><code> level time dtime id lon lat fss
7 NaN 2020-07-01 08:00:00 3 999999 NaN NaN 0.567072
7 NaN 2020-07-01 08:00:00 6 999999 NaN NaN 0.518486
7 NaN 2020-07-01 08:00:00 9 999999 NaN NaN 0.530505
7 NaN 2020-07-01 08:00:00 12 999999 NaN NaN 0.312065
7 NaN 2020-07-01 08:00:00 15 999999 NaN NaN 0.318812
.. ... ... ... ... ... ... ...
7 NaN 2020-07-09 20:00:00 60 999999 NaN NaN 0.672741
7 NaN 2020-07-09 20:00:00 63 999999 NaN NaN 0.578186
7 NaN 2020-07-09 20:00:00 66 999999 NaN NaN 0.360016
7 NaN 2020-07-09 20:00:00 69 999999 NaN NaN 0.214642
7 NaN 2020-07-09 20:00:00 72 999999 NaN NaN 0.188977
[432 rows x 7 columns]</code></pre>
<pre><code class="language-python">meb.tool.plot_tools.mesh_time_dtime(fss_part,annot = -1,title = [&quot;FSS评分随时间和时效变化&quot;],cmap = &quot;ts&quot;) # 将所选结果绘制成图形</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=abe33d2f3a5faad9977d390f78a6be68&amp;file=file.png" alt="" /></p>
<p>若需计算某种时效下,所有时间数据的总体的fss,不能直接将fss取平均,需要先对 pob、 pfo、 fbs列先求和,再利用fss = 1 - fbs/(pob+pfo)来计算</p>
<pre><code class="language-python">fss_one_para = meb.sele_by_para(fss_all,grade =[5], half_window_size =[10],dtime = 3,fname = [&quot;ECMWF&quot;]) # 选取一种参数的结果
fss_03 = 1 - fss_one_para[&quot;fbs&quot;].sum()/(fss_one_para[&quot;pob&quot;].sum()+fss_one_para[&quot;pfo&quot;].sum())
print(&quot;设置半窗口为10,阈值为5mm,则时效为3的所有起报时间的预报的总体fss为:&quot;)
print(fss_03)</code></pre>
<pre><code>设置半窗口为10,阈值为5mm,则时效为3的所有起报时间的预报的总体fss为:
0.713404057345036</code></pre>
<h1>绘制fss评分随阈值和窗口尺度的变化图</h1>
<pre><code class="language-python">result = mps.score_df(fss_all,mem.fss,g = [&quot;fname&quot;,&quot;half_window_size&quot;,&quot;grade&quot;],plot = &quot;mesh&quot;,ncol = 2)</code></pre>
<p><img src="https://www.showdoc.com.cn/server/api/attachment/visitFile?sign=fa2822ad215afb9b37a64497e61c625b&amp;file=file.png" alt="" /></p>