4、百度天宫图片分类大赛

<pre><code class="language-python">import pandas as pd import numpy as np</code></pre> <pre><code class="language-python">df = pd.DataFrame(pd.read_csv('./data/micro_height_data_lable.csv',header=0))</code></pre> <pre><code class="language-python">df.head()</code></pre> <pre><code class="language-python">import os import shutil</code></pre> 创建6个分类的文件夹，便于matlab中导入数据方便 <pre><code class="language-python">os.mkdir('OCEAN') os.mkdir('MOUNTAIN') os.mkdir('DESERT') os.mkdir('LAKE') os.mkdir('FARMLAND') os.mkdir('CITY')</code></pre> <pre><code class="language-python">shutil.move("./data/multi_test_data_lable.csv","./data/CITY") # 移动文件或目录</code></pre> <pre><code>'out: ./data/CITY\\multi_test_data_lable.csv'</code></pre> <pre><code class="language-python"> # 照片目录 file_dir="D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\multi_test_data" # 读入标签数据 df = pd.DataFrame(pd.read_csv('./data/multi_test_data_lable.csv',header=0)) # 获取lables 和img_name lables = df['lables'] img_name = df['img_name'] # 遍历文件夹内所有图片，读取图片存到img，遍历到名字，在img_name中查找 # 查找到后，获取当前行数，移动文件到对应分类文件夹 for file in os.listdir(file_dir): #print(file) #img_path=file_dir+'\\'+file #每个图片的地址 # img=Image.open(img_path) i = 0 for img in img_name: if (file == img): shutil.move(file_dir+'\\'+img,"./data/"+lables[i]) break else: i = i+1 </code></pre> <hr /> <hr /> <pre><code class="language-python">import numpy as np from PIL import Image</code></pre> <pre><code class="language-python">def read_image(img_name): im = Image.open(img_name) #.convert('L') data = np.array(im) return data</code></pre> <pre><code class="language-python">import os images=[] # 照片目录 file_dir="D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data" for file in os.listdir(file_dir): img_path=file_dir+'\\'+file #每个图片的地址 images.append(read_image(img_path))</code></pre> 查找目录内不是(256，256，3)类型的图片 <pre><code class="language-python">j = 0 for i in images: #print(len(i)) if(i.shape==(256, 256, 3)): j = j+1 else: print(j) j = j+1</code></pre> <pre><code class="language-python">X_img = np.array(images)</code></pre> <pre><code> ValueError Traceback (most recent call last) <ipython-input-15-0f8e6620abfa> in <module>() ----> 1 X_img = np.array(images) ValueError: could not broadcast input array from shape (256,256,3) into shape (256)</code></pre> 将matlab中 mat类型变量导入到python中 matlab变量为cell类型的array <pre><code class="language-python">import numpy as np import pandas as pd import scipy.io as sio</code></pre> <pre><code class="language-python">pfile = sio.loadmat('./pfile.mat')</code></pre> <pre><code class="language-python">pfile</code></pre> <pre><code> {'__globals__': [], '__header__': b'MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Mon Nov 5 16:59:22 2018', '__version__': '1.0', 'pfiles': array([[array(['D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\MWI_KD96UmGQ6KWVeohF.jpg'], dtype='<U76')], [array(['D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\MWI_KDDTIKoSiQgwZiUQ.jpg'], dtype='<U76')], ............... [array(['D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\MWI_s8cXI99nJNDiXXNc.jpg'], dtype='<U76')], [array(['D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\MWI_s8gMVZYugHHFmUyg.jpg'], dtype='<U76')]], dtype=object)} </code></pre> <pre><code class="language-python">pf = pfile['pfiles']</code></pre> <pre><code class="language-python">pf[0,0]</code></pre> <pre><code>array(['D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\MWI_KD96UmGQ6KWVeohF.jpg'], dtype='<U76')</code></pre> <pre><code class="language-python">pPre = sio.loadmat('./cell_str.mat')</code></pre> <pre><code class="language-python">pPre</code></pre> <pre><code> {'__globals__': [], '__header__': b'MATLAB 5.0 MAT-file, Platform: PCWIN64, Created on: Mon Nov 5 17:12:21 2018', '__version__': '1.0', 'cell_pre': array([[array(['DESERT'], dtype='<U6')], [array(['OCEAN'], dtype='<U5')], [array(['MOUNTAIN'], dtype='<U8')], [array(['MOUNTAIN'], dtype='<U8')], [array(['LAKE'], dtype='<U4')], [array(['DESERT'], dtype='<U6')], [array(['LAKE'], dtype='<U4')], [array(['LAKE'], dtype='<U4')], [array(['DESERT'], dtype='<U6')], [array(['DESERT'], dtype='<U6')], ........... [array(['OCEAN'], dtype='<U5')], [array(['LAKE'], dtype='<U4')], [array(['DESERT'], dtype='<U6')]], dtype=object)}</code></pre> <pre><code class="language-python">cell_pre = pPre['cell_pre']</code></pre> <pre><code class="language-python">cell_pre[0]</code></pre> <pre><code>array([array(['DESERT'], dtype='<U6')], dtype=object)</code></pre> <pre><code class="language-python">pf[0,0]</code></pre> <pre><code>array(['D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\MWI_KD96UmGQ6KWVeohF.jpg'], dtype='<U76')</code></pre> <pre><code class="language-python">pf[5,0][0]</code></pre> <pre><code>'D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\MWI_KGnibGL9KFBpcvjV.jpg'</code></pre> <pre><code class="language-python">file_name_list=[] for i in range(1000): file_name_list.append(pf[i,0][0][52:])</code></pre> <pre><code class="language-python">file_name_list</code></pre> <pre><code> ['MWI_KD96UmGQ6KWVeohF.jpg', 'MWI_KDDTIKoSiQgwZiUQ.jpg', 'MWI_KERiJD55HvBKIhmL.jpg', 'MWI_KF9KpqQNNMVSsUKH.jpg', 'MWI_KFSnYaR40ro5CsiG.jpg', 'MWI_KGnibGL9KFBpcvjV.jpg', .......... 'MWI_s8Jos2wcHcmTqMJ6.jpg', 'MWI_s8cXI99nJNDiXXNc.jpg', 'MWI_s8gMVZYugHHFmUyg.jpg']</code></pre> <pre><code class="language-python">cell_pre[0][0][0]</code></pre> <pre><code>'DESERT'</code></pre> <pre><code class="language-python">pre_name_list=[] for i in range(1000): pre_name_list.append(cell_pre[i][0][0])</code></pre> <pre><code class="language-python">pre_name_list</code></pre> <pre><code>['DESERT', 'OCEAN', 'MOUNTAIN', 'MOUNTAIN', ..... 'DESERT', 'FARMLAND', 'OCEAN', 'LAKE', 'DESERT']</code></pre> 截取字符串 <pre><code class="language-python">a ='D:\\work\\jupyterNotebook\\baidu_dianshi\\data\\pre_data\\' len(a)</code></pre> <pre><code>52</code></pre> <pre><code class="language-python">pf[5,0][0][52:]</code></pre> <pre><code>'MWI_KGnibGL9KFBpcvjV.jpg'</code></pre> <hr /> <hr /> <hr /> <pre><code class="language-python">import pandas as pd from pandas import Series,DataFrame</code></pre> <pre><code class="language-python">frame1 = DataFrame(file_name_list)</code></pre> <pre><code class="language-python">frame1</code></pre> <pre><code><tr> <th>0</th> <td>MWI_KD96UmGQ6KWVeohF.jpg</td> </tr> <tr> <th>1</th> <td>MWI_KDDTIKoSiQgwZiUQ.jpg</td> </tr> <tr> <th>2</th> <td>MWI_KERiJD55HvBKIhmL.jpg</td> </tr> <tr> <th>3</th> <td>MWI_KF9KpqQNNMVSsUKH.jpg</td> </tr> <t <th>10</th> <td>MWI_KKYSmFI1jRRVBfXF.jpg</td> </tr> <tr> <th>11</th> <th>992</th> <td>MWI_s2l0T218Fq2lqtFS.jpg</td> </tr> <tr> <th>993</th> <td>MWI_s4BD8Ud5Gu91QwMl.jpg</td> </tr> <tr> <th>994</th> <td>MWI_s5GwvnKVlu0HDG2h.jpg</td> </tr> <tr> <th>995</th> <td>MWI_s5r88rnOtm3KriE1.jpg</td> </tr> <tr> <th>996</th> <td>MWI_s7JexiPnRc2cTiyZ.jpg</td> </tr> <tr> <th>997</th> <td>MWI_s8Jos2wcHcmTqMJ6.jpg</td> </tr> <tr> <th>998</th> <td>MWI_s8cXI99nJNDiXXNc.jpg</td> </tr> <tr> <th>999</th> <td>MWI_s8gMVZYugHHFmUyg.jpg</td> </tr></code></pre> 1000 rows × 1 columns <pre><code class="language-python">frame1_name = DataFrame(pre_name_list)</code></pre> <pre><code class="language-python">frame1_name</code></pre> <pre><code><tr> <th>0</th> <td>DESERT</td> </tr> <tr> <th>1</th> <td>OCEAN</td> </tr> <tr> <th>2</th> <td>MOUNTAIN</td> </tr> <tr> <th>3</th> <td>MOUNTAIN</td> </tr> <tr> <th>4</th> <td>LAKE</td> </tr> <tr> <th>5</th> <td>DESERT</td> </tr> <tr> <th>6</th> <td>LAKE</td> </tr> <tr> <th>7</th> <td>LAKE</td> </tr> <tr> <th>979</th> <td>OCEAN</td> </tr> <tr> <th>980</th> <td>DESERT</td> </tr> <tr> <th>981</th></code></pre> 1000 rows × 1 columns 两列合并 <pre><code class="language-python">result = pd.concat([frame1,frame1_name],axis=1) result</code></pre> 保存到csv文件 <pre><code class="language-python">result.to_csv("save_data.csv", index =False)</code></pre> 不保存首行 <pre><code class="language-python">result.to_csv("save_data2.csv", index =False,header=0)</code></pre>

python

4、百度天宫图片分类大赛

页面列表