群馬大学 | 医学部 | サイトトップ | 医学情報処理演習

医学情報処理演習第12回「量的データのノンパラメトリックな分析法」課題

2010年1月25日

(注)前回の課題の解答例は,http://phi.med.gunma-u.ac.jp/medstat/it2009-11r.htmlに示してあります。

課題

sample12.txtは,我が国の2005年の男女別都道府県別主要死因別年齢調整死亡率(全死因がALLM,ALLF;悪性新生物がCANCERM,CANCERF;心疾患がCARDIOM,CARDIOF;脳血管疾患がCEREBROM,CEREBROF;肺炎がPNEUMOM,PNEUMOF;不慮の事故がACCIDENTM,ACCIDENTF;自殺がSUICIDEM,SUICIDEF;老衰がSCENILM,SCENILF;腎不全がKIDNEYFM,KIDNEYFF;肝臓病がLIVERDM;LIVERDF;慢性閉塞性肺疾患がCOPDM,COPDF;糖尿病がDIABETESM,DIABETESF),2004年の世帯平均貯金額(1000円単位)(HHSAVINGS),2005年の年平均気温(AVETEMP),2005年の世帯当たり自家用車保有台数(MYCAR),2005年の工業生産額(100万円単位)(PRODUCTS),2005年人口(一般世帯人員として)(POP),2005年人口1人当たり工業生産額(100万円単位)(PRODPP)からなるデータである。データの出典は,厚生労働省のサイトと,総務省統計局のe-STATと,経済産業省の「統計から見る日本の工業」である。変数PREFは都道府県,変数AREAは東日本か西日本かを示す(東日本は最も広くとって,北海道,東北,関東,中部,北陸,三重を含む東海までとした)。

腎不全による都道府県別の年齢調整死亡率(男性がKIDNEYFM,女性がKIDNEYFF)が,東日本と西日本で有意に異なるか,ウィルコクソンの順位和検定で検定せよ。検定の有意水準は5%とする。

学籍番号・氏名とともに,下のフォームにRのコードと考察を貼り付けて送信せよ。

A file named sample12.txt includes prefecture-based age-adjusted mortalities specific for major cause of death by males and females in 2005. Variables are: ALLM and ALLF are age-adjusted mortality by all causes, CANCERM and CANCERF are by neoplasms (cancer), CARDIOM and CARDIOF are by cardiovascular disease, CEREBROM and CEREBROF are by cerebrovascular disease, PNEUMOMand PNEUMOF are by pneumonia, ACCIDENTM and ACCIDENTF are by accident, SUICIDEM and SUICIDEF are by suicide, SCENILM and SCENILF are by scenescence, KIDNEYFM and KIDNEYFF are by kidney failure, LIVERDM and LIVERDF are by liver disease, COPDM and COPDF are by chronic occlusive pneumonic disease (COPD), and DIABETESM and DIABETESF are by diabetes. The file also contains aggregate measure of socioeconomic status of the prefecture as the following variables: HHSAVINGS is average amount of savings per household (unit: 1,000 yen) in 2004, AVETEMP is annual average temperature in 2005, MYCAR is average number of car ownership for private use per household in 2005, PRODUCTS is total industrial products (unit: million yen) in 2005, POP is total population (as ordinal household member) in 2005, and PRODPP is the ratio of PRODUCTS to POP. The sources of data are in the website of the ministry of health, labor and welfare, the e-STAT, and the "Japanese industry from the viewpoint of statistics" in the website of the ministry of economics, technology and industry. The file also includes the variables PREF (the name of prefecture in Japanese) and AREA (included either in the eastern part of Japan or in the western part of Japan in Japanese; there are several criteria to divide Japan into east and west, but here I used the most east-wide criterion, where the west-end of eastern Japan is Fukui, Gifu, Aichi and Mie prefectures).

Using the Wilcoxon's rank sum test, examine the statistical difference of the age-adjusted mortalities by kidney failure (KIDNEYFM and KIDNEYFF) between eastern Japan and western Japan (indicated by AREA). Let the significance level (alpha-error) 0.05.

Write R scripts to do these hypothesis testing with appropriate graph drawings in the upper box. Discuss the results in the lower box. Never forget to write the student ID and name in each field.

解答フォーム

項目入力欄
学籍番号 (Student ID)
氏名 (Name)
メールアドレス (Mail address, if any)
解答(Rのコード)(R scripts)
解答(考察) (Discussion)