判别分析实例-spss

数据来自http://core.ecu.edu/psyc/wuenschk/SPSS/SPSS-Data.htm,DFA-step

数据现成,网址内附带一份简短的reference,也可在上述网址内找到。reference内提供的操作指导仅到主窗口为止,数据分析着重讲了一张区域图。

所有引用部分已附出处,未标部分即为自己操作的感想。

1

group拖入分组,设定range为1-3。

 

2

x1-x4纳入自变量,use stepwise method(不确定所有变量都适用于判别,所以选逐步挑选变量)

 

右边四个窗口,统计量、方法、分类与保存,相关按钮按需要点取。

每一个选项的详细涵义可见http://wenku.baidu.com/view/7232170fba1aa8114431d9ef.html。靠谱论文一般会在文内写明软件提供的所有选项有何涵义。

4 5 63
结果解读:
7
此表显示x1,x2,x3被纳入判别,x4出局了。
8
显示x4是如何出局的。

 

9

判别函数摘要里,看到2个判别函数起了作用。

两个判别函数分别解释掉方差的72.3%和27.7%,

2个判别函数的判别效果都非常显著。

 

10

x1,x2与判别函数1的相关性更强

x3与判别函数2相关性更强

 

输出的区域图:

典则判别

函数 2

-8.0      -6.0      -4.0      -2.0        .0       2.0       4.0       6.0       8.0

+---------+---------+---------+---------+---------+---------+---------+---------+

8.0 +                             32                                                  +

I                             32                                                  I

I                              32                                                 I

I                              32                                                 I

I                               32                                                I

I                               32                                                I

6.0 +          +         +         +32       +         +         +         +          +

I                                32                                               I

I                                32                                               I

I                                32                                               I

I                                 32                                              I

I                                 32                                              I

4.0 +          +         +         +   32    +         +         +         +          +

I                                  32                                             I

I                                  32                                             I

I                                   32                                            I

I                                   32                                            I

I                                    32                                           I

2.0 +          +         +         +     32  +         +         +         +          +

I                                    32                                           I

I                                     32           *                              I

I                                     32                                          I

I                                     32                                          I

I                         *            32222                                      I

.0 +          +         +         +      311111222222 +         +         +          +

I                                     31    111111222222                          I

I                                    31           111111222222                    I

I                                   31                  111111222222              I

I                                  31                         111111222222        I

I                                 31          *                     111111222222  I

-2.0 +          +         +         + 31      +         +         +         +  11111122+

I                               31                                              11I

I                               31                                                I

I                              31                                                 I

I                             31                                                  I

I                            31                                                   I

-4.0 +          +         +      31 +         +         +         +         +          +

I                          31                                                     I

I                         31                                                      I

I                         31                                                      I

I                        31                                                       I

I                       31                                                        I

-6.0 +          +         + 31      +         +         +         +         +          +

I                     31                                                          I

I                    31                                                           I

I                   31                                                            I

I                   31                                                            I

I                  31                                                             I

-8.0 +                 31                                                              +

+---------+---------+---------+---------+---------+---------+---------+---------+

-8.0      -6.0      -4.0      -2.0        .0       2.0       4.0       6.0       8.0

典则判别函数 1

 

 

 

区域图中使用的符号

 

符号  组  标签

----  --  --------------

 

1    1  Overpaid

2    2  Correct

3    3  Underpaid

*       表示一个组质心

 结果和数据附带reference里的结果一样,

“ the underpayers are on the left, having a low DF1 (high X1 and low X3).  The overpayers are on the lower right, having a high DF1  and a low DF2 (low X2, high X3, high X1).  Those who paid the correct amount are in the upper right, having a high DF1  and a high DF2  (low X1, high X2, high X3).”

Copyright 2008 Karl L. Wuensch - All rights reserved.

 

11

更直观地可以看这张图。三组的分布状况很明确。

 

分类结果

12
已对初始分组案例中的 91.7% 个进行了正确分类。

预测的分类对比原分类结果如下:

13

14

 

 

 

 

 

 

 

时间序列分析优缺点[论文总结]

from 纵向数据分析方法 by刘红云 孟庆茂,2003

时间序列分析-可视作重测方差分析的广义模型

优点:

1较低要求的测量方法

不要求所有观察个体有相同的观察次数

测量与测量之间的时间间隔可以不同

2较强的数据解释性

既包含不同测量点的差异,又包含个体之间存在的差异(归功于多层模型)

缺点:

复杂

不能处理变量间的间接关系(LGM可解决)

 

mult comp = = one bleeding heart experience

After I tried reaaally hard to complete the mult comp,some of my classmates simply told me that SPSS can easily make this through.Heart bleeding%>_<%.

No matter in which software,multiple comparison is not such a kind of easy thing that you simply click on buttons and wait for the processing ending. Well,I really think my classmate and I are customized in using one tool for help and neither of us wanna a change.

** "mult comp "required

signal.lm=lm(d.~subject+store,a)

summary(glht(signal.lm,linfct=mcp(store="turkey")))

one example from experimental psychology course-Repeated measures

In this term's experimental psychology course,most teaching experiments are between-subject designed.Repeated measures have been the most frequent used method so far.

data:(material:light/sound  yesno:response yes/response no   e:error)

subject material yesno e
1 1 1 0.09765
2 1 1 0.2177
3 1 1 0.214
4 1 1 0.169947368
5 1 1 0.1204
6 1 1 0.098
7 1 1 0.14305
1 1 2 0.11145
2 1 2 0.1098
3 1 2 0.08785
4 1 2 0.149
5 1 2 0.0904
6 1 2 0.1183
7 1 2 0.1401
1 2 1 0.22455
2 2 1 0.2295
3 2 1 0.08815
4 2 1 0.19575
5 2 1 0.11695
6 2 1 0.0726
7 2 1 0.13295
1 2 2 0.1423
2 2 2 0.13675
3 2 2 0.0586
4 2 2 0.1464
5 2 2 0.18505
6 2 2 0.0684
7 2 2 0.12215

summary(aov(e~(material*yesno)+Error(subject/(material*yesno)),a))

Things done!

R is best at processing frequent processed steps.

Effect Size

Too many kinds of effect size(ES). Just collect some of them in this post in case of needing.

Cohen's d

In general,d=(M1-M2)/Standard pooled Deviation

  • pairwise = true:

d=(M1-M2)/S

either Gourp1's  S or Group2's S is fine.

  • pairwise = false:

d=(M1-M2)/S

S^2=[(n1-1)S1^2+(n2-1)S2^2]/(n1+n2)

 

d=0.2 small effect

d=0.5 medium effect

d=0.8 big effect

 

one way anova:

Eta^2=SSeffect/SStotal

more factors:

partial Eta^2=SSeffect/(SSeffect+SSerror)

 

Though easy to calculate,omiga^2 is better than Eta^2,which usually overestimates the effect size.

 

PZO transfering table

Quote

The table is usually charged on line,so finding it is easy but to copy down is rather tough.
Seemly useful in many psychology experiments.
p z 0 p z<0 0
0.01 2.326 0.0267 0.51 0.025 0.3988
0.02 2.053 0.0484 0.52 0.5 0.3984
0.03 1.881 0.0681 0.53 0.75 0.3978
0.04 1.75 0.0862 0.54 0.1 0.397
0.05 1.645 0.1032 0.55 0.125 0.3958
0.06 1.555 0.1192 0.56 0.15 0.3945
0.07 1.476 0.1343 0.57 0.176 0.3928
0.08 1.405 0.1487 0.58 0.201 0.3909
0.09 1.34 0.1625 0.59 0.227 0.3888
0.1 1.281 0.1756 0.6 0.253 0.3864
0.11 1.226 0.1881 0.61 0.279 0.3838
0.12 1.175 0.2001 0.62 0.305 0.3808
0.13 1.126 0.2116 0.63 0.331 0.3777
0.14 1.08 0.2227 0.64 0.358 0.3742
0.15 1.036 0.2333 0.65 0.385 0.3705
0.16 0.994 0.2434 0.66 0.412 0.3665
0.17 0.954 0.2532 0.67 0.439 0.3623
0.18 0.915 0.2625 0.68 0.467 0.3577
0.19 0.877 0.2715 0.69 0.495 0.3529
0.2 0.841 0.2801 0.7 0.524 0.3478
0.21 0.806 0.2883 0.71 0.553 0.3424
0.22 0.772 0.2962 0.72 0.582 0.3368
0.23 0.738 0.3038 0.73 0.612 0.3308
0.24 0.706 0.311 0.74 0.653 0.3245
0.25 0.674 0.3179 0.75 0.674 0.3179
0.26 0.643 0.3245 0.76 0.707 0.311
0.27 0.612 0.3308 0.77 0.738 0.3038
0.28 0.582 0.3368 0.78 0.772 0.2962
0.29 0.553 0.3424 0.79 0.806 0.2883
0.3 0.524 0.3478 0.8 0.841 0.2801
0.31 0.495 0.3529 0.81 0.877 0.2715
0.32 0.467 0.3577 0.82 0.915 0.2625
0.33 0.439 0.3623 0.83 0.954 0.2532
0.34 0.412 0.3665 0.84 0.994 0.2434
0.35 0.385 0.3705 0.85 1.036 0.2333
0.36 0.358 0.3724 0.86 1.08 0.2227
0.37 0.331 0.3777 0.87 1.126 0.2116
0.38 0.305 0.3808 0.88 1.175 0.2001
0.39 0.279 0.3838 0.89 1.226 0.1881
0.4 0.253 0.3864 0.9 1.181 0.1756
0.41 0.227 0.3888 0.91 1.34 0.1625
0.42 0.201 0.3909 0.92 1.405 0.1487
0.43 0.176 0.3928 0.93 1.476 0.1343
0.44 0.15 0.3945 0.94 1.555 0.1192
0.45 0.125 0.3958 0.95 1.645 0.1032
0.46 0.1 0.397 0.96 1.75 0.0862
0.47 0.075 0.3978 0.97 1.881 0.0681
0.48 0.05 0.3984 0.98 2.053 0.0484
0.49 0.025 0.3988 0.99 2.326 0.0267
0.5 0 0.3989

18讲ppt第九页

数据采用:18.ex5.csv
用aov或spss改变full design设计,撤掉被试间因素交互作用有何影响?撤掉被试内因素交互作用有何影响?
full design output:

Call:
aov(formula = Recall ~ (Gender * Dosage) * (Task * Valence) + Error(Subject/(Task *
Valence)), data = data.ex5)

Grand Mean: 15.62963

Stratum 1: Subject

Terms:
Gender Dosage Gender:Dosage Residuals
Sum of Squares 542.2593 694.9074 70.7963 1144.5556
Deg. of Freedom 1 2 2 12

Residual standard error: 9.76625
25 out of 30 effects not estimable
Estimated effects may be unbalanced

Stratum 2: Subject:Task

Terms:
Task Gender:Task Dosage:Task Gender:Dosage:Task
Sum of Squares 96.33333 1.33333 8.16667 3.16667
Deg. of Freedom 1 1 2 2
Residuals
Sum of Squares 29.00000
Deg. of Freedom 12

Residual standard error: 1.554563
12 out of 18 effects not estimable
Estimated effects may be unbalanced

Stratum 3: Subject:Valence

Terms:
Valence Gender:Valence Dosage:Valence Gender:Dosage:Valence
Sum of Squares 14.68519 3.90741 20.25926 1.03704
Deg. of Freedom 2 2 4 4
Residuals
Sum of Squares 58.77778
Deg. of Freedom 24

Residual standard error: 1.564952
12 out of 24 effects not estimable
Estimated effects may be unbalanced

Stratum 4: Subject:Task:Valence

Terms:
Task:Valence Gender:Task:Valence Dosage:Task:Valence
Sum of Squares 5.38889 2.16667 2.77778
Deg. of Freedom 2 2 4
Gender:Dosage:Task:Valence Residuals
Sum of Squares 2.66667 49.00000
Deg. of Freedom 4 24

Residual standard error: 1.428869
Estimated effects may be unbalanced

去掉被试内因素交互output:

Call:
aov(formula = Recall ~ Gender * Dosage + Task + Valence + Error(Subject/(Task *
Valence)), data = data.ex5)

Corrected on Mar.17:

aov.nwt  <- aov(Recall~(Gender*Dosage)*(Task+Valence)+Error(Subject/(Task*Valence)))

Grand Mean: 15.62963

Stratum 1: Subject

Terms:
Gender Dosage Gender:Dosage Residuals
Sum of Squares 542.2593 694.9074 70.7963 1144.5556
Deg. of Freedom 1 2 2 12

Residual standard error: 9.76625
Estimated effects may be unbalanced

Stratum 2: Subject:Task

Terms:
Task Residuals
Sum of Squares 96.33333 41.66667
Deg. of Freedom 1 17

Residual standard error: 1.565561
Estimated effects are balanced

Stratum 3: Subject:Valence

Terms:
Valence Residuals
Sum of Squares 14.68519 83.98148
Deg. of Freedom 2 34

Residual standard error: 1.571637
Estimated effects may be unbalanced

Stratum 4: Subject:Task:Valence

Terms:
Residuals
Sum of Squares 62
Deg. of Freedom 36

Residual standard error: 1.312335

去掉被试间因素交互output:

Call:
aov(formula = Recall ~ Gender + Dosage + Task * Valence + Error(Subject/(Task *
Valence)), data = data.ex5)

Corrected on Mar.17:

aov.nbt  <- aov(Recall~(Gender+Dosage)*(Task*Valence)+Error(Subject/(Task*Valence)))

Grand Mean: 15.62963

Stratum 1: Subject

Terms:
Gender Dosage Residuals
Sum of Squares 542.2593 694.9074 1215.3519
Deg. of Freedom 1 2 14

Residual standard error: 9.317234
Estimated effects may be unbalanced

Stratum 2: Subject:Task

Terms:
Task Residuals
Sum of Squares 96.33333 41.66667
Deg. of Freedom 1 17

Residual standard error: 1.565561
2 out of 3 effects not estimable
Estimated effects are balanced

Stratum 3: Subject:Valence

Terms:
Valence Residuals
Sum of Squares 14.68519 83.98148
Deg. of Freedom 2 34

Residual standard error: 1.571637
2 out of 4 effects not estimable
Estimated effects may be unbalanced

Stratum 4: Subject:Task:Valence

Terms:
Task:Valence Residuals
Sum of Squares 5.38889 56.61111
Deg. of Freedom 2 34

Residual standard error: 1.290361
Estimated effects may be unbalanced

【转】“1+1”缘何总在GDP核算中“〉2”

Quote

近日,各省级政府都向当地人大提交了过去一年的经济发展答卷。有记者发现,去年全国各省(区、市)核算出的GDP相加总量达到57.69万亿元,比国家统计局此前公布的2012年初步核算的国内生产总值51.93万亿元高出5.76万亿元,相当于多出一个广东的经济总量。(2月4日《中国青年报》)

赵本山小品《卖拐》中有个笑话:1+1在算错的情况下等于3。但在中国GDP核算中,类似“笑话”几乎是年年上演。自1985年国家和地方层面分别核算GDP数据以来,地方统计总和一直高于全国的GDP总量,不仅呈现出“1+1〉2”(地方+地方〉中央)的局面,而且有递增的趋势。例如,2009年各省GDP之和超出全国2.68万亿元;2010年各省GDP之和超出全国3.2万亿元;2010年31省区市GDP超出全国总量3.5万亿元;2011年31省区市GDP总和超出全国总量4.6万亿元,而2012年地方GDP之和竟然超出全国5.76万亿元之多。

对于GDP数据“1+1〉2”这一怪圈,国家统计局曾称,地区GDP之和比国家统计的GDP多出很多,既有重复统计的因素,也有国家与地区使用的基础资料不完全一致的因素,还有外部环境的影响。诚然,重复统计、统计资料来源不一致等问题会造成“算错情况”,但什么样的错误会达到数以万亿计的差率呢?又是什么样的错误会年年出现1+1〉2而且呈愈演愈烈之势呢?

这显然不是简单的“算错情况”,恐怕更有其必然因素——差率如此显著的主要原因,很大程度上是由于GDP被赋予了重要的考核功能,为了考核过关或政绩需要,一些地方常常高估GDP数据或弄虚作假,进而造成很高的“水分”。

这是因为,长期以来,GDP一直在神坛上被官员顶礼膜拜,正是由于GDP被赋予了过于重要的考核功能,一些地方出现了GDP攀比现象,GDP目标也脱离实际,甚至出现弄虚作假。虽然近年来,中央有弱化GDP考核的意向和趋势,但在其他指标未被认可、确立之前,GDP与“官帽”的联系依然十分密切。因此,要根治地方GDP虚胖,改革官员考核机制须早日提上日程,并规范化程序化制度化。

笔者以为,在肯定GDP指标作用的同时,还必须加重民生、民意在官员考核中的分量。让官员从过去对GDP过分关注的惯性思维中走出,进一步转向对民生、民意负责中来。具体说来,对官员政绩的考核应立足两点:

一是体现科学发展。不断增强考核方式的完整性和系统性,即既注重考核发展速度,更注重考核发展方式、发展质量,力求避免无效的经济增长,如重复建设、盲目投资等;既注重考核经济建设情况,更注重考核维护社会稳定、保障和改善民生等实际的全面的成效。

二是体现民生指标。事实上,作为一个国家宏观经济指标,GDP毕竟和民众生活有着相当距离,其高低在短时期内很难被直观感受。而一个地方的物价水平、收入状况、人居环境等,却是民众能实实在在地感受到的,民众对官员在这些方面的表现也自然最有发言权。因此,在执政为民的政治语境中,只有当群众满意度和官员政绩直接挂钩,才能真正发挥民意考评的威力,才能真正体现人民当家做主,进而挤掉GDP核算中人为造成的“水分”。

(邓子庆)

stroop效应-多重比较的不同方法(2)

与遍历检验的事后检验相比,有选择的事前检验更为明智。

在R中的实现:multcomp::glht(...)函数

前文所列的两个问题就是在进行stroop效应实验时,有选择的事前检验的例子。

**在开始前需要做的

将数据整理为可直接导入spss的标准格式。(glht不支持dataframe形式的复合表格)

personal favor:在excel中预先整理好数据,Ctrl+C复制入剪贴板;R可直接从剪贴板中读取excel数据,个人以为这比打开Rcmdr将数据导入Database便捷一些



之后进行第二次实验的A、B、D三组与第一次实验的A、B、D三组比较



 

stroop效应-多重比较的不同方法

实验综述:

材料:共16张卡片,实验时随机呈现

A套(字色一致):红色的“红”字(A1),黄色的“黄”字(A2),蓝色的“蓝”字(A3),绿色的“绿”字(A4),共4张。

B套(字色矛盾):绿色的“红”字(B1),蓝色的“黄”字(B2),黄色的“蓝”字(B3),红色的“绿”字(B4),共4张。

C套(字色无关):红色的“我”字(C1),黄色的“爱”字(C2),蓝色的“中”字(C3),绿色的“华”字(C4),共4张。

D套(字色语义无关而音义有关):绿色的“洪”字(D1),蓝色的“皇”字(D2),黄色的“拦”字(D3),红色的“滤”字(D4),共4张。

第一次实验时按钮判断颜色,第二次实验时在判断颜色的同时念出卡片上的字。两次实验均测量反应时。

涉及多重比较的问题为:

1 A、B、D组与C组比较检验有无显著差异

2比较第一次实验中的A、B、D各组同第二次实验中A、B、D组的差异

最简单的多重比较方法为分割置信度做多次t检验。最保守的方法为进行bonferroni方法的事后分析。