&
,#-
#$# $rr
Xr
ffir, Wwrww' r'rfl I rr Mt
itt
:
1 \ h ii \ -i--'. -\ '4it
".nr.'\ \ i ir o\t'-\ % \.
;r\ \ 'i ti "1 ,/n- 1
:\ \. å !l ^"r/.:\___=__.
-\ ,^\* a i
,si o,.l\--o , *i : "r)>
*,ry44y tr$S$ryariehannrt, Afiand
b
'1 F**'\*r i å\ * | *'- *. -'
,-
i -'tI- ,.' ' -r)"^ * t *r ';ii i ,it
+i ** 'r{' \ +
*'rl -\- *\ +
#+-.
Fth Noitfiq Confereryöe gn MathEmaticaF$tatietlce
+.,1t 1t F ""t
*',
I'r \
'lr' -\
il
,\
l! \
b'' \
/
\
q)" "
.+
*
;r
L
q,il n
\q'
$epp
gt{
\'x*x
\*
r$
eF"t
S.il4*s**ntr$
:
#se åxl*mr*rf,åvm sflm*åståam3 dmtm pr'#trmssång$.mrxstnn*nr #spmr{srlerl& ns $tmtåst,ics* {.Fnåwmrsåty ms ${må*ål?kå
*F,! Xffi Tfr Refl TXi,i[ STeT gST3il&{_ m&T& pR**tS$HF{ffi
1.
IntroductionThe
inpact o{ autonaiic
dataprocegsing has in
necentyearE
heenenorrous
also in the field o{ slåtistics. In statistical
resealchit is not
enoughto solve purely theoretical problens. Clear
cenpu-tationEl results are equally inportant.
Although complicated
nathenatical
nodelsare applied in statistical analysis it
doesnot inrply that the
prohleosof statistical
data pro-cessing are only
nrathenatical andrelated to statictical theory
andnunerical
analysis. In stätistical
computingthe
knorledgeof
variousfields p{
conputer gcience and systenanalysit is essential
asrell.
The
najor e{forts in statistical
data processing have bnen concernedrith the problms of
dataanalygis.
Thereare
nanylarge collections of
prognalsavailable for the
purpose.Te
feel,
hcxlever,that in spite of
these ingenious progran packages nanystatisticians are not quita satisfied rith
thE Presentsituation.
Thare ane several reagors
for this dissatisfaction:
l)låany
of the
progråncollections are really like
'canned packagesni theyare sinple to
usein
standardapplications, but it is alnost in-
possible
to
see nhatis reall,y insidE this
opaEkagsn and horto
spsrit.
Thusit is difficult to
studylhe internal structure of the
prog-rams and nake
alierations
uhenever needed. Therisidi.ty of
these pack- ageEalso reslricis thein
uEefor
teaching purpogeg.?)llany
statisiical prograns are often too autonatic or they
ateauiomtic in the
rnongplares. After entering
smeinitial in{orratiqn into the
corputerthe
usen cannotbut cait for the final results rith-
out
anypossibility to intervene. Thus even
rhenthere is a slight
errol in the initial in{ormtion the
t+hole process goesthrough
andnust
then berestarted. It is also typical, for instance, that a
prog-ran for linear
regresgionanalysis selects the
regressorsautoratical-
ly, but
trhenthe
nesidualsare to
beplotied the
user hasio
declare eachiiny detait in
orderto have
an decentgraph.
Soin the rorst
caseE
the roles of the ståtigtician
andihe
conputer have changed andthe
user seemsto
becontrolled
bythe
systen andnot vice
verea.3)There
are situationE
rherea statistical
progran nay bequite sat-
isfactory, but
everythingis spoiled
by an inadequate sperating sys-tm. For
inEiance,in a tilre
Eharing environnentstrongly varying te-
sponse
tires of the cmputer syslen nar tot'allv ruin a rell
deEignedinteractive
approach.4)llany statistical
pacltagesare good for their
Epecialiask,
butthey
are too restrictive. å nultisiåge
research process cannot becar- ried out
asa rhole, but sone
stepsin the process nust
be done byother neans.
llorkingHith
several, badly synchronized programsray
bFvery {rustrating.
5)It is
conilon rhenuriting a report
containingnurerical
tablesthat ihe
conputerprintout cannot be
used agsuch, but the resulis
have
to
he retypEdnanually.
This nay happen evenif the cowuter
out-put is rell designed,
sinceihe
needsof the
user nay chanEe dulingthe reporting phage. 0nly nhen the ståtistical
systenincludes text editing facilities this o{fers
no problens"In general, there should
bea iendency to love
anayfrm isolated
packages and
individual
progranstorards statisllcal
operEtinq svsiens uhich coverall the activities in the field of stat'i.stical
cooPutingin a unified {orn,
Ast€tistical
oparating systen can bc considered anenlargenent
o{ a
nornal operatingsystea
havingthe typical rtatitti-
cal
operations ånongits conrtituents. In thls
uaythc urpr hrc
to?al supporl fromthe conputer to the
vaniousiteeds in statletlcal
colplt-ting.
There have
also been proposåls for stalistical
pnogranning langua-$"ffius{*w*ffi: Slt ågt*ermsååwe st*ååsååce3 da*e pn#frsssåms
ggs, Special
languages and codEsare use{uI
j.nrestricted arees of Etatistical corputing
(asin sinulation). In genriral,
horever,it is
hardly
possibleto
proceedin this direction,
sincethere
donot exisi sirplc rays of
expressingståtisiical
operationsin unified {orl.
One
practical solution is
aninteractive statistical operating
sys-ier
rherea naiural
languagelike
English{orns the eEsential tink
betreen
the
syeien andihe ståtistician.
Som
statisiirians
and coapuierspecialists
geErto
berather
gue-picious of the possibilities of interactive
csnputing. Thereprevails,
horever,a
strong agreenentabout the neriis of interactivity in
ex-ploratory data analysis especially cith snåIl
datasets. 8ut
whenrorking rith large salples
and using uore sophisticated techniques the opinionsare
shåred, Forinstånce,
Nelder (tg?E) saysthåt
nFor larger problensinteractive sorking
uay beless iaportant,
becausethe re-
sponse
tine of the user to his iniernediaie results becotes the tin- iting {acior in the analytical process'. This is
aninteresting state-
tent,
sinceusually
peopletell about the
experiencesthåt it is
iheresponse
time of the corpuier rhich actually is liaiting inieractive rorking.
Se
{eel,
houever,that, in principle, there are
norestrictions in using interactive
approachto all
ldndsof probleas. In Practice
thelinit of
profitablenesEis
cont'inuously novingin
favourof interacti- vity.
ålihough everything
is not yet
so påyinsat this
stågesith interpc- tive neens it is $orthnhile to study this alternative elso in
nore complicatedtasks,
sincethe rules of
nahing goodinteractive
sofiuareåre not ihe sane
aEin
baich processinEand it
takestine to
learnthis
ne$attitude, In
recent yeåre,it
has beenquite
coEronto rodify
eristing
proEnåil packagesinto
noreinteractive fort, but re feel that
this is nöt the
best trayto
proceed. TruEinteractivity
needsand
de- sePveg anothersiåniing
point,.Thig doeE
not inpiy that in the {uture
everything shouldbe
solvEdby
intenective syste$E.
Thereal
needs,tastes
andsorking habits of ståtisticianE åre extreuely varying.
5oit is
inPossibleto think thai the
progressuill
leadto
sore uniquesolution.
Senust
havecontinu' ously
severalalternatives {or different
iesks,In the 0epartnent of $tatistics at the University o{ }lelsinki rå
have studiedvarious forls of interactive couputing.
Thereforethis
presentåtioneill
be devotedrainly to the direction re
have closen.8. Principtes of
SURt{l ?6In
onderlo give a nore
precise accountof the possibilities of
åninteraciive ståtistical
operatingsyEtil
uEshåIl
describe SURV0 ?6trhich has been developed
for the snåll
desktop
coapuler Hang E?00.This
systen hasan early
pnedecessor SURtJtl66 rhich ras the {irst
general purpose
siatistical
packagein
Finland andhad
ranyof
thefeatures nqr
connonin statistical systens
(Alanko,Tienari,l{ustonen1968).
Hqleven,in
orderto
achieveirue intersctivity, only a
ninorpart o{ the properties o{ this first
SURI,Q has been ecceptedin
SURU0?6.
It
cannot beclained thst
SURUI ?6is a ståtistical
operaling systenin the true
sense, Einceit is not a part of the
basicoperating
sys-tcn of the
conputer, Uethink,
hcnever, that
nsnyinportant
agpects o{such a rtatistical systeu can
beillu;trated
using $URW 76ag
ånerarple. In
SURIÄ] ?6 ge havetried to test
various approachEe covenrnga ride
rangeof activities in Etatistical
corPuting åså
" Iaboratorycrperirento in order to
learntore
aboutthe rules o{ intenactive rork.
The SURr,rO ?6 systen has
been intended to neet esPecially the
needsof statisticians in
both teachingand
research nork andits ains
atestightly different fron those o{ conventional statistical
Packagesoenerallv avaiiable {or
dataanalvsis. In a certain
gensethe
Ecopeof
$"mrss*mn** $ #n ånflsrme flåwm s&mäås*åcmå dm*'a Sr*fiesså?t$
SURV0 ?6
ir rider peruittins
extendedpossibilities {sr
data andlext editing, sinulation, matrix
conputations and graPhicalanalyrig.
lur nain goal
has heento
Providesuitable tools for a statistlcian
uholikes to
havea
quicktest of his
research ldeasbr
nakingn
ron*putal,iotral
arperineni.
Uguallysuch al exPpnirent
revealsthat
the ideatas silly, but
rhenre learn this fact |n a {er ninutes on
hoursinstead
of rasting
sevaraldays,
ourlhole
reseanch processrill
begpeeded up considerably,
9URIJO ?6
ip at
presenta
natherlarge
systenconsiltinE of about
60staiistical
progran$ andsubsysiens
(SURW ?6 npdulF.q) andthe total
volune
is alnost
1nitlion
byiesof
prograntext.,Fornal,ly
$URtt0 ?6is a single progråil uritten in the
extended BASIC languaSe (BABIC-p)of
Hans ??00lrP.
This nåy be
a surprise for
those nho have beenlold that glsl0 iE
åelenentary languagE
aeant for sinple tasks
andshsrt
progpans only.Thig
is of
coursetrue for
?heoriginal BåSIC'
but,the
various erten- sionsin
BASIC-a have reuovednany of the
drabackE and thereare
nosåvere obstacles
for
naking larEe prograns, Evenin this
eriendedforl
BASIC
is
lacking nany {eaturesexisting in
noresophisticated
langua-gBSr but they
neednot
beso inportant.
Se havea feeling that
theinportance
o{ the
progranning language can be exaggereted by ocolPuterspecialistsn rho
donot actually
knnrthe practiral
needsOf
Progrflt-Eers.
Discusgionabout the relative nerits of various
languagesin
statistical
conputing seensoften to
be ona rrong
basis.An
inlerpretative
languagelike
BASIGis' of course, inefflcient uiih
respect,to
corrputingiine, but in
aninteractive
nodeof
rorkingthis is
sEldona real harn.
0nthe other
handthe possibiliiy io
nakealterations in the
pnograns napidlyriihout extra
systetr counands and prograncorpiling
inFroves and sPeeds up boih advanced useof the
sys-tm
andsystm developneni. te believe that the {uture technical
pro- gresstrill still
increasethe relative nerits of ihe interpretative
languaEes
in statistical
csrputing.Portabilitt, (i.e. the possibility to
usea
Prograt PackasePa5ily in di{ferent
nachines)is
anotherfeature rhich
has been enphasized rhen evaluatingetatistical Progr;ns, It is
easy t'oågree' but re think
againihat
eventhis
proPerty has been eraggereted. Thetruth is still at the
uonentthat
uhen onelikes lo
createa universål solution rork- ing in all inporiant
couPutersvery little csn
berealized rithout
huge
extra labor,
tirne and costs."'i;
;;-;;t;-il restrictive to think in terns of
aninlersectign .of
all ihe available alternatives. If re like to nake
progressin'the
areao{ stati.sticat conPuiing te nust stårt fror
nather speciålizedcolputers
having ProPertieerhich re
hoFeare
comronin the
naarestfuture, It is the ideas rhich are
portableand tha
conPutersrhich
should be portable.SURIII
?6 is an interactive
svsteCI andno speciål iob
describinglanguage
or
codels
needed. UEingthis s!'stil 1r tike dlscussing rith
the cmputeri
ue sPeak about SURinl ?6 convErsatlons. The discussionis transnitted {rol the system to the
user bva
CRTdisplay (speed is
afuost
5000 characters/sec.)
andfron the
ueErto the
systEtr bya
key- board havingålso "so{t
keysn{or
variouscontrol
tasks,For
a rore
precise and det'ailedoutput'a line Printer' a
graphic CRTand/or
a plotter are available.
The
possibility for rapid
in?erchangeof infornation betreen
the user andthe
systenis
one cornerstonein a true interactive statisti-
cal systm. It is also inportant that this
ProPerty hasbeen
adoptedin
Eucha
rEythat the user
caninstantly
reach anypart of the
datato
be analyredfor insPection.
Equatlyirportant is a rapid
accessio the di{ferent
noduleEof the statiEtical
systento get
an ideao{
h65the
systen Eorks andto
nake tenporary nodi{icati.onsand
entrargeoentsto the
nodules,Due
io interactivittr a
user knmingthe
rrain pri.nciPlEsof ståtisti-
cal
conputing canlearn to use
suR'vu ?6 by.just stårting to use it
$.måjsfleffi8e; ffim årc&erse f,*ve s*,e*åsååm*å de*e Prsfrs'ssåslS
without äny detaitred inEiructions. llo prograaning experience is
rscessår),
in
ståndard appU.cationo{
$UfttJ$ ?6but in
aorå advanced usecomand of
BASIC andnain construction principles o{
SURIJQ?5 is esmntial.
Evln intsractive
sygtmsare
sonctirresfrustrating
since they a6y inthair orn
gentletay
coepelthe user to a
longunproductlvg
Gorvel*såtior sithout a nåtural exit, In
SURIJ0 ?6this
dePendenceis
avoidedby sptitting the Prograls into a lot o{ snall nodules.
Hhenthe
userbecones exhausted
riih a certain
nodule he caninterrupt the
conver-sation
andcell
anyof the
neighbouring nodules by PresEing one single key on thElreyboard, nithout
lOsingcontact eith the
Previous stågeSof the
job,It iE
evidentihat
nanystaiisticians
donot tike to think in
ternsof
conpuierprogran6.
Theyprefer carrying out their
conPutations and datsHnipulations in ninor
stepsin the
orderthev like.
There
prefetenses
have been takeninto csnsideration in the
SURI'O 76sygtem
rhich
canin
nany reEpectsbe
operatedlike a
deskcalculptor
rith
verypmerful
keys.0n
the
langää00
keyboardthere
are*
Epecial{unction keys
(de-noted by
F$,Ftr,...,F31) nhich
can bede{ined as stårting pointe for dif{erent parts of the progran. In
SURI',0 ?6the functions of
thesersoft
keyEr vary depending onthe rodule in use.
The usernot
ltnmingrhich
F-keyto
Pregsnext,
canahays resort to
key F0shich in
SURtJtl?6 digplays on
ihe
CRTthe functions of
other F-kays operativein
the presentsituation,
Each
F-stårt
leadstypically io a
sequenceo{
quesiions mdeby-
ihesystm
and these haveto be
ansrered bythe user.
ThenhOle
dialoguei.s
displayedon the
screenand this
procedureallqrs the systen to
give
thE user nåny csmEnts andhints relevant in the context ttithout
anyraste of tile
and PaPer.In
orderto
speedup the csnversåiion
SURUI ?6itsel{
volunteensuith a
suggestionfor
an ån5$er Bhichis
displayedafter the
quesiiOn.To
give
reasonable sugtestions SURV$ ?6tries to
retenberthe
previousaciions o{ the
useror
evanto
guessuhat
henisht
aitErrPtnert' If
the
user agree$nith the
sugsesiionof
SURTE ?6ii is
enoughto
- preEgthe
RETURNkey. 0thercise
herust
f,vpehis mn
ånsrer'Each
interchange of questions
andån5oer5
leadseventually io
aseries of di{ferent actions
andconPutations.
TheresultE are prinied
onthe CRT.
llhenthe conputations åre finished the
usercan
selectanother
F-stårt or
anothermdule.
CertainF-Etarts at€ reserved for mving ihe reEults iust
obtainedfror the
screento the printen or for
saving thenon disk
asinterrediate results for
subsequent analvsisrith other
nodules.The nodulEs Perforning
various ståtisiical
analysescan
co-operateand uEe
the
Eaneoriginal
data{iles or internediate resultE
trj'thout anylodifications
rheneverthie is ståtistically
reasonable.iach ståtistical
nethod i.n $URIJ0?6
has beenspl'it inio snall
eub-roduleE
andthe
variouscorputations
and datananipulations can
becarried out
byco$ining the
corresPondingF-starts
Properly'Hence
it is the user's responsibility to nake
goodchoices' It uuuld, of
couroe, be earyto
connectdif{erent
subrrodulesin a {ired .righio ordår, but
thenthe
user souldbe at the
nercyo{ the
svstenrhich is the undrsirable {eature o{
Eolestetistical
packaE€$.The
possihility to splact di{{erent cotbinations of actions quita freely
neansthEt the
user can enploythe
systenin a creative fflfrer and not only by
repeetingtraditional co4utation chains. It
alsp'le€ns
that,the
usernust
knorln
advancea great deal o{ the
uethod helikeg to use, but
notnuch of
dataprocessing in general. Ie think thst
easy usein connection trith statistical
progralsnugt not inply that they
could be used'easily" uithout
any kn61rledgeo{ ståtistirs'
There
are
nqradaysplenty of riSid 'åutonatic' statistical
Progrånsrhich
can benechanicallv
operated byanybody, but this at the
Earetire is å
sourcefor uncritical application of statistical
nethods'$" ma*måmnsn å #ss -an t,wnot*tåvm så,mf,åmt,åsm 3 deå,m pr#fl msså$rg
g..+ltd:
An
irjeal configuratioa {or
suRu0?6 is at the
momenta
sang aa00w havinga central
processingunit nith å
Dpnoryof at least RX, a
cRTdisrlay
ä4x80,a dual {loppy disk drive, a Frinter, a
graphic CRT anda plotter'
0hservethat thn
the FA$IC-8i.nterpreier
andthe
operating sygten arBin a
spparate controJ. menoryo{ ca.
Sff("uhen
thr
5l",RV076
systernis ln
use oneof the dlsk drives is
re*served
for the
$URVO ?6 prograndisks and
anotheris {or the
user,$data and possiblp
additional
prograns. Anyof the
diskE canbe.t.nr*J in a fer
seconds rhenever neceEsary.The systan consi"sts
of a central
nodule and variousstatistical
andspecial
noduleg, onpo{ uhich at a tine can
bein
usetogether rith
the central nodule.
Thecentral
nodule i,akes careof ihe
co-operation betneenthe different siatistical
nodules andit
containssysten
sub-routines, e.g. {on
datatransfers
betxeenthe centrar
andthe
disk ne-nor)r. Thusthe
usen needs neverrorry
aboutthe location of the
dataduring the
conputa+"ions.The nunber
of
suRtJtl ?6 nodulesis not in
any nåylinited. l{er
rod-ules for sinple
dataanalysis
can be generatEd qvenin an interactive
node bv
consultins a hal{
preparednodule
FRAI{E. Enp}oying FRållEto build up a
ns* nodule guaraniEpsthåt the
nodulenill be
conpatiblerith the
requirenentsof tha
SURUtt ?6 systen,suRv0
76 contains
several nodules{or statistical data
analysis.Hhen beginning
to
developthe
systenthe aost traditional and
elenen*tany forns of analysis
nere enphasized and they gavea natural
basiÅ{or the the
systen.ilsr the
davelopnent has beendirected
lurands nore sophisticated and conputationally dmanding nethods.The
systm
includes nodules,e.g, for {olloning activities:
-basic statistics,
-frequency
disiributions
and tables,-data sorting, order statistics, -statistical tests
and tables,-linear
and nonlinean regression analysis,-nu
ltivariate
nethods,-cluster
enalysis-tine series
analysis,sqveral
non-standard nethods arealso available,
sanplesrith niss- ing
values canbe treated
andtechniques fon detecting outliers
andfor robust
estirnationare
included.The problens o+
aaL-i;il;-;;i;ns
andtransfornation
haye recaivedspecial attention.
Thereare
ståndard nodulesto
coverthe activities in this field and
thevnake the systen self-contåined.
Thensest contribution to
data nanagenentin
SURV0 ?6is a
general purpossedit*
ing progran. It is
connectedto the statistical
ilodulesand
lakes possibletexi editing
and variousreport
generaiingactivities rith
numeric and alphanurreric data and
results.
tlne
of the
basicprinciples in
suRr,o ?6is that
anypotentially in-
portant observations
andinternediate resultE can
beused in
sub-sequent conputations
rithout extra nodifications o{ the
systen and thedata. te thus
haveuniforn
reFresentations{or
variousdaia gtruc-
tures.
suRul ?s
allons
bnthvariables
and observationsio be labelled riih
alphanuueric nanes. This uakes
the results
nore readable and theroni- toring of the
conputationseasiep.
Each noduleis
supposedto
recgrdcontinuously
onthe
CRTshat it is doing. For exanple,
rhen obrer-vations are
FroceEsedthe
systendisplays
,thenanes of the
obser- vations.trt is not
necäEgårythat
t|,re user håstitrE to
readaltr that is shrn
on
the cRTi
uguallya
crude inrPressionis
enoughfor nonitorinE.
But, t+hen sonething unpxppcted seBnsto
happanit is
possible i,ostop
thein{ornation flow
onthe
screenand
sae uhatrBal}.y is
goingon, If
S*mn*sås*e* 3
*n
åc?t"erse&åve ste&åsååcmÅ dmt"* ptrsrmssånffineces$rty the oulput rate cån
beslqred
dotflnto a nornal
readinglevel.
4,
Spesial {oroeof intenactivltv
Som
interactive
approaches usedin the
SURUT 76 svstenrill nor
bedescribed,
althoughre kntn that it is rather difficult to erplain
these dynåric
properties uithout actual
uorkingtlith the
syster.@ In
SURUI?6 lypical ståtisticat graphs like histograns'
Ecatterdiagrils
andplots of tire series
corbineduith analytical cgrves
andsurfates
can be producedinteractivelv nith the
graphic CRTand plot- ter.
Speci,al graphslifte Andrets' fu$clion plots
andCherRoff's
fecesare also available.
SUIW ?6 takes care
of the scaling of the variables if desired
and gelects appropriatE aotaticnEon the
co-ordi.nate axesthus relieving the
userof those nuisånces.
0nthe other hand the
user håsa free
choicein rany really irpsrtånt nattets. For ilstånce,
rhenplotting gcEtter diagrans
any nönlinearscale
onthe
axescan
bedefined
byenteriRg
the
equationof the
correspondingscale trans{ormtion or
bvsplectinE it fron certain
ståndardalternatlves.
ForexatPle'
variouEprobability
papers nay bespecified in this
rav.It is ersential that the
usercan
enplEy variousplotting
aodulesone
after
another{or the sale picture to
corbinegraPhs' It nay
beuge{Ul
to
have,{or instancs,
severalrelated tire series in the
sE1ppicture. Likesise, after
nakinga scatter
diagranthe
userlay esti- late
variousrodels
andreturn lo plot the fiited
curves onthe
såDe9raph.
The graphs
also
håvean inportant role ln the prelilinary investi- gation o{ the daia, In
$tlfttJ{l ?6interactive
techniqugsåre
availablefor detecting outliers
by graphicallEans. It is iypical ihat
uhen'for instance, a gcattpr diagrar is
displayed onthe
CRTthe uEer
cånpoint at
any observåt'ionnith the
cursor andfind the
naneo{ the
ob-servation sinply
by pressing key o?o.The sare search procedure
appliee in the displäy of
the l{ahalanobis' diståncedistribution
then usingthe mdule
C(}RR$BU' intendedfor ro- bust estiration of
neansr standard deviations andcorrelations
alsng arodification of the iechnique
presentedin
Gnanadesikan(19??).
Inaddition, the
usercan point at the reiectiol treshold far the out- Iiers uith the cursor.
Usingthis interactive
techniqueiteratively re
have reechedprouising
resultE.In
aninteractive environnent it is possible to revive
techniquesuhich
have beendifficult to
comPuterizebefore. lhe
problenof to- tation in factor
analyEisis a
good exanPle. $henthe rotation is
car-risd out uith a
conputensithout the pogeibility o{ instånt
graphicaldieplays
the critaria for Euitable rotation
haveto
benodi{ied to
ablind analytic {orn.
llanyanalytlc rotation
pnogrålegive
goodresults
in ståndard applications, but
theyare rather insensible to
thesppcial
needso{ the usel. In our
syståsthe factor rotstions åre
Per- fornedgraphically
and stePuise onthe
CRT,but ihe
user canalso
en-ploy sola analytic criteria
as advicefor
each step.4.8. l{atrix
operationsI11 rany
desk
conputerEvarious arittrnetic oPerations
canbe
per-fonred and
results
displayed.just
by operatingthe
nachinelike a
nor-aal calculator.
Toa certain extent thig also applies to natrix
co&pu-tations.
le feel, hqlevet, that these
siandard operationsas
suchare
notsophisticated
enough{or ihe nultifarious
conputåtional needsof stat-
igticians. It is oflct desirable io have
anopporiunity to
continuecertain
conputationsnanually a{ter the
siandsrdroutines håve
beenper{orned, For this
pu?po5e $URVtl?6 contains a special
subsystencalled
|IATRI.S' tr{$sf,mäimrl $ #n åm*srmrååvm s*a*i"s{ånm
}
dmt"e Fs.ffifress$"glstlith
I'IåTRIthe typical urairix
operations neededin statistics
can beperforned uEing
the corpuier like a calculator. In
I{ATRIthe "eo{t'
keys
are
defined{or
variousnatrix operations.
Thenatrices
requined as aninput
can be keyedin nanually (usually by fillinr a fonn rith
proper dilenaions and
labels
onthe
cRT) ontrans{erred {ron dl{{cnent
suRVO 76
files.
Results can be cavedin special latrix {iles for later
operations.An
essential {eature of
IIATRIis that it
doesa tot of
bookkeeping andlabels
eachresult rlth a
nång correspondingto the ordinary nat-
rir notation.
Thecolulns
andrors in natrices
canalso be
labelleduith
nanesand
these nåresrill
be novedin
IIATRIoperations
alongcertain rules.
The usen
can also define extra
openationsand
nakesinple mtrir
pnograns (llATRI chains) bv
iust carrying out a
sequenceof ratrix
op-erations
andthis
sequence can be nepeated automaticallyrith
otheninput natrices,
These I{ATRI chains can hesöved
ondisk
and uEedin
connection
rith
other I{ATRI operations ehen needed.4*3., RFndqn data
sinlilatiott
In
nethodological norkand in
teachingsituations it is useful to analyze artificial
randon data whoseonigin is perfeclly lrnan.
rtre planningof
suchexperinents
cen basubstantially facilitated by er- ploying the
nodule CHAI'ICE nhichis a
randon data generator.The user
has to type
l,hestatenents
neededto generate a typical obsErvaiion
accordingto the
advice givenby
CHAI{CE. FqrthiE
task, several subroutinesare iunediately available to
Eenerate pseudo rån- donvariates fron
variousdistributions,
Thusit, is
easyto
construct randoldata
accordingto a given statistical rodel. The silulated
files
can subsequently betreated as
ordinany datafiles in
SURiII ?6.using cHAl{cE
the behaviour of different
sacpledistributions
canalso
be denonstrated onthe
cRT. The userselects the distribution
andits paraleters
and CHAilCEstårts to
generate andplot observations
onthe
CRT oneafier aisther
aaa constantly grqring
histogran.4.4.-Testino of etatistical
hvpothesesAs an
ermple of the
useof intenal{ivity in
alrrpleståtistical in-
ference
let
us conEiderthe
technique usedin the
suRrfi ?6mdule
TAB- TEST. Atypical display
onthe
CRT duringa
TABIEST nunis the follsr-
ins
i
FREOIEiIf,Y TABLE
I il*
LA4A00 013A
Xft= 9.33 0F= 3
P*0. 0e4BgCAST
ä:
OIILY RSH TOTALS FTXEDREPLICåTE$ CRITTCAL LHT€L P
( CH T SE *SPPRSX Tf{A T3CIN }
$.,8. SF P
Egg o.op8oo
0.oo$8X€ IS
SIGI{IFICANT AT T}lELI
LElrtLrfrff
pnOeABILITy 0.69e1?TO STIIP T}IE SII.IIT.ATIOil, PRESS RETURN(EXEC)
The user
has
siantedthis job
byentering ? sanples of 5
obser-vations in the forn of a
Px4 frequencytable
andthe goal o{ this
ana-lysis is to
deciderhether
these sanples åFe{ron the
såne population.For this
purpose TABIEST has conputedthe
cornon Xf€-value9.BB
andindicates thai its critical level is p=0.ffi4g
accordingto the chi-
squared appnoxination. lde
kns,
houever,that in
caseof fer
obser-vElions this
approxinationnay
berather poor
andihe exact disiri-
bution of X|B-statistic
should be used insiead.llqadavs it is tvpical to construct
tablEsfor
Eomplicatedte*ts
bynuaerica! nethods
and sirulatlon.
Here,h*ever, re are uslng riru- lation in a slightly different
ray.TABTE$T does
not consult
any ready nadetables, but trler to find
the true critical level just for the case presented. A{ter the
user"S. mffist-se*?E ä
#*
å*&snmcååvs s&#t"åst'åreå d*åa prffiflessåBlshas
speci{ied the nu}I
hypothesis (here CASEei
(}NLY R0ll T0TALS FIXE0}TABI€5T
inrediately starts to estinate the critical level by
gener-ating
randoft sanples accordingto ihe null hypothesis, {orns the
cor*respcndlng
tablss,
conputesthe
Xtfl-value andthe proportion o{
thosetableg for rhicn
Xt? exceedsthe
value9.33 in
ourcååe, This
pl'o-portion
Prill then
aPProxinatethe true critical level. The
under-lined nurbers in the display are changing
duringthe sinulation
ex-perineni
andthe
u5er cån watchthe
ProceEEas
long ashe likes,
SinceP
is approrinately nornEl uith
neån equalto the true critical
vå!ue, TABIE$Tdisplays also the probsbililv for this estigte to go
belnrthe
nearest standardlevet
(11in this
case).Usually
it is fiot
necå55åryto
knotrthe
exactP-value, but a
crudeapproximtion is sufficient {or Practical
purPoses. Hereit
tåkes onlya
{e13 secondsto obtain the display above
andit
nevealsthat
theoriginal
chi-squaredapprnxinatioi
seelsto
berather
congervative.In
$URIJ0?6 lhis 'instånt sinulation'
approachhas
beenused {or
variousnonparatettic
testE and evenFisher's randonization principle
becones
applicable {sr quite
reasonablesarple sires. For
inst€nce,ihe
SURtltt ?6nodule
C0IIPARE includesihe Fisherfitrnan
randouizationtest
{onconparing
tryo independentsånPles.
(Forthe definition
o{this test see, for instance,
Ccnover 19?1, pp,36?-364). The exhaustive enunenationof critical colbinations
neededfor ihe traditional
åp- proachis fornidable
alreadyfor
sanplesires
15 and?0' but 'instEttt sirulation' usually gives satisfactory results cithout
delay.4:5,
ProErannodificatllrnF in
advanced uåeInteractivity offers
nanybenefits for
those users cholike to
nod-ify eristing progriros terporarily for their
sPecialtasks.
Uhen theprograming
ianguageis interpretative this is especially profitable,
sincealterations
can belade
asa
Panto{ the
conversationeven
nhen runningthe
progran.In
SURTJO ?6ihis
åFproåchis
already adoptedin
soneEtandard
oPer-ations, For instance, specification o{
neu trans{ornedvariables is carried out
byinserting the transfornation ståtements in the
prognån accordingto instructions
given bythe
systen. Althoughthis
proqedurepre6upposes rudinentary progranning
skills
ue have {oundit
po$erfulconparEd
trith the
nurrralconveniion
Of presentinglists
ot"codes {or specific
standarcialternatives.
In
sarneother activities in
suR|,ru ?6*e
do have sucha list, but, ai the
sanetine there is
anoption for a
generaluser-defined
aPproach.FEr
erarFle, in the
nrodul.e HI$T0for plottins
histogransand fj.t'ting uiivåriåie
trequercydistributions
hytheoretical nsdels' the
theor-etical distribution
can be selected anong Salternatives ar
defined bythe
userquile {reely
byeniering the equeiion o{ the
eorresPondinEdensity, {In {act, ihe kernel o{ the density
upto a
constant {Ectoris sufficieni,
sinceHISII
takescare sf scaling ihe integral to
1).The
denslty funciion
naylnclude
unkntnnPareoeters
andbefore
thefitted drnsity is plotted on thp histograt
andthe
goodness-of-frttcrts sre psrforned,
these pananetersuill
beautouatically
estinated by HIST$using the rarinun likelihood lethod. This
procedune hasproved
to be uEsful even in estinating truncEted
andlired distri-
butions.
4.5. Tert
processincin connectioll$ith
data analvsisffi oui ttrat it
naybe {rustrating {or
ästatistician to retype the
conputer outPutnanually to reach a
{orrrguitable for final, rePorting. lle cån,
o+ coufse, havehighly sfecial- ired
syEtensfor text Processing, but usually
theyare not directly
connected
to statistical
Prograns.To lessen
the
burdenfor a statistician in the
rePorturiting
stågete
havetried to
devetoP aneditor Progra[
as anintegrated Part oi our syster. This editor
can be usednot only fon
nornaltert
process-ing purposes, but also for input o{
datain
an un{ornattedforn, for
$.Pqusämeffi?!: flå* åntmrer&åqrm såmf,åntårmå
dat*
pp*rmssånstransferring data into
$URVfl ?6files
and{or edlting
SURIJO ?6{j.les
andresults
togelheruith
nornaltert,
by using pouen{uleditinp
oper-ations.
These operations ane{or
instancel-to
nake upthe text to a certain line
length,-to transforn
andedit
nuneric tables(ner
colunns andrors
canalso
beinserted
by using nunerj.c inans{onnations),
-ti "ri-iir-"ii "rrr*runeric sortins of
data,-lo print out
selectedparts of the text
onthe printer,
All the infornation is represented in an 'edit field' rhich
con-sists, for
exanple,of
100colunns
and P50rqrs.
Thefield ig
alcayspartially visible
onthe
CRT. Theediting
operati.onsare also
typedin
this field
andthey
can belreated as
nornaltext.
Anyoperation
canbe
artivated
by rnovingthe curror to the
corregpondingline and
bypresging key C0NTINIE. ilhenever needed
the
contentsof the edit fietd (tables, text
and operations) can be savedin
anedit file.
It
seetsquitp natural to
extendediting
operationsto$ards
nornalstatistical
operalLonsand this rill be a ner {orn of
interacti.vestatistical
conputing nhlch coversihe {inal
docunentation asrell.
4.7.
Docurnentåtisn0ocunentati.on
is not only
iaportant,{or the results of a staiisiical analysisi iN iE
equall,yinponiani for thp statisiical progrars,
sinceå
progra!nilhout a
decentdescription is often rather rorthless.
In interactive
Fysiensthe
progran text,itself
contains sonuch in- forraiion
concerningihe discussion nith the
user Nhatnere lisis of the
pnograrrsare help{ul.
Thusa user
knmingthe rain
construclingprilciples of
$URt{l ?6 canfind
rnuchin{ornation just by lisiing
partsof
prograns onthe
CRTor
on paper.In addition,
non-staadardaciivi- ties rill
bedeclaned to the uEer
duningthe conversation.
For smenore
colptehensivetopics special interactive
teachlng progrems are inc luded.Il is
assunedthat in anbivaleni situationg the
user hascourage to {ind his nay by trial
anderror.
SURU} ?6is not
ån eås}.sysien in thåt it
does everythingautonatically for the user,
0nlhe
conirary,it
aEsunesthat
?hestatistician
nakeshis
orndecisisns and
takesinitiatives,
0nthe other
hand,this type of
systenoffers
in{ornation aild guggeetionsto
supportthe decisions.
Thereare statisticians
rholove
to rork
onthis basis, but
thereare also
påoplerho {ind ii dif- {icult or ioo
vå9uerAlthough ue have noraal
prograi
dEscriptionso{
SURIJ076, thay
ran-not tell all essentials,
sincepaper is too rigid a aediun for
the dynarricaspects,
Therefore ne havetnied to
coilpogeautonatic
denon-Etration
progråns nhich contain ready nade SURIII ?6ronversationE
be- fueenthe
systeu anda fictitious user.
The user can tratch ihese con-versations like a
TUpnogran, but
he canalso break ihe
conversation and continuein his sn
fashion.This
dynanic docunentation apprpach seemgto be {nuit{ul alsc in
ieachingstatistical
nethods.In theoretical
andappligd
rerearchrork
this type of
docunentationrill
obviously beof considerable
Euppontand it could
evenof{er
analternative to a iraditional
regearch paper.S.'$tust,cnen
!
{3ninierective
stötåsi,ica3 data prGcesså*$t0
REFERENCTS:
ålanko T.,llustqnpn
$.,Tieiari l{.(Ig68),
Asiatistiea!
progtgr-nins
lansuage $URrJ066,
BIT 9,69-85'Gonover
l.
J.tig?l), Practicgl
llqnpar'met'nicStatistice'
John
tileY,
l'le$ York.Gnanadesilran R. (19??)',
$taiisticq].fata
Analysisof l{ultivari- pteobgqnvationg,|nhnUi1el'ller.Y9rk...j
,
tluqtonen S. (19??), SURUI, ?6r Astat'istical !a!q
11o11a1iu'sysf€D, [enearch
rEport'Nq.6,
DePi'o{$tat'ictigsr UiliversitY'of Helsilki. -
.tlustonan
$., llellin L(lå90t,
SURUI ?6progiai
descriptionsn :OePt.uf Stetistics, Universitl o{,tlel1in}tii' llelderJ.å'(l9?8),The.futurEo{statisiåcpIsE{tuarp,
FE'usr
"'";;;;äins;-ii-d*tuGtion"t $talisiics,
Phvsica'uerlåg' UienAPPE}IOICES:
1.
Gr*phicEniih
$Uf,V0 ?å8.'Li,si of
SIIRW ?6 lnduleEASEPHH$ I.X '-5
fÅS*å DrntrtY {unctiPn
s $llåf"${ Ifr S it, I TH suRl'lt] "?{$
(Plotteu hY rndule $Ufifå0[]
nd a tro-dirrnsrgilål nortsl dlst'ribut'ron
åFP}II
*{:h;
*P
-*za- - J'J
a'r t
-r...f* -*n
'-'-tP'$"
-*"5*":i'*
-.-.'ts
-''*
etJ4*
**ååårplrofgsOahrctYöti?lFflor.airo*iir:1li:o''nor;åldistrtbution,
(isprr-sriri;il-i;
cilmrcc, a*r lrottäd bv oIåsRAll) Con{tden(r :lliteåi å;;-ä'g'ioc-p*0'1 are ptot'ted bv CURtrf,B INilRH
: X ..-
Y.}
I
I al'
fås,3 &ensååY {P åot,ted
{uactlsns s# a nsrffil dretrrbut'ion *or signå=O'5'1'A by &I*6Råit and Ct.t*W)
{J,8
t./
il.6
nf,
U" Jil.4 il,3
fnn
u, {
ff"1
äEI,J
**
*Jr *4 *3 --)
L*t
,tof the orden rtatiEtics of a sanplp til=30) fron a unifonn distributisn 3S bets deneitiesi Oist'rihutions
( p tqt ted bv 0lfi6Mlt and CURW ,
/ai\
CU
n-1 il"7 0"
3il.5 fi.6 il,7 fi"f; il.9
fJt*ä A hlrtogn*il xåth a fått,ed nornal diatribution (plottad by lti$tr0)
HILSll-l[(I:
l-1grlrr tFfirprlrrrttirp ir, .lrrlv" N,Il,
I,fiFPå13
å correlation diagrsr of the reight ånd the ålso å regression llne (corputåd by tIllREC)
regult of shot put for 48 athletes (OIAGRAil)
is p lotted (CURtfi )
IITCft.
l,IEIilHT
SHI:}T PI-I T{F,a
900
EilO
r0il
600
ffx
#,i
?t Jf++
-i(
1+
+(#
y#
n
*#
x##
if
+f
+(
,r
It I
/U
I
-?r ,r .J
I
b)
./r 951ilil
1its
Thp såre cornelat,ion diagtöär but nw *it'h å tl$ådråtrc curve
(estisåted bv HflHLIll and plstted bv 0IåSRAlt tnd CU-RWI
NFTft:
IITII;HT + $HfiT
PLITsfis
8tIil
7fi0
dficl
s.
95il.8
fnr the nari*un levPl ln shPt Put f,PPI/{
å(
/
^f -tf P J.r
J'?
il *å# )** Iftt
IxX:*)+'
t(
*
X
*
r(
*
,(
)t
65 7rJ 75
80
ån the prevl,ous dlsrlay
it
res esrutod in y!11t1:n.:1 *n* lari;ur lpval t'håtthe reEiduals s{ thr rsdgt havs A }ognerrat
dlgtriftulion'
ii.-rr{r\g;ii*ilo-i*tico*ii**i*-i*
trotl.a on.
lornenlal prohabititv paeer (OnGnAfi)Tfl:
RTSIDUåL (Lt]Gi +
t-u1"1 .tREtl
{FROB ITi ilEtt
1 JII
J
I
J
!
I
I I-{
I -1I I
.{
*{
J
I I
1
I I
1t
I
Ilx
lt+
lx
I
B5 9l-l
1fi0 trils
X
rt X
)f
i(
)t
#x-#
dno*
il,5
\.,d
t\s
*X
iri('( tr
il.1
tååe$ å ({rl* l*n* be*n tota*d $S0 tåles,*nd tl.l lurylluu Hmtilrsstochast,lrtstvcr0äncsof?ltHl{fitol/äi|r
{A lossrit}ntr rcml* {or S t* pnp}oypd}
..r,. r- AFFIIS
frpquancyofhnadg}l(}l}/l|i*rncnp1"d.(cHånffi illustråted in thp 4nllurrng *raph rlott'ard by fitlRW'
I:il
IN :. FI
(LilGi
N(H)
I'Nt
U.H
0,6
il.4 0,.l
f-llr
lL 115
rågrto luo nonthlv tire sQnr-cs (ptrot'ted bv 0IåSRålt)
1il0 eilti
,5il1jI
N
FINLfiNt] Ig{rt */'*
.:
fill::t-ll-{01
II BIUFRftt;F5
5Uil
4ilt]
300
200
I|{NI[F$ FilN 5fi1F5 ilF i1951=100i
REåL FRIIE INilI;{
ffi *--T*T-
ll
U0LLIN''IE
It'ltlEx
10tl
1
9St I'r*4 rt)67
t97il 1973
IS/6
[4.*-ååVeertryrrnguwptågmm{elcohgÅj"cbever {Påc{ted bY 8tr*Sftåffi}
[rJr'å Suil't F T i L]hi t-rF
xTiltfiL:iltl (ksi
t
4i
åII
UE5ftor,,,öY
5 r öe
I
'l"t
exiccr
ages and tobaccc i,n vErioug reuntries APP!./6
(PtFl iN[-{ftBITf{NT}
ftL[.ilr-rur_
i[ Bru[Frftu[5 fiNti
TL]ilftut u:1,*,rtzerl
f-t'r 1
i
cn'J[]rPP(:e
BrleitrTTrIJS
fl f
an n'j eicplan,J
ilErrTTlörk
ftrssnt in [rrslnrrii
Ir*ian,J
nuUUtoSFrvJnpsF*14r,
,jr-.
-a Ronr,:rri,rr'.*Fd,ry.,
[.2 ec l-i c,E
i
ilDn
Srmin
tlG e rm arr Y
ftu
st r i
aFröncP
Tu
rk rv HrazL
1'.