• Ei tuloksia

1 TAN=tree augmented naive Bayes

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "1 TAN=tree augmented naive Bayes"

Copied!
1
0
0

Kokoteksti

(1)

1 TAN=tree augmented naive Bayes

The idea is to define a networks structure, which is like naive Bayes (i.e. the root node is FR and leaf nodes are A and B for Prog.1, for Prog. 2 the leaf nodes could be TP1, D, E), but now we represent the strongest dependencies between leaf nodes.

In Prog. 1 model this means that we simply let B to depend on A, in addition to FR1. In Prog. 2, we should define the optimal dependencies, but you could simply try following: D depends on TP1, and E depends on D.

The parameters are simply calculated from frequencies. E.g. P(FR=1) is the number of rows, where FR=1, divided by the total number of rows. Conditional probabilityP(B = 1|F R = 1, A= 1) is the number of rows, where B=1, FR=1 and A=1 divided by the number of rows, where FR=1 and A=1.

If you want to get better accuracy, you could also try Dirichlet smoothing method method for defining parameters (and compare the results). (In fact, I could give an extra ects for the one, who implements this – it is not required for your project work!)

2 Bayesian multinets

Now we define two Bayesian classifiers, one for failed students and the other for passed students. The model structures should be different, because otherwise there is no reason for two networks. On the other hand, if it seems that the optimal structure for both networks is the same, we can use just one networks.

There are two ways to define the model structure. In both approaches you should first divide the datasets into two parts, one containing only failed students and the other one only passed students.

1. You can analyze the dependencies between attribute values in both data sets and try to find strongest dependencies. Then you define a model, which contains an arrow from X to Y, if Y is strongly dependent on X. I suggest to use simple models, like TANs.

2. You learn the optimal networks structure by some tool. Hugin may be able to do that, too, but I have used another tool called camml, which you can install to either Linux or Windows:

http://www.datamining.monash.edu.au/software/camml/.

1

Viittaukset

LIITTYVÄT TIEDOSTOT

This account makes no appeal to special-purpose sequenc- ing principles such as Grice's maxim of orderliness or Dowty's Temporal Discourse Interpretation Principle;

Each word in the input is represented by a word node connected by excitatory links to sense nodes representing the different possible senses for that word in the

I The central idea of the Na¨ıve Bayes classifier is to assume that the class-conditional distributions factorize, i.e.. Now all probabilities are

I The central idea of the Na¨ıve Bayes classifier is to assume that the class-conditional distributions factorize, i.e.. Now all probabilities are

We define the model structure for both data sets, such that F R is the root node in both networks but other variables A, B, C can be either root or leaf nodes and the edges can

Updated timetable: Thursday, 7 June 2018 Mini-symposium on Magic squares, prime numbers and postage stamps organized by Ka Lok Chu, Simo Puntanen. &

To emulate natural leaf deformations in 3D space a series of geometric transformations is defined, which are applied to the initially flat leaf template: Besides a

To model the hormonal signal concentrations in the leaf, [signal], a transport equation in the root network with signal mass production rate at the root tips, M signal [µmol d -1