• Ei tuloksia

Autumn2012 JyrkiKivinen Information-TheoreticModeling

N/A
N/A
Info
Lataa
Protected

Academic year: 2022

Jaa "Autumn2012 JyrkiKivinen Information-TheoreticModeling"

Copied!
137
0
0

Kokoteksti

(1)

Outline Administrative issues Overview of Contents Compression

Information-Theoretic Modeling

Jyrki Kivinen

Department of Computer Science, University of Helsinki

Autumn 2012

(2)

Outline Administrative issues Overview of Contents Compression

The content and form of these lecture notes have been mostly copied from the lecture notes made by Teemu Roos for the Autumn 2009 installation of this course.

Those lectures are available also on video fromhttp://www.cs.

helsinki.fi/group/cosco/Teaching/Information/2009/.

(3)

Outline Administrative issues Overview of Contents Compression

1 Administrative issues Course details Prerequisites

What do I need to do?

Grading and policies

2 Overview of Contents What is Information? Why Information?

Information vs. Complexity Information Theory

3 Compression Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

(4)

Outline Administrative issues Overview of Contents Compression

1 Administrative issues Course details Prerequisites

What do I need to do?

Grading and policies

2 Overview of Contents What is Information?

Why Information?

Information vs. Complexity Information Theory

3 Compression Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

(5)

Outline Administrative issues Overview of Contents Compression

1 Administrative issues Course details Prerequisites

What do I need to do?

Grading and policies

2 Overview of Contents What is Information?

Why Information?

Information vs. Complexity Information Theory

3 Compression Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

(6)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582650 Information-Theoretic Modeling

An advanced studies course (“syvent¨av¨at opinnot”)

Algorithms and Machine Learningsub-programme, optional.

4 credit units.

Lectures: 5 September–12 October Wed & Fri 10–12 in C222.

Exercises: 11 September–9 October Tue 10–12 in C220. Instructor: Jyrki Kivinen, B229a,

jyrki.kivinen at cs.helsinki.fi

(no fixed office hours, make an appointment by e-mail). Course assistant: Hannes Wettig.

hannes.wettig at cs.helsinki.fi

(7)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582650 Information-Theoretic Modeling

An advanced studies course (“syvent¨av¨at opinnot”) Algorithms and Machine Learningsub-programme, optional.

4 credit units.

Lectures: 5 September–12 October Wed & Fri 10–12 in C222.

Exercises: 11 September–9 October Tue 10–12 in C220. Instructor: Jyrki Kivinen, B229a,

jyrki.kivinen at cs.helsinki.fi

(no fixed office hours, make an appointment by e-mail). Course assistant: Hannes Wettig.

hannes.wettig at cs.helsinki.fi

(8)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582650 Information-Theoretic Modeling

An advanced studies course (“syvent¨av¨at opinnot”) Algorithms and Machine Learningsub-programme, optional.

4 credit units.

Lectures: 5 September–12 October Wed & Fri 10–12 in C222.

Exercises: 11 September–9 October Tue 10–12 in C220. Instructor: Jyrki Kivinen, B229a,

jyrki.kivinen at cs.helsinki.fi

(no fixed office hours, make an appointment by e-mail). Course assistant: Hannes Wettig.

hannes.wettig at cs.helsinki.fi

(9)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582650 Information-Theoretic Modeling

An advanced studies course (“syvent¨av¨at opinnot”) Algorithms and Machine Learningsub-programme, optional.

4 credit units.

Lectures: 5 September–12 October Wed & Fri 10–12 in C222.

Exercises: 11 September–9 October Tue 10–12 in C220. Instructor: Jyrki Kivinen, B229a,

jyrki.kivinen at cs.helsinki.fi

(no fixed office hours, make an appointment by e-mail). Course assistant: Hannes Wettig.

hannes.wettig at cs.helsinki.fi

(10)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582650 Information-Theoretic Modeling

An advanced studies course (“syvent¨av¨at opinnot”) Algorithms and Machine Learningsub-programme, optional.

4 credit units.

Lectures: 5 September–12 October Wed & Fri 10–12 in C222.

Exercises: 11 September–9 October Tue 10–12 in C220.

Instructor: Jyrki Kivinen, B229a, jyrki.kivinen at cs.helsinki.fi

(no fixed office hours, make an appointment by e-mail). Course assistant: Hannes Wettig.

hannes.wettig at cs.helsinki.fi

(11)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582650 Information-Theoretic Modeling

An advanced studies course (“syvent¨av¨at opinnot”) Algorithms and Machine Learningsub-programme, optional.

4 credit units.

Lectures: 5 September–12 October Wed & Fri 10–12 in C222.

Exercises: 11 September–9 October Tue 10–12 in C220.

Instructor: Jyrki Kivinen, B229a, jyrki.kivinen at cs.helsinki.fi

(no fixed office hours, make an appointment by e-mail).

Course assistant: Hannes Wettig. hannes.wettig at cs.helsinki.fi

(12)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582650 Information-Theoretic Modeling

An advanced studies course (“syvent¨av¨at opinnot”) Algorithms and Machine Learningsub-programme, optional.

4 credit units.

Lectures: 5 September–12 October Wed & Fri 10–12 in C222.

Exercises: 11 September–9 October Tue 10–12 in C220.

Instructor: Jyrki Kivinen, B229a, jyrki.kivinen at cs.helsinki.fi

(no fixed office hours, make an appointment by e-mail).

Course assistant: Hannes Wettig.

(13)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Resources

There is no required textbook on the course, but the following are recommended.

Highly recommended: Cover & Thomas, Elements of Information Theory,

MacKay, Information Theory, Inference and Learning Algorithms,

Gr¨unwald, The Minimum Description Length Principle, Solomon,Data Compression: The Complete Reference.

(14)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons. Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(15)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons. Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(16)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons. Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(17)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons. Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(18)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons.

Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(19)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons.

Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(20)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons.

Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(21)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

582651 Project in Information-Theoretic Modeling

There is also a related project:

2 credit units.

Period II

This course is a prerequisite.

Together they replace the old Three Concepts: Information course—both old and new version cannot be included in your degree.

Groups of 2–3 persons.

Task: compress data.

Best compressor wins! Intermediate results annonced periodically.

Programming + report.

(22)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Prerequisites

No formal prerequisitesbutyou will need

Calculus: integrals, derivatives, convergence, ... Probability theory: joint & conditional distributions, expectations, law of large numbers, ...

Programming: language is up to you (but need to work in groups in project).

(23)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Prerequisites

No formal prerequisitesbutyou will need

Calculus: integrals, derivatives, convergence, ...

Probability theory: joint & conditional distributions, expectations, law of large numbers, ...

Programming: language is up to you (but need to work in groups in project).

(24)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Prerequisites

No formal prerequisitesbutyou will need

Calculus: integrals, derivatives, convergence, ...

Probability theory: joint & conditional distributions, expectations, law of large numbers, ...

Programming: language is up to you (but need to work in groups in project).

(25)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Prerequisites

No formal prerequisitesbutyou will need

Calculus: integrals, derivatives, convergence, ...

Probability theory: joint & conditional distributions, expectations, law of large numbers, ...

Programming: language is up to you (but need to work in groups in project).

(26)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

What do I need to do?

Weekly exercises:

Mathematical problems. Programming tasks.

Course exam on 18 October at 9:00am.

You donothave to attend the classes, unless otherwise stated. However, it is recommend that you do.

Alternatively, you can just take a separate exam later (next one is 23 November).

(27)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

What do I need to do?

Weekly exercises:

Mathematical problems.

Programming tasks.

Course exam on 18 October at 9:00am.

You donothave to attend the classes, unless otherwise stated. However, it is recommend that you do.

Alternatively, you can just take a separate exam later (next one is 23 November).

(28)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

What do I need to do?

Weekly exercises:

Mathematical problems.

Programming tasks.

Course exam on 18 October at 9:00am.

You donothave to attend the classes, unless otherwise stated. However, it is recommend that you do.

Alternatively, you can just take a separate exam later (next one is 23 November).

(29)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

What do I need to do?

Weekly exercises:

Mathematical problems.

Programming tasks.

Course exam on 18 October at 9:00am.

You donothave to attend the classes, unless otherwise stated. However, it is recommend that you do.

Alternatively, you can just take a separate exam later (next one is 23 November).

(30)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

What do I need to do?

Weekly exercises:

Mathematical problems.

Programming tasks.

Course exam on 18 October at 9:00am.

You donothave to attend the classes, unless otherwise stated.

However, it is recommend that you do.

Alternatively, you can just take a separate exam later (next one is 23 November).

(31)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

What do I need to do?

Weekly exercises:

Mathematical problems.

Programming tasks.

Course exam on 18 October at 9:00am.

You donothave to attend the classes, unless otherwise stated.

However, it is recommend that you do.

Alternatively, you can just take a separate exam later (next one is 23 November).

(32)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Grading

The course grading is based on:

1 Exercises (40 %)

2 Exam (60 %)

Minimum 50 % of exercises have to be solved (or at least seriously attempted).

Feel free to discuss homework problems in groups. However, each student is expected to

independently write down his/her solutions to theory problems implement all programming tasks by him/herself.

(33)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Grading

The course grading is based on:

1 Exercises (40 %)

2 Exam (60 %)

Minimum 50 % of exercises have to be solved (or at least seriously attempted).

Feel free to discuss homework problems in groups. However, each student is expected to

independently write down his/her solutions to theory problems implement all programming tasks by him/herself.

(34)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Grading

The course grading is based on:

1 Exercises (40 %)

2 Exam (60 %)

Minimum 50 % of exercises have to be solved (or at least seriously attempted).

Feel free to discuss homework problems in groups. However, each student is expected to

independently write down his/her solutions to theory problems implement all programming tasks by him/herself.

(35)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Grading

The course grading is based on:

1 Exercises (40 %)

2 Exam (60 %)

Minimum 50 % of exercises have to be solved (or at least seriously attempted).

Feel free to discuss homework problems in groups.

However, each student is expected to

independently write down his/her solutions to theory problems implement all programming tasks by him/herself.

(36)

Outline Administrative issues Overview of Contents Compression

Course details Prerequisites What do I need to do?

Grading and policies

Grading

The course grading is based on:

1 Exercises (40 %)

2 Exam (60 %)

Minimum 50 % of exercises have to be solved (or at least seriously attempted).

Feel free to discuss homework problems in groups.

However, each student is expected to

independently write down his/her solutions to theory problems implement all programming tasks by him/herself.

(37)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

1 Administrative issues Course details Prerequisites

What do I need to do?

Grading and policies

2 Overview of Contents What is Information?

Why Information?

Information vs. Complexity Information Theory

3 Compression Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

(38)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

What is Information?

Etymology: informare= give form, 14th century.

knowledge [...], intelligence, news, facts, data, [...], (as nucleotides in DNA or binary digits in a computer program) [...], a signal [...], a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed. (source: Merriam-Webster).

Data vs. Information vs. Knowledge. Information technology.

Physical information.

This course: measuring the amountof information in data, and using such measures for automatically buiding models.

(39)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

What is Information?

Etymology: informare= give form, 14th century.

knowledge [...], intelligence, news, facts, data, [...], (as nucleotides in DNA or binary digits in a computer program) [...], a signal [...], a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed.

(source: Merriam-Webster).

Data vs. Information vs. Knowledge. Information technology.

Physical information.

This course: measuring the amountof information in data, and using such measures for automatically buiding models.

(40)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

What is Information?

Etymology: informare= give form, 14th century.

knowledge [...], intelligence, news, facts, data, [...], (as nucleotides in DNA or binary digits in a computer program) [...], a signal [...], a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed.

(source: Merriam-Webster).

Data vs. Information vs. Knowledge.

Information technology. Physical information.

This course: measuring the amountof information in data, and using such measures for automatically buiding models.

(41)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

What is Information?

Etymology: informare= give form, 14th century.

knowledge [...], intelligence, news, facts, data, [...], (as nucleotides in DNA or binary digits in a computer program) [...], a signal [...], a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed.

(source: Merriam-Webster).

Data vs. Information vs. Knowledge.

Information technology.

Physical information.

This course: measuring the amountof information in data, and using such measures for automatically buiding models.

(42)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

What is Information?

Etymology: informare= give form, 14th century.

knowledge [...], intelligence, news, facts, data, [...], (as nucleotides in DNA or binary digits in a computer program) [...], a signal [...], a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed.

(source: Merriam-Webster).

Data vs. Information vs. Knowledge.

Information technology.

Physical information.

This course: measuring the amountof information in data, and using such measures for automatically buiding models.

(43)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

What is Information?

Etymology: informare= give form, 14th century.

knowledge [...], intelligence, news, facts, data, [...], (as nucleotides in DNA or binary digits in a computer program) [...], a signal [...], a numerical quantity that measures the uncertainty in the outcome of an experiment to be performed.

(source: Merriam-Webster).

Data vs. Information vs. Knowledge.

Information technology.

Physical information.

This course: measuring the amountof information in data, and using such measures for automatically buiding models.

(44)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Why Information?

The amount of data around us is exploding – internet!

Need to store, transmit, and process it efficiently.

Wish tounderstand more and more complex phenomena. Computer science: make things automatic (intelligent).

(45)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Why Information?

The amount of data around us is exploding – internet!

Need to store, transmit, and process it efficiently.

Wish tounderstand more and more complex phenomena. Computer science: make things automatic (intelligent).

(46)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Why Information?

The amount of data around us is exploding – internet!

Need to store, transmit, and process it efficiently.

Wish tounderstand more and more complex phenomena.

Computer science: make things automatic (intelligent).

(47)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Why Information?

The amount of data around us is exploding – internet!

Need to store, transmit, and process it efficiently.

Wish tounderstand more and more complex phenomena.

Computer science: make things automatic (intelligent).

(48)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Information vs. Complexity

On this course we do not define the terminformationas such.

Often a useful intuition is to think of theinformation content of some piece of data as the amount of bits needed to represent its relevantfeatures.

“Relevant” here is of course not well defined. The point is that generally for example random strings are not considered to contain much information, although they cannot be compressed much (i.e.

require many bits to encode).

(49)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

One way of thinking about this is

Complexity =Information+Noise so that for example random strings are complex, but the complexity mainly comes from noise, not information.

From a bit diffent points of view,

Complexity =Regularity+Randomness or

Complexity =Algorithm+Compressed file.

(50)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Information Theory

”The real birth of modern information theory can be traced to the publication in 1948 of Claude Shannon’s“The Mathematical Theory of Communication”

in the Bell System Technical Journal. ”

(Encyclopædia Britannica)

(51)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits,

compression, error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(52)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits, compression,

error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(53)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits, compression,

error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(54)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits, compression,

error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(55)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits, compression,

error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(56)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits, compression,

error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(57)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits, compression,

error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(58)

Outline Administrative issues Overview of Contents Compression

What is Information?

Why Information?

Information vs. Complexity Information Theory

Course Topics

Information Theory:

entropy and information, bits, compression,

error correction.

Fundamental limits (mathematical and statistical) and practice (computer science).

Modeling:

statistical models,

complexity (in data and models),

over-fitting, Occam’s Razor, and MDL Principle.

(59)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

1 Administrative issues Course details Prerequisites

What do I need to do?

Grading and policies

2 Overview of Contents What is Information?

Why Information?

Information vs. Complexity Information Theory

3 Compression Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

(60)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

Form groups of 3–4 persons. Each group constructs acodefor the letters A–Z by using ascode-wordsunique sequences of dots • and dashes (—) like “•”, “—•”, “ — •— —”, etc.

A G M S Y

B H N T Z

C I O U

D J P V

E K Q W

F L R X

(61)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

Use your code toencode the message

“WHAT DOES THIS HAVE TO DO WITH INFORMATION”.

Now count how long the encoded message is using the rule: A dot •: 1 units.

A dash —: 2 units.

A space between words: 2 units.

• • •— — —• • •: 1 + 1 + 1 + 2 + 2 + 2 + 1 + 1 + 1 = 12. Thecoding rateof your code is the length of the encoded message divided by the length of the original message, including spaces (42).

(62)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

Use your code toencode the message

“WHAT DOES THIS HAVE TO DO WITH INFORMATION”.

Now count how long the encoded message is using the rule:

A dot •: 1 units.

A dash —: 2 units.

A space between words: 2 units.

• • •— — —• • •: 1 + 1 + 1 + 2 + 2 + 2 + 1 + 1 + 1 = 12. Thecoding rateof your code is the length of the encoded message divided by the length of the original message, including spaces (42).

(63)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

Use your code toencode the message

“WHAT DOES THIS HAVE TO DO WITH INFORMATION”.

Now count how long the encoded message is using the rule:

A dot •: 1 units.

A dash —: 2 units.

A space between words: 2 units.

• • •— — —• • •: 1 + 1 + 1 + 2 + 2 + 2 + 1 + 1 + 1 = 12.

Thecoding rateof your code is the length of the encoded message divided by the length of the original message, including spaces (42).

(64)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

Use your code toencode the message

“WHAT DOES THIS HAVE TO DO WITH INFORMATION”.

Now count how long the encoded message is using the rule:

A dot •: 1 units.

A dash —: 2 units.

A space between words: 2 units.

• • •— — —• • •: 1 + 1 + 1 + 2 + 2 + 2 + 1 + 1 + 1 = 12.

Thecoding rateof your code is the length of the encoded message divided by the length of the original message, including spaces (42).

(65)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

c 1989 A.G. Reinhold. Samuel F.M. Morse (1791–1872)

(66)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

WHAT DOES THIS HAVE TO DO WITH INFORMATION

.-- .... .- - -.. --- . ... - .... .. ... .... .- ...- . - --- -.. --- .-- .. - .... .. -. ..-. --- .-. -- .- - .. --- -.

51 dots, 36 dashes, 7 spaces: 51 + 72 + 14 = 137 units. Morse code

Coding rate: 137

42 ≈3.26

Did you do better or worse? Why?

(67)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

WHAT DOES THIS HAVE TO DO WITH INFORMATION

.-- .... .- - -.. --- . ... - .... .. ...

.... .- ...- . - --- -.. --- .-- .. - ....

.. -. ..-. --- .-. -- .- - .. --- -.

51 dots, 36 dashes, 7 spaces: 51 + 72 + 14 = 137 units. Morse code

Coding rate: 137

42 ≈3.26

Did you do better or worse? Why?

(68)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

WHAT DOES THIS HAVE TO DO WITH INFORMATION

.-- .... .- - -.. --- . ... - .... .. ...

.... .- ...- . - --- -.. --- .-- .. - ....

.. -. ..-. --- .-. -- .- - .. --- -.

51 dots, 36 dashes, 7 spaces: 51 + 72 + 14 = 137 units.

Morse code Coding rate: 137

42 ≈3.26

Did you do better or worse? Why?

(69)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Coding Game

WHAT DOES THIS HAVE TO DO WITH INFORMATION

.-- .... .- - -.. --- . ... - .... .. ...

.... .- ...- . - --- -.. --- .-- .. - ....

.. -. ..-. --- .-. -- .- - .. --- -.

51 dots, 36 dashes, 7 spaces: 51 + 72 + 14 = 137 units.

Morse code Coding rate: 137

42 ≈3.26

Did you do better or worse? Why?

(70)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Codes as Mappings

Lossless compression:

injective mapping

Lossy compression: non-injective mapping

Source strings Code strings Source strings Code strings Source strings Code strings Source strings Code strings

?

?

Onlylossless codes areuniquely decodable.

(71)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Codes as Mappings

Lossless compression:

injective mapping

Lossy compression:

non-injective mapping

Source strings Code strings Source strings Code strings

Source strings Code strings Source strings Code strings

?

?

Onlylossless codes areuniquely decodable.

(72)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Codes as Mappings

Lossless compression:

injective mapping

Lossy compression:

non-injective mapping

Source strings Code strings Source strings Code strings

Source strings Code strings Source strings Code strings

?

?

(73)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Codes as Mappings

Lossless compression:

injective mapping

Lossy compression:

non-injective mapping

Source strings Code strings

Source strings Code strings Source strings Code strings

Source strings Code strings

?

?

Onlylossless codes areuniquely decodable.

(74)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Examples

general purpose

gzip

bzip

(75)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Examples

general purpose

image

gzip

bzip

png

jpeg

(76)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Examples

music general purpose

image

gzip bzip png jpeg

mp3

(77)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Examples

music video general purpose

image

gzip bzip png jpeg

mp3

mpeg

(78)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Examples

music video general purpose

image

gzip bzip png jpeg

mp3 mpeg

lossless

lossy

(79)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Examples

music video general purpose

image

gzip bzip png jpeg

mp3 mpeg

~ 1 : 2.5

~ 1 : 25

~ 1 : 12

~ 1 : 3

~ 1 : 3.5

~ 1 : 30

lossless

lossy ratio

compression

(80)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1 = 2n−k −1. Thus the ratio is

2n−k −1

2n < 2n−k

2n = 2−k.

(81)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.

The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1 = 2n−k −1. Thus the ratio is

2n−k −1

2n < 2n−k

2n = 2−k.

(82)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1

= 2n−k −1. Thus the ratio is 2n−k −1

2n < 2n−k

2n = 2−k.

(83)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1 = 2n−k −1.

Thus the ratio is 2n−k −1

2n < 2n−k

2n = 2−k.

(84)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1 = 2n−k −1. Thus the ratio is

2n−k −1

2n < 2n−k

2n = 2−k.

(85)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1 = 2n−k −1. Thus the ratio is

2n−k −1

2n < 2n−k

2n = 2−k.

Less than 50 % of files are compressible by more than one bit.

(86)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1 = 2n−k −1. Thus the ratio is

2n−k −1

2n < 2n−k

2n = 2−k.

Less than 1 % of files are compressible by more than 7 bits.

(87)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

Is it always possible to compress data?

Theorem

The proportion of binary strings compressible by more thank bits is less than 2−k.

Proof. For all n≥1, the number of binary strings of lengthn is 2n.The number of binary code strings of length less thann−k is 20+ 21+ 22+. . .+ 2n−k−1 = 2n−k −1. Thus the ratio is

2n−k −1

2n < 2n−k

2n = 2−k.

Less than 0.000000000000000000000000000001 % of files are compressible by 100 bits.

(88)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

How is it possible?

Why was the compression ratio greater than one in all the examples we saw?

What are those rare files that are compressible?

Why are the files we use in practice so often compressible?

(89)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

echo <x> | gzip - | wc -c # multiply by 8 for bits

Source string, x `(C(x)) ratio

aaa. . .a (10000×a) 368 27.2 : 1.

aabaabbbbabbbbb. . . (10000 random letters) 13456 0.74 : 1 abababab. . .ab (5000×ab) 368 27.2 : 1

aaa. . .abbb. . .b (5000×a,5000×b) 376 26.6 : 1

abbaababba. . . (1000×abbaababba) 488 20.5 : 1

Strings following a rule are compressible?

(90)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

echo <x> | gzip - | wc -c # multiply by 8 for bits

Source string, x `(C(x)) ratio

aaa. . .a (10000×a) 368 27.2 : 1.

aabaabbbbabbbbb. . . (10000 random letters) 13456 0.74 : 1

abababab. . .ab (5000×ab) 368 27.2 : 1

aaa. . .abbb. . .b (5000×a,5000×b) 376 26.6 : 1

abbaababba. . . (1000×abbaababba) 488 20.5 : 1

Strings following a rule are compressible?

(91)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

echo <x> | gzip - | wc -c # multiply by 8 for bits

Source string, x `(C(x)) ratio

aaa. . .a (10000×a) 368 27.2 : 1.

aabaabbbbabbbbb. . . (10000 random letters) 13456 0.74 : 1 abababab. . .ab (5000×ab) 368 27.2 : 1

aaa. . .abbb. . .b (5000×a,5000×b) 376 26.6 : 1

abbaababba. . . (1000×abbaababba) 488 20.5 : 1

Strings following a rule are compressible?

(92)

Outline Administrative issues Overview of Contents Compression

Dots and Dashes Codes as Mappings Data Compression

Information vs. Complexity (contd.) Examples (with numbers)

Compression

echo <x> | gzip - | wc -c # multiply by 8 for bits

Source string, x `(C(x)) ratio

aaa. . .a (10000×a) 368 27.2 : 1.

aabaabbbbabbbbb. . . (10000 random letters) 13456 0.74 : 1 abababab. . .ab (5000×ab) 368 27.2 : 1

aaa. . .abbb. . .b (5000×a,5000×b) 376 26.6 : 1

abbaababba. . . (1000×abbaababba) 488 20.5 : 1

Strings following a rule are compressible?

Viittaukset

LIITTYVÄT TIEDOSTOT

It is easily seen that all prefix codes are uniquely decodable: each symbol can be decoded as soon as its codeword is read.. Outline Codes

Note that since Shannon-Fano gives E [`(X )] ≤ H(X ) + 1, and Huffman is optimal, Huffman must satisfy the same bound. Jyrki Kivinen

1 We know (think) that the source symbols are generated by a Bernoulli model with parameter p ∈ [0, 1].. 2 We’d like to encode data at

2 MDL Principle Rules &amp; Exceptions Probabilistic Models Old-Style MDL Modern MDL.. Jyrki Kivinen

Encoding Continuous Data Differential Entropy Linear Regression Subset Selection Problem Wavelet Denoising..

To summarize, the prefix property is reasonable from a practical point of view, and from a theoretical point of view can be assumed without increasing input lengths too much....

Key ideas: motivation and basic definition universal and prefix machines invariance theorem, uncomputability basic upper bounds and examples More technical: proof of

Find out what compression algorithms are used in the gadgets you use—DVD player, digital TV, CD, iPod, cell-phone, laptop (WinZip), etc.. You can simply list the names of