This document has been downloaded from
TamPub โ The Institutional Repository of University of Tampere
Publisher's version
The permanent address of the publication is http://urn.fi/URN:NBN:fi:uta-201603161319
Author(s): Zolotavkin, Yevhen; Juhola, Martti
Title: A New Scalar Quantization Method for Digital Image Watermarking
Year: 2016
Journal Title: Journal of Electrical and Computer Engineering
Pages: 1-13
ISSN: 2090-0147
Discipline: Mathematics; Computer and information sciences;
Electronic, automation and communications engineering, electronics
School /Other Unit: School of Information Sciences Item Type: Journal Article
Language: en
DOI: http://dx.doi.org/10.1155/2016/9029745
URN: URN:NBN:fi:uta-201603161319
Additional Information: Artikkelinumero: 9029745
All material supplied via TamPub is protected by copyright and other intellectual
property rights, and duplication or sale of all part of any of the repository collections
is not permitted, except that material may be duplicated by you for your research use
or educational purposes in electronic or print form. You must obtain permission for
any other use. Electronic or print copies may not be offered, whether for sale or
otherwise to anyone who is not an authorized user.
Research Article
A New Scalar Quantization Method for Digital Image Watermarking
Yevhen Zolotavkin and Martti Juhola
Computer Science, School of Information Sciences, University of Tampere, Kanslerinrinne 1, 33014 Tampere, Finland
Correspondence should be addressed to Yevhen Zolotavkin; zhzolot@countermail.com Received 7 October 2015; Revised 18 January 2016; Accepted 24 January 2016
Academic Editor: Mazdak Zamani
Copyright ยฉ 2016 Y. Zolotavkin and M. Juhola. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
A new technique utilizing Scalar Quantization is designed in this paper in order to be used for Digital Image Watermarking (DIW).
Efficiency of the technique is measured in terms of distortions of the original image and robustness under different kinds of attacks, with particular focus on Additive White Gaussian Noise (AWGN) and Gain Attack (GA). The proposed technique performance is affirmed by comparing with state-of-the-art methods including Quantization Index Modulation (QIM), Distortion Compensated QIM (DC-QIM), and Rational Dither Modulation (RDM). Considerable improvements demonstrated by the method are due to a new form of distribution of quantized samples and a procedure that recovers a watermark after GA. In contrast to other known quantization methods, the detailed method here stipulates asymmetric distribution of quantized samples. This creates a distinctive feature and is expressed numerically by one of the proposed criteria. In addition, several realizations of quantization are considered and explained using a concept of Initial Data Loss (IDL) which helps to reduce watermarking distortions. The procedure for GA recovery exploits one of the two criteria of asymmetry. The accomplishments of the procedure are due to its simplicity, computational lightness, and sufficient precision of estimation of unknown gain factor.
1. Introduction
In modern communications, multimedia plays significant role. Ownership of multimedia data is important and needs to be protected [1]. As a part of nowadays popular multimedia content, digital images are an important class. A protection of digital rights of an owner is implemented by Digital Image Watermarking (DIW). A watermark that is inserted into an image has to be robust [2] as well as invisible [3].
Among the popular and efficient techniques in DIW, Quantization Index Modulation (QIM) is widely used in blind watermarking where neither original media nor water- mark is known to the receiver [4, 5]. One of the aspects of robustness of QIM is evaluated by attacking a watermarked image with Additive White Gaussian Noise (AWGN). Unfor- tunately, all the known on practice implementations of QIM are far from achieving the channel capacity limit that was first derived in [6].
Several different QIM-related approaches are known.
Some state-of-the-art realizations will be outlined briefly.
According to QIM, intervals of equal lengthฮare mapped
on the real number line. The oldest known approach is to replace all the original coefficients inside every interval with one of the two endpoints of that interval. The selection of the endpoint depends on a bit of a watermark [7]. The main disadvantage is that for high intensity of noise and the capacity of the oldest QIM is much lower than the theoretical limit. In a more advanced realization of DC-QIM, coefficients from every original interval are mapped into two disjoint subintervals. The gap between the subintervals is controlled by parameter๐ผ,0 โค ๐ผ โค 1[8]. Assuming that initial dis- tribution inside original interval and target distributions in subintervals are uniform, the mapping in accordance to DC- QIM is optimal in terms of Mean Square Error (MSE) of quantization. In order to maximize capacity for a given MSE under AWGN of different intensity, parametersฮ,๐ผhave to be adjusted. Nevertheless, the limit defined in [6] is still well above the one achievable by DC-QIM.
Not all the original coefficients in each interval need to be quantized. This idea has been explored by the authors of Forbidden Zone Data Hiding (FZDH) [9]. Another idea was proposed by the authors of Thresholded Constellation
Volume 2016, Article ID 9029745, 16 pages http://dx.doi.org/10.1155/2016/9029745
Modulation (TCM) that uses two different quantization rules to modify coefficients inside the original interval [10].
Despite sufficient robustness of QIM under AWGN, the limitation is that synchronization is required in order to reconstruct intervals that are necessary to extract (or decode) a watermark. A type of distortion which scales all the watermarked coefficients is called Gain Attack (GA).
The scaling factor might be close to 1 and cause very little visual distortion, but it is unknown to the receiver which causes asynchronous extraction. Retrieval of the watermark is usually complicated by AWGN that follows GA [11].
Improvement of QIM performance under GA is the task of numerous known approaches [12]. Most of them can be classified into two groups where the main idea of the first group is to estimate the unknown factor [13] while the idea of the second is to quantize coefficients of a different kind that are invariant to scaling of original signal.
The solution proposed in [11] contributes to robustness enhancement in case of GA and a constant offset attack followed by AWGN. A pilot signal is embedded for this pur- pose. Fourier analysis is used during extraction to estimate the gain factor and the offset. Another method of recovery after GA and AWGN is proposed in [14]. It uses information about dither sequence and applies Maximum Likelihood (ML) procedure to estimate the scaling factor.
Watermarking that is invariant to GA demands more complex transform of original signal (e.g., nonlinear) to obtain coefficients. One of the most popular watermarking methods in that category is Rational Dither Modulation (RDM) [15]. For a particular coefficient, a ratio that depends on a norm of other coefficients is being quantized instead of a coefficient itself. In order to quantize the ratio, RDM utilizes the simplest QIM scheme. This implies that the performance of RDM under AWGN (without GA) is close to the simplest QIM. Among others recent blind watermarking methods robust to GA are, for example, detailed in [16โ18].
A new scalar QIM-based watermarking method is pro- posed in this paper. It provides high robustness under conditions of AWGN and GA. Among the new features of the method are IDL and a new form of distribution of quantized samples.
The organization of the rest of the paper is as follows.
Section 2 explains the choice of the distribution of quan- tized samples and contains description of the procedure of recovery after GA. Concept of IDL and quantization model are described in Section 3 using formal logic approach. The aspects of analytic-based estimation of robustness under AWGN are discussed in Section 4. Next, Section 5 contains experimental results obtained under AWGN and GA. Dis- cussion of the details of the experiment and comparison of the performance are given in Section 6. Section 7 concludes the paper. The list of the key variables and their meaning is given in Nomenclature section.
2. Distribution of Quantized Samples and Procedure for Recovery after GA
An asymmetric distribution of quantized samples is proposed and parametrized in this section. Asymmetry is the quality
that can be easily expressed quantitatively. Under symmetric attack, like AWGN, such quantitative index remains suffi- ciently indicative. On the other hand, it can be affected by GA. Such semifragility is favorable for restoration of the right condition for decoding. The restoration is done by the procedure for recovery after GA which uses criterion of asymmetry. Compared to the known estimation procedures [14], the one proposed in this section depends on a single variable which is the unknown gain factor. This makes the technique simple and more precise.
For encoding, in our case, asymmetric distribution re- quires substantially more variables for description compared to common QIM methods. Because of that, it is advisable to refer to Nomenclature section.
2.1. Distribution of Quantized Samples. Symbolฮฃwill be used to denote a random variable whose domain is the space of original coefficients of a host. A particular realization ofฮฃ will be denoted as๐. We will further consider manipulation of original values๐ that are in some๐th interval of sizeฮ and its left endpoint is๐ฮ๐. Such an interval is referred further as embedding interval. For any๐ โ [๐๐ฮ, ๐๐ฮ + ฮ]we define ๐ฅ = ๐ โ ๐ฮ๐ and๐will be used to denote a random variable which represents๐ฅ. The value ofฮshould be small enough so that the distribution of๐can be considered uniform. A random variable that represents quantized coefficients inside ๐th interval is denoted as๐๓ธ and its realization is denoted as ๐ฅ๓ธ . Each pair of an original๐ฅand corresponding quantized ๐ฅ๓ธ belongs to the same๐th embedding interval so that an absolute shift is never larger than ฮ. Correspondingly, a random variable that represents quantized coefficients on the whole real number line is denoted asฮฃ๓ธ and its realization is denoted as๐๓ธ .
In order to provide efficient recovery after GA, we propose the following asymmetric distribution of quantized samples๐ฅ๓ธ inside๐th embedding interval (Figure 1(a)):
๐ (๐ฅ๓ธ ) = {{ {{ {{ {{ {
(๐พ0+ ๐1) ๐0(๐ฅ๓ธ ) , if๐ฅ๓ธ โ [0, ฮ (๐ฝ โ ๐ผ)] , (๐1+ ๐0) ๐1(๐ฅ๓ธ ) , if๐ฅ๓ธ โ [ฮ๐ฝ, ฮ] ,
0, otherwise,
(1)
where๐0(๐ฅ๓ธ )and๐1(๐ฅ๓ธ )are two different kinds of truncated distributions defined as
๐0(๐ฅ๓ธ ) ={ {{
๐๐ฅ๓ธ + ๐, if ๐ฅ๓ธ โ [0, ฮ (๐ฝ โ ๐ผ)] ,
0, otherwise, (2)
๐1(๐ฅ๓ธ ) ={ {{
๐, if ๐ฅ๓ธ โ [ฮ๐ฝ, ฮ] ,
0, otherwise. (3)
The other parameters are constrained in the following way:
0 โค ๐ผ โค ๐ฝ โค 1, ๐พ0 + ๐0 + ๐1 + ๐1 = 1 (see Nomenclature section). The meaning of parameters๐พ0,๐0,๐1, ๐1will be discussed later in Section 3. In Figure 1(b) we can see the distribution of the quantized coefficients outside๐th embedding interval as well.
x๓ณฐ f0(x๓ณฐ)
f1(x๓ณฐ) ๐ผฮ
๐ฝฮ ฮ
โ1โ ๐
lฮk lฮk+ ฮ ๐๓ณฐ
โ0โ
(a)
๐๓ณฐ
โ0โ
โ1โ
โ0โ
โ1โ
โ0โ
โ1โ Th Th
k โ 1 k k + 1 k + 2 k + 3
(b)
Figure 1: Distribution of the quantized coefficients: (a) Inside๐th embedding interval. (b) In five consecutive intervals.
2.2. Procedure for GA Recovery. It is assumed that under GA the original length of embedding interval ฮ is altered by unknown gain factor๐and the resulting length isฬฮ = ๐ฮ.
In addition to that, AWGN attack is applied. The procedure for GA recovery is the estimator whose result is based on a criterion having higher values for the right length ฬฮ of embedding interval. The uniqueness of the distribution of quantized samples is exploited by two different criteria ๐ถ1 and ๐ถ2. The procedure itself represents a brute force approach that substitutes guessed values ฬฮ๓ธ of the length of embedding interval into a criterion. Guessed value of ฬฮ๓ธ which maximizes it (๐ถ1or๐ถ2) should be selected:
ฬฮ๓ธ ๓ธ =arg max
{ฬฮ๓ธ } ๐ถ1,2(ฬฮ๓ธ ) , (4) whereฬฮ๓ธ ๓ธ is the final output of the procedure. Some interval [ฬฮ๓ธ min, ฬฮ๓ธ max] for guessed values ฬฮ๓ธ should be defined in advance. For instance,ฬฮ๓ธ min = 0.9ฮand ฬฮ๓ธ max = 1.1ฮworks well in most cases because the diapason of scaling factor๐is quite limited on practice.
For each particular valueฬฮ๓ธ , the index defined according to the criterion is calculated by projecting noisy quantized samples๐๐๓ธ on a single embedding interval:
๐ฅ๓ธ ๐
= {{ {{ {{ {{ {
๐๐๓ธ mod ฬฮ๓ธ , if [[ [ [
๐๓ธ ๐โ ๐๐ฮ
ฬฮ๓ธ ]] ] ]
mod2 = 0,
ฬฮ๓ธ โ (๐๐๓ธ modฬฮ๓ธ ) , otherwise.
(5)
This is needed to reconstruct the distribution of quantized samples inside embedding interval.
Two criteria are proposed for the assessment of the distribution of random variable๐๓ธ ๐ โ [0, ฬฮ๓ธ ](subscript โ๐โ
means affected by noise):
๐ถ1(ฬฮ๓ธ ) =๓ตจ๓ตจ๓ตจ๓ตจ
๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ
median(๐๓ธ ๐)
ฬฮ๓ธ โ 0.5๓ตจ๓ตจ๓ตจ๓ตจ
๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ, ๐ถ2(ฬฮ๓ธ ) =๓ตจ๓ตจ๓ตจ๓ตจ
๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ
๓ตจ
๐๐ค(๐๓ธ ๐) (ฬฮ๓ธ )๐ค
๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ
๓ตจ๓ตจ๓ตจ๓ตจ๓ตจ
, ๐ค = 2๐ + 1, ๐ โN.
(6)
Here, ๐๐ค is the ๐คth central moment. Odd moments are zero for symmetric distributions, but for asymmetric distributions their values can be sufficiently large. If the assumption aboutฬฮis wrong, then the values of both criteria are low. In that case the distribution of๐๓ธ ๐ is very close to uniform (which is symmetric). This is because of the effect caused by GA on calculation of๐ฅ๓ธ ๐in (5). Nevertheless, the distribution of๐๓ธ ๐demonstrates asymmetry in caseฬฮ๓ธ is close to ฬฮ. The explanation is that the distribution of quantized samples inside embedding interval (before GA is introduced) is indeed asymmetric. In spite of utilization of brute force optimization, the procedure is simple and the computational demand is low. On practice, the number of brute force steps is much smaller than the number of quantized elements.
Therefore, the complexity is๐(๐)in that case. For instance, for recovery with high accuracy it is enough to perform 103 brute force steps with values from the interval[ฬฮ๓ธ min, ฬฮ๓ธ max].
3. Quantization
A quantization model is introduced in this section. In order to represent it in a compact form, we combine all the quantization conditions in a single logical expression.
Previously proposed distribution of quantized samples is assured. However, additional parameter of the quantization model implies different distribution of the samples associated with labels โ0โ and โ1.โ
3.1. Two Approaches for Quantization. Quantized samples are modified according to the model described in this subsection.
A watermark bit is denoted as๐. Each sample with value๐ฅ inside๐th embedding interval has index๐ โ N according to its order in the host sequence. During watermarking a bit is assigned to each index๐. Different frameworks might be used for description of the quantization model. We will use first order predicate logic to describe our approach. This choice can be reasoned as follows. A closed-from expression has to be defined for quantization and it is important to show that the derived solution minimizes MSE between initial and target distribution. The kind of proposed target distribution is not common for QIM-based watermarking methods. Therefore, we find it necessary to explain in detail the process of derivation of quantization expression. Also, samples interpreting โ0โ should be quantized in a different way to samples interpreting โ1.โ Predicate logic is a suitable
lฮk+ L1 lฮk+ L2
lkฮ lฮk+ ฮ
๐ ๐1+ ๐1
ฮ 1
ฮ
๐พ0+ ๐0
ฮ NIDL0
NIDL1 IDL1
IDL0
Figure 2: Scheme of labeling and distribution of original samples prior to quantization.
tool for description of embedding because logical construc- tion can incorporate all the possible quantization conditions in a compact form.
Two-place predicate ๐ธ is to denote correspondence between some index and the value of coefficient. For example, ๐ธ๐๐ฅis true if a coefficient with order๐has value๐ฅ. We will further use notation of the setE which contains all the pairs (๐ฅ, ๐)that provide true value of๐ธ๐๐ฅ. One-place predicate๐ตis to denote bit value assigned to a coefficient with particular index. For instance, ๐ต๐ is true if watermark bit๐ = 1 is assigned to a coefficient with index๐andโผ ๐ต๐is true if๐ = 0.
Two-place predicates๐0 or๐1 will be used to define that some๐th sample with value๐ฅhas label โ0โ or โ1,โ respectively:
(๐0๐๐ฅ โก (๐ธ๐๐ฅ&โผ ๐ต๐)) , (โ๐) (โ๐ฅ) , (7) (๐1๐๐ฅ โก (๐ธ๐๐ฅ&๐ต๐)) , (โ๐) (โ๐ฅ) . (8) SetsX0 and X1 contain all the pairs(๐ฅ, ๐) that provide true values of๐0๐๐ฅand๐1๐๐ฅ, respectively. Initial PDFs of๐ insideX0,X1, andE are considered to be uniform:๐X0(๐ฅ) = ๐X1(๐ฅ) = ๐E(๐ฅ) = 1/ฮ(Figure 2).
Also, each coefficient is labeled either as IDL or non- IDL depending on its value๐ฅand index๐. Samples labeled as IDL are quantized in a different way which reduces the total embedding distortion. Both types of coefficients (IDL and non-IDL) are being modified during quantization.
However, after quantization, interpretation of a bit of each IDL coefficient is incorrect. The purpose of quantization is to provide that all the non-IDL samples can be extracted correctly and the resulting distribution of all the samples is the one depicted in Figure 1(a). Parameters ๐1 and ๐0 represent fractions of IDL for๐ = 1and๐ = 0, respectively.
Parameters๐1and๐พ0represent fractions of non-IDL samples for๐ = 1and๐ = 0, respectively. The fraction of zeros in a watermark data is๐พ0+ ๐0and fraction of ones is๐1+ ๐1. It is required that๐พ0+ ๐0+ ๐1+ ๐1= 1.
We define IDL and non-IDL samples using two-place predicates IDL0, IDL1, NIDL0, and NIDL1 in the following way (Figure 2):
(IDL0๐๐ฅ โก (๐0๐๐ฅ&(๐ฅ > ๐ฟ1))) , (โ๐) (โ๐ฅ) , (IDL1๐๐ฅ โก (๐1๐๐ฅ&(๐ฅ < ๐ฟ2))) , (โ๐) (โ๐ฅ) , (NIDL0๐๐ฅ โก (๐0๐๐ฅ&(๐ฅ โค ๐ฟ1))) , (โ๐) (โ๐ฅ) , (NIDL1๐๐ฅ โก (๐1๐๐ฅ&(๐ฅ โฅ ๐ฟ2))) , (โ๐) (โ๐ฅ) ,
(9)
where๐ฟ1= ฮ๐พ0/(๐พ0+ ๐0),๐ฟ2= ฮ๐1/(๐1+ ๐1), and๐ฟ1โฅ ๐ฟ2.
SetsIDL0,IDL1,NIDL0, andNIDL1will be used in order to specify all the coefficients that satisfy IDL0, IDL1, NIDL0, and NIDL1, respectively. Fractions๐พ0,๐0,๐1, and๐1 can be expressed in terms of cardinalities of setsIDL0,IDL1,NIDL0, NIDL1, and E. For example,|IDL0|/|E| = ๐0.
In this paper, two different quantization techniques are proposed. Since predicate logic is used to describe watermark embedding, a suitable logical construction should be able to distinguish between the techniques. According to our model, each kind of quantization can be represented by setting a corresponding logical value (โ0โ or โ1โ) for zero-place predicateฮฉ. Hence,ฮฉis used to define one out of two possible quantization techniques. For each kind of quantization,E is split on two subsetsE0andE1. For two-place predicates๐ธ0 and๐ธ1formulas๐ธ0๐๐ฅand๐ธ1๐๐ฅare defined in the following way:
(๐ธ0๐๐ฅ
โก (NIDL0๐๐ฅ โจ (IDL1๐๐ฅ&ฮฉ) โจ (IDL0๐๐ฅ&โผ ฮฉ))) , (โ๐) (โ๐ฅ) ,
(10)
(๐ธ1๐๐ฅ โก (๐ธ๐๐ฅ&โผ ๐ธ0๐๐ฅ)) , (โ๐) (โ๐ฅ) . (11) Using information about distribution insideIDL0,IDL1, NIDL0, andNIDL1 it is easy to derive distribution inside E0 andE1. Let us introduce variable๐ โ {0, 1}of natural numbers domainN (not a logical variable) which satisfies (ฮฉ โ (๐ = 1))&(โผ ฮฉ โ (๐ = 0)). Common arithmetical operations can be performed with๐which makes it possible to express PDF๐E0(๐ฅ)in the following compact form:
๐E0(๐ฅ)
= {{ {{ {{ {{ {{ {{ {{ {{ {
(๐พ0+ ๐0) ๐X0(๐ฅ) + ๐ (๐1+ ๐1) ๐X1(๐ฅ)
๐ท๐0 , if๐ฅ โค ๐ฟ2, (๐พ0+ ๐0) ๐X0(๐ฅ)
๐ท๐0 , if๐ฟ2< ๐ฅ โค ๐ฟ1, (1 โ ๐) (๐พ0+ ๐0) ๐X0(๐ฅ)
๐ท๐0 , otherwise,
(12)
where๐ท๐0= (๐๐1+ ๐พ0+ (1 โ ๐)๐0).
Therefore๐E1(๐ฅ)can be expressed as (Figures 3 and 4) ๐E1(๐ฅ) =๐E(๐ฅ) โ ๐ท๐0๐E0(๐ฅ)
1 โ ๐ท๐0 . (13)
Elements of setsE0andE1are modified during quanti- zation so that new setsE๓ธ 0andE๓ธ 1are obtained, respectively.
lkฮ
lฮk
lkฮ
๐๓ณฐ ๐๓ณฐ
๐
E๓ณฐ0 E0
E๓ณฐ1
E1
(๐พ0+ ๐1)f0(x๓ณฐ)
(๐1+ ๐0)f1(x๓ณฐ)
E0โE๓ณฐ0
E1โE๓ณฐ1
Figure 3: Scheme of redistribution of original samples during quantization,ฮฉis โtrue.โ
lkฮ
lkฮ
lkฮ ๐๓ณฐ
๐๓ณฐ
E๓ณฐ0 E0
E๓ณฐ1
E1
๐0f1(x๓ณฐ) ๐1f1(x๓ณฐ)
๐พ0f0(x๓ณฐ) ๐1f0(x๓ณฐ)
E0โE๓ณฐ0
E1โE๓ณฐ1
Figure 4: Scheme of redistribution of original samples during quantization,ฮฉis โfalse.โ
Therefore, for successful quantization, we require the follow- ing formula๐น1to be true:
๐น1 โก ((๐ธ0๐๐ฅ โ ๐ธ๓ธ 0๐๐ฅ๓ธ )&(๐ธ1๐๐ฅ โ ๐ธ๓ธ 1๐๐ฅ๓ธ )) ,
(โ๐) (โ๐ฅ) (โ๐ฅ๓ธ ) . (14)
As a result of quantization, variables ๐E0 and ๐E1 are modified in a way that the resulting๐๓ธ E๓ธ
0 and ๐๓ธ E๓ธ
1 are dis- tributed according to some desired distributions. For each kind of quantization (depending on the value ofฮฉ), the pair of desired distributions is different. We propose the following distributions that can be expressed as (Figures 3 and 4)
๐E๓ธ 0(๐ฅ๓ธ ) = ๐๐0(๐ฅ๓ธ ) + (1 โ ๐)๐พ0๐0(๐ฅ๓ธ ) + ๐0๐1(๐ฅ๓ธ ) ๐พ0+ ๐0 , ๐E๓ธ 1(๐ฅ๓ธ ) = ๐๐1(๐ฅ๓ธ ) + (1 โ ๐)๐1๐0(๐ฅ๓ธ ) + ๐1๐1(๐ฅ๓ธ ) ๐1+ ๐1 .
(15)
It can be seen that, for any logical value ofฮฉ, the distribution of๐๓ธ inside{E0โชE1}is the same and matches the distribution represented in Figure 1. It means that the efficiency of the procedure of GA recovery (proposed in the previous section) cannot be affected by the selection ofฮฉ.
In addition to the necessity of providing desired dis- tribution of the quantized samples, we need to minimize quantization distortions. Both requirements can be expressed by two two-place predicates๐and๐:
(๐ธ๓ธ 0๐๐ฅ๓ธ โก ๐ธ0๐๐ฅ&๐๐ฅ๐ฅ๓ธ ) , (โ๐) (โ๐ฅ) (โ๐ฅ๓ธ ) ,
(๐ธ๓ธ 1๐๐ฅ๓ธ โก ๐ธ1๐๐ฅ&๐๐ฅ๐ฅ๓ธ ) , (โ๐) (โ๐ฅ) (โ๐ฅ๓ธ ) . (16) The idea of minimization of embedding distortions can be explained in the following example. Assuming two samples ๐ฅ๐, ๐ฅ๐ โ E0, ๐ฅ๐ โค ๐ฅ๐, we infer that quantization in a way in which๐ฅ๓ธ ๐ โค ๐ฅ๓ธ ๐ implies less distortion than in case when ๐ฅ๓ธ ๐ > ๐ฅ๓ธ ๐. Let us sort elements inE0andE๓ธ 0in the dimension of ๐ฅand๐ฅ๓ธ , respectively. Then, for some๐ฅ๐(index๐is an order in a host sequence) the number of elements inE0with๐ฅvalue less than๐ฅ๐should be equal to the number of elements inE๓ธ 0 that have๐ฅ๓ธ value less than๐ฅ๓ธ ๐. Integration should be used in case we switch from discrete distribution of samples inE0 andE๓ธ 0to continuous one. Further, throughout the paper we assume that the constant of integration is zero for indefinite
integrals. Hence, the truth values for both predicates๐and๐ are defined as
(๐๐ฅ๐ฅ๓ธ โก (โซ ๐E0(๐ฅ) ๐๐ฅ = โซ ๐E๓ธ 0(๐ฅ๓ธ ) ๐๐ฅ๓ธ )) ,
(โ๐ฅ) (โ๐ฅ๓ธ ) , (17)
(๐๐ฅ๐ฅ๓ธ โก (โซ ๐E1(๐ฅ) ๐๐ฅ = โซ ๐E๓ธ 1(๐ฅ๓ธ ) ๐๐ฅ๓ธ )) ,
(โ๐ฅ) (โ๐ฅ๓ธ ) . (18)
Further, we introduce logical formula๐น2
๐น2 โก ((โ๐ฅ๓ธ ) ๐๐ฅ๐ฅ๓ธ &(โ๐ฅ๓ธ ) ๐๐ฅ๐ฅ๓ธ ) , (โ๐ฅ) (19) and state that argument
๐น2, (11) , (16) โจ ๐น1 (20) is valid. The task of watermark embedding is to assure that the mentioned argument is sound. For that purpose, a procedure that makes๐น2true should be proposed.
3.2. Quantization Equations. Quantization equations and their solutions are needed to satisfy formula ๐น2 during embedding. For this purpose, we will analyze conditions that enforce qualities of predicates ๐ and ๐. Due to the large number of variables in the text we recommend to refer to Nomenclature section for clarity. We can rewrite elements of (17) in the following way:
โซ ๐E0(๐ฅ) ๐๐ฅ = {{ {{ {{ {
min(๐ฅ, ๐ฟ2) ๐ (๐1+ ๐1) + ๐ฅ (๐พ0+ ๐0)
ฮ๐ท๐0 , if๐ฅ โค ๐ฟ1;
๐ +(1 โ ๐) ๐ฅ (๐พ0+ ๐0)
ฮ๐ท๐0 , otherwise,
โซ ๐E๓ธ 0(๐ฅ๓ธ ) ๐๐ฅ๓ธ = {{ {{ {{ {
(๐ + ๐พ0 1 โ ๐
๐พ0+ ๐0) โซ ๐0(๐ฅ๓ธ ) ๐๐ฅ๓ธ , if๐ฅ๓ธ โค ฮ๐ฝ;
(๐ + ๐พ0 1 โ ๐
๐พ0+ ๐0) + ๐0 1 โ ๐
๐พ0+ ๐0 (โซ ๐1(๐ฅ๓ธ ) ๐๐ฅ๓ธ + โซ0
ฮ๐ฝ๐1(๐ฅ๓ธ ) ๐๐ฅ๓ธ ) , otherwise.
(21)
From (21) it is clear that
โซ๐ฟ1
0 ๐E0(๐ฅ) ๐๐ฅ = โซฮ(๐ฝโ๐ผ)
0 ๐E๓ธ 0(๐ฅ๓ธ ) ๐๐ฅ๓ธ
= ๐ + ๐พ0 1 โ ๐
๐พ0+ ๐0. (22) The equation above means that the following is true:
(๐๐ฅ๐ฅ๓ธ โ (((๐ฅ โค ๐ฟ1)&(๐ฅ๓ธ โค ฮ๐ฝ))
โจ ((๐ฅ > ๐ฟ1)&(๐ฅ๓ธ > ฮ๐ฝ)))) , (โ๐ฅ) (โ๐ฅ๓ธ ) .
(23)
We introduce two two-place predicates๐1and๐2: (((๐๐ฅ๐ฅ๓ธ &(๐ฅ โค ๐ฟ1)&(๐ฅ๓ธ โค ฮ๐ฝ)) โก ๐1๐ฅ๐ฅ๓ธ )
&((๐๐ฅ๐ฅ๓ธ &(๐ฅ > ๐ฟ1)&(๐ฅ๓ธ > ฮ๐ฝ)) โก ๐2๐ฅ๐ฅ๓ธ )) , (โ๐ฅ) (โ๐ฅ๓ธ ) .
(24)
According to (21) and (24) the following can be derived:
(๐1๐ฅ๐ฅ๓ธ โก (ฮฅ1(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1) = 0.5๐๐ฅ๓ธ 2+ ๐๐ฅ๓ธ )) ,
(โ๐ฅ) (โ๐ฅ๓ธ ) , (๐2๐ฅ๐ฅ๓ธ โก (ฮฅ2(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1) = ๐ (๐ฅ๓ธ โ ฮ๐ฝ))) ,
(โ๐ฅ) (โ๐ฅ๓ธ ) , (25) where
ฮฅ1(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1)
= (๐พ0+ ๐0)min(๐ฅ, ๐ฟ2) ๐ (๐1+ ๐1) + ๐ฅ (๐พ0+ ๐0) ฮ๐ท๐0(๐พ0+ ๐๐0) , ฮฅ2(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1) = ๐ฅ (๐พ0+ ๐0)2โ ๐พ0ฮ๐ท๐0
๐0ฮ๐ท๐0 . (26) Now, let us analyze conditions that enforce quality of predi- cate๐. Elements of (18) can be represented as
โซ ๐E1(๐ฅ) ๐๐ฅ = {{ {{ {{ {{ {
(1 โ ๐) ๐ฅ (๐1+ ๐1)
ฮ (1 โ ๐ท๐0) , if๐ฅ โค ๐ฟ2;
max(๐ฅ โ ๐ฟ1, 0) ๐ (๐พ0+ ๐0) + (๐ฅ โ ๐ฟ2) (๐1+ ๐1)
ฮ (1 โ ๐ท๐0) , otherwise,
โซ ๐E๓ธ 1(๐ฅ๓ธ ) ๐๐ฅ๓ธ = {{ {{ {{ {
(1 โ ๐) ๐1
๐1+ ๐1 โซ ๐0(๐ฅ๓ธ ) ๐๐ฅ๓ธ , if ๐ฅ๓ธ โค ฮ๐ฝ;
(1 โ ๐) ๐1
๐1+ ๐1 + (๐ + ๐1 1 โ ๐
๐1+ ๐1) (โซ ๐1(๐ฅ๓ธ ) ๐๐ฅ๓ธ + โซ0
ฮ๐ฝ๐1(๐ฅ๓ธ ) ๐๐ฅ๓ธ ) , otherwise.
(27)
We can see that according to (25)
โซ๐ฟ2
0 ๐E1(๐ฅ) ๐๐ฅ = โซฮ(๐ฝโ๐ผ)
0 ๐E๓ธ 1(๐ฅ๓ธ ) ๐๐ฅ๓ธ = (1 โ ๐) ๐1 ๐1+ ๐1 . (28) This means that the following expression is true:
(๐๐ฅ๐ฅ๓ธ โ (((๐ฅ โค ๐ฟ2)&(๐ฅ๓ธ โค ฮ๐ฝ))
โจ ((๐ฅ > ๐ฟ2)&(๐ฅ๓ธ > ฮ๐ฝ)))) , (โ๐ฅ) (โ๐ฅ๓ธ ) .
(29)
Next, two two-place predicates๐1and๐2are defined as (((๐๐ฅ๐ฅ๓ธ &(๐ฅ โค ๐ฟ2)&(๐ฅ๓ธ โค ฮ๐ฝ)) โก ๐1๐ฅ๐ฅ๓ธ )
&((๐๐ฅ๐ฅ๓ธ &(๐ฅ > ๐ฟ2)&(๐ฅ๓ธ > ฮ๐ฝ)) โก ๐2๐ฅ๐ฅ๓ธ )) , (โ๐ฅ) (โ๐ฅ๓ธ ) .
(30) According to (27) and (30) the following can be derived:
(๐1๐ฅ๐ฅ๓ธ โก (ฮฅ3(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1) = 0.5๐๐ฅ๓ธ 2+ ๐๐ฅ๓ธ )) , (โ๐ฅ) (โ๐ฅ๓ธ ) , (๐2๐ฅ๐ฅ๓ธ โก (ฮฅ4(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1) = ๐ (๐ฅ๓ธ โ ฮ๐ฝ))) ,
(โ๐ฅ) (โ๐ฅ๓ธ ) , (31)
where
ฮฅ3(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1) = ๐ฅ (๐1+ ๐1)2 ๐1ฮ (1 โ ๐ท๐0),
ฮฅ4(๐ฅ, ๐, ๐พ0, ๐0, ๐1, ๐1) = (๐1+ ๐1) (max(๐ฅ โ ๐ฟ1, 0) ๐ (๐พ0+ ๐0) + (๐ฅ โ ๐ฟ2) (๐1+ ๐1)) โ ฮ (1 โ ๐ท๐0) (1 โ ๐) ๐1
ฮ (1 โ ๐ท๐0) (๐1+ ๐๐1) .
(32)
We can express๐using๐1and๐2in the following way:
(๐๐ฅ๐ฅ๓ธ
โก (((๐ฅ โค ๐ฟ1) โ ๐1๐ฅ๐ฅ๓ธ )&((๐ฅ > ๐ฟ1) โ ๐2๐ฅ๐ฅ๓ธ ))) , (โ๐ฅ) (โ๐ฅ๓ธ ) .
(33)
Also, we can express๐using๐1and๐2: (๐๐ฅ๐ฅ๓ธ
โก (((๐ฅ โค ๐ฟ2) โ ๐1๐ฅ๐ฅ๓ธ )&((๐ฅ > ๐ฟ2) โ ๐2๐ฅ๐ฅ๓ธ ))) , (โ๐ฅ) (โ๐ฅ๓ธ ) .
(34)
Begin
X,b,ฮ,๐, ๐, ฮฅ1, ฮฅ2, ฮฅ3, ฮฅ4
No
No No
No
Yes
Yes
Yes xiโฅ L2 Yes xiโค L1
x๓ณฐiโฮฅ4+ gฮ๐ฝ
g x๓ณฐiโฮฅ2+ gฮ๐ฝ g
i > |X| X๓ณฐ End
xiโE0
i โ1
x๓ณฐiโโ๐2+ 2ฮฅ3c โ ๐
c x๓ณฐiโโ๐2+ 2ฮฅ1c โ ๐
c
x๓ณฐiโX๓ณฐ, i โ i + 1
Figure 5: Quantization diagram for the๐th embedding interval.
Further, utilizing property๐ฟ2โค ๐ฟ1we can obtain ((๐ฅ โค ๐ฟ2) โ ((โ๐ฅ๓ธ ) (๐1๐ฅ๐ฅ๓ธ )&(โ๐ฅ๓ธ ) (๐1๐ฅ๐ฅ๓ธ ))) ,
(โ๐ฅ) , ((๐ฟ2< ๐ฅ โค ๐ฟ1)
โ ((โ๐ฅ๓ธ ) (๐1๐ฅ๐ฅ๓ธ )&(โ๐ฅ๓ธ ) (๐2๐ฅ๐ฅ๓ธ ))) , (โ๐ฅ) , ((๐ฟ1< ๐ฅ) โ ((โ๐ฅ๓ธ ) (๐2๐ฅ๐ฅ๓ธ )&(โ๐ฅ๓ธ ) (๐2๐ฅ๐ฅ๓ธ )))
โจ ๐น2, (โ๐ฅ) .
(35)
Here, each premise should be true. With the aim to provide this, equations in (25) and (31) should be solvable. It can be seen that the solutions are straightforward:
๐ฅ๓ธ ๐1,๐1= โ๐2+ 2ฮฅ1,3๐ โ ๐
๐ , (36)
๐ฅ๓ธ ๐2,๐2= ฮฅ2,4+ ๐ฮ๐ฝ
๐ , (37)
where, for example, in (36), ๐ฅ๓ธ ๐1,๐1 denotes the values of ๐ฅ๓ธ that turn either ๐1๐ฅ๐ฅ๓ธ or ๐1๐ฅ๐ฅ๓ธ true for ฮฅ1(โ )or ฮฅ3(โ ), respectively. The diagram of quantization is represented in
Figure 5. Each ๐th original sample is chosen from array X on ๐th iteration. The corresponding bit of a watermark is chosen from arrayb. Vector ๐contains parameters of the quantization. At the end of each iteration, quantized value of ๐th sample is written to arrayX๓ธ .
4. Robustness under AWGN
In this section, we will analytically estimate the robust- ness of the proposed watermarking scheme under AWGN.
Robustness is reflected by the term โextracted informationโ
which denotes mutual information between embedded and detected messages. In contrast to channel capacity, the index of extracted information is practical but depends on the algorithm of detection. Also, throughout this section we assume that the original samples are distributed uniformly inside the quantization interval.
The derivations for extracted information are less involved whenฮฉis โfalse.โ Therefore, only that condition is considered here. In order to estimate extracted information we first find error rates. The rates depend on the attack severity (represented by๐),ฮ, and parameter set๐ = {๐พ0, ๐1, ๐1, ๐0, ๐ผ, ๐ฝ}. Moreover, we derive a stronger statement that information aboutฮ/๐ and ๐ is sufficient to perform analytic estimation of error rates for our watermarking scheme. Finally, we will demonstrate how error rates can be expressed using WNR and๐.
4.1. Estimation of Error Rates. For our estimation, it is consid- ered that, during watermark extraction, in each embedding interval samples that interpret โ0โ are separated from samples that interpret โ1โ using a threshold (e.g., hard decision region detector). The position of the threshold in ๐th embedding interval is Th+ [ฮ โ 2Th]mod(๐ โ ๐, 2)(dashed vertical lines in Figure 1(b)). Therefore, the whole real number line can be seen as a union of two domains:
Z= โโ
๐=โโ[2ฮ๐ + ๐ฮ๐โTh, 2ฮ๐ + ๐๐ฮ+Th) , (38) O= โโ
๐=โโ[2ฮ๐ + ๐ฮ๐+Th, 2ฮ (๐ + 1) + ๐ฮ๐โTh) . (39) During extraction, all the elements inZ will be labeled โ0โ
and all the elements inO will be labeled โ1.โ
After noise is added, elements quantized in๐th embed- ding interval might spread over its limits and other notations should be used. We notate sample values that are affected by noise as๐๓ธ ๐. Also,๐๓ธ ๐belongs to some embedding interval and inside this interval we use๐ฅ๓ธ ๐= ๐๓ธ ๐ โ ฮโ๐๓ธ ๐/ฮโ. Random variablesฮฃ๓ธ ๐and๐๓ธ ๐represent๐๐๓ธ and๐ฅ๓ธ ๐, respectively (alterna- tively we use ฬฮฃ๓ธ and ฬ๐๓ธ to save space in lower subscript part).
Therefore, two modified sets are obtained:E๓ธ 0 ๓ณจ๓ณจ๓ณจ๓ณจ๓ณจโ ฬฮฃAWGN ๓ธ 0; E๓ธ 1 ๓ณจ๓ณจ๓ณจ๓ณจ๓ณจโ ฬฮฃAWGN ๓ธ 1. For noise variance๐2we might, for instance, estimate the expected fraction for each of the noisy sets ฬฮฃ๓ธ 0 and ฬฮฃ๓ธ 1inZ. Fractions of ฬฮฃ๓ธ 0and ฬฮฃ๓ธ 1that belong toO can be found in a trivial manner. In that way we obtain error rates for โ0โ and โ1.โ
However, instead of appealing directly to sets ฬฮฃ๓ธ 0 and
ฬฮฃ๓ธ 1, we use an indirect but computationally lighter approach.
In case ฮฉ is โfalseโ we can conclude for the following distributions of quantized samples (not affected by AWGN yet) that
๐Eฬ๓ธ
0(๐ฅ๓ธ ) = ๐Eฬ๓ธ
1(๐ฅ๓ธ ) = ๐0(๐ฅ๓ธ ) , (40) where
( ฬ๐ธ๓ธ 0๐๐ฅ๓ธ โก (๐ธ0๓ธ ๐๐ฅ๓ธ &(๐ฅ๓ธ โค ฮ (๐ฝ โ ๐ผ)))) , (โ๐) (โ๐ฅ๓ธ ) , ( ฬ๐ธ๓ธ 1๐๐ฅ๓ธ โก (๐ธ1๓ธ ๐๐ฅ๓ธ &(๐ฅ๓ธ โค ฮ (๐ฝ โ ๐ผ)))) , (โ๐) (โ๐ฅ๓ธ ) .
(41)
Also, we can conclude that the following distributions are also identical:
๐Eฬ๓ธ
0(๐ฅ๓ธ ) = ๐ฬE๓ธ
1(๐ฅ๓ธ ) = ๐1(๐ฅ๓ธ ) , (42) where
(ฬ๐ธ๓ธ 0๐๐ฅ๓ธ โก (๐ธ๓ธ 0๐๐ฅ๓ธ &(๐ฅ๓ธ โฅ ฮ๐ฝ))) , (โ๐) (โ๐ฅ๓ธ ) , (ฬ๐ธ๓ธ 1๐๐ฅ๓ธ โก (๐ธ๓ธ 1๐๐ฅ๓ธ &(๐ฅ๓ธ โฅ ฮ๐ฝ))) , (โ๐) (โ๐ฅ๓ธ ) .
(43)
For any ๐, (40) means that, for example, the fraction of elements fromEฬ๓ธ 0that after AWGN appear inZ is equal to
that ofEฬ๓ธ 1 and can be calculated using๐0(๐ฅ๓ธ ). This fraction will be denoted as ฬ๐นZ. The PDF of AWGN with variance๐๐2is denoted as๐N[๐๓ธ ๐โ๐๓ธ , 0, ๐๐]using parameters๐๓ธ = ๐ฅ๓ธ +๐ฮ๐and ๐๓ธ ๐. Therefore
ฬ๐นZ
= โซZโซฮ(๐ฝโ๐ผ)
0 ๐0(๐ฅ๓ธ ) ๐N[๐๐๓ธ โ ๐ฅ๓ธ โ ๐๐ฮ, 0, ๐๐] ๐๐ฅ๓ธ ๐๐๓ธ ๐. (44) Fraction of elements fromEฬ๓ธ 0that after AWGN appear inZ will be denoted asฬ๐นZ:
ฬ๐นZ= โซ
Zโซฮ
ฮ๐ฝ๐1(๐ฅ๓ธ ) ๐N[๐๓ธ ๐โ ๐ฅ๓ธ โ ๐๐ฮ, 0, ๐๐] ๐๐ฅ๓ธ ๐๐๓ธ ๐. (45) Error rates are calculated using ฬ๐นZandฬ๐นZ:
BER0= (1 โ ฬ๐นZ) ๐พ0
๐พ0+ ๐0 + (1 โ ฬ๐นZ) ๐0 ๐พ0+ ๐0, BER1= ฬ๐นZ ๐1
๐1+ ๐1 + ฬ๐นZ ๐1 ๐1+ ๐1.
(46)
In order to demonstrate that error rates can be calculated based on ฮ/๐, ๐we analyze expression for ฬ๐นZ (expression for ฬ๐นZ can be analyzed in a similar way). Function๐0(๐ฅ๓ธ ) is present in (44). According to (2) it is defined using parameters๐,๐. Parameters๐ผ,๐ฝare also present in (2) as well as in (44). Parameters๐ผ,๐ฝhave clear constraints (the same is true about๐พ0,๐0,๐1,๐1). It is possible to express๐,๐using๐ผ, ๐ฝ,๐พ0,๐0,๐1,๐1. In the realization of our method parameter๐ is set as
๐ = ๐พ0+ ๐0
ฮ๐พ0 . (47)
Defining new parameter ฬ๐as
ฬ๐ = ๐ฮ, (48)
it can be seen that ฬ๐ = (๐พ0+ ๐0)/๐พ0does not depend on the choice ofฮ.
Using property of PDF, the following is obtained from (2):
โซ(๐ฝโ๐ผ)ฮ
0 ๐0(๐ฅ๓ธ ) ๐๐ฅ๓ธ = ๐(๐ฝ โ ๐ผ)2ฮ2
2 + ๐ฮ (๐ฝ โ ๐ผ)
= 1.
(49)
It is easy to derive from (48) and (49) that ๐ฮ2= 21 โ ฬ๐ (๐ฝ โ ๐ผ)
(๐ฝ โ ๐ผ)2 . (50)
According to (50), it is also obvious that parameter
ฬ๐ = ๐ฮ2 (51)
is independent ofฮ.