Quantitative test setup - Test methods - Experimental Evaluation of Depth Cameras for Pallet De

3.3 Test methods

3.3.2 Quantitative test setup

The quantitative evaluation attempts to assess the absolute depth error of the mea-surements, the precision of the meamea-surements, and the tolerance for different materials and lighting conditions of the depth cameras. The test setup for absolute depth error and precision is similar to [24, 87], where a white mostly diffuse planar target, visible

in figure 3.2, is recorded in a static situation. An approach to quantitatively evaluate camera tolerance to different materials and illuminations is proposed. During the tests standard EUR-pallets made out of different materials are recorded in a static situation, while varying the illumination conditions. The used EUR-pallets are wooden, cardboard, black plastic and red plastic as seen in figure 3.3.

Figure 3.2. Quantitative test setup. Target object is a white mostly diffusely reflective plane with a size of 520mm x 370mm against a neutral background. Cameras were attached to a rack along with other cameras that were not used in this thesis.

Figure 3.3.Euro pallets used for the invalid pixel analysis.

Absolute depth error

An accurate enough ground truth distance to the target plane is needed for each camera to assess the absolute depth error. To simplify the test setup the extrinsic calibration be-tween the Realsense D415 camera and all of the other cameras is first solved. Then the distance from the Realsense D415 camera front glass to the target object was measured

with a laser measurement device with a ±1,5mmtypical error term [97]. The distance from the D415 front glass to it’s the depth ground zero reference 1,1mm [37] is to be added to the measured distance to obtain the ground truth for the D415. This distance combined with the extrinsic calibration is used to obtain the ground truth for the other cameras.

For most of the cameras the depth frame to depth frame calibration was done with an open-source calibration tool Kalibr [98–101] using the RGB images of each camera com-bined with each camera’s manufacturer provided calibration from RGB frame to depth frame. For the IFM camera the calibration was done using a calibration tool developed during the Himmeli Robotics-project for point cloud to RGB calibration. Other important points that need to be taken into account while collecting data are taking the depth images from each camera individually to prevent any interference between the active cameras, and the ToF cameras need to be warmed up for at least 25 minutes before taking any measurements to reduce temperature related effects.

The absolute depth errors are measured from camera-to-object distances of 0,7m to 7,5m. Depth frames are collected from nine different distances and 50 frames are recorded from each distance to include any temporal noise. Similar to [24] a 20x20 pixel area is cropped from the middle of the target plane in the depth image, see the blue rectangle in figure 3.4. Invalid values are discarded, values below 1% percentile and over 99% per-centile are discarded as outliers. A mean value is calculated for each frame and another mean of these 50 frames is then calculated and compared against the ground truth.

Precision

The test to assess precision is similar to papers [24, 102]. The same nine distances be-tween 0,7m and 7,5m as in the absolute depth error test are used and again 50 depth frames are recorded from each distance to include any temporal noise in the data. The target plane is annotated in the depth data by hand, see the red square in the 3.4. In-valid values are discarded and values outside 1% and 99% percentile are discarded as outliers. To compensate for the camera not possibly being exactly parallel to the target, a plane is fitted to the combined depth data from the 50 frames using a MATLAB^⃝^R function named pcfitplane [103], which uses MSAC [104] for plane-fitting (MSAC is a variation of RANSAC). Finally, the root mean squared error (RMSE) between the depth data and the fitted model is calculated as a measure for precision.

Figure 3.4. Crops used in the absolute depth error test (blue) and the precision test (red) visible in a Azure Kinect DK depth image from 2041mm of the target.

Fill rate of pallet front face

The tolerance for varying materials and lighting is tested quantitatively using thefill rate of the depth image as a measure. The proposed test is the following. Four different standard sized EUR-pallets [4] are imaged from 2,0 meters while varying the illumination.

50 images are recorded to include any temporal noise. The pallet front faces are extracted from the depth images by first manually drawing the pallet front face on top of a more clear image from the same camera as it is easier to draw the front face in e.g. the IR, RGB or amplitude image, which are conveniently in the same frame with the same resolution as the corresponding depth images. A binary mask is created from the drawing and applied to the depth image. Then thefill rate or the ratio of valid depth pixels in the pallet front face area is calculated. The collection of the fill rates from all of the pallets and lighting settings is used as a measure of the cameras robustness to different pallet materials, colors and lighting conditions. The process is illustrated in image 3.5.

Figure 3.5. The process to find out the fill rate of the pallet front face area from a depth image. The fill rate measures the depth cameras robustness to the recorded pallet mate-rial, color and prevailing lighting

The correctness of the data is not considered in this test, merely that any valid depth data of the pallet front face is produced. The used EUR-pallets are wooden, cardboard, black plastic and red plastic, as shown in fig. 3.3. The illumination is varied from office lighting to no lights and to no lights with additional construction light. The 300W construction light is placed in an 20^◦angle and 1 meter away from the pallet front face middle point in 1 meter distance, as illustrated in figure 3.6.

Figure 3.6. The construction light (blue box) is placed in a 20^◦angle and 1 meter away from the pallet front face middle point.

The results for the quantitative tests are discussed in chapter 5.

4 QUALITATIVE EVALUATION OF DEPTH CAMERAS

This chapter analyses the results of the test situations described by the first entry of table 3.2. The qualitative evaluation resulted in a lot of information about the behaviour of the tested camera models individually and for the different depth camera technologies in general. When varying the material and shape of the target object and the illumination of the scene, the depth camera performance varied also resulting in different noise patterns for most of the object and lighting combinations.

The cameras are tested with the objects and illumination conditions listed in the first en-try of table 3.2. The camera performance and possible noise patterns are commented for each material, lighting and object in tables 4.1, 4.2 and 4.3 below. Blank entries im-ply that the particular material, shape or illumination in question does not introduce any change compared to the normal performance. After the tables, the chapter will focus on each camera individually and address the camera performance and encountered noise patterns more in-depth and tries to find the underlying reasons for them. The observa-tions are made from 2m camera-to-object distance and 1m distance from the object to the background wall.

Table 4.1.Summary of the effects of varying target object material.

Material Notes

ZED

cardboard cardboard + wrapping

sheet metal Reflections in the metal cause erroneous depth values.

D435i

cardboard cardboard + wrapping

sheet metal Reflections in the metal cause erroneous depth values.

IFMO3D303

cardboard cardboard + wrapping

sheet metal Surfaces parallel to the camera are otherwise imaged successfully, but due to overexposure, some invalid pixel areas are present. Sur-faces that are not parallel against the camera reflect the emitted pulses away and depending on the object geometry might cause invalid or wrong depth values.

AzureKinectDK cardboard cardboard + wrapping

sheet metal Surfaces parallel to the camera are imaged successfully, some over-exposure might happen. Surfaces parallel to the camera are imaged ok. Surfaces that are not parallel against the camera reflect the emitted pulses away and depending on the object geometry might cause invalid or wrong depth values.

L515

cardboard cardboard + wrapping

sheet metal Surfaces parallel to the camera are imaged ok. Surfaces that are not parallel against the camera reflect the emitted pulses away and de-pending on the object geometry might cause invalid or wrong depth values.

The diffuse cardboard on its own didn’t pose problems for any of the cameras. Adding plastic wrapping didn’t cause noticeable effects either with any of the depth cameras. The reflective sheet metal caused problems with all of the depth cameras at least to some level. In hindsight, the sheet metal used might have been a bit too specular and such object surfaces might be rare in real life, but at least it highlighted the possible problems with specular materials very clearly.

Table 4.2. Summary of the effects of varying scene illumination conditions.

Lighting Notes

ZED

Office lighting

Lights off The depth couldn’t be estimated in some areas of the image due to a lower amount of features.

Office lighting

+ construction light Distorted shape with objects where strong shadows occur, for example, the convex corner. Missing depth values due to too large of a gradient in intensity, RGB images are over-exposed and depth can’t be calculated due to missing fea-tures.

Lights off

+ constructions light Distorted shape with objects where strong shadows occur, for example, the convex corner. Even more missing depth values due to too large of a gradient in intensity, RGB im-ages are overexposed and depth can’t be calculated due to missing features.

D435i

Office lighting

Lights off Almost as good performance as with lights on, small amounts of single invalid pixels. Better performance in the dark than the ZED as expected due to the active IR lighting.

Office lighting

+ construction light Mostly completely invalid values or depth values are com-pletely wrong. Caused by too large of a gradient in intensity, combined monochrome+IR images are overexposed and depth can’t be calculated or is miscalculated due to miss-ing features.

Lights off

+ constructions light Mostly completely invalid values or depth values are at the same level as the background wall. Caused by too large of a gradient in intensity, combined monochrome+IR images are overexposed and depth can’t be calculated or is miscal-culated due to missing features.

IFMO3D303

Office lighting No noticeable interference from the office lighting. The spectrum of the fluorescent lights might not contain the used 850nm[50] wavelength of IR light.

Lights off Office lighting

+ construction light Very small amount of invalid pixels near the construction light noticeable. The IFM does not seem severely affected by the interfering light.

Lights off

+ constructions light Very small amount of invalid pixels near the construction light noticeable. The IFM does not seem severely affected by the interfering light.

AzureKinectDK Office lighting No noticeable interference from the office lighting. The spectrum of the fluorescent lights might not contain the used 860nm [85] wavelength of IR light.

Lights off Office lighting

+ construction light Additional halogen construction light introduces invalid pix-els to the depth image because it most probably has power in the used IR wavelengths.

Lights off

+ constructions light Additional halogen construction light introduces invalid pix-els to the depth image because it most probably has power in the used IR wavelengths.

L515

Office lighting No noticeable interference from the office lighting. The spectrum of the fluorescent lights might not contain the used 860nm[86] wavelength of IR light.

Lights off Office lighting

+ construction light Additional halogen construction light introduces a lot of in-valid pixels and random flying pixels to the depth image because it most probably has power in the used IR wave-lengths.

Lights off

+ constructions light Additional halogen construction light introduces a lot of in-valid pixels and random flying pixels to the depth image because it most probably has power in the used IR wave-lengths.

The CW-ToF cameras IFM O3D303 and Azure Kinect seem to fare better than the rest with different illumination settings. The L515 LiDAR camera has serious problems with interfering light, which will be discussed later in section 4.5. D435i does a better job than the ZED camera in dark conditions but has difficulties with overexposure when interfering light is added. However, the D415 was later tested in the same situation for comparison and it does not have such problems with interfering light. This might be related to the shutter type, as rolling shutter cameras tend to have wider dynamic ranges than global shutter cameras [105, 106].

Table 4.3.Summary of the effects of varying target object shapes.

Shape Notes

ZED

Plane

Convex corner Distorted shape due to strong shadows, most visible when com-bined with the construction light and only one side is illuminated.

Concave corner

Cylinder Invalid depth areas appear when the construction light is used and RGB images are over-exposed.

D435i

Plane

Convex corner

Concave corner A minor noise pattern occurs with the reflective sheet metal ma-terial. A symmetrical area of wrong and invalid values can be seen in the "deep"-end of the corner caused possibly by reflec-tions of the emitted pattern.

Cylinder Cylinder radius seems a lot larger than in reality, as in the shape is flatter than ground truth, probably due to reflections in the metal surface.

IFMO3D303

Plane

Convex corner Some problems occur when combined with specularly reflective materials. Depth values are valid only in the front edge of the corner, elsewhere the depth value appears several meters too far away or no value could be estimated at all because the ampli-tude of the returning signal is very low. The too far away values are probably related to reflections and MPI.

Concave corner Deformation due to MPI. More severe the more reflective the material is and the corner appears as a plane with sheet metal.

Cylinder The metal cylinder is deformed so that the edges are closer than the center as if the cylinder would be the other way around. Re-lated to intensity reRe-lated error, which will be discussed more in 4.3.

AzureKinectDK Plane

Convex corner Problems occur when combined with specularly reflective ma-terials. The depth values are valid only the front edge of the corner, elsewhere the depth values are invalid or several meters off.

Concave corner Deformation due to MPI. More severe the more reflective the material is and the corner appears as a plane with sheet metal.

Cylinder Invalid depth pixels in the edges of the metal cylinder due to the emitted light pulses scattering away from the reflective surface.

L515

Plane

Convex corner Problems occur when combined with specularly reflective ma-terials. The depth values are valid only the front edge of the corner, elsewhere the depth values are invalid.

Concave corner Deformation due to MPI. More severe the more reflective the material is and the corner appears as a plane with sheet metal.

Cylinder Invalid depth pixels in the edges of the metal cylinder due to the emitted light pulses scattering away from the reflective surface.

The ToF-cameras had most difficulties with complex shapes when a reflective material was used. The stereo cameras had also some problems with the cylindrical shape due to reflections. The D435i seemed to be most robust to different shapes.

In document Experimental Evaluation of Depth Cameras for Pallet Detection and Pose Estimation (sivua 36-46)