-
Notifications
You must be signed in to change notification settings - Fork 3
The swiss army knife for image compression testing and format conversion
License
thorfdbg/difftest_ng
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is difftest_ng, "Difftest, the next generation". Difftest is a program/framework helping to find errors in image compression algorithms. It allows to measure many error measures between a reference image and a compressed and re-expanded image. Error measures are targetted not at human vision, but at measures that allow to automatically detect common problems in image codecs. In addition, difftest_ng includes a couple of convenience functions, including restricting the measurement only to a single component, computing the FFT (or weighted FFT), filtering the image, measuring the histogram and converting between various image formats. Currently, difftest_ng supports pnm (ppm,pgm,pbm), pgx (JPEG 2000 reference testing format), bmp, TIFF, multiple of raw formats with very flexible specifications, pfm, rgbe, png, exr and dpx. difftest_ng compiles under GNU/Linux and probably some other operating systems, it requires libpng, libgsl and libopenexr for its full function. Without additional libraries, some of its operations are not available. difftest_ng is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. difftest_ng is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ----------------------------------------------------------------------------------------------- Source and destination image formats are recognized by the file name extension. While this is certainly not the most elegant method to identify image formats, it has been proven to be robust enough for the purpose of difftest_ng. The following file name extensions are currently recognized: .pnm,.pgm,.ppm,.pbm,.pfm,.pfs: File formats from the pnm (Picture Any Map) format The precise format does not depend on the extender, but on the number of components and whether the data is float or integer. pnm is the superset, pgm is grey scale, ppm RGB, pbm binary data. While all the above represent integer samples, pfs and pfm are floating point sample formats. .bmp: The Windows(tm) native bitmap format, covering 4 and 8 bits per sample, palette or true-color images of various bitdepths. .pgx: The file format defined by ISO/IEC 15444-4 (JPEG 2000) used there for conformance testing. pgx can represent any integer type of samples. A pgx file consists of a directory file - the one with the .pgx ending that is specified, and header and data files. The header files define the dimensions and bitdepths of the data files. Data and header may be concatenated together, difftest_ng understands this case, but always writes separate headers as output. .tif,.tiff The TIFF file format, specified and owned by Adobe. TIFF supports multiple formats, including YUV sub- sampling, palette files and floating-point formats. .png PNG is a simple lossless image compression scheme for internet images and replaced there gif images. This requires libpng available. .rgbe,.hdr RGBE high-dynamic range images. While rgbe and hdr are not exactly identical, they are close enough to be supported as sub-formats of the same format. When creating files, RGBE is created. .dpx The DPX format specified by SMTPE, used mainly to represent frame-based movie content. dpx can represent integer and floating point samples of various bit-depths, including YUV data and subsampling. Unfortunately, the format is quite underspecified and not all software seems to follow the specs (or an interpretation thereof) precisely. .exr The openEXR format by Industrial Light & Magic, a format for representing high-dynamic range images. This requires libopenexr to be available. .raw,.craw,.v210,.yuv Raw formats. All raw formats are covered by a single format converter that requires the specification of the layout of the format as part of the file name. Use the --rawhelp command line option how to specify a raw format. ----------------------------------------------------------------------------------------------- Usage: difftest_ng [options] original distorted where original and distorted are ppm,pbm,pgm,pfm,pfs,bmp,pgx,tif,png,exr,rgbe or raw (craw,v12,yuv) images and options are one or more of --psnr : measure the psnr with equal weights over all components --maxsnr : measure the snr with equal weights over all components, normalize to maximum signal --snr : measure the snr with equal weights over all components, normalize to source energy --mse : measure the mean square error with equal weights over all components --rmse : measure the root mean square error with equal weights over all components --minpsnr : measure the minimum psnr over all components --ycbcrpsnr : measure the psnr with weights derived from the YCbCr transformation --yuvpsnr : measure the psnr with weights derived from the YUV transformation --swpsnr : measure the psnr with weights coming from the subsampling factors --mrse : measure the log of the mean relative square error with equal weights --minmrse : measure the minimum mrse over all components --ycbcrmrse : measure the mrse with weights derived from the YCbCr transformation --yuvmrse : measure the mrse with weights derived from the YUV transformation --swmrse : measure the mrse with weights coming from the subsampling factors --peak : measure the peak relative error in dB over all components --avgpeak : measure the peak relative error averaged over all components, in dB --peakx : find the x position of the largest pixel error --peaky : find the y position of the largest pixel error --min : find the minimum value of original - distorted --max : find the maximum value of original - distorted --toe : find the minimum of the original --head : find the maximum of the original --drift : find the mean error (drift) between original and distorted --mae : find the mean absolute error --pae : find the peak absolute error --stripe : measure a striping indicator that detects horizontal or vertical artifacts --width : print the width of the images --height : print the height of the images --depth : print the number of components of the images --precision comp : print the bit precision of the given component --signed comp : print the signedness of the given component (-1 = signed, +1 = unsigned) --float comp : print whether the indicated component is IEEE floating point (1 = yes, 0 = no) --diff target : save the difference image (-i is an alternative form of this option) --rawdiff target : similar to --diff, except that it doesn't scale the difference to maximum range --sdiff scale trgt : generate a differential signal with an explicitly given scale --suppress thres : suppress all pixels in the target that are less than a threshold away from the source --mask roi : mask the source image by the mask image before applying the comparison --notmask roi : mask the source image by the inverse of the mask --convert target : save the original image unaltered, but possibly in a new format --merge target : merge the two images together, add second as components of first --fft target : save the fft of the difference image --wfft target : save the windowed fft of the difference image --filt x y r dst : run a radial filter around frequency x,y with radius r, saves the filtered image as dst --nfilt x y r dst : similar to --filt, but the output is normalized to the full range --comb x y r dst : apply a comb filter in direction x y and radius r --ncomb x y r dst : similar to --comb, but the output is normalized to the full range --hist target : generate a histogram plot. If "target" is -, write to stdout --thres threshold : compute the ratio of pixels whose difference is > than threshold --colorhist size : generate reduced histogram separately for each component using the given bucket size --maxfreqr : locate the absolute value of the most exposed frequency in the error image --maxfreqx : locate the horizontal component of the most exposed frequency in the error image --maxfreqy : locate the vertical component of the most exposed frequency in the error image --maxfreqv : compute the domination ratio of the most exposed frequency in the error image --patternidx : scan the FFT for suspicious patterns and output the likeliness of errors --toflt dst : save a floating point version of the source image --tohfl dst : save a half-float version of the source image --touns bpp dst : save an unsigned integer version with bpp bits per pixel of the source image --tosgn bpp dst : save a signed integer version with bpp bits per pixel of the source image --asuns bpp : convert the two input images to unsigned bpp before further procesing --assgn bpp : convert the two input images to signed bpp before further processing --gamma bpp gamma dst : perform a gamma correction on a floating point image to create integer output --invgamma gamma dst : perform an inverse gamma correction creating a floating point image from integer --toegamma slope gamma dst : perform a gamma transformation on the same data type with a given slope in its toe region --invtoegamma slope gamma dst : perform an inverse gamma transformation on the same data with parameters as above --halflog dst : represent a floating point image in IEEE half float format saved as 16 bit integers --halfexp dst : read a 16-bit integer image using IEEE half float and save as floating point --togamma bpp gamma: convert both images to gamma before applying the measurement (apply as a filter) --fromgamma gamma : convert both images from a gamma before applying the measurement --totoegamma slope gamma : perform a forwards gamma transformation (linear to gamma) with given slope in the toe region --fromtoegamma slope gamma : perform an inverse gamma transformation (gamma to linear) with given slope in the toe region --tohalflog : convert to 16-bit integer before comparing (apply as filter) --fromhalflog : convert from 16-bit integer to float before comparing (apply as filter) --tolog clamp : convert to logarithmic domain with clamp value before comparing --topercept : convert from absolute luminance to a perceptually uniform space --topq bits : convert floating point to SMPTE 2084 quantized to the given bits --frompq : convert SMPTE 2084 quantized data to luminances --tohlg bits : convert floating point to Hybrid Log Gamma with HEVC conventions --fromhlg : convert Hybrid Log Gamma to linear luminance, 1000 nits peak --invert : invert the source image before comparing --flipx : flip the source horizontally before comparing --flipy : flip the source vertically before comparing --flipxextend : create a twice as wide image by flipping it over the right edge --flipyextend : create a twice as high image by flippingit over the bottom edge --shift dx dy : shift the image by the given amount right/bottom (or right/up if < 0) --pad bpp dst : pad (right-aligned) a component into a larger bit-depths --asprec bpp : set the bit-depth to bpp, padding input and output into the target bitdepth --sub x y : subsample all components by the subsampling factors in x and y direction --csub x y : subsample all but component 0 by the subsampling factors in x and y direction --up x y : upsample all components by the subsampling factors in x and y direction --up auto : upsample all components such that we get consistent 1x1 (444) sampling --cup x y : upsample all but component 0 by the subsampling factors in x and y direction --coup x y : co-sited upsampling of all components in x and y direction --coup auto : co-sited upsampling, with automatic choice of upsampling factors --cocup x y : co-sited upsampling of the chroma components in x and y direction --boxup x y : upsample with a simple box filter --boxcup x y : upsample the chrome components with a simple box filter --clamp min max : clamp the image(s) to the specified range of sample values --only component : acts as a filter and restricts all following operations to the given component --upto component : restricts all following operations to components 0..component-1 --rgb : restricts the activity to at most the first three components --crop x1 y1 x2 y2 : crop a rectangular image region (x1,y1)-(x2,y2). Edges are inclusive. --cropd x1 y1 x2 y2: crop a rectangular image region (x1,y1)-(x2,y2) from the distorted image only. --restore : un-do the restrictions of --crop and --only or --rgb --toycbcr : convert images to 601 YCbCr before comparing --toycbcrbl : convert images to 601 YCbCr before comparing, and include a black level --tosignedycbcr : convert images to 601 YCbCr with signed chroma components --fromycbcr : convert images from 601 YCbCr to RGB before comparing --fromycbcrbl : convert images from 601 YCbCr to RGB before comparing, and remove the black level --toycbcr709 : convert images to 709 YCbCr before comparing --toycbcr709bl : convert images to 709 YCbCr before comaring, and include a black level --fromycbcr709 : convert images from 709 YCbCr to RGB before comparing --fromycbcr709bl : convert images from 709 YCbCr to RGB before comparing, and remove the black level --toycbcr2020 : convert images to 2020 YCbCr before comparing --toycbcr2020bl : convert images to 2020 YCbCr before comaring, and include a black level --fromycbcr2020 : convert images from 2020 YCbCr to RGB before comparing --fromycbcr2020bl : convert images from 2020 YCbCr to RGB before comparing, and remove the black level --fromgrey : convert a grey-scale image to color by duplicating components --torct : convert an image with the RCT from JPEG 2000 --tosignedrct : convert an image with the RCT from JPEG 2000, leaving chroma signed --torctd : convert a 4 component RGGB image to YCbCr+DeltaG with the RCT --torctd1 agmnt : convert a Bayer pattern with given arrangement with the above RCTD --torctx agmnt : convert a Bayer pattern with given arrangement to RCT with an improved longer averaging filter for green --toydgcgcox agmnt : convert a Bayer pattern with given arrangement with an extended version of the YDgCgCo transformation --to422rct : convert a 422 image with green in component 0 to YCbCr --to422signedrct : convert a 422 image with green in component 0, leaving chroma signed --fromrct : convert an image back to RGB with the inverse RCT --fromrctd : convert a YCbCr+DeltaG to a four-component RGGB with the inverse RCTD --fromrctd1 agmnt : convert a YCbCr+DeltaG to a 1-component RGGB Bayer pattern --fromrctx agmnt : convert an RCTX-converted image to a Bayer pattern image with the given sample arrangment --fromydgcgcox agmt: convert a YDgCgCoX-converted image to a Bayer pattern image with the given sample arrangement --from422rct : convert from YCbCr to RGB with green in channel 0 --todeltag : convert RGGB to RGB+DeltaG --toycgco : convert an image with the YCgCo transformation --tosignedycgco : convert an image to YCgCo leaving chroma signed --tocycbcod : convert an RGGB image to YCgCo+DeltaG --fromycgco : convert an image back to RGB with the inverse YCgCo transformation --fromycgcod : convert a YCgCo+DeltaG to RGGB with the inverse RCT --fromdeltag : convert RGB+DeltaG to RGGB --toxyz : convert images from RGB to XYZ before comparing --fromxyz : convert images from XYZ to RGB before comparing --tolms : convert images from RGB to LMS before comparing --fromlms : convert images from LMS to RGB before comparing --xyztolms : convert images from XYZ to LMS before comparing --lmstoxyz : convert images from LMS to XYZ before comparing --scale a,b,c... : scale components by the indicated factors before comparing --offset a,b,c... : offset component values by the indicated values before comparing --tobayer : convert a four-component image to a Bayer-pattern image --frombayer : convert a Bayer patterned grey-scale image to four components --tobayersh agmnt: convert a four-component image in component order RGGB to a Bayer patterned image of the given arragement --frombayersh agmnt: convert a Bayer patterned image in the given sample order into a four-component image in RGGB order --422tobayer agmnt: convert a 422 three-component image to a Bayer pattern image where the argument describes the sample organization. It can be either grbg,rggb,gbrg or bggr, and green becomes the luma component --bayerto422 agmnt: convert a Bayer pattern image to a 422 three component image with luma as green and Cb as red and Cr as blue component. The argument describes the bayer pattern arrangement as above. --debayer agmnt: de-Bayer a bayer pattern image with bi-linear interpolation, org describes the sample organization and can be grbg,rggb,gbrg or bggr --debayerahd argmt: de-Bayer with the Adaptive Homogeneity-Directed Demosaic Algorithm --fill r,g,b,... : fill the source image with the given color --paste x y : paste the distorted image at the given position into the source --raw : encode output in raw if applicable --ascii : encode output in ascii if applicable --interleaved : encode output in interleaved samples if applicable --separate : encode output in separate planes if applicable --isyuv : override automatic YUV detection, sources are really in YUV --isrgb : override automatic YUV detection, sources are really in RGB --isfullrange : override automatic range detection, source has no head/toe region --isreducedrange : override automatic range detection, source has head/toe region --littleendian : use little endian output if applicable --bigendian : use big endian output if applicable --toabsradiance : multiply floating point samples by recorded radiance scale to convert to absolute radiance --brief : use a brief (only numeric) output format >,>=,==,!=,<=,< t : last result must be larger, larger or equal, equal, not equal, smaller or equal or smaller than given threshold t. Attention: Quoting required when used from the shell. If the source image is '-/<width>x<height>x<depth>', it is replaced by a blank image of the given dimensions. This image can be filled with any other color by --fill, see above. If the distorted image file name equals '-', then the image is replaced by a blank image --help : print this page --rawhelp : print help on raw image formatting. First time users: PLEASE READ THIS. ----------------------------------------------------------------------------------------------- Raw formats are unframed and hence the formatting of the file must be specified on the command line, here as part of the FILE NAME, not by separate options. Raw files are specified by 'filename.raw@format', with an '@' (at) sign separating the format specification from the file name. Note that the annex .raw is necessary. The format specification itself consists of two parts, the image dimensions and the layout of the data: <width>x<height>x<depth>:<datalayout> where <width> is the width, <height> the height and <depth> the number of components in the image, without the angle brackets (pure numerical values). Numbers are separated by 'x' (lower-case x). This part MAY be omitted for saving since image dimensions are known. The colon ':', however, must be present. Image data can be either represented INTERLEAVED, that is, all components of a single pixel are adjacent to each other, or SEPARATE, that is, each component is described in a separate bitplane, and the bitplanes are stored adjacent to each other. Additional padding bits might be present in the interleaved representation. A single pixel is described by one or several fields, where each field encodes the data of a single component or may simply be present for padding. Fields can be either signed or unsigned, have a bit-width and an endianness. In the interleaved presentation, fields can be bit-packed together, i.e. may share bits in a byte, word or longword. If fields are bit-packed, the entire number of bits must be either 8,16 or 32, and shares the endianness of all its components, i.e. all components must indicate the same endianness. In the separate presentation, bits of a component are packed near each other without any padding. However, some components might be subsampled, i.e. may contain less samples than others. This option does not exist for interleaved data. The format specification looks like this for interleaved data: <packing>{<sign-flag><bits><endian>=<target>}:{<sign-flag><bits><endian>=<target>}:... where <packing> is either + or -, indicating the packing order within the field with +, which is the default, packing is from MSB to LSB, with '-' bits are packed from LSB to MSB where the curly brackets indicate the interleaved format where <sign-flag> is an optional '+' or '-' sign indicating whether the component is signed (then '-') or unsigned (then '+'). If omitted, the component is unsigned. where <bits> is a mandatory number of bits the component takes, e.g. 8 where <endian> is an optional endian indicator. It is '+' for big-endian and '-' for little-endian data. If omitted, big-endian is assumed. where <target> indicates to which component the data belongs, e.g. 0 for red the target is separated by an equals sign '=' from the endianness. For padding data, equal-sign and target are omitted. where the colon ':' indicates that fields have separate endianness. This only works for fields of withs 8,16,32 or 64 bits. If the colon is replaced by a comma, the fields adjacent to the comma are bit- packed into a single data unit. In total, up to 32 bits can be packed together, and the total number of bits packed must be either 8, 16 or 32. Then, the endian- ness of all fields packed together applies to the packed result, and must be identical. By default, bit-packing reads from the MSB to the LSB. With the optional minus sign in front, packing is from LSB to MSB. Components may appear multiple times in the same format specification, which implies subsampling. The subsampling factors are inclined from the sample count per component. The sample pattern is filled at each line end, potentially creating dummy samples that will be skipped over when reading and which are written as zero on writing. The format for the separate representation uses square brackets instead: <packing>[<sign-flag><bits><endian>=<target>]/<subx>x<suby>:... and the syntax is as above, except that subsampling factors can be added to a field description. They are separated by a slash '/' from the field description, followed by the horizontal and vertical subsampling factors, separated by an 'x'. The separate format also allows to pack several components together into one field, similar to the above, packed fields are separated by a comma instead of a semicolon. In such a case, the subsampling of the packed channels must be consistent and identical within the same plane. A padding channel is indicated by a missing target specification. EXAMPLES: A 640x480 RGB image with 8 bits per component encoded as RRRRRRRRGGGGGGGGBBBBBBBB is denoted as this: image.raw@640x480x3:{8=0}:{8=1}:{8=2} Image dimensions can be omitted when saving, it then may be simplified to: image.raw@:{8=0}:{8=1}:{8=2} Note both '@' and ':' must be present. Typical image formats: {8=2}:{8=1}:{8=0}: 24 bits, pixel layout: BBBBBBBBGGGGGGGGRRRRRRRR {8=2}:{8=1}:{8=0}:{8} 32 bits, plus a pad byte: BBBBBBBBGGGGGGGGRRRRRRRR00000000 {8=2}:{8=1}:{8=0}:{8=3} 32 bits plus alpha channel: BBBBBBBBGGGGGGGGRRRRRRRRAAAAAAAA [1=0] bit-packed 1bpp black & white image [8=0]:[8=1]:[8=2] YUV or RGB in separate encoding, three planes, each 8bpp [8=0]:[8=1]/2x2:[8=2]/2x2 YUV 420 in separate planes [8=0]:[8=1]/2x1:[8=2]/2x1 YUV 422 in separate planes [4],[12=0]:[4]/2x1,[12=1]/2x1:[4]/2x1,[12=2]/2x1 YUV 422 in separate planes, 12 bits per component where each component is packed into 16 bits with padding bits upfront, represented in big-endian [4-],[12-=0]:[4-]/2x1,[12-=1]/2x1:[4-]/2x1,[12-=2]/2x1 YUV 422 12 bits/component as above, but little-endian. {2-},{10-=2},{10-=1},{10-=0} 32 bits, pixel layout is ten bits per component with two padding bits in front, packed into 32 bits which is written in little-endian format. Represented as little endian, this format reads 00BBBBBBBBBBGGGGGGGGGGRRRRRRRRRR but in the file, bytes are shuffled around: RRRRRRRR GGGGGGRR BBBBGGGG 00BBBBBB {1-},{5-=2},{5-=1},{5-=0} 16 bits, pixel layout is five bits per component with a single pad-bit upfront, packed into a 16-bit word which is written in little-endian format. On a little endian machine, this reads as 0BBBBBGGGGGRRRRR but in the file, bytes are ordered reversely: GGGRRRRR 0BBBBBGG -{10-=1},{10-=0},{10-=2},{2-}:-{10-=0},{10-=1},{10-=0},{2-}: -{10-=2},{10-=0},{10-=1},{2-}:-{10-=0},{10-=2},{10-=0},{2-} This (single) line creates a raw format that complies to the V210 pixel format. Each component is 10 bits wide, three samples are left-aligned into one 32 bit word causing two padding bits per 32 bit word. Each 32 bit word is in little endian, and filled from LSB to MSB. The samping order is UYV - YUY - VYU - YVY. Subsampling factors are derived from the sample counts and padding is applied at the end of each line to one complete cycle. -----------------------------------------------------------------------------------------------
About
The swiss army knife for image compression testing and format conversion
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published