Algorithmic processing of digitally-represented images
This article is about mathematical processing of digital images. For artistic processing of images, see Image editing. For compression algorithms, see Image compression.
Digital image processing is the use of a digital computer to process digital images through an algorithm.[1][2] As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the build-up of noise and distortion during processing. Since images are defined over two dimensions (perhaps more) digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers;[3] second, the development of mathematics (especially the creation and improvement of discrete mathematics theory);[4] third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.[5]
Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory, Massachusetts Institute of Technology, University of Maryland, and a few other research facilities, with application to satellite imagery, wire-photo standards conversion, medical imaging, videophone, character recognition, and photograph enhancement.[6] The purpose of early image processing was to improve the quality of the image. It was aimed for human beings to improve the visual effect of people. In image processing, the input is a low-quality image, and the output is an image with improved quality. Common image processing include image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They useD image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the Sun and the environment of the Moon. The impact of the successful mapping of the Moon's surface map by the computer has been a success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the Moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the Moon.[7]
The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in real-time, for some dedicated problems such as television standards conversion. As general-purpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computer-intensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest.
The charge-coupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969.[15] While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next.[8] The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting.[16]
Medical imaging techniques produce very large amounts of data, especially from CT, MRI and PET modalities. As a result, storage and communications of electronic image data are prohibitive without the use of compression.[31][32]JPEG 2000 image compression is used by the DICOM standard for storage and transmission of medical images. The cost and feasibility of accessing large image data sets over low or various bandwidths are further addressed by use of another DICOM standard, called JPIP, to enable efficient streaming of the JPEG 2000 compressed image data.[33]
In 1972, engineer Godfrey Hounsfield from the British company EMI invented the X-ray computed tomography (CT) device for head diagnosis, which is commonly referred to as CT (computed tomography). The CT nucleus method is based on the projecting X-rays through a section of the human head, which are then processed by computer to reconstruct the cross-sectional image, known as image reconstruction. In 1975, EMI successfully developed a CT device for the entire body, enabling the clear acquisition of tomographic images of various parts of the human body. This revolutionary diagnostic technique earned Hounsfield and physicist Allan Cormack the Nobel Prize in Physiology or Medicine in 1979.[7] Digital image processing technology for medical applications was inducted into the Space Foundation's Space Technology Hall of Fame in 1994.[39]
Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means.
In particular, digital image processing is a concrete application of, and a practical technology based on:
Images are typically padded before being transformed to the Fourier space, the highpass filtered images below illustrate the consequences of different padding techniques:
Zero padded
Repeated edge padded
Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding.
Filtering code examples
MATLAB example for spatial domain highpass filtering.
Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples:[46]
To apply the affine matrix to an image, the image is converted to matrix in which each entry corresponds to the pixel intensity at that location. Then each pixel's location can be represented as a vector indicating the coordinates of that pixel in the image, [x, y], where x and y are the row and column of a pixel in the image matrix. This allows the coordinate to be multiplied by an affine-transformation matrix, which gives the position that the pixel value will be copied to in the output image.
However, to allow transformations that require translation transformations, 3 dimensional homogeneous coordinates are needed. The third dimension is usually set to a non-zero constant, usually 1, so that the new coordinate is [x, y, 1]. This allows the coordinate vector to be multiplied by a 3 by 3 matrix, enabling translation shifts. So the third dimension, which is the constant 1, allows translation.
Because matrix multiplication is associative, multiple affine transformations can be combined into a single affine transformation by multiplying the matrix of each individual transformation in the order that the transformations are done. This results in a single matrix that, when applied to a point vector, gives the same result as all the individual transformations performed on the vector [x, y, 1] in sequence. Thus a sequence of affine transformation matrices can be reduced to a single affine transformation matrix.
For example, 2 dimensional coordinates only allow rotation about the origin (0, 0). But 3 dimensional homogeneous coordinates can be used to first translate any point to (0, 0), then perform the rotation, and lastly translate the origin (0, 0) back to the original point (the opposite of the first translation). These 3 affine transformations can be combined into a single matrix, thus allowing rotation around any point in the image.[47]
The following examples are about Structuring elements. The denoise function, image as I, and structuring element as B are shown as below and table.
e.g.
Define Dilation(I, B)(i,j) = . Let Dilation(I,B) = D(I,B)
D(I', B)(1,1) =
Define Erosion(I, B)(i,j) = . Let Erosion(I,B) = E(I,B)
E(I', B)(1,1) =
After dilation
After erosion
An opening method is just simply erosion first, and then dilation while the closing method is vice versa. In reality, the D(I,B) and E(I,B) can implemented by Convolution
Structuring element
Mask
Code
Example
Original Image
None
Use Matlab to read Original image
original=imread('scene.jpg');image=rgb2gray(original);[r,c,channel]=size(image);se=logical([111;111;111]);[p,q]=size(se);halfH=floor(p/2);halfW=floor(q/2);time=3;% denoising 3 times with all method
Digital cameras generally include specialized digital image processing hardware – either dedicated chips or added circuitry on other chips – to convert the raw data from their image sensor into a color-corrected image in a standard image file format. Additional post processing techniques increase edge sharpness or color saturation to create more naturally looking images.
Film
Westworld (1973) was the first feature film to use the digital image processing to pixellate photography to simulate an android's point of view.[48] Image processing is also vastly used to produce the chroma key effect that replaces the background of actors with natural or artistic scenery.
The feature-based method of face detection is using skin tone, edge detection, face shape, and feature of a face (like eyes, mouth, etc.) to achieve face detection. The skin tone, face shape, and all the unique elements that only the human face have can be described as features.
Process explanation
Given a batch of face images, first, extract the skin tone range by sampling face images. The skin tone range is just a skin filter.
Structural similarity index measure (SSIM) can be applied to compare images in terms of extracting the skin tone.
Normally, HSV or RGB color spaces are suitable for the skin filter. E.g. HSV mode, the skin tone range is [0,48,50] ~ [20,255,255]
After filtering images with skin tone, to get the face edge, morphology and DCT are used to remove noise and fill up missing skin areas.
Opening method or closing method can be used to achieve filling up missing skin.
DCT is to avoid the object with skin-like tone. Since human faces always have higher texture.
Sobel operator or other operators can be applied to detect face edge.
To position human features like eyes, using the projection and find the peak of the histogram of projection help to get the detail feature like mouth, hair, and lip.
Projection is just projecting the image to see the high frequency which is usually the feature position.
Improvement of image quality method
Image quality can be influenced by camera vibration, over-exposure, gray level distribution too centralized, and noise, etc. For example, noise problem can be solved by Smoothing method while gray level distribution problem can be improved by histogram equalization.
In drawing, if there is some dissatisfied color, taking some color around dissatisfied color and averaging them. This is an easy way to think of Smoothing method.
Smoothing method can be implemented with mask and Convolution. Take the small image and mask for instance as below.
Oberseving image[1, 1], image[1, 2], image[2, 1], and image[2, 2].
The original image pixel is 1, 4, 28, 30. After smoothing mask, the pixel becomes 9, 10, 9, 9 respectively.
new image[1, 1] = * (image[0,0]+image[0,1]+image[0,2]+image[1,0]+image[1,1]+image[1,2]+image[2,0]+image[2,1]+image[2,2])
new image[1, 1] = floor( * (2+5+6+3+1+4+1+28+30)) = 9
new image[1, 2] = floor({ * (5+6+5+1+4+6+28+30+2)) = 10
new image[2, 1] = floor( * (3+1+4+1+28+30+7+3+2)) = 9
new image[2, 2] = floor( * (1+4+6+28+30+2+3+2+2)) = 9
Gray Level Histogram method
Generally, given a gray level histogram from an image as below. Changing the histogram to uniform distribution from an image is usually what we called Histogram equalization.
In discrete time, the area of gray level histogram is (see figure 1) while the area of uniform distribution is (see figure 2). It is clear that the area will not change, so .
From the uniform distribution, the probability of is while the
In continuous time, the equation is .
Moreover, based on the definition of a function, the Gray level histogram method is like finding a function that satisfies f(p)=q.
Improvement method
Issue
Before improvement
Process
After improvement
Smoothing method
noise
with Matlab, salt & pepper with 0.01 parameter is added to the original image in order to create a noisy image.
Noise and Distortions: Imperfections in images due to poor lighting, limited sensors, and file compression can result in unclear images that impact accurate image conversion.
Variability in Image Quality: Variations in image quality and resolution, including blurry images and incomplete details, can hinder uniform processing across a database.
Object Detection and Recognition: Identifying and recognising objects within images, especially in complex scenarios with multiple objects and occlusions, poses a significant challenge.
Data Annotation and Labelling: Labelling diverse and multiple images for machine recognition is crucial for further processing accuracy, as incorrect identification can lead to unrealistic results.
Computational Resource Intensity: Accessing adequate computational resources for image processing can be challenging and costly, hindering progress without sufficient resources.
^Azriel Rosenfeld, Picture Processing by Computer, New York: Academic Press, 1969
^ abGonzalez, Rafael C. (2008). Digital image processing. Woods, Richard E. (Richard Eugene), 1954– (3rd ed.). Upper Saddle River, N.J.: Prentice Hall. pp. 23–28. ISBN978-0-13-168728-8. OCLC137312858.
^Dhouib, D.; Naït-Ali, A.; Olivier, C.; Naceur, M.S. (June 2021). "ROI-Based Compression Strategy of 3D MRI Brain Datasets for Wireless Communications". IRBM. 42 (3): 146–153. doi:10.1016/j.irbm.2020.05.001. S2CID219437400. Because of the large amount of medical imaging data, the transmission process becomes complicated in telemedicine applications. Thus, in order to adapt the data bit streams to the constraints related to the limitation of the bandwidths a reduction of the size of the data by compression of the images is essential.
^Grant, Duncan Andrew; Gowar, John (1989). Power MOSFETS: theory and applications. Wiley. p. 1. ISBN978-0-471-82867-9. The metal–oxide–semiconductor field-effect transistor (MOSFET) is the most commonly used active device in the very large-scale integration of digital integrated circuits (VLSI). During the 1970s these components revolutionized electronic signal processing, control systems and computers.
^Zhang, M. Z.; Livingston, A. R.; Asari, V. K. (2008). "A High Performance Architecture for Implementation of 2-D Convolution with Quadrant Symmetric Kernels". International Journal of Computers and Applications. 30 (4): 298–308. doi:10.1080/1206212x.2008.11441909. S2CID57289814.
^ abGonzalez, Rafael (2008). Digital Image Processing, 3rd. Pearson Hall. ISBN978-0-13-168728-8.
^House, Keyser (6 December 2016). Affine Transformations(PDF). Foundations of Physically Based Modeling & Animation. A K Peters/CRC Press. ISBN978-1-4822-3460-2. Archived(PDF) from the original on 30 August 2017. Retrieved 26 March 2019. {{cite book}}: |website= ignored (help)
Solomon, C.J.; Breckon, T.P. (2010). Fundamentals of Digital Image Processing: A Practical Approach with Examples in Matlab. Wiley-Blackwell. doi:10.1002/9780470689776. ISBN978-0-470-84473-1.
R. Fisher; K Dawson-Howe; A. Fitzgibbon; C. Robertson; E. Trucco (2005). Dictionary of Computer Vision and Image Processing. John Wiley. ISBN978-0-470-01526-1.
Rafael C. Gonzalez; Richard E. Woods; Steven L. Eddins (2004). Digital Image Processing using MATLAB. Pearson Education. ISBN978-81-7758-898-9.
Tim Morris (2004). Computer Vision and Image Processing. Palgrave Macmillan. ISBN978-0-333-99451-1.
Vipin Tyagi (2018). Understanding Digital Image Processing. Taylor and Francis CRC Press. ISBN978-11-3856-6842.
Milan Sonka; Vaclav Hlavac; Roger Boyle (1999). Image Processing, Analysis, and Machine Vision. PWS Publishing. ISBN978-0-534-95393-5.
Gonzalez, Rafael C.; Woods, Richard E. (2008). Digital image processing. Upper Saddle River, N.J.: Prentice Hall. ISBN978-0-13-168728-8. OCLC137312858.
Kovalevsky, Vladimir (2019). Modern algorithms for image processing: computer imagery by example using C#. [New York, New York]. ISBN978-1-4842-4237-7. OCLC1080084533.{{cite book}}: CS1 maint: location missing publisher (link)