Differentiable Compound Optics and Processing Pipeline Optimization for End-to-end Camera Design

Ethan Tseng; Ali Mosleh; Fahim Mannan; Karl St-Arnaud; Avinash Sharma; Yifan Peng; Alexander Braun; Derek Nowrouzezahrai; Jean-François Lalonde; Felix Heide

doi:10.1145/3446791

Differentiable Compound Optics and Processing Pipeline Optimization for End-to-end Camera Design

ACM Transactions on Graphics ◽

10.1145/3446791 ◽

2021 ◽

Vol 40 (2) ◽

pp. 1-19

Author(s):

Ethan Tseng ◽

Ali Mosleh ◽

Fahim Mannan ◽

Karl St-Arnaud ◽

Avinash Sharma ◽

...

Keyword(s):

Neural Network ◽

Image Processing ◽

Optical Design ◽

Neural Nets ◽

Fine Tuning ◽

Optical Systems ◽

Design Tools ◽

End To End ◽

Application Specific ◽

Network Processing

Most modern commodity imaging systems we use directly for photography—or indirectly rely on for downstream applications—employ optical systems of multiple lenses that must balance deviations from perfect optics, manufacturing constraints, tolerances, cost, and footprint. Although optical designs often have complex interactions with downstream image processing or analysis tasks, today’s compound optics are designed in isolation from these interactions. Existing optical design tools aim to minimize optical aberrations, such as deviations from Gauss’ linear model of optics, instead of application-specific losses, precluding joint optimization with hardware image signal processing (ISP) and highly parameterized neural network processing. In this article, we propose an optimization method for compound optics that lifts these limitations. We optimize entire lens systems jointly with hardware and software image processing pipelines, downstream neural network processing, and application-specific end-to-end losses. To this end, we propose a learned, differentiable forward model for compound optics and an alternating proximal optimization method that handles function compositions with highly varying parameter dimensions for optics, hardware ISP, and neural nets. Our method integrates seamlessly atop existing optical design tools, such as Zemax . We can thus assess our method across many camera system designs and end-to-end applications. We validate our approach in an automotive camera optics setting—together with hardware ISP post processing and detection—outperforming classical optics designs for automotive object detection and traffic light state detection. For human viewing tasks, we optimize optics and processing pipelines for dynamic outdoor scenarios and dynamic low-light imaging. We outperform existing compartmentalized design or fine-tuning methods qualitatively and quantitatively, across all domain-specific applications tested.

Download Full-text

Optical design tools for reflective optical systems

10.1117/12.478492 ◽

2002 ◽

Cited By ~ 1

Author(s):

Joseph M. Howard

Keyword(s):

Optical Design ◽

Optical Systems ◽

Design Tools

Download Full-text

Recognize Number of fingers from Single hand gesture Image using Image processing and Neural Network

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v6i5.349352 ◽

2018 ◽

Vol 6 (5) ◽

pp. 349-352

Author(s):

H. Bhavsar ◽

◽

J. Trivedi

Keyword(s):

Neural Network ◽

Image Processing ◽

Hand Gesture

Download Full-text

Trauma Identification Using Image Processing and FeedForward Neural Network

SSRN Electronic Journal ◽

10.2139/ssrn.3734767 ◽

2020 ◽

Author(s):

Sofia R

Keyword(s):

Neural Network ◽

Image Processing ◽

Feedforward Neural Network

Download Full-text

Efficient Computing in Image Processing and DSPs with ASIP Based Multiplier

Recent Patents on Engineering ◽

10.2174/1872212112666180810150357 ◽

2019 ◽

Vol 13 (2) ◽

pp. 174-180

Author(s):

Poonam Sharma ◽

Ashwani Kumar Dubey ◽

Ayush Goyal

Keyword(s):

Image Processing ◽

High Speed ◽

Computation Time ◽

Digital Signal ◽

Instruction Set ◽

Computationally Efficient ◽

Specific Instruction ◽

Processor Core ◽

Speed Performance ◽

Application Specific

Background: With the growing demand of image processing and the use of Digital Signal Processors (DSP), the efficiency of the Multipliers and Accumulators has become a bottleneck to get through. We revised a few patents on an Application Specific Instruction Set Processor (ASIP), where the design considerations are proposed for application-specific computing in an efficient way to enhance the throughput. Objective: The study aims to develop and analyze a computationally efficient method to optimize the speed performance of MAC. Methods: The work presented here proposes the design of an Application Specific Instruction Set Processor, exploiting a Multiplier Accumulator integrated as the dedicated hardware. This MAC is optimized for high-speed performance and is the application-specific part of the processor; here it can be the DSP block of an image processor while a 16-bit Reduced Instruction Set Computer (RISC) processor core gives the flexibility to the design for any computing. The design was emulated on a Xilinx Field Programmable Gate Array (FPGA) and tested for various real-time computing. Results: The synthesis of the hardware logic on FPGA tools gave the operating frequencies of the legacy methods and the proposed method, the simulation of the logic verified the functionality. Conclusion: With the proposed method, a significant improvement of 16% increase in throughput has been observed for 256 steps iterations of multiplier and accumulators on an 8-bit sample data. Such an improvement can help in reducing the computation time in many digital signal processing applications where multiplication and addition are done iteratively.

Download Full-text

Knowledge Transferred Fine-Tuning for Anti-Aliased Convolutional Neural Network in Data-Limited Situation

10.1109/icip42928.2021.9506696 ◽

2021 ◽

Author(s):

Satoshi Suzuki ◽

Shoichiro Takeda ◽

Ryuichi Tanida ◽

Hideaki Kimata ◽

Hayaru Shouno

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

Fine Tuning

Download Full-text

Convolutional Neural Network for the Semantic Segmentation of Remote Sensing Images

Mobile Networks and Applications ◽

10.1007/s11036-020-01703-3 ◽

2021 ◽

Vol 26 (1) ◽

pp. 200-215

Author(s):

Muhammad Alam ◽

Jian-Feng Wang ◽

Cong Guangpei ◽

LV Yunrong ◽

Yuanfang Chen

Keyword(s):

Neural Network ◽

Remote Sensing ◽

Neural Networks ◽

Image Processing ◽

Deep Learning ◽

Semantic Segmentation ◽

Natural Scene ◽

Remote Sensing Images ◽

Advantages And Disadvantages ◽

Target Segmentation

AbstractIn recent years, the success of deep learning in natural scene image processing boosted its application in the analysis of remote sensing images. In this paper, we applied Convolutional Neural Networks (CNN) on the semantic segmentation of remote sensing images. We improve the Encoder- Decoder CNN structure SegNet with index pooling and U-net to make them suitable for multi-targets semantic segmentation of remote sensing images. The results show that these two models have their own advantages and disadvantages on the segmentation of different objects. In addition, we propose an integrated algorithm that integrates these two models. Experimental results show that the presented integrated algorithm can exploite the advantages of both the models for multi-target segmentation and achieve a better segmentation compared to these two models.

Download Full-text

Matching Large Baseline Oblique Stereo Images Using an End-to-End Convolutional Neural Network

Remote Sensing ◽

10.3390/rs13020274 ◽

2021 ◽

Vol 13 (2) ◽

pp. 274

Author(s):

Guobiao Yao ◽

Alper Yilmaz ◽

Li Zhang ◽

Fei Meng ◽

Haibin Ai ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Stereo Matching ◽

Least Square ◽

Affine Invariant ◽

Stereo Images ◽

Distance Ratio ◽

Matching Algorithm ◽

End To End

The available stereo matching algorithms produce large number of false positive matches or only produce a few true-positives across oblique stereo images with large baseline. This undesired result happens due to the complex perspective deformation and radiometric distortion across the images. To address this problem, we propose a novel affine invariant feature matching algorithm with subpixel accuracy based on an end-to-end convolutional neural network (CNN). In our method, we adopt and modify a Hessian affine network, which we refer to as IHesAffNet, to obtain affine invariant Hessian regions using deep learning framework. To improve the correlation between corresponding features, we introduce an empirical weighted loss function (EWLF) based on the negative samples using K nearest neighbors, and then generate deep learning-based descriptors with high discrimination that is realized with our multiple hard network structure (MTHardNets). Following this step, the conjugate features are produced by using the Euclidean distance ratio as the matching metric, and the accuracy of matches are optimized through the deep learning transform based least square matching (DLT-LSM). Finally, experiments on Large baseline oblique stereo images acquired by ground close-range and unmanned aerial vehicle (UAV) verify the effectiveness of the proposed approach, and comprehensive comparisons demonstrate that our matching algorithm outperforms the state-of-art methods in terms of accuracy, distribution and correct ratio. The main contributions of this article are: (i) our proposed MTHardNets can generate high quality descriptors; and (ii) the IHesAffNet can produce substantial affine invariant corresponding features with reliable transform parameters.

Download Full-text