Large-field holographic projection system based on deep-learning acceleration calculation

Author(s):  
Chao Cai ◽  
Ping Su ◽  
Jianshe Ma
2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
BinBin Zhang ◽  
Fumin Zhang ◽  
Xinghua Qu

Purpose Laser-based measurement techniques offer various advantages over conventional measurement techniques, such as no-destructive, no-contact, fast and long measuring distance. In cooperative laser ranging systems, it’s crucial to extract center coordinates of retroreflectors to accomplish automatic measurement. To solve this problem, this paper aims to propose a novel method. Design/methodology/approach We propose a method using Mask RCNN (Region Convolutional Neural Network), with ResNet101 (Residual Network 101) and FPN (Feature Pyramid Network) as the backbone, to localize retroreflectors, realizing automatic recognition in different backgrounds. Compared with two other deep learning algorithms, experiments show that the recognition rate of Mask RCNN is better especially for small-scale targets. Based on this, an ellipse detection algorithm is introduced to obtain the ellipses of retroreflectors from recognized target areas. The center coordinates of retroreflectors in the camera coordinate system are obtained by using a mathematics method. Findings To verify the accuracy of this method, an experiment was carried out: the distance between two retroreflectors with a known distance of 1,000.109 mm was measured, with 2.596 mm root-mean-squar error, meeting the requirements of the coarse location of retroreflectors. Research limitations/implications The research limitations/implications are as follows: (i) As the data set only has 200 pictures, although we have used some data augmentation methods such as rotating, mirroring and cropping, there is still room for improvement in the generalization ability of detection. (ii) The ellipse detection algorithm needs to work in relatively dark conditions, as the retroreflector is made of stainless steel, which easily reflects light. Originality/value The originality/value of the article lies in being able to obtain center coordinates of multiple retroreflectors automatically even in a cluttered background; being able to recognize retroreflectors with different sizes, especially for small targets; meeting the recognition requirement of multiple targets in a large field of view and obtaining 3 D centers of targets by monocular model-based vision.


2014 ◽  
Vol 2014 (1) ◽  
pp. 000178-000183 ◽  
Author(s):  
James Webb ◽  
Roger McCleary ◽  
Gerald Lopez ◽  
Qing Tan

Increasing volume using larger substrates with decreasing process margins create new challenges for advanced packaging applications. Key step and repeat camera technology continues being introduced for the mass production of high density interconnects used for 2.5D and 3D technologies that will provide solutions for the challenges encountered. A 2X reduction stepper with unique features achieves the tighter specifications needed for many advanced packaging applications printed on large substrates. A large field-of-view optical projection system utilizes the 350–450nm light spectrum from a mercury arc to expose the circuit patterns from a reticle mask onto a substrate and image features with the optimal fidelity required for advanced packaging technologies. The imaging field prints a large 52mm x 66mm area or 59.4mm x 59.4mm in a single exposure. These features enable a system to process larger substrates in fewer shots which result in higher throughput using lower power. Details of the camera and the adjustments that are provided to extend the range of use for both high power and high fidelity applications are discussed. An extensive evaluation of measured and modeled lithographic capabilities of the step and repeat camera to achieve critical dimensions with precise image placement is provided. Limiting resolution and depth of focus results sampled over the imaging field will be provided and supported with simulation. Results of thin and thick resist patterning will be presented and compared to simulated 3D resist profiles using the MACK4 model.


2021 ◽  
Author(s):  
Yifeng Zhou ◽  
Naidi Sun ◽  
Song Hu

Enabling simultaneous and high resolution quantification of the total concentration of hemoglobin (CHb), oxygen saturation of hemoglobin (sO2), and cerebral blood flow (CBF), multi parametric photoacoustic microscopy (PAM) has emerged as a promising tool for functional and metabolic imaging of the live mouse brain. However, due to the limited depth of focus imposed by the Gaussian beam excitation, the quantitative measurements become inaccurate when the imaging object is out of focus. To address this problem, we have developed a hardware-software combined approach by integrating Bessel beam excitation and conditional generative adversarial network (cGAN) based deep learning. Side by side comparison of the new cGAN powered Bessel-beam multi parametric PAM against the conventional Gaussian beam multi parametric PAM shows that the new system enables high resolution, quantitative imaging of CHb, sO2, and CBF over a depth range of ~600 μm in the live mouse brain, with errors 13 to 58 times lower than those of the conventional system. Better fulfilling the rigid requirement of light focusing for accurate hemodynamic measurements, the deep learning powered Bessel beam multi parametric PAM may find applications in large field functional recording across the uneven brain surface and beyond (e.g., tumor imaging).


2018 ◽  
Author(s):  
Xiongchao Chen ◽  
Hao Zhang ◽  
Tingting Zhu ◽  
Yao Yao ◽  
Di Jin ◽  
...  

We demonstrate a deep learning based contact imaging on a CMOS chip to achieve ∼1 μm spatial resolution over a large field of view of ∼24 mm2. By using regular LED illumination, we acquire the single lower-resolution image of the objects placed approximate to the sensor with unit fringe magnification. For the raw contact-mode lens-free image, the pixel size of the sensor chip limits the spatial resolution. We apply a generative and adversarial network (GAN), a type of deep learning algorithm, to circumvent this limitation and effectively recover much higher resolution image of the objects, permitting sub-micron spatial resolution to be achieved across the entire sensor chip active area, which is also equivalent to the imaging field-of-view (24 mm2) due to unit magnification. This GAN-contact imaging approach eliminates the need of either lens or multi-frame acquisition, being very handy and cost-effective. We demonstrate the success of this approach by imaging the proliferation dynamics of cells directly cultured on the chip.


2021 ◽  
Vol 3 (9) ◽  
Author(s):  
Sajjad Mardanirad ◽  
David A. Wood ◽  
Hassan Zakeri

Abstract In this paper, we present how precise deep learning algorithms can distinguish loss circulation severities in oil drilling operations. Lost circulation is one of the costliest downhole problem encountered during oil and gas well construction. Applying artificial intelligence can help drilling teams to be forewarned of pending lost circulation events and thereby mitigate their consequences. Data-driven methods are traditionally employed for fluid loss complexity quantification but are not able to achieve reliable predictions for field cases with large quantities of data. This paper attempts to investigate the performance of deep learning (DL) approach in classification the types of fluid loss from a very large field dataset. Three DL classification models are evaluated: Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU) and Long-Short Term Memory (LSTM). Five fluid-loss classes are considered: No Loss, Seepage, Partial, Severe, and Complete Loss. 20 wells drilled into the giant Azadegan oil field (Iran) provide 65,376 data records are used to predict the fluid loss classes. The results obtained, based on multiple statistical performance measures, identify the CNN model as achieving superior performance (98% accuracy) compared to the LSTM and GRU models (94% accuracy). Confusion matrices provide further insight to the prediction accuracies achieved. The three DL models evaluated were all able to classify different types of lost circulation events with reasonable prediction accuracy. Future work is required to evaluate the performance of the DL approach proposed with additional large datasets. The proposed method helps drilling teams deal with lost circulation events efficiently. Article Highlights Three deep learning models classify fluid loss severity in an oil field carbonate reservoir. Deep learning algorithms advance machine learning a large resource dataset with 65,376 data records. Convolution neural network outperformed other deep learning methods.


2020 ◽  
Author(s):  
Nikolaos Ioannis Bountos ◽  
Melanie Brandmeier ◽  
Mark Günter

<p>Urban landscapes are characterized as the fastest changing areas on the planet. However, regularly monitoring of larger areas it is not feasible using UAVs or costly air borne data. In these situations, satellite data with a high temporal resolution and large field of view are more appropriate but suffer from the lower spatial resolution (deca-meters). In the present study we show that by using freely available Sentinel-2 data from the Copernicus program, we can extract anthropogenic features such as roads, railways and building footprints that are partly or completely on a sub-pixel level in this kind of data. Additionally, we propose a new metric for the evaluation of our methods on the sub-pixel objects. This metric measures the performance of the detection of an object while penalizing the false positive classification. Given that our training samples contain one class, we define two thresholds that represent the lower bound of accuracy for the object to be classified and the background. We thus avoid a good score in occasions where we classify correctly our object, but a wide area of the background has been included in our prediction. We investigate the performance of different deep-learning architectures for sub-pixel classification of the different infrastructure elements based on Sentinel-2 multispectral data and the labels derived from the UAV data. Our study area is located in the Rhone valley in Switzerland where very high-resolution UAV data was available from the University of Applied Sciences. Highly accurate labels for the respective classes were digitized in ArcGIS Pro and used as ground-truth for the Sentinel data. We trained different deep learning models based on state-of-the-art architectures for semantic segmentation, such as DeepLab and U-Net. Our approach focuses on the exploitation of the multi spectral information to increase the performance of the RGB channels. For that purpose, we make use of the NIR and SWIR 10m and 20m bands of the Sentinel-2 data. We investigate early and late fusion approaches and the behavior and contribution of each multi spectral band to improve the performance in comparison to only using the RGB channels. In the early fusion approach, we stack nine (RGB, NIR, SWIR) Sentinel-2 bands together, pass them from two convolutions followed by batch normalization and relu layers and then feed the tiles to DeepLab. In the late fusion approach, we create a CNN with two branches with the first branch processing the RGB channels and the second branch the NIR/SWIR bands. We use modified DeepLab layers for the two branches and then concatenate the outputs into a total output of 512 feature maps. We then reduce the dimensionality of the result into the finaloutput equal to the number of classes. The dimension reduction step happens in two convolution layers. We experiment on different settings for all of the mentioned architectures. In the best-case scenario, we achieve 89% overall accuracy. Moreover, we measure 60% building accuracy, streets accuracy 60%, railway accuracy 73%, river accuracy 92% and background accuracy 94%.</p>


2021 ◽  
Vol 15 ◽  
Author(s):  
Hristofor Lukanov ◽  
Peter König ◽  
Gordon Pipa

While abundant in biology, foveated vision is nearly absent from computational models and especially deep learning architectures. Despite considerable hardware improvements, training deep neural networks still presents a challenge and constraints complexity of models. Here we propose an end-to-end neural model for foveal-peripheral vision, inspired by retino-cortical mapping in primates and humans. Our model has an efficient sampling technique for compressing the visual signal such that a small portion of the scene is perceived in high resolution while a large field of view is maintained in low resolution. An attention mechanism for performing “eye-movements” assists the agent in collecting detailed information incrementally from the observed scene. Our model achieves comparable results to a similar neural architecture trained on full-resolution data for image classification and outperforms it at video classification tasks. At the same time, because of the smaller size of its input, it can reduce computational effort tenfold and uses several times less memory. Moreover, we present an easy to implement bottom-up and top-down attention mechanism which relies on task-relevant features and is therefore a convenient byproduct of the main architecture. Apart from its computational efficiency, the presented work provides means for exploring active vision for agent training in simulated environments and anthropomorphic robotics.


Author(s):  
Stellan Ohlsson
Keyword(s):  

2019 ◽  
Vol 53 (3) ◽  
pp. 281-294
Author(s):  
Jean-Michel Foucart ◽  
Augustin Chavanne ◽  
Jérôme Bourriau

Nombreux sont les apports envisagés de l’Intelligence Artificielle (IA) en médecine. En orthodontie, plusieurs solutions automatisées sont disponibles depuis quelques années en imagerie par rayons X (analyse céphalométrique automatisée, analyse automatisée des voies aériennes) ou depuis quelques mois (analyse automatique des modèles numériques, set-up automatisé; CS Model +, Carestream Dental™). L’objectif de cette étude, en deux parties, est d’évaluer la fiabilité de l’analyse automatisée des modèles tant au niveau de leur numérisation que de leur segmentation. La comparaison des résultats d’analyse des modèles obtenus automatiquement et par l’intermédiaire de plusieurs orthodontistes démontre la fiabilité de l’analyse automatique; l’erreur de mesure oscillant, in fine, entre 0,08 et 1,04 mm, ce qui est non significatif et comparable avec les erreurs de mesures inter-observateurs rapportées dans la littérature. Ces résultats ouvrent ainsi de nouvelles perspectives quand à l’apport de l’IA en Orthodontie qui, basée sur le deep learning et le big data, devrait permettre, à moyen terme, d’évoluer vers une orthodontie plus préventive et plus prédictive.


Sign in / Sign up

Export Citation Format

Share Document