当前位置: 首页 > news >正文

在线教育网站策划方案app开发用到的技术

在线教育网站策划方案,app开发用到的技术,用html做个人网站代码,dw做网站Multi-task Video Enhancement for Dental Interventions 2022 miccai Abstract 微型照相机牢牢地固定在牙科手机上#xff0c;这样牙医就可以持续地监测保守牙科手术的进展情况。但视频辅助牙科干预中的视频增强减轻了低光、噪音、模糊和相机握手等降低视觉舒适度的问题。…Multi-task Video Enhancement for Dental Interventions 2022 miccai Abstract 微型照相机牢牢地固定在牙科手机上这样牙医就可以持续地监测保守牙科手术的进展情况。但视频辅助牙科干预中的视频增强减轻了低光、噪音、模糊和相机握手等降低视觉舒适度的问题。为此我们引入了一种新的深度网络用于多任务视频增强使牙科场景的宏观可视化。特别是该网络以多尺度方式联合利用视频恢复和时间对齐来有效增强视频。我们对虚幻场景中自然牙齿的视频进行的实验表明所提出的网络在多任务中获得了接近实时处理的最新结果。我们在https://doi.org/10.34808/1jby-ay90 上发布了video -lab这是第一个具有多任务标签的牙科视频数据集以促进相关视频处理应用的进一步研究。 Related Work UberNet [9] and cross-stitch networks [16] are encoder-focused architectures that propagate task outputs across scales in the encoder. Multi-modal distillation in PAD-Net [27] and PAP-Net [29] are decoder-focused networks that fuse outputs of task heads to make the final dense predictions but only at a single scale. MTI-Net [24], which is most similar to our architecture, extends the decoder fusion by propagating task-specific features bottom-up across multiple scales through the encoder. Instead of propagating the task features in scale-specific distillation modules across scales to the encoder, our network simultaneously propagates task outputs to the encoder and to the task heads in the decoder. Furthermore, the networks make dense task prediction in static images while we extend our network to videos. Contribution i) a novel application of a microcamera in computer-aided dental intervention for continuous tooth macro-visualization during drilling (居然是硬件创新悻悻离去 (ii)    a new, asymmetrically annotated dataset of natural teeth in phantom scenes with pairs of frames of compromised and good quality using a beam splitter, (iii)  a novel deep network for video processing that propagates task outputs to encoder and decoder across multiple scales to model task interactions, and (iv) demonstration that an instantiated model e˙ectively addresses multi-task video enhancement in our application by matching and surpassing state-of-the-art re-sults of single task networks in near real-time. Method 通过不同任务间的交互来增强视频的处理效果 视频增强任务是相互关联的。比如 --对齐视频帧aligning video frames有助于去模糊deblurring。 --去噪denoising和去模糊可以揭示有助于运动估计motion estimation的图像特征。 这种相互依赖性可以通过设计一个多任务模型来充分利用。 MOST-Net 是一种多输出、多尺度、多任务的网络架构。它的目标是通过编码器和解码器之间的多尺度特性建模任务间的交互。网络的输出包括多个任务用 T 表示这些任务在不同尺度用 s 表示上都有输出。例如 传播方式: 尺度内传播任务的输出会在当前尺度内传播。跨尺度传播任务输出会从较低的尺度上采样upsample然后传播到较高尺度的解码器层和任务分支中。 约束条件: ui denotes some operator, for instance, the upsampling operator for seg-mentation or the scaling operator for homography estimation. Problem Statement 模型需要同时解决视频恢复、牙齿分割和运动估计任务并在一个退化图像生成模型的假设下进行学习和优化。 T 3 and O1: video restoration, O2:segmentation , O3: homography esti-mation.  video stream generates observations , where t is the time index and P 0 is a scalar value referring to the number of past frames. The problem is to 1. estimate a clean frame, 2. a binary teeth segmentation mask and 3. approximate the inter-frame motion by a homography matrix, denoted by the triplet (三个任务的联合输出在尺度 s1上表示为一个三元组↓) Let x correspond to pixel location. Given per-pixel blur kernels kx,t of size K, the degraded image为了模拟输入视频的退化过程如模糊和噪声 at s 1 is generated as: We assume multiple independently moving objects present in the considered scenes, while our task is to estimate only the motion related to the object of interest (i.e. teeth), which is present in the region indicated by non-zero values of mask M: ∀t ∀x 是指所有t和x Training*** 在多任务和多尺度的深度学习模型中定义损失函数和优化目标 数据集 Loss Function 需要对 N样本数、T任务数和 S尺度数进行总共 N * T * S 次求和操作。 损失函数类型 模型通过最小化总损失函数来学习参数 Θ以便同时优化所有任务和所有尺度下的输出预测。优化过程需要考虑不同任务之间的相互关系和尺度之间的协同作用多任务多尺度学习的核心思想。 感觉这个multi task learning这块还是有点没搞清楚我再看看别的论文 Structure MOST-Net enables refinement of lower scale segmentations by upsampling and inputting them at the task-specific branches of higher scales. Encoders MOST-Net extracts features from two input frames Bt−1 and Bt independently at three scales.也就是说模型同时在多个尺度上处理输入数据。 U-shaped Downsampling : features are extracted via 3 × 3 convolutions with strides of 1, 2, 2 for s 1, 2, 3 followed by ReLU activations and 5 residual blocks [4] at each scale. The residual connections are augmented with an additional branch of convolutions in the Fast Fourier domain. output channel dimension :2^(s4) At each scale, featuresandare concatenated and a channel attention mechanism follows [30] to fuse them into MOST-Net uses homography outputs from lower scales to warp encoder features from the previous time step as Decoders encoder featuresare passed onto the expanding blocks scale-wisely via the skipping connections. At the lower scale (s 3),are directly passed on a stack of two residual blocks with 128 output channels. transposed convolutions with strides of 2 are used twice to recover the resolution scale. At higher scales (s 3), featuresare first concatenated with the upsampled decoder features and convolved by 3X 3 kernels to halve the number of channels.(为啥要减半Subsequently, they are propagated onto two residual blocks with 64 and 32 output channels each. The residual block outputs constitute scale-specific shared backbones. Lightweight task-specific branches follow to estimate the dense outputs. Specifically, one 3×3 convolution estimates  and two 3 × 3 convolutions, separated by ReLU, yieldat each scale At each scale, homography estimation modules estimate 4 offsets偏移量, related 1-1 to homographies via the Direct Linear Transformation (DLT) as in [5,12].  The motion gated attention modules multiply featureswith segmentationsto filter out context irrelevant to the motion of the teeth.The channel dimensionality is then halved by a 3 × 3 convolution while a second one extracts features from the restored output. The concatenation of the two streams forms features  Homography Estimation Module: At each scale,  and are employed to predict the offsets with shallow downstream networks. Predicted offsets at lower scales are transformed back to homographies and cascaded(串联) bottom-up [12] to refine the higher scale ones. Similarly to [5], we use blocks of 3 × 3 convolutions coupled with ReLU, batch normalization and max-pooling to reduce the spatial size of the features. Before the regression layer, a 0.2 dropout is applied.or s 1, the convolution output channels are 64, 128, 256, 256 and 256. For s2,3 the network depth is cropped from the second and third layers onwards respectively. Task-Specific Branches 这段是自己根据gpt加的以前没弄过多任务学习方便理解* Each task (colorization, motion estimation, segmentation) is handled by separate branches of the network. These branches can be seen in the image as the paths where F1,F2,F3 (the features at different scales) are passed through different processing stages (e.g., motion gated attention, channel attention, homography estimation) to produce task-specific outputs, such as the colorized frame Rt, mask Mt, and flow Ht​. The network is optimized for multiple tasks by using shared features across different task-specific branches, while each branch focuses on a particular tasks output (colorization, segmentation, motion estimation).The losses corresponding to each task are computed separately and combined in the final objective function, which allows the model to simultaneously learn multiple tasks while sharing common feature representations. Experiment Dataset Vident-lab: a dataset for multi-task video processing of phantom dental scenes - Open Research Data - Bridge of Knowledge Frame-to-Frame (F2F) Training: The model is trained using static video fragments recorded with a camera (C1). The goal is to apply a trained image denoiser to clean noisy frames, obtain denoised frames and and their noise maps Denoising Process: The noisy frames are first denoised using the trained model. Then, these denoised frames are temporally interpolated (using 17 frames) to generate a blurry effect. The temporal interpolation helps in simulating realistic motion blur. Adding Noise: After the blur effect, noise maps are added to the blurry frames(The denoised frames are tem-porally interpolated [19] 8 times and averaged over a temporal window of 17 frames to synthesize real-istic blur) to form the input video frames (B). The noise maps represent the original noise that would have been present in the actual noisy frames. Colorization: registration of frames between two di˙erent modalities C1 and C2 To generate output video frames (R), frames from camera C1 are colorized using a process where frames from a second camera (C2) are mapped to create the ground truth frames.Specifically, the frames from C1 are colorized based on data from C2 to form the colorized video frames. This helps in overcoming the difficulty of aligning the frames between the two cameras and creating accurate pixel-to-pixel correspondences. Color Mapping Network: A color mapping (CM) network is learned to predict parameters that map 3D functions from the dental scene colors of camera C2 to the camera C1. This network helps achieve precise color mapping and ensures accurate spatial correspondence between frames B and R. Segmentation masks and homographies 单应性 HRNet48 [22] pretrained on ImageNet, is fine-tuned on our annotations to automatically segment the teeth in the remaining frames in all three sets. We compute optical flows between consecutive clean frames with RAFT [23]. Motion fields are cropped with teeth masks Mt to discard other moving objects, such as the dental bur or the suction tube, as we are interested in stabilizing the videos with respect to the teeth. Subsequently, a partial aÿne homography H is fitted by RANSAC to the segmented motion field. Setup We train, validate, and test all methods on our dataset (Tab. 1). In all MOST-Net training runs, we set λ1, λ2, λ3 to 2 × 10−4, 5 × 10−5 and 1 for balancing tasks in Eq. 4. augmented by horizontal and vertical flips with 0.5 probability, random channel perturbations, and color jittering, after [31].  batch size 16 Adam , Learning rate 1e − 4, decayed to 1e − 6 with cosine annealing PyTorch 1.10 (FP32). The inference speed is reported in frames-per-second (FPS) on GPU NVidia RTX 5000. Results
http://www.hkea.cn/news/14410109/

相关文章:

  • 如何用dreamer做网站云服务器怎么用详细步骤
  • 成都工程网站建设建设网站要不要钱百度贴吧
  • 关于英文网站建设的请示智慧团建网站链接
  • 网盘搜索网站如何做的字体设计在线转换器
  • 网络设计的原则seo推广公司有哪些
  • 聊城市建设工程质量监督站网站厦门网站建设哪家不错推荐
  • 清流县建设局网站如何介绍网站建设公司
  • 为网站优势台州建设网站制作
  • 网站制作步骤辽宁建设工程信息网官网新域名
  • 东莞网站的制作设计网站域名注册步骤
  • 网站推广是怎么推广的验证码平台网站开发
  • 网站建设技术大赛试题石家庄网站建设教程
  • 沈阳网络推广优化外汇seo公司
  • 手机网站优化排名怎么做网站建设流程总结
  • 为什么四川省建设厅网站打不开阿里巴巴网站的功能
  • 深圳著名设计网站大全自适应网站建设推荐
  • 网站开发对企业有什么用网站托管服务 重庆
  • 哪里有零基础网站建设教学网站的建设可以起到什么作用是什么意思
  • 宁波广告公司网站建设苏州网站建设哪家做得好
  • 网站建设招标方案模板上海法律网站建设
  • 邯郸网站优化技巧wordpress怎么改登陆地址
  • 济宁网站建设 济宁智雅思帽网站建设
  • 戴尔网站建设目标深圳网站网页制作
  • 网站建设pdf 下载中国人在线观看免费高清
  • 推荐网站建设的书那个网站适合学生做兼职
  • 网站设计建设有赞微商城网页版
  • 张掖网站建设wordpress首页添加模块
  • 怎么删除2345网址导航网络网站推广首荐乐云seo
  • 拍卖网站建设公司搞外贸一般是干什么的
  • 合肥做网站好的公司做棋牌辅助网站