Abstract: High-quality 3D reconstructions from endoscopy video
play an important role in many clinical applications, including surgical navigation where they enable direct videoCT registration. While many methods exist for general
multi-view 3D reconstruction, these methods often fail to
deliver satisfactory performance on endoscopic video. Part
of the reason is that local descriptors that establish pairwise point correspondences, and thus drive reconstruction,
struggle when confronted with the texture-scarce surface of
anatomy. Learning-based dense descriptors usually have
larger receptive fields enabling the encoding of global information, which can be used to disambiguate matches. In
this work, we present an effective self-supervised training
scheme and novel loss design for dense descriptor learning. In direct comparison to recent local and dense descriptors on an in-house sinus endoscopy dataset, we demonstrate that our proposed dense descriptor can generalize
to unseen patients and scopes, thereby largely improving
the performance of Structure from Motion (SfM) in terms
of model density and completeness. We also evaluate our
method on a public dense optical flow dataset and a smallscale SfM public dataset to further demonstrate the effectiveness and generality of our method. The source code is
available at https://github.com/lppllppl920/
DenseDescriptorLearning-Pytorch.
0 Replies
Loading