📝 ViewMix: Augmentation for Robust Representation in Self-Supervised Learning 🔭🧠 "Cut and paste patches from one view to another and create different views of the same image to form positive pairs, and the network is trained to maximize the agreement between positive pairs while minimizing the agreement between negative pairs." [gal30b+] 🤖 #CV #LG 🔗 https://arxiv.org/abs/2309.03360v1 #arxiv https://creative.ai/system/media_attachments/files/111/029/868/345/026/922/original/8fb681072c38509f.jpg https://creative.ai/system/media_attachments/files/111/029/868/394/954/621/original/02a525bfdd0dfb09.jpg https://creative.ai/system/media_attachments/files/111/029/868/455/432/545/original/c7a51405a244586b.jpg
📝 Source Camera Identification and Detection in Digital Videos Through Blind Forensics 🔭🧠 "Based on feature extraction from the video frames followed by feature selection and classification of the extracted features to identify the source (camera) from which the video was captured." [gal30b+] 🤖 #CV #LG 🔗 https://arxiv.org/abs/2309.03353v1 #arxiv https://creative.ai/system/media_attachments/files/111/029/750/369/206/037/original/fd5478f887c25579.jpg
📝 Relay Diffusion: Unifying Diffusion Process Across Resolutions for Image Synthesis 🔭🧠 "Relay Diffusion Model (RDM) transfers a low-resolution image or noise into an equivalent high-resolution one for diffusion model via blurring diffusion and block noise." [gal30b+] 🤖 #CV #LG ⚙️ https://github.com/THUDM/RelayDiffusion 🔗 https://arxiv.org/abs/2309.03350v1 #arxiv https://creative.ai/system/media_attachments/files/111/029/573/580/632/325/original/776e79282ba96b35.jpg https://creative.ai/system/media_attachments/files/111/029/573/645/221/853/original/05b26ff97a7d4859.jpg https://creative.ai/system/media_attachments/files/111/029/573/718/772/264/original/6fca0573a2a51380.jpg https://creative.ai/system/media_attachments/files/111/029/573/802/961/958/original/b5ceea05fe5728b4.jpg
📝 Robust Visual Tracking by Motion Analyzing 🔭 "Works by modeling the motion pattern using a tensor structure, obtained through Tucker2 decomposition, which is effective in describing the target's motion by analyzing the correlation between adjacent frames." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.03247v1 #arxiv https://creative.ai/system/media_attachments/files/111/029/337/352/884/781/original/87d505c63db3500b.jpg https://creative.ai/system/media_attachments/files/111/029/337/493/719/029/original/2dd67ce6b45571b8.jpg https://creative.ai/system/media_attachments/files/111/029/337/569/850/597/original/7134aebde2bf0f2b.jpg https://creative.ai/system/media_attachments/files/111/029/337/621/951/806/original/72176c6f7de53883.jpg
📝 RepSGG: Novel Representations of Entities and Relationships for Scene Graph Generation 🔭 "RepSGG learns to sample semantically discriminative and representative points for relationship inference, formulating a subject as queries, an object as keys, and their relationship as the maximum attention weight between pairwise queries and keys." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.03240v1 #arxiv https://creative.ai/system/media_attachments/files/111/029/042/612/271/004/original/6e93bc16cf7a454f.jpg https://creative.ai/system/media_attachments/files/111/029/042/686/573/480/original/ba32a9c7a4df1ebd.jpg https://creative.ai/system/media_attachments/files/111/029/042/816/593/129/original/b2fbbad9b83894b0.jpg https://creative.ai/system/media_attachments/files/111/029/042/895/270/010/original/b88d055652314d5e.jpg
📝 A Multisensor Hyperspectral Benchmark Dataset for Unmixing of Intimate Mixtures 🔭 "Generated using a set of different hyperspectral sensors to capture the reflectance spectra in the visible, near, short, mid, and long-wavelength infrared regions (350-15385) nm." [gal30b+] 🤖 #CV ⚙️ https://github.com/VisionlabUA/Multisensor_datasets 🔗 https://arxiv.org/abs/2309.03216v1 #arxiv https://creative.ai/system/media_attachments/files/111/028/865/703/469/356/original/3cc4268be2e795b3.jpg https://creative.ai/system/media_attachments/files/111/028/865/778/229/222/original/27ff1ebd160d85f2.jpg https://creative.ai/system/media_attachments/files/111/028/865/859/653/456/original/8e01a9c4c9ad019d.jpg https://creative.ai/system/media_attachments/files/111/028/865/923/428/465/original/9056fd5c456c9dd2.jpg
📝 Better Practices for Domain Adaptation 🧠🔭 "Analyzes state of domain adaptation (DA) in a rigorous way by benchmarking a suite of validation criteria with different DA algorithms including Unsupervised Domain Adaptation (UDA), Source-Free Domain Adaptation (SFDA), and Test Time Adaptation (TTA)." [gal30b+] 🤖 #LG #CV 🔗 https://arxiv.org/abs/2309.03879v1 #arxiv https://creative.ai/system/media_attachments/files/111/028/629/458/752/988/original/12a87bfe6f427112.jpg https://creative.ai/system/media_attachments/files/111/028/629/512/580/253/original/750472a294a877ba.jpg https://creative.ai/system/media_attachments/files/111/028/629/574/018/303/original/07d066916ae33869.jpg https://creative.ai/system/media_attachments/files/111/028/629/629/561/141/original/c2a1466e74e557b5.jpg
📝 PDiscoNet: Semantically Consistent Part Discovery for Fine-Grained Recognition 🔭 "Proposes an end-to-end trainable method to discover object parts by using only image-level class labels along with priors encouraging the parts to be discriminative, compact, distinct from each other, equivariant to rigid transforms, and active in at least some of the images." [gal30b+] 🤖 #CV ⚙️ https://github.com/robertdvdk/part_detection 🔗 https://arxiv.org/abs/2309.03173v1 #arxiv https://creative.ai/system/media_attachments/files/111/028/377/448/872/011/original/d5f3537f09610fb6.jpg https://creative.ai/system/media_attachments/files/111/028/377/509/948/933/original/e4e32bba1a37c5de.jpg https://creative.ai/system/media_attachments/files/111/028/377/577/888/338/original/98a4081a71fc9cc3.jpg https://creative.ai/system/media_attachments/files/111/028/377/717/058/933/original/658ea67f0d5b7dbf.jpg
📝 Do We Still Need Non-Maximum Suppression? Accurate Confidence Estimates and Implicit Duplication Modeling with IoU-Aware Calibration 🔭 "Uses IoU-Aware calibration, a conditional version of Beta calibration to implicitly account for the likelihood of each detection being a duplicate and adjust the confidence score accordingly." [gal30b+] 🤖 #CV ⚙️ https://github.com/Blueblue4/IoU-AwareCalibration}{code 🔗 https://arxiv.org/abs/2309.03110v1 #arxiv https://creative.ai/system/media_attachments/files/111/027/964/514/682/287/original/ee530fc0f8fdbd31.jpg https://creative.ai/system/media_attachments/files/111/027/964/568/678/528/original/7f98891df7768e81.jpg https://creative.ai/system/media_attachments/files/111/027/964/621/619/803/original/55ed4be077ec505a.jpg https://creative.ai/system/media_attachments/files/111/027/964/693/493/377/original/84c3e51bde07eb57.jpg
📝 Prompt-Based All-in-One Image Restoration Using CNNs and Transformer 🔭 "Our network architecture is based on a CNN-transformer backbone, the CAPTNet, which consists of multiple CAPT blocks, the multi-scale information can be aggregated through the CAPTNet." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.03063v1 #arxiv https://creative.ai/system/media_attachments/files/111/027/551/700/244/659/original/9ec39eb0b53e5bcb.jpg https://creative.ai/system/media_attachments/files/111/027/551/758/105/876/original/8572462184584bcd.jpg https://creative.ai/system/media_attachments/files/111/027/551/828/154/292/original/107ce19c113ac5a6.jpg https://creative.ai/system/media_attachments/files/111/027/551/890/119/498/original/ec4a2e7a45d949ce.jpg
📝 Adaptive Growth: Real-Time CNN Layer Expansion 🔭 "Iteratively introduces kernels to the convolutional layer, gauging its real-time response to varying data to guide its growth and improving the model's performance and scalability." [gal30b+] 🤖 #CV ⚙️ https://github.com/YunjieZhu/Extensible-Convolutional-Layer-git-version 🔗 https://arxiv.org/abs/2309.03049v1 #arxiv https://creative.ai/system/media_attachments/files/111/027/197/872/134/425/original/ab5addc19b92d07e.jpg https://creative.ai/system/media_attachments/files/111/027/197/926/181/761/original/0202a740ba064052.jpg https://creative.ai/system/media_attachments/files/111/027/197/978/866/860/original/88aaa221d8922d19.jpg https://creative.ai/system/media_attachments/files/111/027/198/032/623/710/original/d9d152b884e4a4c6.jpg
📝 McM: Multi-Condition Motion Synthesis Framework for Multi-Scenario 🔭 "MCM adopts a two-branch architecture consisting of a main branch and a control branch, both are Transformer-based diffusion model MWNet (DDPM-like) that can capture the spatial complexity and inter-joint correlations in motion sequences through a channel-dimension self-attention module." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.03031v1 #arxiv https://creative.ai/system/media_attachments/files/111/026/784/779/920/597/original/4ab789f61e0c5f47.jpg https://creative.ai/system/media_attachments/files/111/026/784/835/906/969/original/d2c52201000a0b37.jpg https://creative.ai/system/media_attachments/files/111/026/784/908/283/939/original/d323e735d71af129.jpg https://creative.ai/system/media_attachments/files/111/026/784/966/607/804/original/519f09e0ac55b824.jpg
📝 SEAL: A Framework for Systematic Evaluation of Real-World Super-Resolution 🔭 "By clustering the extensive degradation space to create a set of representative degradation cases, which serves as a comprehensive test set; By proposing a coarse-to-fine evaluation protocol to measure the distributed and relative performance of real-SR methods on the test set." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.03020v1 #arxiv https://creative.ai/system/media_attachments/files/111/026/490/050/311/117/original/d86896944ec324f4.jpg https://creative.ai/system/media_attachments/files/111/026/490/126/132/813/original/fa7620588db9e887.jpg https://creative.ai/system/media_attachments/files/111/026/490/201/032/642/original/8b7fd3b6e47079b3.jpg https://creative.ai/system/media_attachments/files/111/026/490/261/265/333/original/598b06a234124d2e.jpg
📝 Continual Evidential Deep Learning for Out-of-Distribution Detection 🔭 "Continual Evidential Deep Learning, which is an evidential deep learning method in a continual learning paradigm, has been evaluated on the CIFAR-100 dataset." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02995v1 #arxiv https://creative.ai/system/media_attachments/files/111/026/195/165/305/046/original/100f34e98c7d30aa.jpg https://creative.ai/system/media_attachments/files/111/026/195/233/565/435/original/fd66c56369e267cd.jpg https://creative.ai/system/media_attachments/files/111/026/195/315/245/713/original/2c4b7dd322b26496.jpg https://creative.ai/system/media_attachments/files/111/026/195/375/183/058/original/fd5e49786abc7a60.jpg
📝 FishMOT: A Simple and Effective Method for Fish Tracking Based on IoU Matching 🔭 "Combines object detection techniques with the IoU matching algorithm, thereby achieving efficient, precise, and robust fish detection and tracking, and the algorithm employed in this method addresses the issue of missed detections without relying on complex feature matching or graph optimization algorithms." [gal30b+] 🤖 #CV ⚙️ https://github.com/gakkistar/FishMOT 🔗 https://arxiv.org/abs/2309.02975v1 #arxiv https://creative.ai/system/media_attachments/files/111/025/900/179/392/685/original/144bf9796a09b368.jpg https://creative.ai/system/media_attachments/files/111/025/900/240/222/108/original/05f2d143a41619d0.jpg https://creative.ai/system/media_attachments/files/111/025/900/303/571/920/original/679d26f9a7e6e17c.jpg https://creative.ai/system/media_attachments/files/111/025/900/373/723/119/original/555ca3287156955e.jpg
📝 Towards Efficient Training with Negative Samples in Visual Tracking 🔭 "JN-256 introduces a distribution-based head which modeling the bounding box as distribution of distances to express uncertainty about the target's location in the presence of negative samples." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02903v1 #arxiv https://creative.ai/system/media_attachments/files/111/025/605/305/881/490/original/00d66456145664a6.jpg https://creative.ai/system/media_attachments/files/111/025/605/364/954/172/original/762c6420aa21b7e7.jpg https://creative.ai/system/media_attachments/files/111/025/605/423/037/062/original/18547eec2500e284.jpg https://creative.ai/system/media_attachments/files/111/025/605/478/832/618/original/602a38cd263c2ce6.jpg
📝 Image Aesthetics Assessment via Learnable Queries 🔭 "IAA-LQ utilizes learnable queries, which are initialized by content-related knowledge, to extract aesthetic features by attentively selecting content-related image features." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02861v1 #arxiv https://creative.ai/system/media_attachments/files/111/025/251/293/364/183/original/97797627490185f9.jpg https://creative.ai/system/media_attachments/files/111/025/251/351/962/196/original/2cb3eeecd503aa20.jpg
📝 Bandwidth-Efficient Inference for Neural Image Compression 🔭 "An illustration of proposed method: (a) An end-to-end differentiable bandwidth efficient neural inference method with the activation compressed by neural data compression method, (b) An illustration of transform-quantization-entropy coding pipeline for activation compression." [gal30b+] 🤖 #CV ⚙️ https://github.com/rygorous/ryg 🔗 https://arxiv.org/abs/2309.02855v1 #arxiv https://creative.ai/system/media_attachments/files/111/024/956/504/517/295/original/9ab028c6a939d96e.jpg https://creative.ai/system/media_attachments/files/111/024/956/567/819/771/original/213a7b5769a40451.jpg https://creative.ai/system/media_attachments/files/111/024/956/630/382/365/original/86d3686217298726.jpg https://creative.ai/system/media_attachments/files/111/024/956/692/036/329/original/1b87373edeeee13e.jpg
📝 Knowledge Distillation Layer That Lets the Student Decide 🔭 "Proposes a learnable knowledge distillation (letKD) layer for the student which improves KD with two distinct abilities: i) learning how to leverage the teacher's knowledge, enabling to discard nuisance information, and ii) feeding forward the transferred knowledge deeper." [gal30b+] 🤖 #CV ⚙️ https://github.com/adagorgun/letKD-framework 🔗 https://arxiv.org/abs/2309.02843v1 #arxiv https://creative.ai/system/media_attachments/files/111/024/602/518/498/251/original/5853116792dde4fb.jpg https://creative.ai/system/media_attachments/files/111/024/602/576/324/021/original/9571d8e6ee51e7ec.jpg
📝 Image-Object-Specific Prompt Learning for Few-Shot Class-Incremental Learning 🔭 "Proposes a novel FSCIL training framework, IOSFSCIL, based on CLIP's generalizability and image-object-specific (IOS) classifiers." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02833v1 #arxiv https://creative.ai/system/media_attachments/files/111/024/248/685/717/121/original/02538e3f0ed4f706.jpg https://creative.ai/system/media_attachments/files/111/024/248/746/711/630/original/b4a350919451ac44.jpg https://creative.ai/system/media_attachments/files/111/024/248/825/909/009/original/f82a7c73d080add6.jpg https://creative.ai/system/media_attachments/files/111/024/248/883/940/468/original/a5dc2720f53c2926.jpg
📝 Diffusion Model Is Secretly a Training-Free Open Vocabulary Semantic Segmenter 🔭 "DiffSegmenter utilizes an off-the-shelf pre-trained text-conditioned diffusion model to generate cross-attention and self-attention maps, which are further refined and completed for final predictions." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02773v1 #arxiv https://creative.ai/system/media_attachments/files/111/023/895/007/003/956/original/5b561a42ab2e2a2c.jpg https://creative.ai/system/media_attachments/files/111/023/895/085/425/336/original/4c264b7d1607aeac.jpg https://creative.ai/system/media_attachments/files/111/023/895/140/165/653/original/7600a030bd0b672e.jpg https://creative.ai/system/media_attachments/files/111/023/895/194/206/978/original/996fb71f77b7dead.jpg
📝 DMKD: Improving Feature-Based Knowledge Distillation for Object Detection via Dual Masking Augmentation 🔭 "Develops a Dual Masked Knowledge Distillation (DMKD) framework for compressing the backbone of two-stage detector (RetinaNet and Cascade Mask R-CNN)." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02719v1 #arxiv https://creative.ai/system/media_attachments/files/111/023/541/121/203/665/original/2d309c26d2fcfa58.jpg https://creative.ai/system/media_attachments/files/111/023/541/173/402/050/original/ccdb4b5ce8b2fddf.jpg https://creative.ai/system/media_attachments/files/111/023/541/230/684/963/original/99c55c79811d2f4b.jpg
📝 Efficient Training for Visual Tracking with Deformable Transformer 🔭 "Proposes a novel end-to-end visual tracking framework, DETRack, which achieves efficient training and low inference complexity with an encoder-decoder structure, one-to-many label assignment, and auxiliary denoising techniques." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02676v1 #arxiv https://creative.ai/system/media_attachments/files/111/023/128/001/343/756/original/650166001a479ebf.jpg https://creative.ai/system/media_attachments/files/111/023/128/056/582/898/original/1235e448e3fdcbfc.jpg https://creative.ai/system/media_attachments/files/111/023/128/115/729/461/original/b1f67fe6df188366.jpg https://creative.ai/system/media_attachments/files/111/023/128/175/310/065/original/d4925b91d5eee899.jpg
📝 Fast and Resource-Efficient Object Tracking on Edge Devices: A Measurement Study 🔭 "EMO is proposed to speed up the real time object tracking on the edge by leveraging both the window-based optimization and the similarity-based optimization, to achieve a better balance of speed and accuracy." [gal30b+] 🤖 #CV #DC ⚙️ https://github.com/git-disl/EMO 🔗 https://arxiv.org/abs/2309.02666v1 #arxiv https://creative.ai/system/media_attachments/files/111/022/774/118/348/233/original/b35e1d3a96ab56ff.jpg https://creative.ai/system/media_attachments/files/111/022/774/183/720/757/original/e0b4953ab02ff096.jpg https://creative.ai/system/media_attachments/files/111/022/774/278/228/136/original/d57dd039cf875176.jpg https://creative.ai/system/media_attachments/files/111/022/774/383/175/409/original/8b5cfc73b1ee6d56.jpg
📝 Self-Supervised Video Transformers for Isolated Sign Language Recognition 🔭 "Self-Supervised Learning, Transformers, Isolated Sign Language Recognition, Phonological Features, Sign Language Representation Learning I would like to express my sincere gratitude to Drs." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2309.02450v1 #arxiv https://creative.ai/system/media_attachments/files/111/022/360/990/404/780/original/678c8d4be74f17af.jpg https://creative.ai/system/media_attachments/files/111/022/361/043/533/294/original/87f04435c26ab568.jpg https://creative.ai/system/media_attachments/files/111/022/361/120/021/420/original/d341a04cf503b55e.jpg https://creative.ai/system/media_attachments/files/111/022/361/172/222/123/original/8a45b853f0b854f8.jpg
📝 Classification Committee for Active Deep Object Detection 🔭 "A classification committee for active deep object detection method is proposed by introducing a discrepancy mechanism of multiple classifiers to select the most informative images, which is expected to focus more on discrepancy and representative of instances." [gal30b+] 🤖 #CV 🔗 https://arxiv.org/abs/2308.08476v1 #arxiv https://creative.ai/system/media_attachments/files/110/910/204/072/472/389/original/365683f849c7bc80.jpg https://creative.ai/system/media_attachments/files/110/910/204/143/504/441/original/14b77902c98008a7.jpg https://creative.ai/system/media_attachments/files/110/910/204/202/123/040/original/18572d16e43e426d.jpg https://creative.ai/system/media_attachments/files/110/910/204/294/841/762/original/a98f7c8a9a34eeb0.jpg
📝 Fast Adaptation with Bradley-Terry Preference Models in Text-to-Image Classification and Generation 🔭🧠 "Leverages the Bradley-Terry preference model to develop a fast preference-aware adaptation method, using few samples and with minimal computing resources, which efficiently fine-tunes the original model and aligns it with the preferences of the user." [gal30b+] 🤖 #CV #LG 🔗 https://arxiv.org/abs/2308.07929v1 #arxiv https://creative.ai/system/media_attachments/files/110/905/190/621/565/699/original/df6f499dea85f5d9.jpg https://creative.ai/system/media_attachments/files/110/905/190/674/599/721/original/44b90f246934cd4e.jpg
Notes by 9a622e93 | export