Oddbean new post about | logout

Notes by 9a622e93 | export

 📝 ViewMix: Augmentation for Robust Representation in Self-Supervised Learning 🔭🧠

"Cut and paste patches from one view to another and create different views of the same image to form positive pairs, and the network is trained to maximize the agreement between positive pairs while minimizing the agreement between negative pairs." [gal30b+] 🤖 #CV #LG

🔗 https://arxiv.org/abs/2309.03360v1 #arxiv

https://creative.ai/system/media_attachments/files/111/029/868/345/026/922/original/8fb681072c38509f.jpg

https://creative.ai/system/media_attachments/files/111/029/868/394/954/621/original/02a525bfdd0dfb09.jpg

https://creative.ai/system/media_attachments/files/111/029/868/455/432/545/original/c7a51405a244586b.jpg 
 📝 Source Camera Identification and Detection in Digital Videos Through Blind Forensics 🔭🧠

"Based on feature extraction from the video frames followed by feature selection and classification of the extracted features to identify the source (camera) from which the video was captured." [gal30b+] 🤖 #CV #LG

🔗 https://arxiv.org/abs/2309.03353v1 #arxiv

https://creative.ai/system/media_attachments/files/111/029/750/369/206/037/original/fd5478f887c25579.jpg 
 📝 RepSGG: Novel Representations of Entities and Relationships for Scene Graph Generation 🔭

"RepSGG learns to sample semantically discriminative and representative points for relationship inference, formulating a subject as queries, an object as keys, and their relationship as the maximum attention weight between pairwise queries and keys." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.03240v1 #arxiv

https://creative.ai/system/media_attachments/files/111/029/042/612/271/004/original/6e93bc16cf7a454f.jpg

https://creative.ai/system/media_attachments/files/111/029/042/686/573/480/original/ba32a9c7a4df1ebd.jpg

https://creative.ai/system/media_attachments/files/111/029/042/816/593/129/original/b2fbbad9b83894b0.jpg

https://creative.ai/system/media_attachments/files/111/029/042/895/270/010/original/b88d055652314d5e.jpg 
 📝 Better Practices for Domain Adaptation 🧠🔭

"Analyzes state of domain adaptation (DA) in a rigorous way by benchmarking a suite of validation criteria with different DA algorithms including Unsupervised Domain Adaptation (UDA), Source-Free Domain Adaptation (SFDA), and Test Time Adaptation (TTA)." [gal30b+] 🤖 #LG #CV

🔗 https://arxiv.org/abs/2309.03879v1 #arxiv

https://creative.ai/system/media_attachments/files/111/028/629/458/752/988/original/12a87bfe6f427112.jpg

https://creative.ai/system/media_attachments/files/111/028/629/512/580/253/original/750472a294a877ba.jpg

https://creative.ai/system/media_attachments/files/111/028/629/574/018/303/original/07d066916ae33869.jpg

https://creative.ai/system/media_attachments/files/111/028/629/629/561/141/original/c2a1466e74e557b5.jpg 
 📝 PDiscoNet: Semantically Consistent Part Discovery for Fine-Grained Recognition 🔭

"Proposes an end-to-end trainable method to discover object parts by using only image-level class labels along with priors encouraging the parts to be discriminative, compact, distinct from each other, equivariant to rigid transforms, and active in at least some of the images." [gal30b+] 🤖 #CV

⚙️ https://github.com/robertdvdk/part_detection
🔗 https://arxiv.org/abs/2309.03173v1 #arxiv

https://creative.ai/system/media_attachments/files/111/028/377/448/872/011/original/d5f3537f09610fb6.jpg

https://creative.ai/system/media_attachments/files/111/028/377/509/948/933/original/e4e32bba1a37c5de.jpg

https://creative.ai/system/media_attachments/files/111/028/377/577/888/338/original/98a4081a71fc9cc3.jpg

https://creative.ai/system/media_attachments/files/111/028/377/717/058/933/original/658ea67f0d5b7dbf.jpg 
 📝 Do We Still Need Non-Maximum Suppression? Accurate Confidence Estimates and Implicit Duplication Modeling with IoU-Aware Calibration 🔭

"Uses IoU-Aware calibration, a conditional version of Beta calibration to implicitly account for the likelihood of each detection being a duplicate and adjust the confidence score accordingly." [gal30b+] 🤖 #CV

⚙️ https://github.com/Blueblue4/IoU-AwareCalibration}{code
🔗 https://arxiv.org/abs/2309.03110v1 #arxiv

https://creative.ai/system/media_attachments/files/111/027/964/514/682/287/original/ee530fc0f8fdbd31.jpg

https://creative.ai/system/media_attachments/files/111/027/964/568/678/528/original/7f98891df7768e81.jpg

https://creative.ai/system/media_attachments/files/111/027/964/621/619/803/original/55ed4be077ec505a.jpg

https://creative.ai/system/media_attachments/files/111/027/964/693/493/377/original/84c3e51bde07eb57.jpg 
 📝 McM: Multi-Condition Motion Synthesis Framework for Multi-Scenario 🔭

"MCM adopts a two-branch architecture consisting of a main branch and a control branch, both are Transformer-based diffusion model MWNet (DDPM-like) that can capture the spatial complexity and inter-joint correlations in motion sequences through a channel-dimension self-attention module." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.03031v1 #arxiv

https://creative.ai/system/media_attachments/files/111/026/784/779/920/597/original/4ab789f61e0c5f47.jpg

https://creative.ai/system/media_attachments/files/111/026/784/835/906/969/original/d2c52201000a0b37.jpg

https://creative.ai/system/media_attachments/files/111/026/784/908/283/939/original/d323e735d71af129.jpg

https://creative.ai/system/media_attachments/files/111/026/784/966/607/804/original/519f09e0ac55b824.jpg 
 📝 SEAL: A Framework for Systematic Evaluation of Real-World Super-Resolution 🔭

"By clustering the extensive degradation space to create a set of representative degradation cases, which serves as a comprehensive test set; By proposing a coarse-to-fine evaluation protocol to measure the distributed and relative performance of real-SR methods on the test set." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.03020v1 #arxiv

https://creative.ai/system/media_attachments/files/111/026/490/050/311/117/original/d86896944ec324f4.jpg

https://creative.ai/system/media_attachments/files/111/026/490/126/132/813/original/fa7620588db9e887.jpg

https://creative.ai/system/media_attachments/files/111/026/490/201/032/642/original/8b7fd3b6e47079b3.jpg

https://creative.ai/system/media_attachments/files/111/026/490/261/265/333/original/598b06a234124d2e.jpg 
 📝 FishMOT: A Simple and Effective Method for Fish Tracking Based on IoU Matching 🔭

"Combines object detection techniques with the IoU matching algorithm, thereby achieving efficient, precise, and robust fish detection and tracking, and the algorithm employed in this method addresses the issue of missed detections without relying on complex feature matching or graph optimization algorithms." [gal30b+] 🤖 #CV

⚙️ https://github.com/gakkistar/FishMOT
🔗 https://arxiv.org/abs/2309.02975v1 #arxiv

https://creative.ai/system/media_attachments/files/111/025/900/179/392/685/original/144bf9796a09b368.jpg

https://creative.ai/system/media_attachments/files/111/025/900/240/222/108/original/05f2d143a41619d0.jpg

https://creative.ai/system/media_attachments/files/111/025/900/303/571/920/original/679d26f9a7e6e17c.jpg

https://creative.ai/system/media_attachments/files/111/025/900/373/723/119/original/555ca3287156955e.jpg 
 📝 Towards Efficient Training with Negative Samples in Visual Tracking 🔭

"JN-256 introduces a distribution-based head which modeling the bounding box as distribution of distances to express uncertainty about the target's location in the presence of negative samples." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.02903v1 #arxiv

https://creative.ai/system/media_attachments/files/111/025/605/305/881/490/original/00d66456145664a6.jpg

https://creative.ai/system/media_attachments/files/111/025/605/364/954/172/original/762c6420aa21b7e7.jpg

https://creative.ai/system/media_attachments/files/111/025/605/423/037/062/original/18547eec2500e284.jpg

https://creative.ai/system/media_attachments/files/111/025/605/478/832/618/original/602a38cd263c2ce6.jpg 
 📝 Image Aesthetics Assessment via Learnable Queries 🔭

"IAA-LQ utilizes learnable queries, which are initialized by content-related knowledge, to extract aesthetic features by attentively selecting content-related image features." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.02861v1 #arxiv

https://creative.ai/system/media_attachments/files/111/025/251/293/364/183/original/97797627490185f9.jpg

https://creative.ai/system/media_attachments/files/111/025/251/351/962/196/original/2cb3eeecd503aa20.jpg 
 📝 Bandwidth-Efficient Inference for Neural Image Compression 🔭

"An illustration of proposed method: (a) An end-to-end differentiable bandwidth efficient neural inference method with the activation compressed by neural data compression method, (b) An illustration of transform-quantization-entropy coding pipeline for activation compression." [gal30b+] 🤖 #CV

⚙️ https://github.com/rygorous/ryg
🔗 https://arxiv.org/abs/2309.02855v1 #arxiv

https://creative.ai/system/media_attachments/files/111/024/956/504/517/295/original/9ab028c6a939d96e.jpg

https://creative.ai/system/media_attachments/files/111/024/956/567/819/771/original/213a7b5769a40451.jpg

https://creative.ai/system/media_attachments/files/111/024/956/630/382/365/original/86d3686217298726.jpg

https://creative.ai/system/media_attachments/files/111/024/956/692/036/329/original/1b87373edeeee13e.jpg 
 📝 Knowledge Distillation Layer That Lets the Student Decide 🔭

"Proposes a learnable knowledge distillation (letKD) layer for the student which improves KD with two distinct abilities: i) learning how to leverage the teacher's knowledge, enabling to discard nuisance information, and ii) feeding forward the transferred knowledge deeper." [gal30b+] 🤖 #CV

⚙️ https://github.com/adagorgun/letKD-framework
🔗 https://arxiv.org/abs/2309.02843v1 #arxiv

https://creative.ai/system/media_attachments/files/111/024/602/518/498/251/original/5853116792dde4fb.jpg

https://creative.ai/system/media_attachments/files/111/024/602/576/324/021/original/9571d8e6ee51e7ec.jpg 
 📝 Diffusion Model Is Secretly a Training-Free Open Vocabulary Semantic Segmenter 🔭

"DiffSegmenter utilizes an off-the-shelf pre-trained text-conditioned diffusion model to generate cross-attention and self-attention maps, which are further refined and completed for final predictions." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.02773v1 #arxiv

https://creative.ai/system/media_attachments/files/111/023/895/007/003/956/original/5b561a42ab2e2a2c.jpg

https://creative.ai/system/media_attachments/files/111/023/895/085/425/336/original/4c264b7d1607aeac.jpg

https://creative.ai/system/media_attachments/files/111/023/895/140/165/653/original/7600a030bd0b672e.jpg

https://creative.ai/system/media_attachments/files/111/023/895/194/206/978/original/996fb71f77b7dead.jpg 
 📝 DMKD: Improving Feature-Based Knowledge Distillation for Object Detection via Dual Masking Augmentation 🔭

"Develops a Dual Masked Knowledge Distillation (DMKD) framework for compressing the backbone of two-stage detector (RetinaNet and Cascade Mask R-CNN)." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.02719v1 #arxiv

https://creative.ai/system/media_attachments/files/111/023/541/121/203/665/original/2d309c26d2fcfa58.jpg

https://creative.ai/system/media_attachments/files/111/023/541/173/402/050/original/ccdb4b5ce8b2fddf.jpg

https://creative.ai/system/media_attachments/files/111/023/541/230/684/963/original/99c55c79811d2f4b.jpg 
 📝 Efficient Training for Visual Tracking with Deformable Transformer 🔭

"Proposes a novel end-to-end visual tracking framework, DETRack, which achieves efficient training and low inference complexity with an encoder-decoder structure, one-to-many label assignment, and auxiliary denoising techniques." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.02676v1 #arxiv

https://creative.ai/system/media_attachments/files/111/023/128/001/343/756/original/650166001a479ebf.jpg

https://creative.ai/system/media_attachments/files/111/023/128/056/582/898/original/1235e448e3fdcbfc.jpg

https://creative.ai/system/media_attachments/files/111/023/128/115/729/461/original/b1f67fe6df188366.jpg

https://creative.ai/system/media_attachments/files/111/023/128/175/310/065/original/d4925b91d5eee899.jpg 
 📝 Fast and Resource-Efficient Object Tracking on Edge Devices: A Measurement Study 🔭

"EMO is proposed to speed up the real time object tracking on the edge by leveraging both the window-based optimization and the similarity-based optimization, to achieve a better balance of speed and accuracy." [gal30b+] 🤖 #CV #DC

⚙️ https://github.com/git-disl/EMO
🔗 https://arxiv.org/abs/2309.02666v1 #arxiv

https://creative.ai/system/media_attachments/files/111/022/774/118/348/233/original/b35e1d3a96ab56ff.jpg

https://creative.ai/system/media_attachments/files/111/022/774/183/720/757/original/e0b4953ab02ff096.jpg

https://creative.ai/system/media_attachments/files/111/022/774/278/228/136/original/d57dd039cf875176.jpg

https://creative.ai/system/media_attachments/files/111/022/774/383/175/409/original/8b5cfc73b1ee6d56.jpg 
 📝 Self-Supervised Video Transformers for Isolated Sign Language Recognition 🔭

"Self-Supervised Learning, Transformers, Isolated Sign Language Recognition, Phonological Features, Sign Language Representation Learning I would like to express my sincere gratitude to Drs." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.02450v1 #arxiv

https://creative.ai/system/media_attachments/files/111/022/360/990/404/780/original/678c8d4be74f17af.jpg

https://creative.ai/system/media_attachments/files/111/022/361/043/533/294/original/87f04435c26ab568.jpg

https://creative.ai/system/media_attachments/files/111/022/361/120/021/420/original/d341a04cf503b55e.jpg

https://creative.ai/system/media_attachments/files/111/022/361/172/222/123/original/8a45b853f0b854f8.jpg 
 📝 Classification Committee for Active Deep Object Detection 🔭

"A classification committee for active deep object detection method is proposed by introducing a discrepancy mechanism of multiple classifiers to select the most informative images, which is expected to focus more on discrepancy and representative of instances." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2308.08476v1 #arxiv

https://creative.ai/system/media_attachments/files/110/910/204/072/472/389/original/365683f849c7bc80.jpg

https://creative.ai/system/media_attachments/files/110/910/204/143/504/441/original/14b77902c98008a7.jpg

https://creative.ai/system/media_attachments/files/110/910/204/202/123/040/original/18572d16e43e426d.jpg

https://creative.ai/system/media_attachments/files/110/910/204/294/841/762/original/a98f7c8a9a34eeb0.jpg 
 📝 Fast Adaptation with Bradley-Terry Preference Models in Text-to-Image Classification and Generation 🔭🧠

"Leverages the Bradley-Terry preference model to develop a fast preference-aware adaptation method, using few samples and with minimal computing resources, which efficiently fine-tunes the original model and aligns it with the preferences of the user." [gal30b+] 🤖 #CV #LG

🔗 https://arxiv.org/abs/2308.07929v1 #arxiv

https://creative.ai/system/media_attachments/files/110/905/190/621/565/699/original/df6f499dea85f5d9.jpg

https://creative.ai/system/media_attachments/files/110/905/190/674/599/721/original/44b90f246934cd4e.jpg