Full list of publications. See also my Google Scholar profile.

2025

NVIDIA

Nemotron Parse 1.1

K Chumachenko, A Deshmukh, J Seppanen, I Karmanov, C Chen, L Voegtle, P Fischer, et al.

arXiv 2025

Follow-up to Eclair. 885M parameter lightweight model adding a token-compressed variant (20% speed gain), improved reading order for floating elements, and longer output sequences. Released as open weights with optimized NIM container.

NVIDIA

NVIDIA Nemotron Nano V2 VL

140+ authors including I Karmanov

arXiv 2025

Vision-language model on hybrid Mamba-Transformer architecture for document understanding, long video comprehension, and reasoning. 128K token context with token reduction for higher throughput.

NVIDIA

Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models

199 authors including I Karmanov

arXiv 2025

8B and 56B hybrid Mamba-Transformer models with up to 3x faster inference than comparable Transformers (Qwen-2.5, Llama-3.1) with equal or better accuracy. Eclair was used for PDF-to-text extraction in the pre-training data pipeline.

First Author

Eclair: Extracting Content and Layout with Integrated Reading Order for Documents

I Karmanov (first author), A Deshmukh, L Voegtle, P Fischer, K Chumachenko, T Roman, J Seppanen, J Parmar, J Jennings, A Tao, K Sapra

arXiv 2025

Multimodal encoder-decoder for document understanding. Extracts formatted text (markdown/LaTeX), bounding boxes with semantic classes, and reading order. Originated architectural choices: no positional encoding in the decoder, chained multi-token prediction. Used across NVIDIA's training pipelines for LLM pre-training data, VLM distillation, pseudo-labeling, and synthetic VQA grounding. Introduces the DROBS benchmark.

NeurIPS

Eagle 2: Building Post-Training Data Strategies from Scratch for Frontier Vision-Language Models

Z Li, G Chen, S Liu, ... I Karmanov, L Voegtle, P Fischer, ... Z Yu (27 authors)

NeurIPS 2025

Data-centric approach to VLM post-training. Eagle2-9B matches models with up to 70B parameters. Later adopted as the VLM backbone of NVIDIA's GR00T-N1 robotic foundation model.

2022

BMVC

Revisiting Single-gated Mixtures of Experts

A Royer, I Karmanov, A Skliar, B Ehteshami Bejnordi, T Blankevoort

BMVC 2022

Revisits simple single-gate MoE architectures with base model branch for early-exit and regularization. Achieves efficiency-accuracy trade-offs comparable to more complex MoE approaches.

2021

NeurIPS

Modality-Agnostic Topology Aware Localization

FG Zanjani, I Karmanov, H Ackermann, D Dijkman, S Merlin, M Welling, F Porikli

NeurIPS 2021

Unsupervised positioning using optimal transport on isometric embeddings, agnostic to input modality. Applied to WiFi and visual positioning.

NeurIPS

Deep Learning Frameworks for Weakly-Supervised Indoor Localization

FG Zanjani, H Ackermann, D Dijkman, I Karmanov, et al.

NeurIPS 2021 (Competition & Demos)

Deep learning frameworks for weakly-supervised indoor positioning using WiFi and visual data.

First Author

WiCluster: Passive Indoor 2D/3D Positioning using WiFi without Precise Labels

I Karmanov (first author), F Zanjani, S Merlin, I Kadampot, D Dijkman

IEEE GLOBECOM 2021

First weakly-supervised passive indoor positioning using WiFi CSI without precise location labels. Featured by Qualcomm as an AI First and covered by Forbes.

ICCV

Motion-Augmented Self-Training for Video Recognition at Smaller Scale

K Gavrilyuk, M Jain, I Karmanov, C Snoek

ICCV 2021

Self-training approach for video recognition that leverages motion information to improve performance with limited labeled data.

WCNC

Hand Gesture Recognition using 802.11ad mmWave Sensor in the Mobile Device

Y Ren, J Lu, A Beletchi, Y Huang, I Karmanov, D Dijkman

IEEE WCNC 2021

Hand gesture recognition using mmWave radar sensing on mobile devices.

2015

Economics

European Identity and Redistributive Preferences

J Costa-Font, F Cowell (with research contributions from I Karmanov)

CESifo Working Paper / LSE

Empirical causal inference (diff-in-diff) examining how changes in European identity affect preferences for redistribution. Contributed data generation, simulations, and econometric analysis.