Current research in visual tracking is largely focused on the generic case, where no prior knowledge about the target object is assumed. However, many real-world tracking applications stem from specific scenarios where the class or type of object is known. In this work, we propose a tracking framework that can exploit this semantic information, without sacrificing the generic nature of the tracker. In addition to the target-specific appearance, we model the class of the object through a semantic module that provides complementary class-specific predictions. By further integrating a semantic classification module, we can utilize the learned class-specific models even if the target class is unknown. Our unified tracking architecture is trained end-to-end on large scale tracking datasets by exploiting the available semantic metadata. Comprehensive experiments are performed on five tracking benchmarks. Our approach achieves state-of-the-art performance while operating at real-time frame-rates.