This thesis explores a novel voxel-based 3D modeling tool in virtual reality (VR), assessing its usability with and without Automatic Speech Recognition (ASR). Despite VR's potential for immersive modeling, existing software often lacks functionality or is user-unfriendly. Through participant testing, analysis via the Post-Study System Usability Questionnaire (PSSUQ) and qualitative questions, this study aims to bridge the gap in VR modeling tools, catering to the needs of both laymen and professional modelers.