Koboldcpp should allow you to run much larger models with a little bit of ram offloading. There’s a fork that supports rocm for AMD cards: https://github.com/YellowRoseCx/koboldcpp-rocm
Make sure to use quantized models for the best performace, q4k_M being the standard.
There should be no difference because the video track hasn’t been touched. Some software will display the length of the longest track rather than the length of the main video track. It’s likely that the the audio track was originally longer than the video track and because of the offset it’s now shorter.
You can use tools like ffmpeg and mediainfo to count the actual frames in each to verify.