This script performs Optical Character Recognition (OCR) on images within a specified folder and extracts and transcribes audio from a video file. The extracted text from images and the transcription from the audio are saved in separate output text files.
- OCR from Images: Extracts text from all image files in a specified folder (
frames) and saves it toknowledge2.txtone directory level up. - Audio Transcription: Extracts audio from a specified video file (
input.mkv), transcribes it using the Whisper model, and saves the transcription to a text file namedinput-transcription.txt.