CapScript is a Python console script that utilizes the YouTube Data API and the YouTube Transcript API to search for specific words or phrases within the captions (subtitles) of YouTube videos. The tool allows users to perform targeted searches across individual videos, multiple videos specified through a list, or videos associated with a particular YouTube channel. The matching captions and corresponding timestamps are collected and saved to a text file for easy reference.
YouTube Caption Search Tool (CapScript)
- Supports search for words or phrases in YouTube video captions.
- Three search modes available: Video ID(s), Channel ID, or Video ID(s) from a file.
- User can specify the number of videos to search for when searching by channel.
- Allows selection of the language for caption search (default: English - "en").
- Saves and loads preferences, including the YouTube Data API key.
- Displays progress during the search and estimated time of completion.
-
YouTube Data API Key: The script requires a valid YouTube Data API key. If you don't have one, you can obtain it by following the YouTube API Documentation.
-
Python version >= 3.7: Download Python
-
Python Libraries: Ensure you have the following Python libraries installed:
- youtube_transcript_api
- googleapiclient
- google-auth-oauthlib
- configparser
You can install them using
pip
:pip install youtube-transcript-api google-api-python-client google-auth-httplib2 google-auth-oauthlib configparser
-
Monospaced Font for Terminal: To ensure proper display of Unicode characters in the terminal, it's recommended to use a monospaced font. Most modern terminals and command prompts support Unicode characters, but a monospaced font can improve readability. Some common monospaced fonts include "Courier New," "Consolas," "DejaVu Sans Mono," or "Monaco." [Most terminals opt for a (ttf) by default]
-
API Key Configuration: Before running the script, you need to configure the YouTube Data API key. If you have not set it previously or want to change it, the script will prompt you to enter a valid API key.
-
Download the
CapScript.py
file or usegit clone
:git clone https://github.com/yanpuri/CapScript.git
-
Inside the directory of the installation, run the script:
python CapScript.py
-
Search Mode Selection: The script will prompt you to choose a search mode: "Channel" or "Video".
- "Channel": You will be asked to enter the YouTube channel ID, the number of videos to search, and the caption language.
- "Video": You can either enter individual video IDs separated by commas or provide a path to a file containing the video IDs.
-
Search Term Input: After selecting the search mode, enter the word or phrase you want to search for within the captions.
-
Results: The script will begin the search process and display a progress indicator. After completion, it will show the total number of matches found. The matching captions and corresponding timestamps will be saved in a text file inside the "transcripts" folder.
To use CapScript, you need a valid YouTube Data API key. Follow the steps below to obtain one:
-
Create a Google Developer Console Project: Go to the Google Developer Console and create a new project.
-
Enable YouTube Data API: Inside your project, navigate to the "APIs & Services" > "Dashboard" and click on the "+ ENABLE APIS AND SERVICES" button. Search for "YouTube Data API v3" and enable it.
-
Create Credentials: In the left-hand menu, click on "Credentials" > "Create Credentials" > "API key". A pop-up will appear, showing your API key.
-
Restrict API Key (Optional but Recommended): To improve security, restrict the API key usage to only the APIs you need. You can set restrictions for the YouTube Data API within the "Credentials" settings.
To search for videos associated with a specific YouTube channel, you need a unique Channel ID. Here's how you can find it:
-
Open YouTube Channel: Go to the YouTube channel you want to search within your web browser.
-
View Page Source: Right-click on the page (anywhere) and select "View Page Source" or "Inspect" from the context menu.
-
Search for Channel ID: In the page source view, press
Ctrl+F
(Windows/Linux) orCmd+F
(Mac) to open the search function. Enter?channel_id
in the search box. -
Locate the Channel ID: The search will highlight the "?channel_id" parameter in the page source, and the value next to it will be the Channel ID. It will typically be a string of letters and numbers.
- The
preferences.ini
file will be created and used to store the API key. This ensures you don't need to re-enter the key each time you run the script. - The script will skip videos without available captions or with disabled subtitles.
- If you run the script multiple times, make sure to use the same API key to avoid API usage issues.
Important: The script uses the YouTube Data API, which has request quotas and limitations. Make sure to respect the API usage guidelines to avoid potential restrictions.
If you find RepoUp useful, consider supporting me by:
- Starring the repository on GitHub
- Sharing the tool with others
- Providing feedback and suggestions
- Follow me for more :)
For any issues or feature requests, please open an issue on GitHub. Happy coding!